
DOPS-242: Ingesting with Cloudera DataFlow and Apache NiFi
One of the most critical functions of a data-driven enterprise is the ability to manage ingest and data flow across complex ecosystems. Does your team have the tools and skill sets to succeed at this?Apache NiFi and this four-day course provides the fundamental concepts and experience necessary to automate the ingress, flow, transformation, and egress of data using NiFi. The course also covers tuning, troubleshooting, and monitoring the dataflow process as well as how to integrate a dataflow within the Cloudera CDP Hybrid ecosystem and external systems.
During this course, you learn how to: Define, configure, organize, and manage dataflows Transform and trace data as it flows to its destination Track changes to dataflows with NiFi Registry Use the NiFi Expression Language to control dataflows Optimize dataflows for better performance and maintainabilityConnect dataflows with other systems, such as Apache Kafka, Apache Hive, and HDFSUtilize the Data Flow Service
Introduction to Cloudera Flow ManagementOverview of Cloudera Data-in-MotionThe NiFi User InterfaceDataFlow CatalogReadyFlowsInstructor-Led Demo: NiFi User InterfaceHands-On Exercise: Build Your First DataflowProcessorsOverview of ProcessorsProcessor Surface PanelProcessor ConfigurationHands-On Exercise: Start Building a Dataflow Using ProcessorsConnectionsOverview of ConnectionsConnection ConfigurationConnector Context MenuHands-On Exercise: Connect Processors in a DataflowDataflowsCommand and Control of a DataflowProcessor RelationshipsBack PressurePrioritizersLabelsHands-On Exercise: Build a More Complex DataflowHands-On Exercise: Creating a Fork Using RelationshipsHands-On Exercise: Set Back Pressure ThresholdsProcess GroupsAnatomy of Process GroupInput and Output PortsHands-On Exercise: Simplify Dataflows Using Process GroupsFlowFile ProvenanceData Provenance EventsFlowFile LineageReplaying a FlowFileHands-On Exercise: Using Data ProvenanceParametersParameter ContextsReferencing ParametersManaging ParametersMigrating from VariablesHands-On Exercise: Creating, Using, and Managing ParametersFlow Definitions and TemplatesFlow Definition OverviewCreating a Flow DefinitionImporting and Deploying a FlowUsing (migrating from) TemplatesHands-On Exercise: Creating, Using, and Managing Flow Definitions Apache NiFi RegistryApache NiFi Registry OverviewUsing the RegistryHands-On Exercise: Versioning Flows Using NiFi RegistryFlowFile AttributesFlowFile Attribute OverviewRouting on AttributesHands-On Exercise: Working with FlowFile AttributesNiFi Expression LanguageNiFi Expression Language OverviewSyntaxExpression Language EditorSetting Conditional ValuesHands-On Exercise: Using the NiFi Expression LanguageController ServicesController Services OverviewCommon Controller ServicesHands-On Exercise: Adding Apache Hive ControllerRecord-based ComponentsRecord-oriented dataRecord-based ProcessorsAvro Schema RegistrySchema FormatReading and Writing Record DataQuerying Record DataQueryRecord ProcessorWriting Record DataHands-On Exercise: TBD (Creating a function to read and write data?)Enriching Record DataETL OperationsSplit and Join ProcessorUpdate Record ProcessorsWait and Notify ProcessorsNiFi Architecture OverviewNiFi Architecture OverviewPublic Cloud ArchitecturePrivate Cloud ArchitectureDataFlow FunctionsOverviewServerless functionsDemo: Deploying a Flow Definition as a FunctionDataflow OptimizationDataflow OptimizationControl RateManaging ComputeHands-On Exercise: Building an Optimized Dataflow Monitoring, Reporting, and TroubleshootingMonitoring from NiFiReportingExamples of Common Reporting TasksHands-On Exercise: Monitoring and ReportingNiFi SecurityNiFi Security OverviewSecuring Access to the NiFi UIMetadata ManagementIntegrating NiFiNiFi Integration ArchitectureAvailable ReadyFlowsA Closer Look at NiFi and Apache Hive
This course is designed for developers, data engineers, administrators, and others with an interest in learning NiFi’s innovative no-code, graphical approach to data ingest. Although programming experience is not required, basic experience with Linux is presumed, and previous exposure to big data concepts and applications is helpful.



