Cloudera

DOPS-244: Apache Kafka on Cloudera

This four-day instructor-led course begins by introducing Apache Kafka, explaining its key concepts and architecture, and discussing several common use cases. Building on this foundation, you will learn how to plan a Kafka deployment, and then gain hands-on experience by installing and configuring your own cloud-based, multi-node cluster running Kafka on the Cloudera.You will then use this cluster during more than 20 hands-on exercises that follow, covering a range of essential skills, starting with how to create Kafka topics, producers, and consumers, then continuing through progressively more challenging aspects of Kafka operations and development, such as those related to scalability, reliability, and performance problems. Throughout the course, you will learn and use Cloudera's recommended tools for working with Kafka, including Cloudera Manager, Schema Registry, Streams Messaging Manager, and Cruise Control.

28 hours · Virtual
28 hours
Virtual

During this course, you learn how to:Plan, deploy, and operate Kafka clustersCreate and manage topicsDevelop producers and consumersUse replication to improve fault toleranceUse partitioning to improve scalabilityTroubleshoot common problems and performance issues

afka OverviewHigh-Level ArchitectureCommon Use CasesCloudera's Distribution of Apache KafkaDeploying Apache KafkaSystem Requirements and DependenciesService RolesPlanning Your Deployment Deploying Kafka ServicesExercise: Preparing the Exercise EnvironmentExercise: Installing the Kafka Service with Cloudera ManagerExercise (optional): Create Metrics DashboardsExercise (optional): Using the CM APIKafka Command Line BasicsCreate and Manage TopicsRunning Producers and ConsumersUsing Streams Messaging Manager (SMM)Streams Messaging Manager Overview Producers, Topics, and ConsumersData ExplorerBrokersTopic ManagementExercise: Managing Topics using the CLIExercise: Connecting Producers and Consumers from the Command LineKafka Java API BasicsOverview of Kafka's APIsTopic Management from the Java APIExercise (optional): Managing Kafka Topics Using the Java APIUsing Producers and Consumers from the Java APIExercise: Developing Producers and Consumers with the Java APIImproving Availability through ReplicationReplicationExercise: Observing Downtime Due to Broker FailureConsiderations for the Replication FactorExercise: Adding Replicas to Improve AvailabilityImproving Application ScalabilityPartitioningHow Messages are PartitionedExercise: Observing How Partitioning Affects PerformanceConsumer GroupsExercise: Implementing Consumer GroupsConsumer RebalancingExercise: Using a Key to Control Partition AssignmentImproving Application ReliabilityDelivery SemanticsDemonstration (optional): ISRs vs. ACKsProducer DeliveryExercise: Idempotent ProducerTransactionsExercise: Transactional Producers and ConsumersHandling Consumer FailureOffset ManagementExercise: Detecting and Suppressing Duplicate MessagesExercise: Handling Invalid RecordsHandling Producer FailureAnalyzing Kafka Clusters with SMMEnd-to-End LatencyNotifiers Alert Policies Use Cases Monitoring KafkaMonitoring OverviewMonitoring using Cloudera ManagerCharts and Reports in CMMonitoring RecommendationsMetrics for TroubleshootingDiagnosing Service FailureExercise: Monitoring KafkaManaging KafkaManaging Kafka Topic StorageDemonstration (optional): Message Retention PeriodLog Cleanup and CollectionRebalancing PartitionsCruise ControlExercise: Installing Cruise ControlExercise: Troubleshooting Kafka TopicsUnclean Leader ElectionExercise: Unclean Leader ElectionAdding and Removing BrokersExercise: Adding and Removing BrokersBest PracticesMessage Structure, Format, and VersioningMessage StructureSchema RegistryDefining SchemasSchema Evolution and VersioningSchema Registry ClientExercise: Using an Avro SchemaImproving Application PerformanceMessage SizeBatchingCompressionExercise: Observing How Compression Affects PerformanceImproving Kafka Service PerformancePerformance Tuning Strategies for the AdministratorCluster SizingExercise: Planning Capacity Needed for a Use CaseSecuring the Kafka ClusterEncryptionAuthenticationAuthorizationAuditing

This course is designed for system administrators, data engineers, and developers. All students are expected to have basic Linux experience, and basic proficiency with the Java programming language is recommended. No prior experience with Apache Kafka is necessary.

Upcoming Sessions

Contact us for upcoming dates

There are currently no upcoming sessions scheduled for this course.

Request Information