Cloudera

Cloudera Training for Apache HBase

This course enables participants to store and access massive quantities of multi-structured data and perform hundreds of thousands of operations per second.Apache HBase is a distributed, scalable, NoSQL database built on Apache Hadoop. HBase can store data in massive tables consisting of billions of rows and millions of columns, serve data to many users and applications in real time, and provide fast, random read/write access to users and applications.

21 hours · Virtual
21 hours
Virtual

Through hands-on and interactive sessions with hands-on exercises, the student will learn:The use cases and usage occasions for HBase, Hadoop, and RDBMS.Using the HBase shell to directly manipulate HBase tables.Designing optimal HBase schemas for efficient data storage and recovery.How to connect to HBase using the Java API to insert and retrieve data in real time.Best practices for identifying and resolving performance bottlenecks.

Introduction to Hadoop and HBaseIntroducing HadoopCore Hadoop ComponentsWhat Is HBase?Why Use HBase?Strengths of HBaseHBase in ProductionWeaknesses of HBaseHBase TablesHBase ConceptsHBase Table FundamentalsThinking About Table DesignThe HBase ShellCreating Tables with the HBase ShellWorking with TablesWorking with Table DataHBase Architecture FundamentalsHBase RegionsHBase Cluster ArchitectureHBase and HDFS Data LocalityHBase Schema DesignGeneral Design ConsiderationsApplication-Centric DesignDesigning HBase Row KeysOther HBase Table FeaturesBasic Data Access with the HBase APIOptions to Access HBase DataCreating and Deleting HBase TablesRetrieving Data with GetRetrieving Data with ScanInserting and Updating DataDeleting DataMore Advanced HBase API FeaturesFiltering ScansBest PracticesHBase CoprocessorsHBase Write PathHBase Write PathCompactionSplitsHBase Read PathHow HBase Reads DataBlock Caches for ReadingHBase on the ClusterHow HBase Uses HDFSCompactions and SplitsHBase Performance TuningColumn Family ConsiderationsSchema Design ConsiderationsConfiguring for CachingMemory ConsiderationsDealing with Time Series and Sequential DataPre-Splitting RegionsHBase Administration and Cluster ManagementHBase DaemonsZooKeeper ConsiderationsHBase High AvailabilityUsing the HBase BalancerFixing Tables with hbckHBase SecurityHBase Replication and BackupHBase ReplicationHBase BackupMapReduce and HBase ClustersUsing Hive and Impala with HBaseUsing Hive and Impala with HBaseConclusionAppendix A: Accessing Data with Python and ThriftThrift UsageWorking with TablesGetting and Putting DataScanning DataDeleting DataCountersFiltersAppendix B: OpenTSDB

This course is appropriate for developers and administrators who intend to use HBase. Prior experience with databases and data modeling is helpful, but not required. Knowledge of Java is assumed. Prior knowledge of Hadoop is not required.

Upcoming Sessions

Contact us for upcoming dates

There are currently no upcoming sessions scheduled for this course.

Request Information