Managing HBase | 6.3.x | Cloudera Documentation (2023)

Cloudera Manager requires certain additional steps to set up and configure the HBase service.

Creating the HBase Root Directory

Minimum Required Role: ClusterAdministrator (also provided by Full Administrator)

When adding the HBase service, the Add Service wizard automatically creates a root directory for HBase in HDFS. If you quit the AddService wizard or it does not finish, you can create the root directory outside the wizard by doing these steps:

  1. Choose Create Root Directory from the Actions menu in the HBase > Status tab.
  2. Click Create Root Directory again to confirm.

Graceful Shutdown

Minimum Required Role: Operator (also provided by Configurator, Cluster Administrator, FullAdministrator)

A graceful shutdown of an HBase RegionServer allows the regions hosted by that RegionServer to be moved to other RegionServers before stopping the RegionServer. Cloudera Manager providesthe following configuration options to perform a graceful shutdown of either an HBase RegionServer or the entire service.

To increase the speed of a rolling restart of the HBase service, set the Region Mover Threads property to a higher value. This increases the number ofregions that can be moved in parallel, but places additional strain on the HMaster. In most cases, Region Mover Threads should be set to 5 or lower.

(Video) What Is HBase? | HBase Architecture | HBase Tutorial For Beginners | Hadoop Tutorial | Simplilearn

Gracefully Shutting Down an HBase RegionServer

  1. Go to the HBase service.
  2. Click the Instances tab.
  3. From the list of Role Instances, select the RegionServer you want to shut down gracefully.
  4. Select Actions for Selected > Decommission (Graceful Stop).
  5. Cloudera Manager attempts to gracefully shut down the RegionServer for the interval configured in the Graceful Shutdown Timeout configuration option, which defaults to 3 minutes. If the gracefulshutdown fails, Cloudera Manager forcibly stops the process by sending a SIGKILL (kill -9) signal. HBase will perform recovery actions onregions that were on the forcibly stopped RegionServer.
  6. If you cancel the graceful shutdown before the Graceful Shutdown Timeout expires, you can still manually stop a RegionServer by selecting Actions for Selected > Stop, which sends a SIGTERM (kill -5) signal.

Gracefully Shutting Down the HBase Service

  1. Go to the HBase service.
  2. Select Actions > Stop. This tries to perform an HBase Master-driven gracefulshutdown for the length of the configured Graceful Shutdown Timeout (three minutes by default), after which it abruptly shuts down the whole service.

Configuring the Graceful Shutdown Timeout Property

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

This timeout only affects a graceful shutdown of the entire HBase service, not individual RegionServers. Therefore, if you have a large cluster with many RegionServers, you shouldstrongly consider increasing the timeout from its default of 180 seconds.

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > HBASE-1 (Service Wide)
  4. Use the Search box to search for the Graceful Shutdown Timeout property and edit the value.
  5. Click Save Changes to save this setting.

Configuring the HBase Thrift Server Role

Minimum Required Role: ClusterAdministrator (also provided by Full Administrator)

The Thrift Server role is not added by default when you install HBase, but it is required before you can use certain other features such as theHue HBase browser. To add the Thrift Server role:

  1. Go to the HBase service.
  2. Click the Instances tab.
  3. Click the Add Role Instances button.
  4. Select the host(s) where you want to add the Thrift Server role (you only need one for Hue) and click Continue. The Thrift Server role should appearin the instances list for the HBase server.
  5. Select the Thrift Server role instance.
  6. Select Actions for Selected > Start.
(Video) How HBase Works

Enabling HBase Indexing

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

HBase indexing is dependent on the Key-Value Store Indexer service. The Key-Value Store Indexer service uses the Lily HBase Indexer Service to index the stream ofrecords being added to HBase tables. Indexing allows you to query data stored in HBase with the Solrservice.

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > HBASE-1 (Service Wide)
  4. Select Category > Backup.
  5. Select the Enable Replication and Enable Indexing properties.
  6. Click Save Changes.

Adding a Custom Coprocessor

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

The HBase coprocessor framework provides a way to extend HBase with custom functionality. To configure these properties in Cloudera Manager:

(Video) HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop Tutorial | Simplilearn

  1. Select the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > All.
  4. Select Category > All.
  5. Type HBase Coprocessor in the Search box.
  6. You can configure the values of the following properties:
    • HBase Coprocessor Abort on Error (Service-Wide)
    • HBase Coprocessor Master Classes (Master Default Group)
    • HBase Coprocessor Region Classes (RegionServer Default Group)
  7. Enter a Reason for change, and then click Save Changes to commit the changes.

Disabling Loading of Coprocessors

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

In CDH 5.7 and higher, you can disable loading of system (HBase-wide) or user (table-wide) coprocessors. Cloudera recommends against disabling loading of system coprocessors, becauseHBase security functionality is implemented using system coprocessors. However, disabling loading of user coprocessors may be appropriate.

  1. Select the HBase service.
  2. Click the Configuration tab.
  3. Search for HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.
  4. To disable loading of all coprocessors, add a new property with the name hbase.coprocessor.enabled and set its value to false. Cloudera does not recommend this setting.
  5. To disable loading of user coprocessors, add a new property with the name hbase.coprocessor.user.enabled and set its value to false.
  6. Enter a Reason for change, and then click Save Changes to commit the changes.

Enabling Hedged Reads on HBase

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Go to the HBase service.
  2. Click the Configuration tab.
  3. Select Scope > HBASE-1 (Service-Wide).
  4. Select Category > Performance.
  5. Configure the HDFS Hedged Read Threadpool Size and HDFS Hedged Read Delay Threshold properties. The descriptions foreach of these properties on the configuration pages provide more information.
  6. Enter a Reason for change, and then click Save Changes to commit the changes.

Moving HBase Master Role to Another Host

Minimum Required Role: ClusterAdministrator (also provided by Full Administrator)

  1. Select the HBase service in Cloudera Manager.
  2. Click the Instances tab.
  3. Click Add Role Instances to add role instances to HBase.
  4. Choose the desired new host under Master.
  5. Click Continue.
  6. Start the newly added HBase Master role.

    As a results the state of this role becomes Started.

  7. Wait until the type of the newly added HBase Master role becomes Master (Backup).
  8. Stop any other non-active HBase Master role instances.
  9. Stop the remaining active HBase Master role.

    As a result, the type of the newly added HBase Master role automatically becomes Master (Active).

  10. Delete the old HBase Master role instances on hosts that are not wanted.
(Video) Cloudera Hadoop Log Managment Part1

Advanced Configuration for Write-Heavy Workloads

HBase includes several advanced configuration parameters for adjusting the number of threads available to service flushes and compactions in the presence of write-heavy workloads.Tuning these parameters incorrectly can severely degrade performance and is not necessary for most HBase clusters. If you use Cloudera Manager, configure these options using the HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml.

hbase.hstore.flusher.count
The number of threads available to flush writes from memory to disk. Never increase hbase.hstore.flusher.count to more of 50% of the number of disksavailable to HBase. For example, if you have 8 solid-state drives (SSDs), hbase.hstore.flusher.count should never exceed 4. This allows scanners and compactions toproceed even in the presence of very high writes.
hbase.regionserver.thread.compaction.large and hbase.regionserver.thread.compaction.small
The number of threads available to handle small and large compactions, respectively. Never increase either of these options to more than 50% of the number of disks available toHBase.

Ideally, hbase.regionserver.thread.compaction.small should be greater than or equal to hbase.regionserver.thread.compaction.large, since the large compaction threads do more intense work and will be in use longer for a given operation.

In addition to the above, if you use compression on some column families, more CPU will be used when flushing these column families to disk during flushes or compaction. The impact onCPU usage depends on the size of the flush or the amount of data to be decompressed and compressed during compactions.

(Video) Apache HBase 101: How HBase Can Help You Build Scalable, Distributed Java Applications

Videos

1. Cloudera Administration - Configure Apache HBase on Cluster using Cloudera Manager
(itversity)
2. Cloudera Administration - Manage the Cluster - Setup Alerts for Excessive Disk Fill
(itversity)
3. Cloudera manager basics
(NextGen Learning)
4. Hadoop In 5 Minutes | What Is Hadoop? | Introduction To Hadoop | Hadoop Explained |Simplilearn
(Simplilearn)
5. What is Data Pipeline | How to design Data Pipeline ? - ETL vs Data pipeline (2023)
(IT k Funde)
6. Day in the Life of a Cloudera Data Platform Admin
(Cloudera, Inc.)
Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated: 01/09/2023

Views: 5848

Rating: 4.6 / 5 (66 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.