Introduction to HBase

HBase is a NoSQL databases which experienced a tremendous increase in popularity during the last years. There exist many great sources which explain details of the architecture or guide you through the installation of your own HBase cluster. But when I started my research I missed a simple overview which answers questions like what are the requirements to HBase, which data structure is used to meet these requirements and how is this data structure integrated into the architecture of HBase. I am convinced that everyone who can answer these questions will find it much easier to understand the details of the HBase architecture, development, configuration and data modeling. This was also my motivation to answer these questions in this blog post.

This blog post gives an introduction to HBase which

comes to the point
uses bullet points where possible
uses intuitive drawings

The goal is that the reader afterwards knows

why this technology is needed
the basic concepts of HBase
how it can be used
which architecture is used
how the underlying data structure works
why architecture and data structure fulfill all requirements

Requirements

The requirements for database systems have changed over the past decade with respect to the following factors:

Volume
Variety
Velocity

Some typical applications which produce this kind of data are

Internet, Social Media
Natural sciences: Genome Data, Large Hadron Collider (CERN)
Logistics
Production
(And in the future) Internet of things, wearables

Besides the new challenges, it is also important not to forget some classical Database System requirements

Atomicity
Consistency
Isolation
Durability

The need for HBase

Hadoop is a framework for storing, processing and managing large amounts of data. It has amongst others the the following tools and features

Resource management
Fault tolerance
Distributed file system HDFS
Large scale batch processing with MapReduce
Runs on commodity Hardware
Hadoop is a growing platform with many tools and a good integration into other systems

Out of the box Hadoop can handle a high volume of multi-structured data. But it can not handle a high velocity of random reads and writes and it is unable to change a file without completely rewriting it.

HBase is a NoSQL database
It is designed on top of Hadoop, dealing with the drawbacks of HDFS
It can also be used with other file systems
HBase allows fast random reads and writes.
Although HBase allows fast random writes, it is read optimized

Index Data Structure

Requirements to the index data structure

Fast random reads
Fast random writes
Consistency and fail-safety
Based on HDFS

The problem of Hadoop

Fast random reads require the data to be stored structured (ordered).
The only possibility to modify a file stored on HDFS without rewriting is appending.
Fast random writes into sorted files only by appending seems to be impossible.
The solution to this problem is the Log-Structured Merge Tree (LSM Tree).
The HBase data structure is based on LSM Trees

The Log-Structured Merge Tree

The LSM Tree works the following way

All puts (insertions) are appended to a write ahead log (can be done fast on HDFS, can be used to restore the database in case anything goes wrong)
An in memory data structure (MemStore) stores the most recent puts (fast and ordered)
From time to time MemStore is flushed to disk.

This results in the following structure

This results in a many small files on HDFS.
HDFS better works with few large files instead of many small ones.
A get or scan potentially has to look into all small files. So fast random reads are not possible as described so far.
That is why HBase constantly checks if it is necessary to combine several small files into one larger one
This process is called compaction. There are two different kinds of compactions.
Minor Compactions merge few small ordered files into one larger ordered one without touching the data.
Major Compactions merge all files into one file. During this process outdated or deleted values are removed.
Guarantees on the maximum number of compactions per entry can be made because of the way HBase triggers compactions.
Bloom Filters (stored in the Metadata of the files on HDFS) can be used for a fast exclusion of files when looking for a specific key.

Data Model and Properties

HBase uses the following data model

Every entry in a Table is indexed by a RowKey
For every RowKey an unlimited number of attributes can be stored in Columns
There is no strict schema with respect to the Columns. New Columns can be added during runtime
HBase Tables are sparse. A missing value doesn’t need any space
Different versions can be stored for every attribute. Each with a different Timestamp.
Once a value is written to HBase it cannot be changed. Instead another version with a more recent Timestamp can be added.
To delete a value from HBase a Tombstone value has to be added.
The Columns are grouped into ColumnFamilies. The ColumnFamilies have to be defined at table creation time and can’t be changed afterwards.
HBase is a distributed system. It is guaranteed that all values belonging to the same RowKey and ColumnFamily are stored together.

Alternatively HBase can also be seen as a sparse, multidimensional, sorted map with the following structure:

(Table, RowKey, ColumnFamily, Column, Timestamp) → Value

Or in an object oriented way:

Table ← SortedMap<RowKey, Row>
Row ← List<ColumnFamily>
ColumnFamily ← SortedMap<Column, List<Entry>>
Entry ← Tuple<Timestamp,Value>

HBase supports the following operations:

Get: Returns the values for a given RowKey. Filters can be used to restrict the results to specific ColumnFamilies, Columns or versions.
Put: Adds a new entry. The Timestamp can be set automatically or manually.
Scan: Returns the values for a range of RowKeys. Scans are very efficient in HBase. Filters can also be used to narrow down the results. HBase 0.98.0 (which was released last week) also allows backward scans.
Delete: Adds a Tombstone marker

Architecture

HBase is a distributed database
The data is partitioned based on the RowKeys into Regions.
Each Region contains a range of RowKeys based on their binary order.
A RegionServer can contain several Regions.
All Regions contained in a RegionServer share one write ahead log (WAL).
Regions are automatically split if they become too large.
Every Region creates a Log-Structured Merge Tree for every ColumnFamily. That’s why fine tuning like compression can be done on ColumnFamily level. This should be considered when defining the ColumnFamilies.

HBase uses ZooKeeper to manage all required services.
The assignment of Regions to RegionServers and the splitting of Regions is managed by a separate service, the HMaster
The ROOT and the META table are two special kinds of HBase tables which are used for efficiently identifying which RegionServer is responsible for a specific RowKey in case of a read or write request.
When performing a get or scan, the client asks ZooKeeper where to find the ROOT Table. Then the client asks the ROOT Table for the correct META Table. Finally it can ask the META Table for the the correct RegionServer.
The client stores information about ROOT and META Tables to speed up future lookups.
Using these three layers is efficient for a practically unlimited number of RegionServers.

Mission completed?

Does HBase fulfill all “new” requirements?

Volume: By adding new servers to the cluster HBase scales horizontally to an arbitrary amount of data.
Variety: The sparse and flexible table structure is optimal for multi-structured data. Only the ColumnFamilies have to be predefined.
Velocity: HBase scales horizontally to read or write requests of arbitrary speed by adding new servers. The key to this is the LSM-Tree Structure.

What about ACID (Atomicity, Consistency, Isolation, Durability)?

HBase is not fully ACID.
ACID guarantees can only be made to changes within the same row.
Lars Hofhansl has written a nice blog post where he explains how and when ACID can be guaranteed. [1]

It depends on the use case if this is a limitation. For many Big Data use cases it ain’t.

The CAP Theorem?

A distributed system cannot be consistent, available and tolerant to network partitions at the same time.
Since every distributed system has to be tolerant to network partitions (communication between the nodes may be disturbed), one has to choose between availability (the system will always accept and process read and write requests) and consistency (an update is applied to all relevant nodes at the same time).
HBase is partition tolerant and consistent (CP). System failures may result in unprocessed requests.
Coda Hale wrote a really good article on the CAP theorem [2]

When to use HBase?

HBase should be considered in the following cases

Existing Hadoop cluster
Huge amount of data
Fast random reads and/or writes
Well known access patterns

Don’t use HBase in the following cases

New data only needs to be appended
Batch processing instead of random reads
Complicated access patterns (such as joins)
Full ANSI SQL support required
A single node can deal with the volume and the velocity of the complete data set

How to use HBase?

The possibilities for SQL querying and data interaction are restricted. Simple SQL access patterns are possible using Hive.
HBase has a JAVA API
HBase tables can be used as input to MapReduce jobs (including Pig and Hive)
HBase is a perfect candidate for the serving layer in a Lambda Architecture which combines real time analytics with batch processing.

YMC – Your HBase Expert in Switzerland

Understanding the fundamentals is essential. However, running HBase in production can be challenging. To ensure high throughput and low latency one has to design a proper schema, derived by the query patterns. Data model, row key design, manual splitting, monitoring compactions are just a few key challenges. We at YMC run a production cluster since 2011 for one of our customers and gained a lot of valuable knowledge. On this journey we developed Hannibal, a open-source software to monitor HBase.

We help you working with HBase and offer customized trainings, consulting and implementation. Contact our Head of Big Data & Analytics, Jean-Pierre König by phone on +41 (0)71 508 24 86 (Switzerland), or via E-Mail: jean-pierre.koenig@ymc.ch and request an offer.

[1] Lars Hofhansl: ACID in HBase
[2] Coda Hale: You Can’t Sacrifice Partition Toleracnce