SkillTect provides result-driven consulting and cloud training for tech companies and IT professionals.

SkillTect Technologies Pvt Ltd

Mohan Estate,
Delhi, India 110044,

Cassandra: An Overview of the Highly Scalable NoSQL Database


Introduction to Cassandra

Cassandra is an open-source, columnar NoSQL database that provides a system to handle huge amounts of data providing high availability. It has no single point of failure. Cassandra was developed by Facebook but was later accepted into Apache Incubator. It is now being used by some of the biggest companies like Twitter, Cisco, Netflix and so on.

introduction to cassandra

key features of Cassandra

Below are some of the characteristics of the Cassandra database:

  1. No Single Point of Failure: This ensures that Cassandra is available all the time.
  2. Easily scalable: You can add more nodes to accommodate the rising data
  3. Transaction support: Cassandra supports ACID properties (Atomicity, Consistency, Isolation and Durability) and hence it supports transactions.
  4. Fast writes: Cassandra writes super-fast without compromising on reading efficiency. 
  5. Replication: Cassandra replicates the data across nodes to ensure durability.

CQL and cqlsh Commands

Cassandra has peer-to-peer architecture where all the nodes of the cluster play the same role. It uses Gossip Protocol to communicate the information across nodes and also to identify any faulty nodes.

Cassandra can be accessed through its nodes using Cassandra Query Language (CQL). This CQL can be used in a prompt provided by Cassandra – Cassandra Query Language Shell – cqlsh.

Below are some of the cqlsh commands that you can use to perform operations on Cassandra:

  • Copy: copies data to and from Cassandra
  • Describe: describes the current cluster and objects of Cassandra
  • Show: displays current session information like version, host and so on.
  • Source: executes a file that has CQL commands
  • Exit: terminate your cqlsh session

Below are some of the commands that you can use in your CQL:

  • Clauses like WHERE and ORDER BY.

Architecture of Cassandra

Cassandra’s architecture can be described below:

  • Nodes: This is where Cassandra stores data.
  • Datacenter: Collection of nodes.
  • Cluster: Collection of data centres.
  • Commit Log: This log maintains all the write operations performed.
  • Mem-table: After writing to the commit log, data is written in Mem-table.
  • SS Table: Once Mem-table gets full, data is flushed to the SS table.
introduction to Cassandra architecture

Comparison with HBase

Cassandra is a columnar database just like HBase, but they have a few similarities and a few differences. Some of them are listed below:

  • Similarities:
    1. Both Cassandra and HBase are NoSQL columnar database
    2. Both are linearly scalable.
    3. Both use replication for fault tolerance.
  • Differences:
  1. Cassandra possesses CQL which is like SQL and is far richer as compared to what HBase has.
  2. The documentation of Cassandra is also far better than that of HBase.

Cloud training

Cassandra official documentation

Article by Harsh Shrivastav

Leave a Reply