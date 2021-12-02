1. Introduction

Cassandra is an open-source distributed database management system with a wide column store and a NoSQL database that can handle massive amounts of data across many commodity servers with no single point of failure. It was created by the Apache Software Foundation and is written in Java. In this article, we will go through the step-by-step process to install Cassandra in CentOS 7 Linux.

2. Pre-requisites

All commands given below should be run as root or sudo user.

2.1. Install Python 2.7

On CentOS 7, Python 2.7 comes pre-installed. If it's missing for some reason, you can use the following command to install it:

# yum -y install python

# python --version Python 2.7.5

2.2. Install Java

Use the below commands to install latest version of Java 8 and verify the installation.

# yum install java-1.8.0-openjdk-devel

# java -version

Sample output:

openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-b07) OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)

3. How to install Cassandra

First, let us add the Cassandra repository. To do so, create a file named cassandra.repo under /etc/yum.repos.d/ directory:

# vi /etc/yum.repos.d/cassandra.repo

Add the following lines in it:

[cassandra]

name=Apache Cassandra

baseurl=https://www.apache.org/dist/cassandra/redhat/40x/

gpgcheck=1

repo_gpgcheck=1

gpgkey=https://www.apache.org/dist/cassandra/KEYS

Press ESC key and type :wq to save the file and close it.

Verify if the Cassandra repository is added. Below command will ensure the installed and enabled repositories:

# yum repolist

List enabled yum repositories

After adding the repository, run the following command to install Cassandra in your CentOS system:

# yum -y install cassandra

Enable and start Cassandra service:

# systemctl enable cassandra

cassandra.service is not a native service, redirecting to /sbin/chkconfig.

Executing /sbin/chkconfig cassandra on

# systemctl start Cassandra

Ensure the status of Cassandra:

# systemctl status cassandra

Check Cassandra status

Use the below command to get the details of the cluster like it’s condition, load and IDs:

# nodetool status

Sample output:

Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 69.08 KiB 16 100.0% bf2df7a9-54bc-41c9-8c6c-0b9322d10e71 rack1

View the cluster details

In the output,

UN - Up & Normal

- Up & Normal Address - IP Address of Node

- IP Address of Node Load - After excluding all content in the snapshots subdirectory, the amount of file system data under the Cassandra data directory. Every 90 seconds once It will be updated.

- After excluding all content in the snapshots subdirectory, the amount of file system data under the Cassandra data directory. Every 90 seconds once It will be updated. Tokens - The number of tokens that have been assigned to the node.

- The number of tokens that have been assigned to the node. Owns - How much data the node owns; a node can possess 33% of the ring but display 100% if the replication factor is 3.

- How much data the node owns; a node can possess 33% of the ring but display 100% if the replication factor is 3. Host ID - Host’s Network ID

- Host’s Network ID Rack - Rack of the Node where it exists.

4. Cqlsh – CLI for Cassandra

cqlsh is a command-line interface for utilizing CQL to connect with Cassandra (Cassandra Query Language). It's included in every Cassandra package and can be found alongside the cassandra executable in the bin/ directory. The Python native protocol driver is used to implement cqlsh, which connects to a single node.

To launch Cqlsh, run:

# cqlsh

Sample output:

Connected to Test Cluster at 127.0.0.1:9042 [cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 | Native protocol v5] Use HELP for help. cqlsh>

Launch Cqlsh

5. CQL Sample commands

5.1. Create Key Space

In Cassandra, a keyspace serves as a data container, similar to a database in relational database management systems (RDMBS)

cqlsh> CREATE KEYSPACE IF NOT EXISTS OsTechNix WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 }; cqlsh>

Check the key spaces in the system using below commands.

cqlsh> SELECT * FROM system_schema.keyspaces;

Check keyspaces

To show all keyspaces, run:

cqlsh> desc keyspaces;

All the keyspaces on the cluster will be listed:

ostechnix system_auth system_schema system_views system system_distributed system_traces system_virtual_schema

Show all keyspaces

5.2. Create table and insert sample data

cqlsh> CREATE TABLE ostechnix.sample_table ( id UUID PRIMARY KEY, name text, birthday timestamp, nationality text, weight text, height text );

cqlsh>

cqlsh> INSERT INTO ostechnix.sample_table (id, name, nationality) VALUES (5b6962dd-3f90-4c93-8f61-eabfa4a803e2, 'KARTHICK', 'Indian');

cqlsh> INSERT INTO ostechnix.sample_table (id, name, nationality, weight) VALUES (5b6962dd-3f90-4c93-8f61-eabfa4a804e3, 'MOHAN', 'Indian', '85');

You can insert multiple values using INSERT command.

5.3. Querying the table

cqlsh> SELECT * FROM ostechnix.sample_table;

Querying table

To filter a specific item from the table, run:

cqlsh> SELECT * FROM ostechnix.sample_table WHERE weight = '85' ALLOW FILTERING;

Filter items from table

6. Summary

In this article, we have gone through the Cassandra installation procedures and a few sample CQL commands. We will have a deep dive in Cassandra Operation in the upcoming articles.

