Introduction
Apache Cassandra is a free open-source NoSQL database server commonly used with distributed systems. Cassandra can replicate high volumes of data across multiple cloud locations with low latency. When designing an application that requires high availability even in cases where a full data location goes down, Cassandra is the best choice for a database server. Examples of these applications include banking systems, healthcare management systems, social media applications, and e-commerce stores.
This guide explains how to install, use, and secure the Apache Cassandra Database Server on a Ubuntu 22.04 server.
Prerequisites
Before you begin:
-
Deploy a Ubuntu 22.04 server with at least
8GB
RAM in development, or32GB
in production. -
Using SSH, access the server.
-
Switch to the sudo user account.
# su example-user
Install the Apache Cassandra Database Server
In this section, install the Apache Cassandra dependencies such as the Java run time environment as described in the steps below.
-
Install the Java development kit.
$ sudo apt install default-jdk -y
-
Install
apt-transport-https
andgnupg2
dependency packages.$ sudo apt install apt-transport-https gnupg2 -y
-
Download and add the Apache Cassandra GPG key to the server.
$ sudo wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
-
Add the latest Apache Cassandra repository to the APT sources list.
$ echo "deb https://debian.cassandra.apache.org 41x main"| sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
The above command installs version
4.1
. To use the latest version, visit the Apache Cassandra download page. -
Update the server packages to enable the Cassandra repository.
$ sudo apt update
-
Install Apache Cassandra.
$ sudo apt install cassandra -y
-
When the installation is successful, Apache Cassandra takes an average of 2 minutes to start up. View the application logs to verify that it starts correctly.
$ cat /var/log/cassandra/system.log | tail
Your output should look like the one below:
... INFO [main] ... - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)... INFO [OptionalTasks:1] ... Created default superuser role 'cassandra'
As displayed in the above output, the Apache Cassandra database is ready to use with a new default superuser role
cassandra
-
Check the Apache Cassandra database server status, and verify that it's active.
$ sudo systemctl status cassandra
Output.
â— cassandra.service - LSB: distributed storage system for structured data Loaded: loaded (/etc/init.d/cassandra; generated) Active: active (running) ... ...
-
Verify the Cassandra node status.
$ nodetool status
Output:
Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 104.33 KiB 16 100.0% 15addadb-68b2-48f4-b076-ed020c2e926b rack1
As displayed in the Apache Cassandra status outputs, the database server is active and ready to use.
Use the Apache Cassandra Command-line Interface
Apache Cassandra uses the cqlsh
command line tool that accepts Cassandra Query Language (CQL) commands when interacting with the database server. In this section, access the database server, and add sample data as described in the steps below.
-
Log in to the Cassandra database server.
$ cqlsh
-
Create a new sample keyspace
my_company
.cqlsh> CREATE KEYSPACE my_company WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor' : 1 };
The above command creates a new
my_company
keyspace. In Cassandra, a keyspace is an object that stores tables, and defines how data replication works on different nodes. -
Switch to the keyspace.
cqlsh> USE my_company;
-
Create a new sample
products
table with the following data.cqlsh:my_company> CREATE TABLE products ( product_id UUID PRIMARY KEY, product_name TEXT, retail_price DOUBLE );
The above code creates a table with the
product_id
column which is aPRIMARY KEY
that uniquely identifies records in theproducts
table. Theproduct_name
column stores product names.retail_price
stores the actual price that customers pay for a product.product_id
uses the Universal Unique Identifier (UUID) data type. The data type ensures uniqueness of theproduct_id
value in the Cassandra cluster even when some nodes are down. -
Add sample data to the
products
table.cqlsh:my_company> INSERT INTO products (product_id, product_name, retail_price) VALUES (UUID(), 'PLIERS', 3.52); INSERT INTO products (product_id, product_name, retail_price) VALUES (UUID(), 'FILE', 4.38); INSERT INTO products (product_id, product_name, retail_price) VALUES (UUID(), 'PADLOCK', 14.30);
In the above commands, the
UUID()
function generates a newproduct_id
for each record. -
View the
products
table data.cqlsh:my_company> SELECT product_id, product_name, retail_price FROM products;
Output:
product_id | product_name | retail_price --------------------------------------+--------------+-------------- 56c0f7f7-be59-4f0d-a474-5a2174c19060 | FILE | 4.38 12e720cf-f191-4e59-8c66-360339deae65 | PADLOCK | 14.3 9c62c3a0-0e31-41ab-90d1-09f4d35b6e26 | PLIERS | 3.52 (3 rows)
-
Exit the Cassandra database console.
cqlsh:my_keyspace> QUIT
Secure Apache Cassandra with Password Authentication
It's important to secure the Apache Cassandra database server from unauthorized access. In this section, secure Cassandra with a password to harden your database server security as described in the steps below.
-
Use a text editor such as
Nano
, edit the main Cassandra configuration file.$ sudo nano /etc/cassandra/cassandra.yaml
-
Find the following
authenticator:
directive.... authenticator: AllowAllAuthenticator ...
-
Change the
authenticator:
value fromAllowAllAuthenticator
toPasswordAuthenticator
below.... authenticator: PasswordAuthenticator ...
Save and close the file.
In the above configuration, the default
AllowAllAuthenticator
does not prompt users for a password to access the Cassandra database. But,PasswordAutheticator
enforces password authentication for all system users. -
Restart the Apache Cassandra server to save changes.
$ sudo systemctl restart cassandra
The above command restarts Apache Cassandra. Wait for at least 1 minute, before re-accessing the database console. Check the application logs to verify that the startup is complete.
$ cat /var/log/cassandra/system.log | tail
Output:
... INFO [main]... CassandraDaemon.java:768 - Startup complete ... Deleting sstable: /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-8-big ...
-
Log in to the Cassandra database console using the default username (
cassandra
), and password (cassandra
).$ cqlsh -u cassandra -p cassandra
-
Create a new super-user account. Replace
db_administrator
with your desired user, andEXAMPLE_PASSWORD
with a strong password.cassandra@cqlsh> CREATE ROLE db_administrator WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 'EXAMPLE_PASSWORD';
-
Exit the Cassandra database console.
cassandra@cqlsh> QUIT;
-
To verify that the user is available, log in to the Cassandra database server as the new user.
$ cqlsh -u db_administrator
When prompted, enter the user password you created earlier, and press ENTER to proceed.
-
To tighten server security, disable the default super-user
cassandra
.db_administrator@cqlsh> ALTER ROLE cassandra WITH SUPERUSER = false AND LOGIN = false;
-
Exit the database console.
db_administrator@cqlsh> QUIT;
-
Verify that the default user account can no longer access the database server.
$ cqlsh -u cassandra -p cassandra
Output.
... Connection error: ('Unable to connect to any servers', {'127.0.0.1:9042': AuthenticationFailed('Failed to authenticate to 127.0.0.1:9042: Error from server: code=0100 [Bad credentials] message="cassandra is not permitted to log in"')})
Conclusion
In this guide, you installed and used the Apache Cassandra database server on a Ubuntu 22.04 Rcs server. Using the cqlsh
command, you accessed the database server, and used CQL commands to create a sample keyspace, then, added data to the table. For more information and configuration options, visit the Apache Cassandra official documentation.