Predix Columnar Store Service Overview
About Columnar Store
Predix Columnar Store is a data storage service based on Cassandra, a NoSQL database designed to handle large data workloads across multiple nodes with no single point of failure.
Cassandra has a peer-to-peer distributed system architecture where data is distributed among multiple homogeneous nodes organized into data centers, and clusters that contain one or more data centers. Data is replicated across nodes and data centers to protect against catastrophic loss and speed request processing. Any authenticated user can connect to any node in any data center to access data by using CQL (Cassandra Query Language, similar to SQL). Read and write requests can be sent to any node in a cluster, and the recipient node acts as a proxy between the client application and the nodes where the requested data are located. If a node or data center is down, data is retrieved from the nearest node, and changes are synched when the nonfunctional node or data center is restored.
Cassandra Infrastructure Components
- Node: Data is stored in nodes, which can be virtual or physical locations.
- Data center: A group of related nodes, either physical or virtual, in the same physical location. Replication is configured at this level, and data can be written to multiple data centers. Distinct workloads should be handled by separate data centers to keep requests close to each other and reduce data latency.
- Cluster: A group of one or more data centers that can be distributed across multiple physical locations.
- Commit log: Data is first written to this log for durability, and then written to disk when log memory is full. After all data is written to disk, logs can be archived, deleted, or recycled.
- SSTable: A sorted string table file to which Cassandra writes data. These tables are append-only, stored to disk sequentially, and maintained for each Cassandra table.
- CQL Table: A collection of ordered columns that has a primary key and is fetched by table row.
Features and Benefits
Columnar Store provides you with all of the power and flexibility of Cassandra database within the Predix platform, with pre-built infrastructure and integration and easy provisioning.
- Decentralized: Masterless architecture means all nodes are equal, and there is no single point of failure. Data can be written to and read from all nodes and is automatically distributed among nodes. Hardware failures therefore do not impact your important data, and network bottlenecks are eliminated.
- Fault tolerant: Columnar Store distributes your data across multiple nodes and data centers to provide even more failover protection. When nodes fail, they can be easily restored or replaced, and the commit log design prevents data loss.
- Scalable: Easy provisioning means you quickly scale from three to n nodes as your needs evolve.
- Fully replicated: You can customize data replication by selecting a replication factor that meets your requirements.
About Cassandra
Columnar Store is based on Cassandra, a non relational database that offers benefits not found in traditional RDBMS products.
The changing data landscape of today's online applications has created a need for data storage technologies with low latency and massive scalability, continuous uptime, and global data distribution with the ability to read and write in any location. These key requirements, along with the desire to reduce software and operational costs, are the reasons behind the growing popularity of non relational database technologies
Cassandra differs from a more traditional relational database, such as PostgreSQL, in the following ways:
Relational Database | Cassandra |
---|---|
Supports moderate incoming data velocity | Supports high incoming data velocity |
Incoming data from one or few locations | Incoming data from many locations |
Designed to manage mostly structured data | Designed to manage all types of data |
Supports complex and nested transactions | Supports simple transactions |
Single point of failure with failover | No single point of failure with continuous uptime |
Handles moderate data volumes | Handles very high data volumes |
Centralized architecture and deployment | Decentralized architecture and deployment |
Most data written in a single location | Data written in many locations |
Read scalability support, with consistency sacrifices | Read and write scalability support |
Vertical scale-up deployment | Horizontal scale-out deployment |
- What volume of incoming data do you need to store?
- Do you anticipate that data volume will grow over time?
- What is the expected incoming data velocity?
- How many locations generate the data you need to store?
- Is your data structured or unstructured?
- What level of transaction complexity support do you need?
- How important are continuous uptime and data durability?
Columnar Store Architecture

Predix Columnar Store can exchange data with Cloud Foundry apps, and receive inputs from other Predix services. Cloud Foundry apps can send data to Columnar Store and other Predix services. External cloud instances of apps and services are blocked from access to Columnar Store or any other components of the Predix Data Services Virtual Private Cloud (VPC).