Stackable

Open Source Data Platform

The open source data platform

Combining best practices

Popular data apps, easy to use

Stackable provides you with a curated selection of the best open source data apps like Apache Kafka®, Apache Druid, Trino and Apache Spark™. Store, process and visualize your data with the latest versions. Stay with the curve, not behind it.

All data apps work together seamlessly, and can be added or removed in no time. Based on Kubernetes, it runs everywhere – on prem or in the cloud.

Use it to create unique and enterprise-class data architectures. For example, it supports modern Data Warehouses, Data Lakes, Event Streaming, Machine Learning or Data Meshes. 

An illustration of the the best open source data apps

Cloud-native Kubernetes Operators

Stackable modules are regular Kubernetes operators. Because of the excellent performance, the low memory requirements as well as the memory and thread security we decided to use Rust as the programming language.

Kafka logo
Stackable Operator for Apache Kafka
The Stackable Operator for Apache Kafka is a tool for automatically rolling out and managing Apache Kafka in Kubernetes clusters. It is supporting Stackable authorization and monitoring.
Druid logo
Stackable Operator for Apache Druid
The Stackable Operator for Apache Druid is a tool that can manage Apache Druid clusters. Apache Druid is a real-time database to power modern analytics applications.
Apache spark logo
Stackable Operator for Apache Spark
The Stackable Operator for Apache Spark is a tool that makes it possible to manage Spark clusters on Kubernetes. It also offers the possibility to start Spark jobs on the cluster.
Supercast logo
Stackable Operator for Apache Superset
The Stackable Operator for Apache Superset is a tool that can manage Apache Superset. Apache Superset is a modern data exploration and visualization platform. With Stackable, Superset is configured to work with Trino and Apache Druid.
Trino logo
Stackable Operator for Trino
The Stackable Operator for Trino is a tool that is configured to access data stored in Apache HDFS or any S3 compatible cloud storage. Trino is a fast, highly parallel and distributed query engine for Big Data analytics.
Apache Airflow logo
Stackable Operator for Apache Airflow
Stackable Operator for Apache Airflow is a tool that can manage Apache Airflow clusters. Airflow is a workflow engine that allows you to programmatically create, run, and monitor data pipelines and is your replacement if you use Apache Oozie.
Apache nifi logo
Stackable Operator for Apache NiFi
The Stackable Operator for Apache NiFi is a tool for automatically rolling out and managing Apache NiFi. NiFi supports powerful and scalable data flows.
Open policy agent logo
Stackable Operator for OPA (OpenPolicyAgent)
The Stackable Operator for OPA (OpenPolicyAgent) is a tool that can manage OPA servers. With OPA, rules and guidelines for data access can be flexibly defined “as code”.
Apache Hbase logo
Stackable Operator for Apache HBase
The Stackable Operator for Apche HBase manages Apache HBase clusters. HBase is a distributed, scalable Big Data store.
Apache Hbase logo
Stackable Operator for Apache HBase
The Stackable Operator for Apache HBase is a tool that can manage Apache HBase clusters. HBase is a distributed, scalable, big data store.
Hadoop logo
Stackable Operator for Apache Hadoop HDFS
The Stackable Operator for Apache Hadoop HDFS is a tool that can manage Apache HDFS clusters. HDFS is a distributed file system that provides high-throughput access to application data.
Hive logo
Stackable Operator for Apache Hive
The Stackable Operator for Apache Hive is a tool that can manage Apache Hive. Currently, it supports the Hive Metastore. The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.
Zookeeper logo
Stackable Operator for Apache ZooKeeper
The Stackable Operator for Apache ZooKeeper is a tool that can automatically roll out and manage Apache ZooKeeper ensembles. Apache Zookeeper is used by many Big Dat aProducts as highly reliable coordinator of distributed systems.

How it works

From simple to complex environments with infrastructure-as-code

Stackable gives the flexibility to define both simple and complex data scenarios. Either way, the setup is always as simple as this:

1. in step one, you select the Stackable operators for the data apps you need for your data platform and install them using stackablectl or directly via Helm.

2. in step two, you install your data apps in the Kubernetes cluster by passing the appropriate configurations (CRDs) to the operators using stackablectl or directly via kubectl.

All of these definitions are maintained in an infrastructure-as-code fashion so that even the setup remains testable, repeatable and allows for standardization.

Stackable logo icon
Operator Framework
The Stackable Operator framework is a Rust library supporting the fast and unified development of Kubernetes controllers and operators.
Docker logo
Docker Images Repository
The Stackable Docker Image Repository contains Docker files and scripts to build base images of open source products supported by and for use within Stackable.
An illustration of a laptop and phone on a desk

Newsletter

Subscribe to the newsletter

With the Stackable newsletter you’ll always be up to date when it comes to updates around Stackable!