The open source data platform
The open source data platform
Combining best practices
Popular data apps, simple to use
Stackable provides you with a curated selection of the best open source data apps like Apache Kafka®, Apache Druid, Trino and Apache Spark™. Store, process and visualize your data with the latest versions. Stay with the curve, not behind it.
All data apps work together seamlessly, and can be added or removed in no time. Based on Kubernetes, it runs everywhere – on prem or in the cloud.
Use it to create unique and enterprise-class data architectures. For example, it supports modern Data Warehouses, Data Lakes, Event Streaming, Machine Learning or Data Meshes.
Operators of the Platform
Stackable modules are regular Kubernetes operators. Because of the excellent performance, the low memory requirements as well as the memory and thread security we decided to use Rust as the programming language.
Stackable Operator for Apache Kafka
The Stackable Operator for Apache Kafka is a tool for automatically rolling out and managing Apache Kafka in Kubernetes clusters. It is supporting Stackable authorization and monitoring.
Stackable Operator for Apache Druid
The Stackable Operator for Apache Druid is a tool that can manage Apache Druid clusters. Apache Druid is a real-time database to power modern analytics applications.
Stackable Operator for Apache Spark
The Stackable Operator for Apache Spark is a tool that makes it possible to manage Spark clusters on Kubernetes. It also offers the possibility to start Spark jobs on the cluster.
Stackable Operator for Apache Superset
The Stackable Operator for Apache Superset is a tool that can manage Apache Superset. Apache Superset is a modern data exploration and visualization platform. With Stackable, Superset is configured to work with Trino and Apache Druid.
Stackable Operator for Trino
The Stackable Operator for Trino is a tool that is configured to access data stored in Apache HDFS or any S3 compatible cloud storage. Trino is a fast, highly parallel and distributed query engine for Big Data analytics.
Stackable Operator for Apache Airflow is a tool that can manage Apache Airflow clusters. Airflow is a workflow engine that allows you to programmatically create, run, and monitor data pipelines and is your replacement if you use Apache Oozie.
Stackable Operator for Apache HBase
How it works
From simple to complex environments with infrastructure-as-code
Stackable gives the flexibility to define both simple and complex data scenarios. Either way, the setup is always as simple as this:
1. in step one, you select the Stackable operators for the data apps you need for your data platform and install them using stackablectl or directly via Helm.
2. in step two, you install your data apps in the Kubernetes cluster by passing the appropriate configurations (CRDs) to the operators using stackablectl or directly via kubectl.
All of these definitions are maintained in an infrastructure-as-code fashion so that even the setup remains testable, repeatable and allows for standardization.
Docker Images Repository
Subscribe to the newsletter
With the Stackable newsletter you’ll always be up to date when it comes to updates around Stackable!