Open Source Data Platform

The open source data platform

Combining best practices

Popular data apps, easy to use

Stackable provides you with a curated selection of the best open source data apps like Apache Kafka®, Apache Druid, Trino and Apache Spark™. Store, process and visualize your data with the latest versions. Stay with the curve, not behind it.

All data apps work together seamlessly, and can be added or removed in no time. Based on Kubernetes, it runs everywhere – on prem or in the cloud.

Use it to create unique and enterprise-class data architectures. For example, it supports modern Data Warehouses, Data Lakes, Event Streaming, Machine Learning or Data Meshes.

Cloud-native Kubernetes Operators

Stackable modules are regular Kubernetes operators. Because of the excellent performance, the low memory requirements as well as the memory and thread security we decided to use Rust as the programming language.

Stackable Operator for Apache Kafka

The Stackable Operator for Apache Kafka is a tool for automatically rolling out and managing Apache Kafka in Kubernetes clusters. It is supporting Stackable authorization and monitoring.

Stackable Operator for Apache Druid

The Stackable Operator for Apache Druid is a tool that can manage Apache Druid clusters. Apache Druid is a real-time database to power modern analytics applications.

Stackable Operator for Apache Spark

The Stackable Operator for Apache Spark is a tool that makes it possible to manage Spark clusters on Kubernetes. It also offers the possibility to start Spark jobs on the cluster.

Stackable Operator for Apache Superset

The Stackable Operator for Apache Superset is a tool that can manage Apache Superset. Apache Superset is a modern data exploration and visualization platform. With Stackable, Superset is configured to work with Trino and Apache Druid.

Stackable Operator for Trino

The Stackable Operator for Trino is a tool that is configured to access data stored in Apache HDFS or any S3 compatible cloud storage. Trino is a fast, highly parallel and distributed query engine for Big Data analytics.

Stackable Operator for Apache Airflow

Stackable Operator for Apache Airflow is a tool that can manage Apache Airflow clusters. Airflow is a workflow engine that allows you to programmatically create, run, and monitor data pipelines and is your replacement if you use Apache Oozie.

Stackable Operator for Apache NiFi

The Stackable Operator for Apache NiFi is a tool for automatically rolling out and managing Apache NiFi. NiFi supports powerful and scalable data flows.

Stackable Operator for OPA (OpenPolicyAgent)

The Stackable Operator for OPA (OpenPolicyAgent) is a tool that can manage OPA servers. With OPA, rules and guidelines for data access can be flexibly defined “as code”.

Stackable Operator for Apache HBase

The Stackable Operator for Apche HBase manages Apache HBase clusters. HBase is a distributed, scalable Big Data store.

Stackable Operator for Apache HBase

The Stackable Operator for Apache HBase is a tool that can manage Apache HBase clusters. HBase is a distributed, scalable, big data store.

Stackable Operator for Apache Hadoop HDFS

The Stackable Operator for Apache Hadoop HDFS is a tool that can manage Apache HDFS clusters. HDFS is a distributed file system that provides high-throughput access to application data.

Stackable Operator for Apache Hive

The Stackable Operator for Apache Hive is a tool that can manage Apache Hive. Currently, it supports the Hive Metastore. The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.

Stackable Operator for Apache ZooKeeper

The Stackable Operator for Apache ZooKeeper is a tool that can automatically roll out and manage Apache ZooKeeper ensembles. Apache Zookeeper is used by many Big Dat aProducts as highly reliable coordinator of distributed systems.

How it works

From simple to complex environments with infrastructure-as-code

Stackable gives the flexibility to define both simple and complex data scenarios. Either way, the setup is always as simple as this:

1. in step one, you select the Stackable operators for the data apps you need for your data platform and install them using stackablectl or directly via Helm.

2. in step two, you install your data apps in the Kubernetes cluster by passing the appropriate configurations (CRDs) to the operators using stackablectl or directly via kubectl.

All of these definitions are maintained in an infrastructure-as-code fashion so that even the setup remains testable, repeatable and allows for standardization.

1. Operators and...

2. ... configurations

1. Operators and...

2. ... configurations

Operator Framework

The Stackable Operator framework is a Rust library supporting the fast and unified development of Kubernetes controllers and operators.

Docker Images Repository

The Stackable Docker Image Repository contains Docker files and scripts to build base images of open source products supported by and for use within Stackable.

Subscribe to our Newsletter

With the Stackable newsletter you’ll always be up to date when it comes to updates around Stackable!

Newsletter

Subscribe to the newsletter

With the Stackable newsletter you’ll always be up to date when it comes to updates around Stackable!