Stackable Data Platform (SDP) Release 26.3

We’re excited to introduce Stackable Data Platform 26.3 – a release packed with powerful new capabilities, significant product upgrades, and meaningful platform improvements across the board.

This release adds generic object overrides across the entire platform, giving operators unprecedented flexibility to customize any Kubernetes object.
The Stackable operator for OpenSearch receives a major upgrade with TLS configuration, service discovery, and keystore support. SDP 26.3 ships with support for OpenSearch 3.4.0. Want to explore AI-powered search on your own data? Our new OpenSearch RAG demo shows you how to build a Retrieval Augmented Generation pipeline with OpenSearch as the vector store.
The User Info Fetcher (UIF) now has experimental support for OpenLDAP as a backend, and the Entra backend has been stabilized. The restart-controller is now enabled for nearly all products, ensuring Pods are automatically restarted on configuration changes.

With 114 CVEs fixed – including 9 critical and 40 high-severity vulnerabilities – and all operator base images upgraded from UBI9 to UBI10, the platform is more secure and easier to operate than ever.

SDP 26.3 now supports Kubernetes 1.31-1.35 and Red Hat OpenShift 4.18-4.20.

On the product side, Apache Airflow 3.1 introduces Human-in-the-Loop (HITL) workflows, allowing humans to interact with, approve, and guide running data pipelines – a major step forward for AI-assisted and GenAI workflows. Apache NiFi 2.7.2 brings native Iceberg support, enabling direct writes to Iceberg tables from NiFi flows.

This release also marks an important milestone: Apache Spark 4.1.1 and Apache Superset 6.0 are now fully supported – no longer experimental. Apache Spark 4.1.1 debuts Declarative Pipelines, letting engineers define the desired state of their data pipelines rather than imperative execution steps. And Apache Superset 6.0 delivers a complete visual overhaul with true dark mode, dynamic theming, and a new Gantt chart visualization – two major versions of progress in a single release. And Kafka users can now migrate from ZooKeeper to KRaft, future-proofing clusters ahead of Kafka 4.x.

New Platform Features

General

Generic Object Overrides (objectOverrides): All operators now support merging user-supplied configurations into any Kubernetes object they create (StatefulSets, Listeners, ConfigMaps, and more), giving you full control over platform-level customization without forking operator logic. Priority order: configOverrides → podOverrides → objectOverrides. See the concepts documentation.
New RAG Demo with OpenSearch: A new demo showcasing Retrieval Augmented Generation (RAG) with OpenSearch as the vector store, making it easy to explore AI-powered search use cases on the Stackable platform. See the demo documentation.
CRD Self-Management: All operators now maintain their own CRDs independently of Helm, with the conversion webhook running alongside the controller to support future CRD versioning.

Apache Airflow

Git-sync now supports SSH key authentication in addition to basic auth. The operator can also automatically convert the credentialsSecret field to the new credentials: basicAuthSecretName format, making this upgrade non-breaking. See the DAG mounting guide.

Apache HBase

The hbase.rest.endpoint field is now published to the REST server rolegroup discovery ConfigMap, making it easier to advertise and consume the HBase REST API from other services.

Apache Kafka

OPA with TLS: Kafka’s OPA integration now supports TLS (non-TLS mode is still available), giving you flexibility in how you secure policy enforcement (note: TLS not yet supported for the Kafka Controller in Kafka 4.x).
ZooKeeper to KRaft Migration: Existing Kafka 3.9.1 clusters can now be migrated from ZooKeeper to KRaft, reducing operational complexity and preparing your clusters for Kafka 4.x. See the KRaft migration guide.

Apache Spark

Controlled Failure Handling: Spark applications no longer resubmit automatically on failure, giving teams more control over retry behavior. The previous behavior can be restored via spec.job.retryOnFailureCount. Driver pods are deleted on terminal state; executor pods are cleaned up on driver or submit failure.
S3 Support for Spark Connect: First-class S3 bucket and connection support for Spark Connect servers, making it straightforward to give clients access to data in S3. See the Spark Connect usage guide.
Spark Application Templates: Frequently used application configurations can now be centralized and referenced from SparkApplication objects, reducing duplication and making management easier. See the template usage guide.

Open Policy Agent

OpenLDAP Backend for UIF: The User Info Fetcher now has experimental OpenLDAP support, extending enterprise identity integration beyond Keycloak and Entra. See the OpenLDAP backend documentation.
CLI Overrides: OPA command-line arguments can now be customized via cliOverrides in the CRD, allowing fine-tuned control over OPA’s runtime behavior. See the OPA documentation.

OpenSearch

TLS Configuration: TLS for server and internal communication can now be configured via SecretClasses, bringing OpenSearch in line with the security standards of the rest of the platform. See the security usage guide.
Keystore Support: Secrets such as S3 credentials for backups can now be added to the OpenSearch keystore via the operator. See the keystore guide.
Service Discovery: A new discovery ConfigMap exposes connection parameters, making it easy to configure OpenSearch clients without manual coordination. See the discovery documentation.
Security Plugin Configuration: The security plugin is now configurable within the OpenSearchCluster spec in two modes: API-managed (post-initialization) or operator-managed.
Entra Backend for UIF: Now stabilized

⚠️ Upgrading from 25.11 requires removing existing podOverrides and configOverrides related to security settings. Consult the OpenSearch upgrade guide before upgrading.

Stackable Listener Operator

Service Overrides on ListenerClasses: serviceOverrides can now be configured on ListenerClasses, analogous to podOverrides on stacklets, enabling arbitrary modifications to created Services.
Independent ListenerClass Preset Deployment: The listener-operator now deploys the selected ListenerClass preset itself.

Stackable Secret Operator

The new secrets.stackable.tech/provision-parts annotation on secret volumes enables fine-grained control over which parts of secret material are provisioned – for example, provisioning only the ca.crt from an autoTls backend, or only the krb5.conf from a kerberosKeytab backend. This unlocks more targeted secret consumption patterns for complex deployments. See the volume documentation.

Platform Improvements

General

UBI10 Base Images: All operator images upgraded from UBI9 to UBI10.
Restart Controller: Now enabled for all products except Apache Hadoop – Pods restart automatically on ConfigMap or Secret changes, reducing manual intervention for configuration updates.
Graceful Shutdown: All operators now consistently forward SIGTERM and shut down concurrent tasks correctly.
Vector Log Fix: Log entries could previously be sent multiple times to the Vector aggregator after a sidecar restart. Fixed by persisting Vector state across restarts.

Apache Airflow

Extended Provider Packages: All available extra packages for Airflow 3+ are now included in product images, with explicit exclusions documented – giving you a broader set of integrations out of the box.
Celery Redis Reconnect Fix: Celery workers in Airflow 3 would sometimes stop executing tasks after a Redis reconnect. Resolved by bumping the celery package.
Reduced Default API Workers: Default webserver API workers reduced from 4 to 1, lowering resource consumption for typical deployments. Easily adjustable via configOverrides or increasing the replica count.

Apache Druid

⚠️ Breaking change: The router’s CPU request and limit have been increased from 100m/400m to 300m/1200m to reflect Druid 35’s higher resource requirements.

Apache Hadoop

The format-namenodes init container script now includes a warning and exit condition to detect corrupted data after formatting. Shell output from init containers that was previously not aggregated is now captured correctly.

Apache NiFi

⚠️ Breaking change – Authorization Restructuring: The authorization configuration now mirrors NiFi’s own interfaces. opa, singleUser, and standard are explicitly set options, with standard supporting file-based authorization including an initialAdminUser. LDAP users without OPA must now explicitly configure standard. See the security usage guide.
Static Authorization Files: Static users.xml, authorizer.xml, and authorizations.xml files are now supported, useful for organizations sourcing user and group data from AD/Entra.
Port-forward Fix: NiFi pods now listen on the loopback interface, enabling Kubernetes port-forwarding for local debugging.
OPA Package Name Fix: The operator now uses spec.clusterConfig.authorization.opa.package instead of hard-coding nifi, defaulting to the NifiCluster name.

Apache Spark

Pod/node affinities are now supported for Spark application jobs, allowing more precise workload placement. See the pod placement guide.
Driver pods are now garbage collected on job completion, keeping your cluster clean.
Fixed a duplicate volume issue when both the history server and an S3 connection reference the same SecretClass.

Open Policy Agent

The Entra UIF backend is now stable with OpaCluster v1alpha2, with automatic conversion between v1alpha1 and v1alpha2.
Removed a spurious log warning caused by a superfluous service name in bundle POST requests.

Trino

⚠️ Breaking change – Column Masking: The operator now sets opa.policy.batch-column-masking-uri instead of opa.policy.column-masking-uri, enabling Trino to fetch multiple column masks in a single request for better performance. A batchColumnMasks rule is required in the OPA Trino ruleset. This can be disabled by setting enableColumnMasking: false. Note: opa under authorization is now a mandatory enum variant.
spec.connector.iceberg.metastore in TrinoCatalog is now optional, supporting REST and other non-metastore catalog types via configOverrides.

Apache ZooKeeper

Fixed a bug where zkCli commands could fail due to ZOOKEEPER-4985 by setting the ZOOCFGDIR environment variable.

OpenSearch

Fixed a log file rollover bug where reaching the 5 MB limit caused repeated errors instead of rotating the file.

New Product Versions

The following new product versions are now supported (full list here):

Product	New version/s	What’s new ?
Airflow	3.1.6	Human-in-the-Loop (HITL) Workflows HITLOperator, ApprovalOperator, HITLBranchOperator, and HITLEntryOperator allow users to approve, reject, branch, or provide input to running DAGs mid-run – especially powerful for GenAI workflows requiring human review. Authorization support via `is_authorized_hitl_task()` added in 3.1.6. (PR #59399) Deadline Alerts (AIP-86) Time-based deadlines on DAG runs with notification callbacks when they are at risk – replacing the legacy SLA mechanism with a more flexible and reliable approach. Calendar and Gantt Chart Views New calendar view for spotting patterns in DAG run history and Gantt chart for understanding task execution timelines and parallelism. Grid view gains keyboard navigation and expanded filtering. ⚠️ `AUTH_OID` removed – migrate to `AUTH_OAUTH`.
Druid	35.0.1	Virtual Storage / Fabric Mode (Experimental) Historical servers can now serve more segments than their physical disk holds by loading them on-demand from deep storage, effectively decoupling compute from local storage capacity. MSQ Engine Moves to Core The Multi-Stage Query engine is now a core capability – no extension needed. ⚠️ If you previously added `druid-multi-stage-query` to `druid.extensions.loadList` via the Stackable Druid operator’s extension configuration, you must remove it manually before upgrading – the operator will not do this automatically. Java 21 Support, Java 11 Dropped, Java 17 (until version 34) or 21 (for newer versions) required. Exact Count Bitmap Extension `druid-exact-count-bitmap` provides exact cardinality counting via Roaring Bitmap, useful when HyperLogLog approximations aren’t sufficient. Query Performance 40% speedup in interval deserialization, vectorization for CASE/IF/timestamp expressions. Jetty 12 Upgrade Stricter RFC 3986 URI and SNI compliance. May require `druid.server.http.uriCompliance` adjustments.
HBase	2.6.4 (LTS)	Maintenance release with 86 resolved issues. Bug fixes and stability improvements – no new major features. hbase-operator-tools bumped to 1.3.0.
Apache Hive	4.2.0	Iceberg V3 Deletion Vectors Efficient row-level deletes without rewriting entire data files. (HIVE-29006) Iceberg ViewCatalog via REST API Access Iceberg view catalogs through the REST API, plus column defaults with ALTER commands. (HIVE-29036, HIVE-29252) JDK 21 as Minimum (HIVE-29027) LDAP Group Filtering for Kerberos Users Kerberos-authenticated users can now be filtered by LDAP group membership, improving security integration in enterprise environments. (HIVE-29211) Note: For more flexible, policy-driven access control, consider Stackable’s OPA integration as an alternative. HMS HTTPS Support The Hive Metastore now supports HTTPS for its Catalog and Property servlets. (HIVE-29112)
Kafka	4.1.1 (experimental)	Bug fix release with 11 fixes and 2 improvements. Notable fixes for memory-mapped file handling on Linux and Kafka Streams at-least-once delivery guarantees. Note: Kafka 3.9.1 is the last version to support ZooKeeper. Kafka 4.x uses KRaft exclusively.
NiFi	2.7.2 (LTS)	PutIcebergRecord Processor Write structured records directly to Iceberg tables with AWS and Azure FileIO support and REST catalog / Parquet formatting – bringing NiFi into the modern lakehouse stack. ConsumeKinesis Processor Consume Amazon Kinesis streams via Kinesis Client Library 3 for improved performance and reliability. Couchbase 3 Support New GetCouchbase and PutCouchbase processors for Couchbase data integration. Parquet Content Viewer View Parquet file contents directly in the NiFi UI, without external tools. GCP Workload Identity Federation New GCP authentication option alongside enhanced Azure Event Hub auth and AWS SDK v2 upgrades. Critical Fix: QueryRecord Wrong Results A longstanding issue since NiFi 2.0 causing the QueryRecord processor to return incorrect results has been fixed. Note: Iceberg support requires S3 and REST catalog. Hive Metastore and HDFS are not supported. See the NiFi Iceberg documentation. For general Iceberg handling in NiFi see Blog: How Apache NiFi 2 Integrates Apache Iceberg
Spark	3.5.8 (LTS) 4.1.1	Declarative Pipelines Specify the desired state of tables and data flows – Spark handles execution ordering, parallelism, checkpoints, and retries automatically, reducing pipeline boilerplate significantly. (SPARK-51727) SQL Scripting GA Now enabled by default, with CONTINUE HANDLER, multiple DECLARE variables, and improved NULL handling. (SPARK-54499) VARIANT Type GA Generally available for semi-structured data, with CSV, XML, and Parquet scan support and colon-sign operator for field access. (SPARK-54454) Real-time Structured Streaming Single-digit millisecond latency for stateless workloads with AQE support, overcoming the limitations of the traditional micro-batch model – a major step for low-latency streaming pipelines. (SPARK-53736) JDBC Driver for Spark Connect Standard BI tool and JDBC application connectivity to Spark Connect servers. (SPARK-53484) Python Arrow UDF/UDTF Native Arrow integration eliminates serialization overhead for Python UDFs and UDTFs, delivering significant performance gains. (SPARK-52214) Recursive CTE Support Long-awaited support for recursive Common Table Expressions, enabling hierarchical and graph queries in SQL. (SPARK-24497) 77 New Built-in Functions Including approx_top_k, KLL quantiles sketch, Theta Sketch, BITMAP_AND_AGG, and try_to_date. Note: Iceberg and Delta runtime libraries not yet available for Spark 4.1.1.
Superset	6.0.0 (LTS)	Dark Mode and Dynamic Themes True dark mode with Ant Design v5 token-based theming. Themes can be created, edited, imported/exported, and applied system-wide or per-dashboard – giving data teams full control over their BI environment’s look and feel. Bootstrap and Font Awesome dependencies removed entirely. Gantt Chart Type New Gantt chart visualization for timeline-based data, alongside enhanced time series with date range timeshift support. New Database Connectors Superset 5.0 added Parseable, Firebolt, YDB, Denodo Virtual DataPort, and Apache Doris. Superset 6.0 adds SingleStore and improves existing Snowflake, DuckDB, and Databricks connectors. Dataset Folders and Management Datasets can now be organized in folders and created without entering the Explore view first – a significant UX improvement for data teams managing large numbers of datasets. OAuth2 Database Support OAuth2 authentication for database connections, including BigQuery and Trino OAuth2, enabling modern authentication workflows. Multi-tab PDF Export Export multi-tab dashboards as PDF; CSV/Excel exports respect `SQL_MAX_ROW`. React 19 and Frontend Modernization React 19, Ant Design 5, TypeScript v5, DayJS replacing `Moment.js`. Note: OpenStreetView is now the default for Deck.gl visualizations – no API key required. Consult the official update notes before upgrading. ⚠️ `AUTH_OID` removed – migrate to `AUTH_OAUTH`.
Open Policy Agent	1.12.3	String Interpolation in Rego (1.12.0) `$"..."` template syntax with `{expression}` placeholders – a more readable alternative to `sprintf` that makes policies easier to write and maintain. (v1.12.0) SQL Filter Compilation (1.9.0) Generate database-specific SQL WHERE clauses from Rego via the Compile API – previously exclusive to Enterprise OPA. (v1.9.0) Faster Bundle Loading (1.11.0) Concurrent Rego parsing significantly reduces startup times for large policy bundles. (v1.11.0) 10% Faster Compilation (1.12.0) Improved visitor implementation and reduced allocations deliver faster policy compilation. (v1.12.0) Security Fix: Memory Exhaustion via Forged Gzip (1.11.1) A malicious HTTP request could trigger out-of-memory conditions, bypassing token auth – only mTLS fully mitigates. Patched in 1.11.1. (v1.11.1)
OpenSearch	3.4.0	gRPC Transport Graduated from experimental plugin to full module, supporting Match, Boolean, Range, Wildcard, Fuzzy, Nested, and more – providing a high-performance alternative transport protocol for OpenSearch clusters. Streaming Aggregations New streaming cardinality and numeric terms aggregators via Apache Flight and Arrow, enabling real-time aggregation over large datasets without waiting for complete results. S3 Storage Improvements S3CrtClient for higher upload throughput (3.3); server-side and client-side encryption options for repository storage (3.4). Rule-based Auto-tagging Automatic security attribute tagging with ACL-aware routing – useful for compliance and access control scenarios. Pull-based Ingestion All-active ingestion mode, async periodic flush, message mappers, and dynamic consumer config updates – providing a more flexible ingestion pipeline. FIPS Compliance Tooling Build and test tooling for FIPS-compliant environments, important for government and regulated-industry deployments. Lucene 10.3.2 Upgraded from 10.1.x, bringing search performance and correctness improvements.
Trino	479	Automatic Internal TLS (479) Trino now auto-generates TLS certificates for internal cluster communication via ANNOUNCE node discovery – dramatically simplifying cluster security with no manual certificate management required. While Stackable already handled this at the infrastructure level, Trino now brings native TLS into the process itself. (Docs) Experimental Vectorized Serialization (479) Optimized worker data exchange on modern CPU architectures (Graviton 3, Skylake, Icelake, Zen 4+), improving query throughput for data-intensive operations. New Array Functions (479) `array_first()`, `array_last()`, and row literals with field name declarations. Encrypted Parquet Reading in Hive Connector (478) Plus improved complex predicate performance on the `$path` column. Iceberg Connector Improvements (478, 479) Token-exchange config, `expire_snapshots` options, memory optimization for highly nested fields, improved sorted table write performance. OPA Query ID Propagation (478) Query ID now passed to the OPA authorizer, enabling more granular per-query policy decisions. Note: Storage-connector remains based on Trino 477. ⚠️ Breaking (479): JDK 25 required. `task.statistics-cpu-timer-enabled` and `prefer_streaming_operators` removed.

stackablectl

In parallel with SDP 26.3, stackablectl 1.4.0 is now available. This patch fixes a crash during release upgrade from SDP 25.11, caused by a 404 when looking up CRD files for the secret-operator, which now manages its own CRDs independently of Helm. See the release notes.

More Info

Further details on this release and upgrade instructions can be found in the release notes and the changelogs of the individual operators:

Airflow, Druid, HBase, HDFS, Hive, Kafka, NiFi, OpenPolicyAgent, OpenSearch, Spark, Superset, Trino, ZooKeeper