We’re excited to introduce Stackable Data Platform 26.3 – a release packed with powerful new capabilities, significant product upgrades, and meaningful platform improvements across the board.
This release adds generic object overrides across the entire platform, giving operators unprecedented flexibility to customize any Kubernetes object.
The Stackable operator for OpenSearch receives a major upgrade with TLS configuration, service discovery, and keystore support. SDP 26.3 ships with support for OpenSearch 3.4.0. Want to explore AI-powered search on your own data? Our new OpenSearch RAG demo shows you how to build a Retrieval Augmented Generation pipeline with OpenSearch as the vector store.
The User Info Fetcher (UIF) now has experimental support for OpenLDAP as a backend, and the Entra backend has been stabilized. The restart-controller is now enabled for nearly all products, ensuring Pods are automatically restarted on configuration changes.
With 114 CVEs fixed – including 9 critical and 40 high-severity vulnerabilities – and all operator base images upgraded from UBI9 to UBI10, the platform is more secure and easier to operate than ever.
SDP 26.3 now supports Kubernetes 1.31-1.35 and Red Hat OpenShift 4.18-4.20.
On the product side, Apache Airflow 3.1 introduces Human-in-the-Loop (HITL) workflows, allowing humans to interact with, approve, and guide running data pipelines – a major step forward for AI-assisted and GenAI workflows. Apache NiFi 2.7.2 brings native Iceberg support, enabling direct writes to Iceberg tables from NiFi flows.
This release also marks an important milestone: Apache Spark 4.1.1 and Apache Superset 6.0 are now fully supported – no longer experimental. Apache Spark 4.1.1 debuts Declarative Pipelines, letting engineers define the desired state of their data pipelines rather than imperative execution steps. And Apache Superset 6.0 delivers a complete visual overhaul with true dark mode, dynamic theming, and a new Gantt chart visualization – two major versions of progress in a single release. And Kafka users can now migrate from ZooKeeper to KRaft, future-proofing clusters ahead of Kafka 4.x.
New Platform Features
General
- Generic Object Overrides (
objectOverrides): All operators now support merging user-supplied configurations into any Kubernetes object they create (StatefulSets, Listeners, ConfigMaps, and more), giving you full control over platform-level customization without forking operator logic. Priority order:configOverrides → podOverrides → objectOverrides. See the concepts documentation. - New RAG Demo with OpenSearch: A new demo showcasing Retrieval Augmented Generation (RAG) with OpenSearch as the vector store, making it easy to explore AI-powered search use cases on the Stackable platform. See the demo documentation.
- CRD Self-Management: All operators now maintain their own CRDs independently of Helm, with the conversion webhook running alongside the controller to support future CRD versioning.
Apache Airflow
Git-sync now supports SSH key authentication in addition to basic auth. The operator can also automatically convert the credentialsSecret field to the new credentials: basicAuthSecretName format, making this upgrade non-breaking. See the DAG mounting guide.
Apache HBase
The hbase.rest.endpoint field is now published to the REST server rolegroup discovery ConfigMap, making it easier to advertise and consume the HBase REST API from other services.
Apache Kafka
- OPA with TLS: Kafka’s OPA integration now supports TLS (non-TLS mode is still available), giving you flexibility in how you secure policy enforcement (note: TLS not yet supported for the Kafka Controller in Kafka 4.x).
- ZooKeeper to KRaft Migration: Existing Kafka 3.9.1 clusters can now be migrated from ZooKeeper to KRaft, reducing operational complexity and preparing your clusters for Kafka 4.x. See the KRaft migration guide.
Apache Spark
- Controlled Failure Handling: Spark applications no longer resubmit automatically on failure, giving teams more control over retry behavior. The previous behavior can be restored via
spec.job.retryOnFailureCount. Driver pods are deleted on terminal state; executor pods are cleaned up on driver or submit failure. - S3 Support for Spark Connect: First-class S3 bucket and connection support for Spark Connect servers, making it straightforward to give clients access to data in S3. See the Spark Connect usage guide.
- Spark Application Templates: Frequently used application configurations can now be centralized and referenced from SparkApplication objects, reducing duplication and making management easier. See the template usage guide.
Open Policy Agent
- OpenLDAP Backend for UIF: The User Info Fetcher now has experimental OpenLDAP support, extending enterprise identity integration beyond Keycloak and Entra. See the OpenLDAP backend documentation.
- CLI Overrides: OPA command-line arguments can now be customized via
cliOverridesin the CRD, allowing fine-tuned control over OPA’s runtime behavior. See the OPA documentation.
OpenSearch
- TLS Configuration: TLS for server and internal communication can now be configured via SecretClasses, bringing OpenSearch in line with the security standards of the rest of the platform. See the security usage guide.
- Keystore Support: Secrets such as S3 credentials for backups can now be added to the OpenSearch keystore via the operator. See the keystore guide.
- Service Discovery: A new discovery ConfigMap exposes connection parameters, making it easy to configure OpenSearch clients without manual coordination. See the discovery documentation.
- Security Plugin Configuration: The security plugin is now configurable within the
OpenSearchClusterspec in two modes: API-managed (post-initialization) or operator-managed. - Entra Backend for UIF: Now stabilized
⚠️ Upgrading from 25.11 requires removing existing podOverrides and configOverrides related to security settings. Consult the OpenSearch upgrade guide before upgrading.
Stackable Listener Operator
- Service Overrides on ListenerClasses:
serviceOverridescan now be configured on ListenerClasses, analogous topodOverrideson stacklets, enabling arbitrary modifications to created Services. - Independent ListenerClass Preset Deployment: The listener-operator now deploys the selected ListenerClass preset itself.
Stackable Secret Operator
The new secrets.stackable.tech/provision-parts annotation on secret volumes enables fine-grained control over which parts of secret material are provisioned – for example, provisioning only the ca.crt from an autoTls backend, or only the krb5.conf from a kerberosKeytab backend. This unlocks more targeted secret consumption patterns for complex deployments. See the volume documentation.
Platform Improvements
General
- UBI10 Base Images: All operator images upgraded from UBI9 to UBI10.
- Restart Controller: Now enabled for all products except Apache Hadoop – Pods restart automatically on ConfigMap or Secret changes, reducing manual intervention for configuration updates.
- Graceful Shutdown: All operators now consistently forward SIGTERM and shut down concurrent tasks correctly.
- Vector Log Fix: Log entries could previously be sent multiple times to the Vector aggregator after a sidecar restart. Fixed by persisting Vector state across restarts.
Apache Airflow
- Extended Provider Packages: All available extra packages for Airflow 3+ are now included in product images, with explicit exclusions documented – giving you a broader set of integrations out of the box.
- Celery Redis Reconnect Fix: Celery workers in Airflow 3 would sometimes stop executing tasks after a Redis reconnect. Resolved by bumping the celery package.
- Reduced Default API Workers: Default webserver API workers reduced from 4 to 1, lowering resource consumption for typical deployments. Easily adjustable via
configOverridesor increasing the replica count.
Apache Druid
⚠️ Breaking change: The router’s CPU request and limit have been increased from 100m/400m to 300m/1200m to reflect Druid 35’s higher resource requirements.
Apache Hadoop
The format-namenodes init container script now includes a warning and exit condition to detect corrupted data after formatting. Shell output from init containers that was previously not aggregated is now captured correctly.
Apache NiFi
- ⚠️ Breaking change – Authorization Restructuring: The authorization configuration now mirrors NiFi’s own interfaces.
opa,singleUser, andstandardare explicitly set options, withstandardsupporting file-based authorization including aninitialAdminUser. LDAP users without OPA must now explicitly configurestandard. See the security usage guide. - Static Authorization Files: Static
users.xml,authorizer.xml, andauthorizations.xmlfiles are now supported, useful for organizations sourcing user and group data from AD/Entra. - Port-forward Fix: NiFi pods now listen on the loopback interface, enabling Kubernetes port-forwarding for local debugging.
- OPA Package Name Fix: The operator now uses
spec.clusterConfig.authorization.opa.packageinstead of hard-codingnifi, defaulting to the NifiCluster name.
Apache Spark
- Pod/node affinities are now supported for Spark application jobs, allowing more precise workload placement. See the pod placement guide.
- Driver pods are now garbage collected on job completion, keeping your cluster clean.
- Fixed a duplicate volume issue when both the history server and an S3 connection reference the same SecretClass.
Open Policy Agent
- The Entra UIF backend is now stable with
OpaCluster v1alpha2, with automatic conversion betweenv1alpha1andv1alpha2. - Removed a spurious log warning caused by a superfluous service name in bundle POST requests.
Trino
- ⚠️ Breaking change – Column Masking: The operator now sets
opa.policy.batch-column-masking-uriinstead ofopa.policy.column-masking-uri, enabling Trino to fetch multiple column masks in a single request for better performance. AbatchColumnMasksrule is required in the OPA Trino ruleset. This can be disabled by settingenableColumnMasking: false. Note:opaunder authorization is now a mandatory enum variant. spec.connector.iceberg.metastorein TrinoCatalog is now optional, supporting REST and other non-metastore catalog types viaconfigOverrides.
Apache ZooKeeper
Fixed a bug where zkCli commands could fail due to ZOOKEEPER-4985 by setting the ZOOCFGDIR environment variable.
OpenSearch
Fixed a log file rollover bug where reaching the 5 MB limit caused repeated errors instead of rotating the file.
New Product Versions
The following new product versions are now supported (full list here):
| Product | New version/s | What’s new ? |
|---|---|---|
| Airflow | 3.1.6 | Human-in-the-Loop (HITL) Workflows HITLOperator, ApprovalOperator, HITLBranchOperator, and HITLEntryOperator allow users to approve, reject, branch, or provide input to running DAGs mid-run – especially powerful for GenAI workflows requiring human review. Authorization support via is_authorized_hitl_task() added in 3.1.6. (PR #59399)Deadline Alerts (AIP-86) Time-based deadlines on DAG runs with notification callbacks when they are at risk – replacing the legacy SLA mechanism with a more flexible and reliable approach. Calendar and Gantt Chart Views New calendar view for spotting patterns in DAG run history and Gantt chart for understanding task execution timelines and parallelism. Grid view gains keyboard navigation and expanded filtering. ⚠️ AUTH_OID removed – migrate to AUTH_OAUTH. |
| Druid | 35.0.1 | Virtual Storage / Fabric Mode (Experimental) Historical servers can now serve more segments than their physical disk holds by loading them on-demand from deep storage, effectively decoupling compute from local storage capacity. MSQ Engine Moves to Core The Multi-Stage Query engine is now a core capability – no extension needed. ⚠️ If you previously added druid-multi-stage-query to druid.extensions.loadList via the Stackable Druid operator’s extension configuration, you must remove it manually before upgrading – the operator will not do this automatically.Java 21 Support, Java 11 Dropped, Java 17 (until version 34) or 21 (for newer versions) required. Exact Count Bitmap Extension druid-exact-count-bitmap provides exact cardinality counting via Roaring Bitmap, useful when HyperLogLog approximations aren’t sufficient.Query Performance 40% speedup in interval deserialization, vectorization for CASE/IF/timestamp expressions. Jetty 12 Upgrade Stricter RFC 3986 URI and SNI compliance. May require druid.server.http.uriCompliance adjustments. |
| HBase | 2.6.4 (LTS) | Maintenance release with 86 resolved issues. Bug fixes and stability improvements – no new major features. hbase-operator-tools bumped to 1.3.0. |
| Apache Hive | 4.2.0 | Iceberg V3 Deletion Vectors Efficient row-level deletes without rewriting entire data files. (HIVE-29006) Iceberg ViewCatalog via REST API Access Iceberg view catalogs through the REST API, plus column defaults with ALTER commands. (HIVE-29036, HIVE-29252) JDK 21 as Minimum (HIVE-29027) LDAP Group Filtering for Kerberos Users Kerberos-authenticated users can now be filtered by LDAP group membership, improving security integration in enterprise environments. (HIVE-29211) Note: For more flexible, policy-driven access control, consider Stackable’s OPA integration as an alternative. HMS HTTPS Support The Hive Metastore now supports HTTPS for its Catalog and Property servlets. (HIVE-29112) |
| Kafka | 4.1.1 (experimental) | Bug fix release with 11 fixes and 2 improvements. Notable fixes for memory-mapped file handling on Linux and Kafka Streams at-least-once delivery guarantees. Note: Kafka 3.9.1 is the last version to support ZooKeeper. Kafka 4.x uses KRaft exclusively. |
| NiFi | 2.7.2 (LTS) | PutIcebergRecord Processor Write structured records directly to Iceberg tables with AWS and Azure FileIO support and REST catalog / Parquet formatting – bringing NiFi into the modern lakehouse stack. ConsumeKinesis Processor Consume Amazon Kinesis streams via Kinesis Client Library 3 for improved performance and reliability. Couchbase 3 Support New GetCouchbase and PutCouchbase processors for Couchbase data integration. Parquet Content Viewer View Parquet file contents directly in the NiFi UI, without external tools. GCP Workload Identity Federation New GCP authentication option alongside enhanced Azure Event Hub auth and AWS SDK v2 upgrades. Critical Fix: QueryRecord Wrong Results A longstanding issue since NiFi 2.0 causing the QueryRecord processor to return incorrect results has been fixed. Note: Iceberg support requires S3 and REST catalog. Hive Metastore and HDFS are not supported. See the NiFi Iceberg documentation. For general Iceberg handling in NiFi see Blog: How Apache NiFi 2 Integrates Apache Iceberg |
| Spark | 3.5.8 (LTS) 4.1.1 | Declarative Pipelines Specify the desired state of tables and data flows – Spark handles execution ordering, parallelism, checkpoints, and retries automatically, reducing pipeline boilerplate significantly. (SPARK-51727) SQL Scripting GA Now enabled by default, with CONTINUE HANDLER, multiple DECLARE variables, and improved NULL handling. (SPARK-54499) VARIANT Type GA Generally available for semi-structured data, with CSV, XML, and Parquet scan support and colon-sign operator for field access. (SPARK-54454) Real-time Structured Streaming Single-digit millisecond latency for stateless workloads with AQE support, overcoming the limitations of the traditional micro-batch model – a major step for low-latency streaming pipelines. (SPARK-53736) JDBC Driver for Spark Connect Standard BI tool and JDBC application connectivity to Spark Connect servers. (SPARK-53484) Python Arrow UDF/UDTF Native Arrow integration eliminates serialization overhead for Python UDFs and UDTFs, delivering significant performance gains. (SPARK-52214) Recursive CTE Support Long-awaited support for recursive Common Table Expressions, enabling hierarchical and graph queries in SQL. (SPARK-24497) 77 New Built-in Functions Including approx_top_k, KLL quantiles sketch, Theta Sketch, BITMAP_AND_AGG, and try_to_date. Note: Iceberg and Delta runtime libraries not yet available for Spark 4.1.1. |
| Superset | 6.0.0 (LTS) | Dark Mode and Dynamic Themes True dark mode with Ant Design v5 token-based theming. Themes can be created, edited, imported/exported, and applied system-wide or per-dashboard – giving data teams full control over their BI environment’s look and feel. Bootstrap and Font Awesome dependencies removed entirely. Gantt Chart Type New Gantt chart visualization for timeline-based data, alongside enhanced time series with date range timeshift support. New Database Connectors Superset 5.0 added Parseable, Firebolt, YDB, Denodo Virtual DataPort, and Apache Doris. Superset 6.0 adds SingleStore and improves existing Snowflake, DuckDB, and Databricks connectors. Dataset Folders and Management Datasets can now be organized in folders and created without entering the Explore view first – a significant UX improvement for data teams managing large numbers of datasets. OAuth2 Database Support OAuth2 authentication for database connections, including BigQuery and Trino OAuth2, enabling modern authentication workflows. Multi-tab PDF Export Export multi-tab dashboards as PDF; CSV/Excel exports respect SQL_MAX_ROW.React 19 and Frontend Modernization React 19, Ant Design 5, TypeScript v5, DayJS replacing Moment.js.Note: OpenStreetView is now the default for Deck.gl visualizations – no API key required. Consult the official update notes before upgrading. ⚠️ AUTH_OID removed – migrate to AUTH_OAUTH. |
| Open Policy Agent | 1.12.3 | String Interpolation in Rego (1.12.0) $"..." template syntax with {expression} placeholders – a more readable alternative to sprintf that makes policies easier to write and maintain. (v1.12.0)SQL Filter Compilation (1.9.0) Generate database-specific SQL WHERE clauses from Rego via the Compile API – previously exclusive to Enterprise OPA. (v1.9.0) Faster Bundle Loading (1.11.0) Concurrent Rego parsing significantly reduces startup times for large policy bundles. (v1.11.0) 10% Faster Compilation (1.12.0) Improved visitor implementation and reduced allocations deliver faster policy compilation. (v1.12.0) Security Fix: Memory Exhaustion via Forged Gzip (1.11.1) A malicious HTTP request could trigger out-of-memory conditions, bypassing token auth – only mTLS fully mitigates. Patched in 1.11.1. (v1.11.1) |
| OpenSearch | 3.4.0 | gRPC Transport Graduated from experimental plugin to full module, supporting Match, Boolean, Range, Wildcard, Fuzzy, Nested, and more – providing a high-performance alternative transport protocol for OpenSearch clusters. Streaming Aggregations New streaming cardinality and numeric terms aggregators via Apache Flight and Arrow, enabling real-time aggregation over large datasets without waiting for complete results. S3 Storage Improvements S3CrtClient for higher upload throughput (3.3); server-side and client-side encryption options for repository storage (3.4). Rule-based Auto-tagging Automatic security attribute tagging with ACL-aware routing – useful for compliance and access control scenarios. Pull-based Ingestion All-active ingestion mode, async periodic flush, message mappers, and dynamic consumer config updates – providing a more flexible ingestion pipeline. FIPS Compliance Tooling Build and test tooling for FIPS-compliant environments, important for government and regulated-industry deployments. Lucene 10.3.2 Upgraded from 10.1.x, bringing search performance and correctness improvements. |
| Trino | 479 | Automatic Internal TLS (479) Trino now auto-generates TLS certificates for internal cluster communication via ANNOUNCE node discovery – dramatically simplifying cluster security with no manual certificate management required. While Stackable already handled this at the infrastructure level, Trino now brings native TLS into the process itself. (Docs) Experimental Vectorized Serialization (479) Optimized worker data exchange on modern CPU architectures (Graviton 3, Skylake, Icelake, Zen 4+), improving query throughput for data-intensive operations. New Array Functions (479) array_first(), array_last(), and row literals with field name declarations.Encrypted Parquet Reading in Hive Connector (478) Plus improved complex predicate performance on the $path column.Iceberg Connector Improvements (478, 479) Token-exchange config, expire_snapshots options, memory optimization for highly nested fields, improved sorted table write performance.OPA Query ID Propagation (478) Query ID now passed to the OPA authorizer, enabling more granular per-query policy decisions. Note: Storage-connector remains based on Trino 477. ⚠️ Breaking (479): JDK 25 required. task.statistics-cpu-timer-enabled and prefer_streaming_operators removed. |
stackablectl
In parallel with SDP 26.3, stackablectl 1.4.0 is now available. This patch fixes a crash during release upgrade from SDP 25.11, caused by a 404 when looking up CRD files for the secret-operator, which now manages its own CRDs independently of Helm. See the release notes.
More Info
Further details on this release and upgrade instructions can be found in the release notes and the changelogs of the individual operators:
Airflow, Druid, HBase, HDFS, Hive, Kafka, NiFi, OpenPolicyAgent, OpenSearch, Spark, Superset, Trino, ZooKeeper