Stackable Data Platform (SDP) Release 25.11

We’re excited to introduce Stackable Data Platform 25.11, a release focused on expanding your data platform’s capabilities with new features, enhanced security, and improved observability.

This version brings experimental support for OpenSearch 3.1.0, enabling powerful search and analytics directly within the Stackable ecosystem. It also introduces KRaft-managed Kafka clusters, replacing ZooKeeper for modern, scalable Kafka deployments, and TLS encryption for Open Policy Agent (OPA), ensuring secure communication between OPA and your data services.

With 37 CVEs addressed, including 2 critical and 18 high-severity fixes, and enhanced metrics collection across all operators, SDP 25.11 makes your platform more secure and easier to monitor. The release also includes improved Trino performance, Airflow 3.0.6 (LTS) with Triggerer support, and Spark 4.0.1 (experimental), along with expanded Kubernetes 1.31-1.34 and Red Hat OpenShift 4.18-4.20 support. Combined with our OCI-first delivery on oci.stackable.tech, SDP 25.11 empowers you to build smarter, more resilient data architectures faster than ever.

New Platform Features

General enhancements
- Experimental OpenSearch Operator: Deploy and manage OpenSearch clusters with Stackable’s new operator, supporting version 3.1.0.
- End-of-Support (EoS) Warnings: All operators now emit warnings when approaching end-of-support, helping you stay compliant and up-to-date.
- KRaft for Apache Kafka: Experimental support for KRaft-managed Kafka clusters, replacing ZooKeeper for cluster management in Kafka 4.0+.
- Improved Trino Performance: Batch queries using the OPA Batch interface are now significantly faster.
Security
- TLS for Open Policy Agent (OPA): Encrypt traffic between OPA and clients, with automatic integration for Trino and NiFi authorizers.
- SecretClass v1alpha2: New features and stability improvements, including non-experimental samAccountName generation and certManager backend.
Observability
- Enhanced Metrics: All operators now expose Prometheus annotations for HTTP(S) scheme, metrics path, and port, improving metrics scraping and monitoring.
- Log Integrity: Resolved issues causing corrupted log entries in several supported products.
Authorization
- User Info Fetcher (UIF): No longer experimental, enabling seamless user group fetching for authorization.
ArgoCD Demo:
- New demo showcasing GitOps integration with Stackable operators and Airflow.

Platform Improvements

Security & vulnerabilities
- 37 CVEs Fixed: Addressed 2 critical and 18 high-severity vulnerabilities across Stackable product images.
Apache Airflow
- Triggerer Support: Enables deferrable operators, keeping worker slots free and enhancing High Availability.
- DAG-Processor Role: Optional individual role for separate configuration and dedicated container execution.
- Database Initialization: Can now be deactivated for troubleshooting, with clear warnings about potential risks.
Apache Spark
- ServiceAccount Overrides: Override Spark application ServiceAccounts using podOverrides.
- Spark 4.0.1 (Experimental): Added support, with known compatibility issues with HBase and Iceberg.
Apache NiFi
- SNI Check Workaround: Disable SNI checks for NiFi in scenarios where external names are not in certificates.
- Monitoring Guidance: Updated documentation for scraping NiFi 2 metrics using mTLS.
Open Policy Agent
- Dedicated Metrics Service: Per-rolegroup -metrics Service for Prometheus scraping, plus additional environment and bundle load metrics.
Trino
- Version 477 of Trino now supports the new, universal lakehouse connector.
Listener & Secret Operators
- ListenerClass Defaults: .spec.externalTrafficPolicy now defaults to null for broader LoadBalancer compatibility.
- Secret-Operator Helm Chart: Now deploys as two parts (Deployment for controller, DaemonSet for CSI), with separate resource control.

New Product Versions

The following new product versions are now supported (get the list of all supported product versions here):

Product	New version/s	What’s new ?
Airflow	3.0.6 (LTS)	Database Downgrade Support Allow database downgrade from Airflow 3.x to 2.11, providing a safety net for upgrades. PR: #54399, #54508 Grid View Performance Improvements Significantly improved Grid view performance and responsiveness with optimized data loading, making the UI much faster for large deployments. PRs: #52718, #52822, #52919 Fixed Connection Editing Security Issue Fixed critical bug where sensitive fields like passwords and extras were lost when updating connections through the UI. PRs: #53943, #53973 Task Log Viewer Enhancements Enhanced task log viewer with virtualized rendering for improved performance on large logs, making it easier to navigate and view extensive log files. PR: #50746 Improved Graph View Visualization Fixed graph view edge rendering issues for nested task groups and added auto-switching to triggered DAG runs when manually triggering. PRs: #54412 (3.0.5), #54336 (3.0.6)
Apache Hadoop	3.4.2 (LTS)	S3A: Support for S3 Conditional Writes S3A client now uses S3 conditional overwrite PUT requests for atomic file creation, eliminating the need for a separate HEAD request and providing true atomic operations. Issue: HADOOP-19256 S3A: Analytics Accelerator Support for Parquet Performance Added support for the analytics-accelerator-s3 to significantly improve Parquet read performance through optimized input streams. Issue: HADOOP-19363, HADOOP-19348 ABFS: FNS Over Blob Endpoint Support Comprehensive support for using Azure Blob Storage endpoint with Hierarchical Namespace (FNS) accounts, including listing optimizations and duplicate removal across iterations for better performance. JIRA: HADOOP-19179, HADOOP-19474, HADOOP-19543 Optimized Vectored I/O Read Thresholds Improved performance for cloud storage reads by increasing coalescing thresholds – S3A and ABFS now use 128KB minimum and 2MB maximum for merging adjacent read ranges based on Facebook Velox research. JIRA: HADOOP-19229 S3A: Checksum Support for S3 Objects Added ability to set checksums on S3 file uploads with support for CRC32, CRC32C, SHA1, and SHA256 algorithms via the `fs.s3a.create.checksum.algorithm` configuration option. JIRA: HADOOP-15224
Druid	34.0.0	SET Statements for Query Context Parameters You can now use SET statements to define query context parameters in SQL queries through both the web console and API, making it much easier to configure query settings without JSON formatting. Support for Hadoop-based ingestion is scheduled to end with Druid 37.0.0. Hadoop-based ingestion has been deprecated since Druid 32.0 and will be removed as early as Druid 35.0.0. We recommend one of Druid’s other supported ingestion methods, such as SQL-based ingestion or MiddleManager-less ingestion using Kubernetes. As part of this change, you must now opt-in to using the deprecated `index_hadoop` task type. If you don’t do this, your Hadoop-based ingestion tasks will fail. Historical Cloning (Experimental) Configure clone Historicals for rolling updates and high availability scenarios. Clones mirror segment assignments from original Historicals without counting toward replica counts. Improved Concurrency for Batch and Streaming Ingestion Tasks Enhanced parallel processing capabilities for both batch and streaming ingestion tasks, improving overall ingestion performance. Embedded Kill Tasks on the Overlord (Experimental) Kill tasks now run directly on the Overlord, providing multiple benefits: – Kill segments as soon as they’re eligible – Don’t consume task slots – Finish faster with optimized metadata queries – Skip locked intervals to avoid blocking – Can keep up with large numbers of unused segments Multi-Stream Supervisors (Experimental) Multiple supervisors can now ingest data into the same datasource, enabling more flexible ingestion architectures.
HBase	2.6.3 (LTS)	LDAP Admin Access Control for Web UI Added support for configuring LDAP users as administrators for all HBase Web UIs. Privileged pages can now be accessed only by designated admin users through the new configuration property `hbase.security.authentication.ldap.admin.users`. Issue: HBASE-29244 Upgraded OpenTelemetry Libraries Upgraded OpenTelemetry packages from 3-year-old versions to the latest libraries. Moved from `io.opentelemetry:opentelemetry-semconv` to the relocated `io.opentelemetry.semconv:opentelemetry-semconv` group ID and refactored semantic attributes. Issue: HBASE-29293 Dependency Updates for Security and Stability Multiple critical dependency upgrades across the codebase including byte-buddy, hbase-thirdparty, and JRuby. Issues: HBASE-29327, HBASE-29317, HBASE-29360
Hive	4.1.0	REST-based Catalog Server A REST-based catalog server backed by the Hive Metastore was introduced. Issue: HIVE-28059 Enhanced Iceberg Integration Enhanced Iceberg table support including partition-level column statistics. JDK 17 Support Hive 4.1.0 adds JDK 17 support Issue: HIVE-26473 IPv6 Compatibility IPv6 compatibility was added to the metastore Issue: HIVE-28784, HIVE-28783, HIVE-28782, HIVE-28781
Kafka	4.1.0 (experimental)	KRaft Mode Becomes Default (Removes ZooKeeper Dependency) ZooKeeper support has been completely removed, with KRaft becoming the exclusive metadata management solution, simplifying deployment and reducing operational overhead Links: KIP-500, KAFKA-9119 Next-Generation Consumer Rebalance Protocol The new consumer rebalance protocol dramatically improves performance by eliminating stop-the-world rebalances, with broker-managed partition assignment and incremental rebalancing that reduces downtime and latency. Links: KIP-848, KAFKA-14048 Queues for Kafka – Preview Release Moves from Early Access to Preview status, bringing queue-like semantics closer to production readiness with parallel partition consumption, out-of-order processing, and individual message acknowledgements Links: KIP-932, KAFKA-15349 Enhanced Client Resilience Enables clients to proactively rebootstrap when no metadata updates occur within a timeout period or when servers signal to rebootstrap, preventing clients from being stuck with outdated metadata. Links: KIP-1102, KAFKA-15639 Kafka Streams Rebalance Protocol Introduces a dedicated protocol for Kafka Streams with broker-managed task assignment, eliminating the need to piggyback on consumer rebalance protocol and solving production incidents caused by client-side rebalancing logic. Links: KIP-1071, KAFKA-16554 Improved Transactional Producer Error Handling Updates error handling logic and documentation for all transaction APIs, making it simpler to build robust applications with clear exception categories for recovery strategies. Links: KIP-1050, KAFKA-15400 JWT Bearer OAuth 2.0 Support Adds support for jwt-bearer grant type for OAuth in addition to client_credentials, enabling secure access management without static secrets in configurations. Links: KIP-1139, KAFKA-16965
NiFi	2.6.0 (LTS)	– Azure Git DevOps Flow Registry Client – Add Amazon Glue Schema Reference Reader – StandardProtobufReader supporting Schema Registries with Protobuf – Support SASL/OAUTHBEARER in Kafka processors – Add support for Binary XLS format to ExcelReader – Allow an unversioned PG to start tracking to a version in a registry client – Options to deal with parameter values in git registry clients when versioning a flow – Normalization of the JSON flow definitions when using git registry clients – Introduce Inject Metadata Output Strategy in ConsumeKafka – Add OAUTH support to SnowflakeComputingConnectionPool
OpenSearch	3.1.0 (LTS)	The entire product is new in the stack, not listing the full feature set here.
Spark	3.5.7 (LTS) 4.0.1 (experimental)	ANSI SQL Mode by Default Spark 4.0 enables ANSI SQL mode by default, improving SQL compatibility and making it easier to migrate SQL workloads from other databases. Issue: SPARK-44444 String Collation Support Added support for string collation, enabling locale-aware string comparison and sorting, which is crucial for internationalization and compatibility with traditional databases. SQL Pipe Syntax The objective is to make it easy to compose queries by specifying a sequence of SQL clauses separated by the pipe token \|> wherein each operator represents a fully-defined transformation of the preceding relation. Each pipe operator may refer to the names and rows generated by the preceding pipe operator only; otherwise, each step is stateless. Issue: SPARK-49555 New VARIANT Data Type Introduced the VARIANT data type for semi-structured data, improving support for complex, nested data formats. Issue: SPARK-45827
Superset	4.1.4 (LTS)	– Bug Fixes and Stability Improvements – Security and Dependency Updates
Trino	477 (LTS)	In general: – Add Lakehouse connector. (#25347) – Add support for `ALTER MATERIALIZED VIEW ... SET AUTHORIZATION`. (#25910) – Add support for default column values when creating tables or adding new columns. (#25679) – Add support for `ALTER VIEW ... REFRESH`. (#25906) – Add support for managing and querying table branches. (#25751, #26300, #26136) – Add the `cosine_distance()` function for sparse vectors. (#24027) – ⚠️ Breaking change: Improve precision and scale inference for arithmetic operations with decimal values. The previous behavior can be restored by setting the `deprecated.legacy-arithmetic-decimal-operators` config property to `true`. (#26422) – ⚠️ Breaking change: Remove the HTTP server event listener plugin from the server binary distribution and the Docker container. (#25967) – ⚠️ Breaking change: Enforce requirement for catalogs to be deployed in all nodes. (#26063) – Add physical data scan tracking to resource groups. (#25003) – Add `internal_network_input_bytes` column to `system.runtime.tasks` table. (#26524) – Add support for `Geometry` type in `to_geojson_geometry()`. (#26451) Hive Connector: – Add support for using GCS without credentials. (#25810) – Add support for reading tables using the Esri JSON format. (#25241) – Add support for `extended_boolean_literal` in text-file formats. (#21156) – Add metrics for data read from filesystem cache in `EXPLAIN ANALYZE VERBOSE` output. (#26342) – Add support for Twitter Elephantbird protobuf deserialization. (#26305) – Improve throughput for write-heavy queries on Azure when the `azure.multipart-write-enabled` config option is set to `true`. (#26225) – Reduce query failures due to S3 throttling. (#26407) – Avoid worker crashes due to out-of-memory errors when decoding unusually large Parquet footers. (#25973) – Fix incorrect results when reading from Parquet files produced by old versions of PyArrow. (#26058) – Add support for reading Hive OpenCSV tables with quoting and escaping disabled. (#26619) Iceberg Connector: – Add support for `SIGV4` as an independent authentication scheme. It can be enabled by setting the `iceberg.rest-catalog.security` config property to `SIGV4`. The `iceberg.rest-catalog.sigv4-enabled` config property is no longer supported. (#26218) – Add support for using GCS without credentials. (#25810) – Allow configuring the compression codec to use for reading a table via the `compression_codec` table property. The `compression_codec` session is no longer supported. (#25755) – Add metrics for data read from filesystem cache in `EXPLAIN ANALYZE VERBOSE` output. (#26342) – Improve performance of `expire_snapshots` procedure. (#26230) – Improve performance of `remove_orphan_files` procedure. (#26326, #26438) – Improve performance of queries on `$files` metadata table. (#25677) – Improve performance of writes to Iceberg tables when task retries are enabled. (#26620) – Reduce memory usage of `remove_orphan_files` procedure. (#25847) – Improve throughput for write-heavy queries on Azure when the `azure.multipart-write-enabled` config option is set to `true`. (#26225) – Reduce query failures due to S3 throttling. (#26407, #26432) – Avoid worker crashes due to out-of-memory errors when decoding unusually large Parquet footers. (#25973) – Fix performance regression and potential query failures for `REFRESH MATERIALIZED VIEW`. (#26051)
Zookeeper	3.9.4 (LTS)	Upgrade Netty to Fix CVE-2025-24970 Addresses a critical security vulnerability in Netty, ensuring safer network communication. Issue: ZOOKEEPER-4897 Use FIPS-Style Hostname Verification When no custom truststore is specified, ZooKeeper now uses FIPS-compliant hostname verification, improving security in regulated environments. Issue: ZOOKEEPER-4954 Log Full Exception Details for Server JAAS Config Failure Enhanced logging for JAAS configuration failures, making it easier to diagnose authentication issues. Issue: ZOOKEEPER-4906 Check Permissions Individually During Admin Server Auth Admin server authentication now checks permissions individually, providing finer-grained access control and security. Issue: ZOOKEEPER-4964

stackablectl

In parallel with the Stackable Data platform release, stackablctl version 1.2.0 allows you to automatically detect Kubernetes environments and choose a sensible ListenerClass preset by default. It also supports configuring the ListenerClass preset. For more details, take a look at the separate release notes.

More Info

Further details on our release and how to upgrade can be found in the release notes as well as in the change logs of the individual operators:

Airflow, Druid, HBase, HDFS, Hive, Kafka, NiFi, OpenPolicyAgent, OpenSearch, Spark, Superset, Trino, ZooKeeper