Stackable

Author Archives: Jim Halfpenny

stackable-blog-header-blau-laptop

Stackable and Trino Part 2: Setting up a Hive migration sandbox

In Part 1 of this blog series I looked at how Stackable and Trino fit in with the wider Apache Hadoop SQL ecosystem and especially with Apache Hive to provide a means of smoothing the migration away from Hive. Now … Read More

stackable-blog-header-blau-laptop

Stackable and Trino Part 1: A Rosetta Stone for Apache Hive

Apache Hive is a common feature in many Hadoop deployments and it’s not unusual to find Hadoop clusters where the primary use case boils down to SQL queries on structured data. Hive has strong roots in the Hadoop ecosystem and … Read More

stackable-blog-header-blau

The Stackable Docathon: building a data pipeline

Last month we ran our first ever Documentation-Hackathon – or “Docathon” – at Stackable. The result is a guide showing how to build a simple data pipeline which can be found here: As anyone who has been involved in software … Read More

stackable-blog-header-hellgrau

What Hadoop users need to know about our platform

Why does Stackable make the ideal choice for your modern data platform Hadoop was first created in 2005 and as this adolescent technology rapidly approaches adulthood we find ourselves wondering what’s next on its life journey. Many folks have sounded … Read More

stackable-blog-header-hellgrau

A Brief History of Open Source Big Data Distributions

This blog post is based on a lecture at Berlin Buzzwords by Lars Francke and Sönke Liebau on June 15th, 2021. You can find the full version of the lecture on YouTube. If large amounts of data are to be stored, … Read More

stackable-blog-header-blau

Building a New Big Data Distribution Based on Kubernetes – With a Twist!

This blog post is based on the presentation to Berlin Buzzwords by Lars Francke and Sönke Leibau on 2021-06-15. You can watch the full version of the talk on YouTube. A brief history of open source big data distributions If … Read More