Stackable

Author Archives: Jim Halfpenny

Stackable blog thumbnail, showing an illustration of man setting with his mobile phone in hand.

Stackable and Trino Part 3: Migrating Hive Tables Using CTAS

Using CREATE TABLE AS SELECT (CTAS) SQL statements is a well established method of copying and transforming structured data.Trino’s inherent ability to manipulate data from many different sources, sinks and formats makes this a particularly effective way to move data … Read More

Stackable blog thumbnail, showing an illustration of a laptop, a phone and a coffee mug.

Stackable and Trino Part 2: Setting up a Hive migration sandbox

In Part 1 of this blog series I looked at how Stackable and Trino fit in with the wider Apache Hadoop SQL ecosystem and especially with Apache Hive to provide a means of smoothing the migration away from Hive. Now … Read More

Stackable blog thumbnail, showing an illustration of a laptop, a phone and a coffee mug.

Stackable and Trino Part 1: A Rosetta Stone for Apache Hive

Apache Hive is a common feature in many Hadoop deployments and it’s not unusual to find Hadoop clusters where the primary use case boils down to SQL queries on structured data. Hive has strong roots in the Hadoop ecosystem and … Read More

Stackable blog thumbnail, showing an illustration of a man setting and a big laptop beside him.

The Stackable Docathon: building a data pipeline

Last month we ran our first ever Documentation-Hackathon – or “Docathon” – at Stackable. The result is a guide showing how to build a simple data pipeline which can be found here: As anyone who has been involved in software … Read More

Stackable blog thumbnail, showing an illustration of man setting with his mobile phone in hand.

What Hadoop users need to know about our platform

Why does Stackable make the ideal choice for your modern data platform Hadoop was first created in 2005 and as this adolescent technology rapidly approaches adulthood we find ourselves wondering what’s next on its life journey. Many folks have sounded … Read More

Stackable blog thumbnail, showing an illustration of man setting with his mobile phone in hand.

A Brief History of Open Source Big Data Distributions

This blog post is based on a lecture at Berlin Buzzwords by Lars Francke and Sönke Liebau on June 15th, 2021. You can find the full version of the lecture on YouTube. If large amounts of data are to be stored, … Read More

Stackable blog thumbnail, showing an illustration of a man setting and a big laptop beside him.

Building a New Big Data Distribution Based on Kubernetes – With a Twist!

This blog post is based on the presentation to Berlin Buzzwords by Lars Francke and Sönke Leibau on 2021-06-15. You can watch the full version of the talk on YouTube. A brief history of open source big data distributions If … Read More