Using CREATE TABLE AS SELECT (CTAS) SQL statements is a well established method of copying and transforming structured data.Trino’s inherent ability to manipulate data from many different sources, sinks and formats makes this a particularly effective way to move data … Read More
Author Archives: Jim Halfpenny
Stackable and Trino Part 2: Setting up a Hive migration sandbox
In Part 1 of this blog series I looked at how Stackable and Trino fit in with the wider Apache Hadoop SQL ecosystem and especially with Apache Hive to provide a means of smoothing the migration away from Hive. Now … Read More
Stackable and Trino Part 1: A Rosetta Stone for Apache Hive
Apache Hive is a common feature in many Hadoop deployments and it’s not unusual to find Hadoop clusters where the primary use case boils down to SQL queries on structured data. Hive has strong roots in the Hadoop ecosystem and … Read More
The Stackable Docathon: building a data pipeline
Last month we ran our first ever Documentation-Hackathon – or “Docathon” – at Stackable. The result is a guide showing how to build a simple data pipeline which can be found here: As anyone who has been involved in software … Read More
What Hadoop users need to know about our platform
Why does Stackable make the ideal choice for your modern data platform Hadoop was first created in 2005 and as this adolescent technology rapidly approaches adulthood we find ourselves wondering what’s next on its life journey. Many folks have sounded … Read More
A Brief History of Open Source Big Data Distributions
This blog post is based on a lecture at Berlin Buzzwords by Lars Francke and Sönke Liebau on June 15th, 2021. You can find the full version of the lecture on YouTube. If large amounts of data are to be stored, … Read More
Building a New Big Data Distribution Based on Kubernetes – With a Twist!
This blog post is based on the presentation to Berlin Buzzwords by Lars Francke and Sönke Leibau on 2021-06-15. You can watch the full version of the talk on YouTube. A brief history of open source big data distributions If … Read More