Apache Druid Data Modeling (2024)

About this course

The Apache Druid® database powers analytical applications for organisations large and small around the world. Under the hood, a highly-optimised data format and shared-nothing, micro-services architecture helps deliver on performance, resilience, and availability. On the surface, a fully-fledged SQL dialect enables you to create flexible, interactive analytics UI components for your applications using real-time and batch ingested data.

Who is this course for?

This course guides you, as a relatively new Apache Druid® user through functionality and first principles of effective ingestion and data modelling.

Hear from experts on how and why to apply different techniques.
Get hands-on with links to Python notebooks on specific Druid database features.
Gain a certification that you can share on social media.

What's in the course?

There are four units in the course. The first walks through the basics of Druid table schemas, before moving onto the second part that focuses on where to best employ Druid's processing power. Next, part three covers data layout in Druid, and why it matters so much to query performance. Finally, walk through summarisation and approximation in Apache Druid - both key techniques that can help you boost query performance.

Each unit contains presentations from experts in Apache Druid, together with specific Python notebooks to try out.

You can return to the course as often as you like – even retaking the final certification exam to up that score!

What do I need?

All of the parts of this course are optional - you can skip straight to the final exam if that's all you need!

To run the Python notebooks that walk you through features in Apache Druid, you will need to make sure that you can run the learning environment. Make sure to visit the site and can get the learn-druid environment running.

Curriculum

Introduction
Course introduction
Welcome!
Set up your learning environment
Design a good schema

With Apache Druid, you can create TABLEs for storing event data, and LOOKUPs for key-value pair data.

Hear from experts in both areas as they describe what these two structures are and how to use them effectively.
Expert interview
Exercises
Put processing in the right place
Expert interview
Exercises: functions
Exercises: JOINs
Learn more: JSON-based ingestion
Optimize segment layout and location
Segments and infrastructure
Expert interview
Exercises: partitioning and clustering
Learn more
Learn more: tiering in action
Summarise and sketch
Expert interview
Exercises: summarised tables
Approximation
Exercises: approximation
Tables
Exercises: UNION ALL
Learn more
Exam
Join the community
Feedback questionnaire
Exam introduction
Exam
Next steps

About this course

Who is this course for?

This course guides you, as a relatively new Apache Druid® user through functionality and first principles of effective ingestion and data modelling.

Hear from experts on how and why to apply different techniques.
Get hands-on with links to Python notebooks on specific Druid database features.
Gain a certification that you can share on social media.

What's in the course?

Each unit contains presentations from experts in Apache Druid, together with specific Python notebooks to try out.

You can return to the course as often as you like – even retaking the final certification exam to up that score!

What do I need?

All of the parts of this course are optional - you can skip straight to the final exam if that's all you need!

Curriculum

Introduction
Course introduction
Welcome!
Set up your learning environment
Design a good schema

With Apache Druid, you can create TABLEs for storing event data, and LOOKUPs for key-value pair data.

Hear from experts in both areas as they describe what these two structures are and how to use them effectively.
Expert interview
Exercises
Put processing in the right place
Expert interview
Exercises: functions
Exercises: JOINs
Learn more: JSON-based ingestion
Optimize segment layout and location
Segments and infrastructure
Expert interview
Exercises: partitioning and clustering
Learn more
Learn more: tiering in action
Summarise and sketch
Expert interview
Exercises: summarised tables
Approximation
Exercises: approximation
Tables
Exercises: UNION ALL
Learn more
Exam
Join the community
Feedback questionnaire
Exam introduction
Exam
Next steps

Apache Druid Data Modeling (2024)

Data modeling is the key to leveraging your Apache Druid® database. Learn how to ingest data into Druid data models that are fast and scalable.

About this course

Who is this course for?

What's in the course?

What do I need?

Curriculum

Who is this course for?

What's in the course?

What do I need?