} Apache Druid Data Modeling (2022)
Apache Druid Data Modeling (2022)

Apache Druid Data Modeling (2022)

Data modeling is the key to leveraging your Apache Druid® database. Learn how to ingest data into Druid data models that are fast and scalable.

Not currently available

rate limit

Code not recognized.

About this course

The Apache Druid® database is often used to power applications that are fun, fluid, and fast.  It's a database designed for the last mile, sitting behind a UI where customers, suppliers, and employees are waiting for something to happen.  When that's your aim, you need to understand some elements of how the database works to get the best out of it, the kinds of things that impact how you insert your data in Druid and how you query it.  Take this course and walk through key principles of data modeling appropriate for Apache Druid.

  1. Create each table datasource for a specific set of query shapes,
  2. Transform data, as much as possible, before storage,
  3. Eliminate unused columns,
  4. Filter out unnecessary rows,
  5. Combine rows using query time granularity, approximation and rollup, and
  6. Organize segments for fast queries.

Who is this course for?

This course is most useful to application developers, the people who will insert and select data.  It's also useful to the database operators who will work with application developers – you'll get a feel for why they are making data engineering decisions that they are.

What's in the course?

In great detail, you'll walk through the phases of Druid ingestion, and learn how to customize them.  You'll learn how the six data modeling principles apply in your ingestion, and the effect that this has on the user experience.

  • Hear about the importance of time in Druid
  • See how data transformations work in Druid and why they matter
  • Get hands-on with schema design in Druid
  • Experience the value of ingestion-time data summarisation in Druid
  • Find out how Druid organizes its data to maximise parallelisation

Each unit in this course consists of an introductory lecture followed by exercises in a virtual lab environment with Druid running.

The final stage of this course is a certification exam.  Once you complete all essential parts of the course and pass the exam, you’ll receive an email from Imply with instructions on how to download your certificate and display it on LinkedIn to indicate your expertise.

What do I need?

It's a good idea to have some experience of Druid before you begin.  If you're just starting out, complete Apache Druid Basics or try the Druid quickstart.  If you're not new to Druid this course is still for you – it brings together knowledge and experience from project team members that you may not have heard before and can still help you even in a deployment you're already running.

The course is browser-based, so there is no need to download or set up anything before you start.

Curriculum

  • Course Introduction
  • Welcome to Apache Druid® Ingestion and Data modeling
  • Preview
    Course Overview
  • Data Modeling principles
  • Helpful lab tips
  • Learn more...
  • Ingestion and Data Modeling Introduction
  • Ingestion and Data Modeling - Introduction
  • Lab 1: Ingestion and Data Modeling Overview
  • Learn more
  • Timestamps
  • Timestamps - Introduction
  • Lab 2: Ingestion and Data Modeling Timestamps
  • Learn more
  • Transforms
  • Transforms - Introduction
  • Lab 3: Ingestion and Data Modeling Transforms
  • Learn more
  • Dimensions
  • Dimensions - Introduction
  • Lab 4: Ingestion and Data Modeling Dimensions
  • Learn more
  • Rollup and Metrics
  • Rollup and Metrics - Introduction
  • Lab 5: Ingestion and Data modeling Rollup and Metrics
  • Learn more
  • Segment Organization
  • Segments - Introduction
  • Lab 6: Ingestion and Data Modeling Segments
  • Learn more
  • Druid Ingestion and Data Modeling Accreditation Exam
  • Exam Introduction
  • Druid Ingestion and Data Modeling Accreditation Exam
  • Be part of it!
  • Please give us your feedback!

About this course

The Apache Druid® database is often used to power applications that are fun, fluid, and fast.  It's a database designed for the last mile, sitting behind a UI where customers, suppliers, and employees are waiting for something to happen.  When that's your aim, you need to understand some elements of how the database works to get the best out of it, the kinds of things that impact how you insert your data in Druid and how you query it.  Take this course and walk through key principles of data modeling appropriate for Apache Druid.

  1. Create each table datasource for a specific set of query shapes,
  2. Transform data, as much as possible, before storage,
  3. Eliminate unused columns,
  4. Filter out unnecessary rows,
  5. Combine rows using query time granularity, approximation and rollup, and
  6. Organize segments for fast queries.

Who is this course for?

This course is most useful to application developers, the people who will insert and select data.  It's also useful to the database operators who will work with application developers – you'll get a feel for why they are making data engineering decisions that they are.

What's in the course?

In great detail, you'll walk through the phases of Druid ingestion, and learn how to customize them.  You'll learn how the six data modeling principles apply in your ingestion, and the effect that this has on the user experience.

  • Hear about the importance of time in Druid
  • See how data transformations work in Druid and why they matter
  • Get hands-on with schema design in Druid
  • Experience the value of ingestion-time data summarisation in Druid
  • Find out how Druid organizes its data to maximise parallelisation

Each unit in this course consists of an introductory lecture followed by exercises in a virtual lab environment with Druid running.

The final stage of this course is a certification exam.  Once you complete all essential parts of the course and pass the exam, you’ll receive an email from Imply with instructions on how to download your certificate and display it on LinkedIn to indicate your expertise.

What do I need?

It's a good idea to have some experience of Druid before you begin.  If you're just starting out, complete Apache Druid Basics or try the Druid quickstart.  If you're not new to Druid this course is still for you – it brings together knowledge and experience from project team members that you may not have heard before and can still help you even in a deployment you're already running.

The course is browser-based, so there is no need to download or set up anything before you start.

Curriculum

  • Course Introduction
  • Welcome to Apache Druid® Ingestion and Data modeling
  • Preview
    Course Overview
  • Data Modeling principles
  • Helpful lab tips
  • Learn more...
  • Ingestion and Data Modeling Introduction
  • Ingestion and Data Modeling - Introduction
  • Lab 1: Ingestion and Data Modeling Overview
  • Learn more
  • Timestamps
  • Timestamps - Introduction
  • Lab 2: Ingestion and Data Modeling Timestamps
  • Learn more
  • Transforms
  • Transforms - Introduction
  • Lab 3: Ingestion and Data Modeling Transforms
  • Learn more
  • Dimensions
  • Dimensions - Introduction
  • Lab 4: Ingestion and Data Modeling Dimensions
  • Learn more
  • Rollup and Metrics
  • Rollup and Metrics - Introduction
  • Lab 5: Ingestion and Data modeling Rollup and Metrics
  • Learn more
  • Segment Organization
  • Segments - Introduction
  • Lab 6: Ingestion and Data Modeling Segments
  • Learn more
  • Druid Ingestion and Data Modeling Accreditation Exam
  • Exam Introduction
  • Druid Ingestion and Data Modeling Accreditation Exam
  • Be part of it!
  • Please give us your feedback!