Skip to main content

Parallelism in Druid

Watch the following video to learn about parallelism in Apache Druid.

Demo code sample

Here is the query that is featured in the demo video. It uses FILTER (WHERE...) to limit the COUNT of records to one per day. Note that, to use this query as is, you should ingest the flights data into a table called "example-flights".

SELECT
COUNT(*) FILTER (WHERE TIME_IN_INTERVAL("__time", '2005-11-01/P1D')) AS "day-1",
COUNT(*) FILTER (WHERE TIME_IN_INTERVAL("__time", '2005-11-02/P1D')) AS "day-2",
COUNT(*) FILTER (WHERE TIME_IN_INTERVAL("__time", '2005-11-03/P1D')) AS "day-3"
FROM "example-flights"
WHERE "Reporting_Airline" = 'DL'

Exercise

Try to adapt the flight records query to use a table that you've ingested yourself. There are many more examples of Druid SQL in the learn-druid repository. If you'd like to see more, it's the place to see what Druid can do.

Learn more

Refer to the following resources to to learn more about the SQL dialect of Apache Druid, as well as detail on how interactive queries execute.

The learn-druid repository contains many examples of Druid SQL in action. See Using the Druid SQL API to get started with Druid SQL.