💡 This example will show how to calculate the Top 3 suppliers who have the highest sales for every tumbling 5 minutes window.
The source table (orders
) is backed by the faker
connector, which continuously generates rows in memory based on Java Faker expressions.
In our previous recipes we've shown how you can aggregate time series data using the TUMBLE
function and also how you can get continuous Top-N results.
In this recipe, you will use the Window Top-N
feature to display the top 3 suppliers with the highest sales every 5 minutes.
The difference between the regular Top-N and this Window Top-N, is that Window Top-N only emits final results, which is the total top N records at the end of the window.
CREATE TABLE orders (
bidtime TIMESTAMP(3),
price DOUBLE,
item STRING,
supplier STRING,
WATERMARK FOR bidtime AS bidtime - INTERVAL '5' SECONDS
) WITH (
'connector' = 'faker',
'fields.bidtime.expression' = '#{date.past ''30'',''SECONDS''}',
'fields.price.expression' = '#{Number.randomDouble ''2'',''1'',''150''}',
'fields.item.expression' = '#{Commerce.productName}',
'fields.supplier.expression' = '#{regexify ''(Alice|Bob|Carol|Alex|Joe|James|Jane|Jack)''}',
'rows-per-second' = '100'
);
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
FROM (
SELECT window_start, window_end, supplier, SUM(price) as price, COUNT(*) as cnt
FROM TABLE(
TUMBLE(TABLE orders, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
GROUP BY window_start, window_end, supplier
)
) WHERE rownum <= 3;