Skip to content

Latest commit

 

History

History
82 lines (61 loc) · 3.03 KB

File metadata and controls

82 lines (61 loc) · 3.03 KB

02 Building the Union of Multiple Streams

💡 This example will show how you can use the set operation UNION ALL to combine several streams of data.

See our documentation for a full list of fantastic set operations Apache Flink supports.

The Sources

The examples assumes you are building an application that is tracking visits 🦊 on foreign planets 🌰. There are three sources of visits. The universe of Rick and Morty, the very real world of NASA and such, and the not so real world of Hitchhikers Guide To The Galaxy.

All three tables are unbounded and backed by the faker connector.

All sources of tracked visits have the location and visit_time in common. Some have visitors, some have spacecrafts and one has both.

CREATE TEMPORARY TABLE rickandmorty_visits ( 
    visitor STRING,
    location STRING, 
    visit_time TIMESTAMP(3)
) WITH (
  'connector' = 'faker', 
  'fields.visitor.expression' = '#{RickAndMorty.character}',
  'fields.location.expression' =  '#{RickAndMorty.location}',
  'fields.visit_time.expression' =  '#{date.past ''15'',''5'',''SECONDS''}'
);

CREATE TEMPORARY TABLE spaceagency_visits ( 
    spacecraft STRING,
    location STRING, 
    visit_time TIMESTAMP(3)
) WITH (
  'connector' = 'faker', 
  'fields.spacecraft.expression' = '#{Space.nasaSpaceCraft}',
  'fields.location.expression' =  '#{Space.star}',
  'fields.visit_time.expression' =  '#{date.past ''15'',''5'',''SECONDS''}'
);

CREATE TEMPORARY TABLE hitchhiker_visits ( 
    visitor STRING,
    starship STRING,
    location STRING, 
    visit_time TIMESTAMP(3)
) WITH (
  'connector' = 'faker', 
  'fields.visitor.expression' = '#{HitchhikersGuideToTheGalaxy.character}',
  'fields.starship.expression' = '#{HitchhikersGuideToTheGalaxy.starship}',
  'fields.location.expression' =  '#{HitchhikersGuideToTheGalaxy.location}',
  'fields.visit_time.expression' =  '#{date.past ''15'',''5'',''SECONDS''}'
);

The Query

We are using UNION ALL because it doesn't try to combine equivalent rows like UNION would do. That is also the reason why UNION can only be used with bounded streams.

SELECT visitor, '' AS spacecraft, location, visit_time FROM rickandmorty_visits
UNION ALL
SELECT '' AS visitor, spacecraft, location, visit_time FROM spaceagency_visits
UNION ALL
SELECT visitor, starship AS spacecraft, location, visit_time FROM hitchhiker_visits;

👽 As we are using CREATE TEMPORARY TABLE, you need to run both the CREATE TABLE and the SELECT statements together.

The Beauty in VVP

screeny

The result is a combined stream of people visiting a location in one of those fantastic universes. We are sure you'll understand why this is one of our favorite queries.

🐦 Let us know about your favorite streaming SQL Query.