Skip to content

Latest commit

 

History

History
112 lines (80 loc) · 3.8 KB

week-7.adoc

File metadata and controls

112 lines (80 loc) · 3.8 KB

Week 7 - Jungle of German Election Landscape

Graph-Analysis Wahl-O-Mat Bundestagswahl 2021

For every German Election the "Wahl-O-Mat" offers an interactive tool to compare your views on a number of topics/theses with the polical parties up for election.

You can try the interactive comparison at: https://www.wahl-o-mat.de

Source Data

The data can be downloaded from the site, the zip-file contains an excel sheet, that we can export as a CSV.

Our analysis is only for educational and scientific use.

Die Bundeszentrale für politische Bildung ist Urheber des nachfolgend veröffentlichten "Wahl-O-Mat-Datensatzes". Die Veröffentlichung des Datensatzes dient lediglich dem Zugang zu den in ihm enthaltenen Informationen. Jede Nutzung des Datensatzes, egal welcher Art, wird untersagt. Die Schranken des Urheberrechts durch gesetzlich erlaubte Nutzung bleiben hiervon unberührt.

Eine Ausnahme gilt nur für die Analyse des Datensatzes zu wissenschaftlichen oder journalistischen Zwecken sowie für die Veröffentlichung der Ergebnisse dieser Analyse. Dabei muss jederzeit klar erkennbar sein, dass die Bundeszentrale für politische Bildung nicht Urheber dieser Analyse ist.

Graph Database

We’re loading and exploring our data in Neo4j Aura Free.

Data Model

The data model is pretty straightforward, we create a node for Party and Topic each and store the stance of the party as a weight on POSITION releationhip, where

  • disagree is 0.0

  • neutral is 0.5

  • agree is 1.0

election party topic

Data Import

The CSV can be loaded directly into the Database using LOAD CSV and this script:

Create Indexes
create index on :Party(name);
create index on :Topic(name);
create constraint if not exists on (p:Party) assert p.id is unique;
create constraint if not exists on (t:Topic) assert t.id is unique;
Load Data
LOAD CSV WITH HEADERS FROM
"https://github.com/neo4j-examples/discoveraurafree/raw/main/data/wom-btw-2021.csv"
AS row

MERGE  (p:Party {id:toInteger(row.`Partei: Nr.`)})
ON CREATE SET p.text = row.`Partei: Name`, p.name = row.`Partei: Kurzbezeichnung`
MERGE (t:Topic {id:toInteger(row.`These: Nr.`)})
ON CREATE SET t.name = row.`These: Titel`, t.text = row.`These: These`

MERGE (p)-[pos:POSITION]->(t)
ON CREATE SET pos.text = row.`Position: Begründung`,
pos.weight =
CASE row.`Position: Position`
  WHEN "stimme nicht zu" THEN 0.0
  WHEN "stimme zu" THEN 1.0
  WHEN "neutral" THEN 0.5
END

RETURN count(*);

Exploration

We can just visualize the Data by querying with Neo4j Browser or Neo4j Bloom (where we can style the results based on attributes).

top 5 parties, same positions
MATCH (p:Party)-[r:POSITION]->(t)
WHERE p.id <= 6 AND t.id <10 AND r.weight = 1
RETURN *;

We can also compute the similarity (distance) between parties, similar to ratings of movies by the sum or avg of weight distances.

similarity (distance)
MATCH (p1:Party)-[r1:POSITION]->(t:Topic)<-[r2:POSITION]-(p2:Party)
WHERE id(p1)>id(p2)
RETURN p1.name,p2.name, sum(abs(r1.weight-r2.weight)) AS sum,
avg(abs(r1.weight-r2.weight)) AS avg
ORDER BY sum ASC;

We can turn that similarity (distance) when it’s below a certain threshhold into a relationship:

similarity relationship
MATCH (p1:Party)-[r1:POSITION]->(t:Topic)<-[r2:POSITION]-(p2:Party)
WHERE id(p1)>id(p2)
WITH p1, p2, avg(abs(r1.weight-r2.weight)) AS dist
WHERE dist < 0.3
MERGE (p1)-[s:SIMILAR]-(p2) SET s.weight = 1-dist;