This project sets up Neo4J and PostgreSQL databases containing participant and voting data from all Eurovision contests going back to 1956. You can use these and the provided services to answer interesting, complex questions about the contest over the years. See Use Cases for some cool examples.
Flyway is used for schema migrations and to import data from provided CSV files. These files are generated by the Eurovision Scraper companion project, which collects Eurovision participant and voting data from Wikipedia.
The database is containerized using Docker Compose for easy setup and deployment.
-
Clone the repository:
git clone https://github.com/jekrch/eurovision-analytics.git cd eurovision-analytics
-
Source data is included via the following CSV files, contained in the
/data
directory:eurovision_vote_data.csv
eurovision_participant_data.csv
You can either use the provided default source files or generate new data using my Eurovision Scraper project.
-
Start the containers:
docker-compose up -d
This command will start both databases, and run the Flyway migrations, which generate the db schema and import data from CSV files. 3 interactive services will also be started which you can use to start exploring the data.
-
Connect to interactive data services
Service URL Frontend Home http://localhost:3000 GraphQL http://localhost:4000 Neo4J http://localhost:7474 -
Connect to the PostgreSQL database:
You can connect using a database client with the following details:
- Host:
localhost
- Port:
5432
- Database:
eurovision
- Username:
postgres
- Password:
postgres
The staging tables will be available in the
eurovision
database, and you can query them to access the imported data. - Host:
Explore and analyze ESC contest data using graph-based queries and visualizations.
The Neo4j database is automatically set up and populated with data via docker-compose up -d
. The provided docker-compose.yml
file includes the necessary configuration to start a Neo4j container and execute the data migration scripts.
- Open a web browser and visit
http://localhost:7474
. - Click
Connect
. No credentials required
With the Eurovision data loaded into Neo4j, you can perform various graph-based analyses and visualizations. Here are a few examples:
MATCH (sw:Songwriter)<-[:WRITTEN_BY]-(s:Song)
WITH sw, collect(s) as songs
ORDER BY size(songs) DESC
RETURN sw, songs
LIMIT 10
MATCH (c:Country)<-[:REPRESENTS]-(s:Song)-[:PLACED]->(:FinalPlace {place: 1})
MATCH (s)-[:HAS_SONG]-(y:Year)
WITH c, collect({song: s, year: y}) AS songs, count(s) AS count
ORDER BY count DESC
LIMIT 5
RETURN c, songs
MATCH (c:Country)-[:REPRESENTS]-(s:Song)-[:RECEIVED]->(n:ZeroPoints), (s)-[:HAS_SONG]-(y:Year)
WITH c, s, y
ORDER BY c.name, y.year
WHERE y.year <> 1956 // points were not awared in 1956
WITH c, COLLECT(s) AS songs, COLLECT(y) AS years
WHERE SIZE(songs) > 1
RETURN c, songs, years
MATCH (ro:FinalRunningOrder)<-[:RUNNING_ORDER]-(s:Song)-[:PLACED]->(fp:FinalPlace {place: 1})
MATCH (s)-[:PERFORMED_BY]->(a:Artist)
MATCH (s)-[:HAS_SONG]-(y:Year)
WITH ro, collect(s) as songs, collect(a) as artists, collect(y) as years
ORDER BY size(songs) DESC
LIMIT 3
UNWIND songs as s
UNWIND artists as a
UNWIND years as y
RETURN ro, s, a, y
In addition to Neo4J, this project also provides a PostgreSQL database. You can answer all kinds of interesting, complex questions with normalized data. Here are several examples to get you started.
SELECT year, country, broadcaster, artist, song, language
FROM song_view
WHERE language = 'Imaginary'
year | country | broadcaster | artist | song | language |
---|---|---|---|---|---|
2003 | Belgium | RTBF | Urban Trad | Sanomi | Imaginary |
2008 | Belgium | VRT | Ishtar | Julissi | Imaginary |
... | ... | ... | ... | ... | ... |
SELECT
COUNT(*) AS song_count,
artist,
STRING_AGG(CAST(year AS TEXT), ', ' ORDER BY year ASC) as contest_years,
STRING_AGG(DISTINCT country, ', ') as country
FROM song_view
GROUP BY artist
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;
song_count | artist | contest_years | country |
---|---|---|---|
4 | Peter, Sue and Marc | 1971, 1976, 1979, 1981 | Switzerland |
4 | Valentina Monetta | 2012, 2013, 2014, 2017 | San Marino |
4 | Fud Leclerc | 1956, 1958, 1960, 1962 | Belgium |
3 | Hot Eyes | 1984, 1985, 1988 | Denmark |
3 | Lys Assia | 1956, 1957, 1958 | Switzerland |
... | ... | ... | ... |
SELECT
COUNT(DISTINCT sv.song_id) AS song_count,
c.name,
STRING_AGG(CAST(sv.year AS TEXT), ', ' ORDER BY year ASC) as contest_years,
STRING_AGG(DISTINCT sv.country, ', ') as country
FROM song_view sv
JOIN song_composer sc ON sc.song_id = sv.song_id
JOIN composer c ON c.id = sc.composer_id
GROUP BY c.name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;
song_count | name | contest_years | country |
---|---|---|---|
24 | Ralph Siegel | 1974, 1976, 1979, 1980, 1980, 1981, 1982, 1985, 1987, 1988, 1990, 1992, 1994, 1997, 1999, 2002, 2003, 2006, 2009, 2012, 2013, 2014, 2015, 2017 | Germany, Luxembourg, Montenegro, San Marino, Switzerland |
16 | Bernd Meinunger | 1979, 1980, 1980, 1981, 1982, 1985, 1987, 1988, 1992, 1994, 1997, 1999, 2002, 2003, 2006, 2009 | Germany, Luxembourg, Montenegro, Switzerland |
12 | Thomas G:son | 2001, 2006, 2007, 2010, 2012, 2012, 2013, 2015, 2015, 2016, 2016, 2018 | Cyprus, Denmark, Georgia, Malta, Spain, Sweden |
9 | Dimitris Kontopoulos | 2009, 2013, 2014, 2016, 2017, 2018, 2019, 2021, 2021 | Azerbaijan, Greece, Moldova, Russia |
... | ... | ... | ... |
This project also exposes the Neo4j Eurovision data through a GraphQL API, allowing you to query and retrieve data using GraphQL queries.
- The GraphQL API is automatically set up and made available using Docker Compose.
docker-compose up -d
. - Use the API at
http://localhost:4000
Here's an example GraphQL query that retrieves all songs from Croatia that have a final place:
query {
songs(where: { country: { name: "Croatia" }, finalPlace: { place_NOT: null } }) {
id
name
year {
year
}
artist {
name
}
finalPlace {
place
}
totalPoints
}
}