Skip to content

A solution for analyzing historical Eurovision contest data using PostgreSQL, Neo4j, and GraphQL.

License

Notifications You must be signed in to change notification settings

jekrch/eurovision-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

63 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Eurovision Analytics 🌐

This project sets up Neo4J and PostgreSQL databases containing participant and voting data from all Eurovision contests going back to 1956. You can use these and the provided services to answer interesting, complex questions about the contest over the years. See Use Cases for some cool examples.

Flyway is used for schema migrations and to import data from provided CSV files. These files are generated by the Eurovision Scraper companion project, which collects Eurovision participant and voting data from Wikipedia.

The database is containerized using Docker Compose for easy setup and deployment.

Getting Started

  1. Clone the repository:

    git clone https://github.com/jekrch/eurovision-analytics.git
    cd eurovision-analytics
    
  2. Source data is included via the following CSV files, contained in the /data directory:

    • eurovision_vote_data.csv
    • eurovision_participant_data.csv

    You can either use the provided default source files or generate new data using my Eurovision Scraper project.

  3. Start the containers:

    docker-compose up -d
    

    This command will start both databases, and run the Flyway migrations, which generate the db schema and import data from CSV files. 3 interactive services will also be started which you can use to start exploring the data.

  4. Connect to interactive data services

    Service URL
    Frontend Home http://localhost:3000
    GraphQL http://localhost:4000
    Neo4J http://localhost:7474
  5. Connect to the PostgreSQL database:

    You can connect using a database client with the following details:

    • Host: localhost
    • Port: 5432
    • Database: eurovision
    • Username: postgres
    • Password: postgres

    The staging tables will be available in the eurovision database, and you can query them to access the imported data.

Neo4j graph data modeling πŸ“Š

Explore and analyze ESC contest data using graph-based queries and visualizations.

The Neo4j database is automatically set up and populated with data via docker-compose up -d. The provided docker-compose.yml file includes the necessary configuration to start a Neo4j container and execute the data migration scripts.

image

Access the Neo4j Browser:

  1. Open a web browser and visit http://localhost:7474.
  2. Click Connect. No credentials required

Graph-Based Analytics

With the Eurovision data loaded into Neo4j, you can perform various graph-based analyses and visualizations. Here are a few examples:

The top ten most prolific Eurovision songwriters

MATCH (sw:Songwriter)<-[:WRITTEN_BY]-(s:Song)
WITH sw, collect(s) as songs
ORDER BY size(songs) DESC
RETURN sw, songs
LIMIT 10
image

The top five winningest countries

MATCH (c:Country)<-[:REPRESENTS]-(s:Song)-[:PLACED]->(:FinalPlace {place: 1})
MATCH (s)-[:HAS_SONG]-(y:Year)
WITH c, collect({song: s, year: y}) AS songs, count(s) AS count
ORDER BY count DESC
LIMIT 5
RETURN c, songs
image

Countries with more than one song receiving Nil Poi in the finals

MATCH (c:Country)-[:REPRESENTS]-(s:Song)-[:RECEIVED]->(n:ZeroPoints), (s)-[:HAS_SONG]-(y:Year)
WITH c, s, y
ORDER BY c.name, y.year
WHERE y.year <> 1956 // points were not awared in 1956
WITH c, COLLECT(s) AS songs, COLLECT(y) AS years
WHERE SIZE(songs) > 1
RETURN c, songs, years
image

The top 3 running orders with the most songs winning the finals

MATCH (ro:FinalRunningOrder)<-[:RUNNING_ORDER]-(s:Song)-[:PLACED]->(fp:FinalPlace {place: 1})
MATCH (s)-[:PERFORMED_BY]->(a:Artist)
MATCH (s)-[:HAS_SONG]-(y:Year)
WITH ro, collect(s) as songs, collect(a) as artists, collect(y) as years
ORDER BY size(songs) DESC
LIMIT 3
UNWIND songs as s
UNWIND artists as a
UNWIND years as y
RETURN ro, s, a, y
image

PostgreSQL Use Cases πŸ”Ž

In addition to Neo4J, this project also provides a PostgreSQL database. You can answer all kinds of interesting, complex questions with normalized data. Here are several examples to get you started.

1. Which Eurovision songs were written in an imaginary language?

SELECT year, country, broadcaster, artist, song, language
FROM song_view
WHERE language = 'Imaginary'
year country broadcaster artist song language
2003 Belgium RTBF Urban Trad Sanomi Imaginary
2008 Belgium VRT Ishtar Julissi Imaginary
... ... ... ... ... ...

2. Which artists competed in the most contests?

SELECT
  COUNT(*) AS song_count,
  artist,
  STRING_AGG(CAST(year AS TEXT), ', ' ORDER BY year ASC) as contest_years,
  STRING_AGG(DISTINCT country, ', ') as country
FROM song_view
GROUP BY artist
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;
song_count artist contest_years country
4 Peter, Sue and Marc 1971, 1976, 1979, 1981 Switzerland
4 Valentina Monetta 2012, 2013, 2014, 2017 San Marino
4 Fud Leclerc 1956, 1958, 1960, 1962 Belgium
3 Hot Eyes 1984, 1985, 1988 Denmark
3 Lys Assia 1956, 1957, 1958 Switzerland
... ... ... ...

3. Which composers had the most songs in Eurovision?

SELECT
  COUNT(DISTINCT sv.song_id) AS song_count,
  c.name,
  STRING_AGG(CAST(sv.year AS TEXT), ', ' ORDER BY year ASC) as contest_years,
  STRING_AGG(DISTINCT sv.country, ', ') as country
FROM song_view sv
JOIN song_composer sc ON sc.song_id = sv.song_id
JOIN composer c ON c.id = sc.composer_id
GROUP BY c.name
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;
song_count name contest_years country
24 Ralph Siegel 1974, 1976, 1979, 1980, 1980, 1981, 1982, 1985, 1987, 1988, 1990, 1992, 1994, 1997, 1999, 2002, 2003, 2006, 2009, 2012, 2013, 2014, 2015, 2017 Germany, Luxembourg, Montenegro, San Marino, Switzerland
16 Bernd Meinunger 1979, 1980, 1980, 1981, 1982, 1985, 1987, 1988, 1992, 1994, 1997, 1999, 2002, 2003, 2006, 2009 Germany, Luxembourg, Montenegro, Switzerland
12 Thomas G:son 2001, 2006, 2007, 2010, 2012, 2012, 2013, 2015, 2015, 2016, 2016, 2018 Cyprus, Denmark, Georgia, Malta, Spain, Sweden
9 Dimitris Kontopoulos 2009, 2013, 2014, 2016, 2017, 2018, 2019, 2021, 2021 Azerbaijan, Greece, Moldova, Russia
... ... ... ...

GraphQL API πŸš€

This project also exposes the Neo4j Eurovision data through a GraphQL API, allowing you to query and retrieve data using GraphQL queries.

Accessing the GraphQL API

  1. The GraphQL API is automatically set up and made available using Docker Compose. docker-compose up -d.
  2. Use the API at http://localhost:4000

Example GraphQL Query

Here's an example GraphQL query that retrieves all songs from Croatia that have a final place:

query {
  songs(where: { country: { name: "Croatia" }, finalPlace: { place_NOT: null } }) {
    id
    name
    year {
      year
    }
    artist {
      name
    }
    finalPlace {
      place
    }
    totalPoints
  }
}
image

About

A solution for analyzing historical Eurovision contest data using PostgreSQL, Neo4j, and GraphQL.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published