This study investigates the thematic structure and narrative depth of The Guardian's editorial content published between January 1, 2024, and November 30, 2024. The research employs two advanced natural language processing (NLP) techniques – Latent Dirichlet Allocation (LDA) and BERT embeddings with clustering – to identify and analyze key themes. The findings reveal 15 distinct topics across the editorial corpus, including governance, healthcare, climate change, political dynamics, international conflicts, and social justice. LDA's results highlight overarching themes such as fiscal policies, electoral strategies, and geopolitical conflicts, demonstrating The Guardian's emphasis on systemic issues. BERT complements these insights by uncovering detailed narratives, such as patient experiences in healthcare, cultural reflections, and the complexities of global diplomacy.
-
Notifications
You must be signed in to change notification settings - Fork 0
adityanarayan-rai/editorials_topic_modeling
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This repo contains files from the Research Note assignment done for my NLP course.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published