Skip to content

This repository contains the implementation of ANTS, a research paper accepted at ESWC 2025 (Research Track). ANTS addresses the challenges of abstractive entity summarization in Knowledge Graphs (KGs) by generating optimal summaries that integrate existing triples with inferred (absent) triples.

License

Notifications You must be signed in to change notification settings

dice-group/ANTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

72 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ANTS: Knowledge Graph Abstractive Entity Summarization

GitHub license GitHub stars

This repository contains the implementation of ANTS, a research paper accepted at ESWC 2025 (Research Track). ANTS addresses the challenges of abstractive entity summarization in Knowledge Graphs (KGs) by generating optimal summaries that integrate existing triples with inferred (absent) triples. It leverages Knowledge Graph Embeddings (KGE) and Large Language Models (LLMs) to enhance summarization quality.

πŸš€ Table of Contents


πŸ“Œ About the Project

ANTS generates entity summaries in natural language from Knowledge Graphs by leveraging both KGE and LLM techniques. It addresses the problem of missing information by predicting absent triples and verbalizing them into readable summaries.


βš™οΈ Installation

To run the ANTS framework, you need to install the following packages:

python 3.7+
torch
  1. Create and activate a Conda environment:
conda create --name ants python=3.7
conda activate ants
  1. Download the project
git clone https://github.com/dice-group/ANTS.git

# Navigate to ANTS directory
cd ANTS
  1. Install required packages:
pip install torch
pip install -r requirements.txt

⚠️ Important Note: Ensure that all dependencies are correctly installed.


πŸ“‚ Repository Structure

β”œβ”€β”€ data
β”‚   β”œβ”€β”€ ESBM-DBpedia
β”‚   β”‚   β”œβ”€β”€ ESSUM
β”‚   β”‚   β”‚   β”œβ”€β”€ silver-standard-summaries
β”‚   β”‚   β”‚   └── absent
β”‚   β”‚   β”œβ”€β”€ predictions
β”‚   β”‚   β”‚   β”œβ”€β”€ ANTS
β”‚   β”‚   β”‚   β”œβ”€β”€ baselines
β”‚   β”‚   β”‚   β”œβ”€β”€ KGE
β”‚   β”‚   β”‚   └── LLM
β”‚   β”‚   └── elist.txt
β”‚   └── FACES
β”‚       β”œβ”€β”€ ESSUM
β”‚       β”‚   β”œβ”€β”€ silver-standard-summaries
β”‚       β”‚   └── absent
β”‚       β”œβ”€β”€ predictions
β”‚       β”‚   β”œβ”€β”€ ANTS
β”‚       β”‚   β”œβ”€β”€ baselines
β”‚       β”‚   β”œβ”€β”€ KGE
β”‚       β”‚   └── LLM
β”‚       └── elist.txt
β”œβ”€β”€ src
β”œβ”€β”€ β”œβ”€β”€ evaluation-modules
β”œβ”€β”€ β”œβ”€β”€ KGE-triples
β”œβ”€β”€ β”œβ”€β”€ LLM-triples
β”œβ”€β”€ β”œβ”€β”€ ranking-modules
β”œβ”€β”€ └──verbalizing-modules
β”œβ”€β”€ LICENSE
└── README.md

πŸ“Š Datasets

ESSUM (Silver-standard-summaries)

A silver-standard dataset combining entities from ESBM-DBpedia and FACES. For each entity, we extract sentences with mentioned entities from the first paragraph of its Wikipedia page. In our experiment, we created two subsets: (1) ESSUM-DBpedia: 110 entities from ESBM-DBpedia, and (2) ESSUM-FACES: 50 entities from FACES.

ESSUM-ABSENT

Derived by randomly removing 20% of triples from ESBM-DBpedia and FACES. These omitted triples serve as ground-truth absent triples to evaluate a model’s ability to infer missing facts.


πŸ› οΈ Usage

1️⃣ KGE-Triples

Setup LiteralE Framework

cd src/KGE-triples

# Clone the LiteralE repository
git clone https://github.com/SmartDataAnalytics/LiteralE.git

# Navigate to the LiteralE directory and download the DBpedia dataset
cd LiteralE/data
wget https://zenodo.org/records/10991461/files/dbpedia34k.tar.gz
tar -xvf dbpedia34k.tar.gz

# back to KGE-triples folder
cd ../..

# Update LiteralE modules
bash update-LiteralE-modules.sh

Run Missing Triple Predictions

## Navigate to the KGE-triples directory 
cd src/KGE-triples

# Execute the script for missing triples prediction
python run_missing_triples_prediction.py --dataset dbpedia34k --system Conve_text --input_drop 0.2 --embedding_dim 100 --batch_size 1 --epochs 100 ---lr 0.001 --process True

2️⃣ LLM-Triples

This component leverages a Large Language Model (LLM), such as GPT, to extend its application to knowledge graph (KG) completion tasks, including triple classification, relation prediction, and the completion of missing triples. As illustrated below, the ANTS approach integrates the LLM-triples component, such as GPT-4, to address the inherent limitations of KGE methods in inferring literal triples.

cd src/LLM-triples
# Execute the script for missing triples prediction
python run_missing_triples_prediction.py --model <gpt-model> --system gpt-4 --dataset ESSUM-DBpedia

# Execute the script for post-processing 
python post_processing.py --system gpt-4 --dataset ESSUM-DBpedia

3️⃣ Triple-Ranking And Entity Summary

Triples ranking utilizes the frequency of predicate occurrences within the knowledge graph, such as DBpedia. Predicates that occur most frequently will prioritize their corresponding triples at the top of the list. Run the triples-ranking process (which includes the ranking process and entity summary).

# Navigate to ranking-modules directory
cd src/ranking-modules

# Run triple-ranking and entity summary
python triples-ranking.py  --kge_model conve_text --llm_model gpt-4 --combined_model conve_text_gpt-4 --dataset ESSUM-DBpedia --base_model ANTS

4️⃣ Evaluation Protocol

Provides automatic evaluation of verbalized summaries using multiple NLP metrics.

Step 1: Verbalizing Entity Summary

Requirement:

  • Download the pre-trained model for verbalizing the abstractive summaries. Link verbalization-P2 model: https://zenodo.org/records/10984714
  • Move the pre-trained model to the verbalization-modules directory.
# Navigate to verbalizing-modules directory
cd verbalizing-modules

# Execute the script for verbalizing entity summary
python verbalization-process.py --dataset ESSUM-DBpedia --system conve_text_gpt-4 --base_model ANTS --semantic_constraints True

Step 2: Convert Triples into Evaluation Format

# Navigate to evaluation-modules directory
cd src/evaluation-modules

# Run converting verbalization results to evaluation format
python converting-to-evaluation-format.py --system "conve_text_gpt-4" --dataset "ESSUM-DBpedia" --base_model "ANTS" --semantic_constraints

Step 3: Evaluate Experiment Results using BLEU, METEOR, ChrF++, BLEURT

# Make sure the directory still in src/evaluation-modules directory
cd GenerationEval

# Execute the script to perform automatic evaluation
python eval.py -R ../../data/ESBM-DBpedia/predictions/ANTS/semantic-constraints/conve_text_gpt-4/evaluation/refs.txt -H ../../data/ESBM-DBpedia/predictions/ANTS/semantic-constraints/conve_text_gpt-4/evaluation/hyp.txt -lng en -nr 1 -m bleu,meteor,chrf++,ter,bert,bleurt

How to Cite

@inproceedings{ANTS2025,
  author = {Firmansyah, Asep Fajar and Zahera, Hamada and Sherif, Mohamed Ahmed and and Moussallem, Diego and Ngonga Ngomo, Axel-Cyrille},
  booktitle = {ESWC2025},
  title = {ANTS: Abstractive Entity Summarization in Knowledge Graphs},
  year = 2025
}

Contact

If you have any questions or feedbacks, feel free to contact us at asep.fajar.firmansyah@upb.de

About

This repository contains the implementation of ANTS, a research paper accepted at ESWC 2025 (Research Track). ANTS addresses the challenges of abstractive entity summarization in Knowledge Graphs (KGs) by generating optimal summaries that integrate existing triples with inferred (absent) triples.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published