Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems
This repository provides an implementation of Poison-RAG, a framework designed to explore adversarial data poisoning attacks on Retrieval-Augmented Generation (RAG)-based recommender systems. The objective of these attacks is to manipulate the textual metadata of items—specifically tags and descriptions—to promote certain items (e.g., long-tail items) and demote others (e.g., popular items) within a recommendation pipeline that leverages Large Language Models (LLMs).
RAG-based recommender systems combine three essential stages to generate final recommendations:
-
Retrieval: Relevant items are selected from a catalog based on user profiles and textual item metadata (e.g., tags, descriptions). LLM-based embeddings help bridge the semantic gap between user interests and item attributes.
-
Augmentation: The retrieved items are enriched and re-ranked by combining user context with additional knowledge (e.g., item textual attributes and user’s long-term preferences).
-
Generation: The LLM generates the final recommendations, grounded in the retrieved and augmented data, to present coherent and accurate suggestions to the user.
By modifying item metadata (e.g., adding adversarial tags), an attacker can subtly influence the retrieval process and, consequently, the downstream augmentation and generation stages.
The figure below illustrates the RAG-based recommender system pipeline. It highlights where adversarial data poisoning (in red) can be introduced to manipulate item exposures. By altering textual descriptions and tags at the retrieval stage, the attacker aims to shift item popularity or introduce specific biases that affect the final output recommendations.
Figure: The three-stage RAG pipeline (Retrieval, Augmentation, Generation). The adversarial poisoning occurs in the Retrieval stage by modifying textual metadata, thereby influencing the ranking of items and ultimately affecting final recommendations.
-
Textual Metadata Manipulation: By altering tags and descriptions, attackers can change how items are semantically represented. This influences the embeddings generated by LLMs and can push or pull items into or out of user recommendation sets.
-
Local vs. Global Attacks:
- Local Attacks: Introduce targeted changes to individual items, selecting contextually relevant tags that can subtly shift an item’s popularity class.
- Global Attacks: Apply a uniform set of modifications across a broader set of items. While simpler, these can be less effective and sometimes even counterproductive.
-
Defense Mechanisms: To mitigate such attacks, one can use:
- Data Augmentation: Enriching item metadata (tags, descriptions) using LLMs makes the system more robust against subtle manipulations.
- Auto-Tagging and NLP-based Enrichment: Automatically generated tags from rich item descriptions can reduce cold-start vulnerabilities and limit the impact of adversarially inserted tags.
- Attacks are more effective at demoting popular items than promoting long-tail ones.
- Local, contextually aware modifications outperform global, uniform changes.
- Data augmentation and auto-tagging strategies can improve the system’s resilience, making it harder for attackers to significantly alter recommendation outcomes.
This repository includes multiple Jupyter notebooks, each focusing on different stages of the Poison-RAG pipeline:
-
ECIR2025-advAttack_Code1_MovieLens_DataAugmentation&AutoTagging.ipynb:
Generates enriched textual metadata (descriptions, tags) for items.
Helps address cold-start issues and increases robustness of the recommendation system. -
ECIR2025-advAttack_Code2_MovieLens_ExtEmbeddings.ipynb:
Extracts embeddings from item metadata using LLM-based encoders.
Prepares semantic representations for downstream retrieval and recommendation tasks. -
ECIR2025-advAttack_Code3_MovieLens_RAG_attacks.ipynb:
Implements adversarial tag manipulation strategies (local and global) to poison the data.
Evaluates how these modifications influence item popularity and exposure metrics. -
ECIR2025-advAttack_Code4_MovieLens_RAG_phase1-Retrieval.ipynb:
Performs initial candidate retrieval from the full item catalog using LLM-based profiles.
Sets the stage for subsequent augmentation and re-ranking steps. -
ECIR2025-advAttack_Code5_MovieLens_RAG_phase2-Reranking.ipynb:
Enhances and re-ranks the retrieved candidates by integrating user context and augmented metadata.
Produces the final recommendation lists after applying Poison-RAG manipulations.
To reproduce the experiments or implement the Poison-RAG approach:
- Prepare the Dataset: Start with a dataset containing user-item interactions, item metadata, and any available textual attributes.
- Enrich Metadata: Use LLMs (e.g., GPT-3.5-turbo) to generate detailed item descriptions and auto-tagging to create consistent, semantically meaningful tags.
- Implement the RAG Pipeline: Extract LLM embeddings for items and construct user profiles, run retrieval to produce candidate sets, and then apply augmentation and generation steps.
- Apply Attacks: Modify selected tags according to the local or global strategies. Evaluate how these changes affect recommendation exposure, popularity shifts, and relevance metrics.
If you use or build upon this work, please cite the associated paper: