Skip to content

MarwahAlaofi/SIGIR-23-SRP-UQV100-GPT-Query-Variants

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

SIGIR'23 Short Paper Data for Reproduction

This repository has the query variants used to generate the results presented in the SIGIR'23 short paper: Can Generative LLMs Create Query Variants for Test Collections? An Exploratory Study.

🔖 Paper Overview

The paper explores the utility of a LLM (GPT 3.5) to automatically generate queries and query variants from a description of an information need. Given a set of 100 information needs described as backstories from the UQV100 test collection, we explore how similar the queries generated by GPT 3.5 are to those generated by humans. We quantify the similarity using different metrics and examine how the use of each set would contribute to document pooling when building test collections. Our results show potential in using LLMs to generate query variants. While they may not fully capture the wide variety of human-generated variants, they generate similar sets of relevant documents, reaching up to 71.1% overlap at depth 100 during pooling, offering what could be a cost-effective solution for constructing test collections.

⛵️ Data

Query Variants

You can run the script variant_generation/generate_variants.py to generate query variants for the backstories provided in the UQV100 using GPT 3.5. For each backstory, the script builds a prompt using the task description (DESC_A) given in variant_generation/prompts.py, appends a random example, and provides the input backstory (see Figure 1 in the paper). Note that you must provide an access key to use the OpenAI API. Alternatively, you can access the generated query variants used in this paper with varying temperatures at: gpt_generated_variants/

Citation

If you find this paper useful, please cite it using the following BibTeX:

@INPROCEEDINGS{Alaofi23GptVariants,
    TITLE = {Can Generative LLMs Create Query Variants for Test Collections? An Exploratory Study},
    AUTHOR = {Alaofi, Marwah and Gallagher, Luke and Sanderson, Mark and Scholer, Falk and Thomas, Paul},
    BOOKTITLE = {{SIGIR} '23: The 46th International {ACM} {SIGIR}
                  Conference on Research and Development in
                  Information Retrieval},
    YEAR = {2023},
    URL = {https://doi.org/10.1145/3539618.3591960},
    DOI = {10.1145/3539618.3591960},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages