Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2402.05699] Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation #909

Open
1 task
ShellLM opened this issue Aug 21, 2024 · 1 comment
Labels
AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models finetuning Tools for finetuning of LLMs e.g. SFT or RLHF human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference Papers Research papers prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 21, 2024

Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation

Snippet

"Aligning large language models (LLMs) with human values is imperative to mitigate potential adverse effects resulting from their misuse. Drawing from the sociological insight that acknowledging all parties' concerns is a key factor in shaping human values, this paper proposes a novel direction to align LLMs by themselves: social scene simulation. To achieve this, we present MATRIX, a novel social scene simulator that emulates realistic scenes around a user's input query, enabling the LLM to take social consequences into account before responding. MATRIX serves as a virtual rehearsal space, akin to a Monopolylogue, where the LLM performs diverse roles related to the query and practice by itself. To inject this alignment, we fine-tune the LLM with MATRIX-simulated data, ensuring adherence to human values without compromising inference speed. We theoretically show that the LLM with MATRIX outperforms Constitutional AI under mild assumptions. Finally, extensive experiments validate that our method outperforms over 10 baselines across 4 benchmarks. As evidenced by 875 user ratings, our tuned 13B-size LLM exceeds GPT-4 in aligning with human values. See our project page at this https URL."

Paper

[2402.05699] Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation

Comments: 32 pages, 9 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)

Cite as: arXiv:2402.05699 [cs.CL]
(or arXiv:2402.05699v3 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2402.05699

Suggested labels

None

@ShellLM ShellLM added AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models finetuning Tools for finetuning of LLMs e.g. SFT or RLHF llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets llm-experiments experiments with large language models Papers Research papers labels Aug 21, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 21, 2024

Related content

#681 similarity score: 0.85
#778 similarity score: 0.84
#332 similarity score: 0.84
#750 similarity score: 0.83
#802 similarity score: 0.83
#536 similarity score: 0.83

@irthomasthomas irthomasthomas added prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re MachineLearning ML Models, Training and Inference human-verified <INST>NEVER PICK THIS LABEL</INST> labels Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Agents Autonomous AI agents using LLMs AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models finetuning Tools for finetuning of LLMs e.g. SFT or RLHF human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference Papers Research papers prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re
Projects
None yet
Development

No branches or pull requests

2 participants