[Feature] Add Model Hooks for Accessing and Customizing Model Activations #3266

shuyhere · 2025-02-03T05:44:46Z

Checklist

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
2. Please use English, otherwise it will be closed.

Motivation

Description

It would be beneficial to introduce model hooks that allow users to access and modify model activations. This feature would enable greater flexibility for tasks such as visualization, debugging, and custom processing of intermediate representations.

Use case

Extract intermediate outputs for interpretability analysis, such as LogitLens-style investigations.
Expose internal activations, enabling users to cache activations and implement functions to edit, remove, or replace them dynamically during inference, for example representation engineering.

While this may introduce some performance overhead, it would enhance interpretability research and enable efficient model editing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Model Hooks for Accessing and Customizing Model Activations #3266

[Feature] Add Model Hooks for Accessing and Customizing Model Activations #3266

shuyhere commented Feb 3, 2025

[Feature] Add Model Hooks for Accessing and Customizing Model Activations #3266

[Feature] Add Model Hooks for Accessing and Customizing Model Activations #3266

Comments

shuyhere commented Feb 3, 2025

Checklist

Motivation

Description

Use case

Related resources

model hook resources

related issues and use case