Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] unify LoopState and saved_break_states #16406

Merged
merged 2 commits into from
Feb 26, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 42 additions & 52 deletions crates/red_knot_python_semantic/src/semantic_index/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,22 +41,22 @@ use crate::Db;

mod except_handlers;

/// Are we in a state where a `break` statement is allowed?
#[derive(Clone, Copy, Debug)]
enum LoopState {
InLoop,
NotInLoop,
#[derive(Clone, Debug, Default)]
struct Loop {
/// Flow states at each `break` in the current loop.
break_states: Vec<FlowSnapshot>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked through all usages but I'm wondering if we could use a single break_states struct stored on SemanticIndexBuilder instead of allocating a Vec for each Loop.

What I'm thinking about:

  • Loop: stores the length of the "arena" at the time this loop was pushed
  • Getting all break states for the current loop is the same as arena[start..]
  • When popping a loop, truncate the arena to start to remove all snapshots belonging to the current state.

Copy link
Contributor Author

@carljm carljm Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had originally thought I might invest more time in reducing vec allocations here, but the simple version was completely neutral on benchmarks, so I'm not sure it's worth putting more time into this right now. I think perf optimization work should be motivated by some evidence.

I suspect that in practice loops are relatively uncommon, and loops with break in them even more uncommon, so in the great majority of cases we never allocate for this vec at all. I think if we want to improve performance of semantic indexing, there will be much higher-ROI things to look at.

}

impl LoopState {
fn is_inside(self) -> bool {
matches!(self, LoopState::InLoop)
impl Loop {
fn push_break(&mut self, state: FlowSnapshot) {
self.break_states.push(state);
}
}

struct ScopeInfo {
file_scope_id: FileScopeId,
loop_state: LoopState,
/// Current loop state; None if we are not currently visiting a loop
current_loop: Option<Loop>,
}

pub(super) struct SemanticIndexBuilder<'db> {
Expand All @@ -73,8 +73,6 @@ pub(super) struct SemanticIndexBuilder<'db> {
/// The name of the first function parameter of the innermost function that we're currently visiting.
current_first_parameter_name: Option<&'db str>,

/// Flow states at each `break` in the current loop.
loop_break_states: Vec<FlowSnapshot>,
/// Per-scope contexts regarding nested `try`/`except` statements
try_node_context_stack_manager: TryNodeContextStackManager,

Expand Down Expand Up @@ -106,7 +104,6 @@ impl<'db> SemanticIndexBuilder<'db> {
current_assignments: vec![],
current_match_case: None,
current_first_parameter_name: None,
loop_break_states: vec![],
try_node_context_stack_manager: TryNodeContextStackManager::default(),

has_future_annotations: false,
Expand Down Expand Up @@ -134,19 +131,20 @@ impl<'db> SemanticIndexBuilder<'db> {
builder
}

fn current_scope(&self) -> FileScopeId {
*self
.scope_stack
fn current_scope_info(&self) -> &ScopeInfo {
self.scope_stack
.last()
.map(|ScopeInfo { file_scope_id, .. }| file_scope_id)
.expect("SemanticIndexBuilder should have created a root scope")
}

fn loop_state(&self) -> LoopState {
fn current_scope_info_mut(&mut self) -> &mut ScopeInfo {
self.scope_stack
.last()
.last_mut()
.expect("SemanticIndexBuilder should have created a root scope")
.loop_state
}

fn current_scope(&self) -> FileScopeId {
self.current_scope_info().file_scope_id
}

/// Returns the scope ID of the surrounding class body scope if the current scope
Expand All @@ -167,11 +165,21 @@ impl<'db> SemanticIndexBuilder<'db> {
}
}

fn set_inside_loop(&mut self, state: LoopState) {
self.scope_stack
.last_mut()
.expect("Always to have a root scope")
.loop_state = state;
/// Push a new loop, returning the outer loop, if any.
fn push_loop(&mut self) -> Option<Loop> {
self.current_scope_info_mut()
.current_loop
.replace(Loop::default())
}

/// Pop a loop, replacing with the previous saved outer loop, if any.
fn pop_loop(&mut self, outer_loop: Option<Loop>) -> Loop {
std::mem::replace(&mut self.current_scope_info_mut().current_loop, outer_loop)
.expect("pop_loop() should not be called without a prior push_loop()")
}

fn current_loop_mut(&mut self) -> Option<&mut Loop> {
self.current_scope_info_mut().current_loop.as_mut()
}

fn push_scope(&mut self, node: NodeWithScopeRef) {
Expand Down Expand Up @@ -204,7 +212,7 @@ impl<'db> SemanticIndexBuilder<'db> {

self.scope_stack.push(ScopeInfo {
file_scope_id,
loop_state: LoopState::NotInLoop,
current_loop: None,
});
}

Expand Down Expand Up @@ -1208,15 +1216,9 @@ where
.current_visibility_constraints_mut()
.add_atom(later_predicate_id);

// Save aside any break states from an outer loop
let saved_break_states = std::mem::take(&mut self.loop_break_states);

// TODO: definitions created inside the body should be fully visible
// to other statements/expressions inside the body --Alex/Carl
let outer_loop_state = self.loop_state();
self.set_inside_loop(LoopState::InLoop);
let outer_loop = self.push_loop();
self.visit_body(body);
self.set_inside_loop(outer_loop_state);
let this_loop = self.pop_loop(outer_loop);

// If the body is executed, we know that we've evaluated the condition at least
// once, and that the first evaluation was True. We might not have evaluated the
Expand All @@ -1225,11 +1227,6 @@ where
let body_vis_constraint_id = first_vis_constraint_id;
self.record_visibility_constraint_id(body_vis_constraint_id);

// Get the break states from the body of this loop, and restore the saved outer
// ones.
let break_states =
std::mem::replace(&mut self.loop_break_states, saved_break_states);

// We execute the `else` once the condition evaluates to false. This could happen
// without ever executing the body, if the condition is false the first time it's
// tested. So the starting flow state of the `else` clause is the union of:
Expand All @@ -1250,7 +1247,7 @@ where

// Breaking out of a while loop bypasses the `else` clause, so merge in the break
// states after visiting `else`.
for break_state in break_states {
for break_state in this_loop.break_states {
let snapshot = self.flow_snapshot();
self.flow_restore(break_state);
self.record_visibility_constraint_id(body_vis_constraint_id);
Expand Down Expand Up @@ -1298,7 +1295,6 @@ where
self.record_ambiguous_visibility();

let pre_loop = self.flow_snapshot();
let saved_break_states = std::mem::take(&mut self.loop_break_states);

let current_assignment = match &**target {
ast::Expr::List(_) | ast::Expr::Tuple(_) => Some(CurrentAssignment::For {
Expand Down Expand Up @@ -1346,16 +1342,9 @@ where
self.pop_assignment();
}

// TODO: Definitions created by loop variables
// (and definitions created inside the body)
// are fully visible to other statements/expressions inside the body --Alex/Carl
let outer_loop_state = self.loop_state();
self.set_inside_loop(LoopState::InLoop);
let outer_loop = self.push_loop();
self.visit_body(body);
self.set_inside_loop(outer_loop_state);

let break_states =
std::mem::replace(&mut self.loop_break_states, saved_break_states);
let this_loop = self.pop_loop(outer_loop);

// We may execute the `else` clause without ever executing the body, so merge in
// the pre-loop state before visiting `else`.
Expand All @@ -1364,7 +1353,7 @@ where

// Breaking out of a `for` loop bypasses the `else` clause, so merge in the break
// states after visiting `else`.
for break_state in break_states {
for break_state in this_loop.break_states {
self.flow_merge(break_state);
}
}
Expand Down Expand Up @@ -1547,8 +1536,9 @@ where
}

ast::Stmt::Break(_) => {
if self.loop_state().is_inside() {
self.loop_break_states.push(self.flow_snapshot());
let snapshot = self.flow_snapshot();
if let Some(current_loop) = self.current_loop_mut() {
current_loop.push_break(snapshot);
}
// Everything in the current block after a terminal statement is unreachable.
self.mark_unreachable();
Expand Down
Loading