swap-ir-impl starting point #4

Soulthym · 2024-12-02T11:00:46Z

#derive(Spanned)
struct ... {
  #[span]
  span: ...
  ...
}

Implement converters for:

Link<NodeType> -> Link<Node>
Link<NodeType> -> Link<Leaf<T>>
Link<NodeType> -> Link<Graph>

Helper traits are available (IsNode, NotNode, IsGraph, NotGraph, IsLeaf, NotLeaf)

The updated check-list was moved to this comment: #4 (comment)

IsNode derive macro: support extra fields

bitwalker · 2024-12-04T18:58:49Z

@Soulthym I just realized that since this PR is in your fork, and not the 0xPolygonMiden repo, I can't leave review comments. Not sure if it's easier to open a PR stacked on 0xPolygonMiden/air-script#359 with these changes, or add me as a collaborator to this repo for reviews, but I'll have to wait for one or the other to provide my review feedback. Just let me know which!

Soulthym · 2024-12-05T08:45:03Z

Hey @bitwalker, we just added you as a contributor to our fork. We hope that solves the issue.

+ cargo fmt

* derive Hash for SpannedMirValue * ir3: better node splitting, no macro, simpler api missimg builder pattern * replaced Link<Op> by Link<Owner> for Operations and Leaves * added IndexAccess Operand * renamed IndexAccess to Accessor re use parser's AccessType impl Hash for AccessType * fix call being copied instead of pointed to by get_children() * typesafe builder pattern for most ops. Misssing structured ops * type-safe Builder pattern for structured ops * MirGraph Builder editing * Add Leaves to Op enum * ir3: comment top level items * isolate Leaf, Op, Owner, Root * add converters to enums * added converters to structs renamed IndexAccess to Acccessor * Initial merge ir3 > update_passes * remove comments * Update mod.rs * Graph: implement public api * Swapped in helpers of passes/mod.rs * Node: all node types + converters * Full inlining with Node + cargo fmt * Remove Link for FoldOperator * Swap unrolling * Swap lowering Mir -> Air * Update visitor (with suboptimal get_children() herlper, to refactor) * Add MirType to Parameter * Update inlining to handle evaluators * most of translated adapted * add type support for Parameter in translate * in build_statement, handle Owner::None (integrity or boundary constraint) --------- Co-authored-by: Thybault Alabarbe <thybault.alabarbe@gmail.com>

Update passes

Soulthym · 2024-12-12T17:08:10Z

Soulthym · 2024-12-20T16:06:31Z

Hey @bitwalker, here is a recap of the work left to do.

The latest version is available on a new branch of this fork: https://github.com/massalabs/air-script/tree/fix-ir-translate
It will get merged in the current branch once all issues are resolved.

Currently, the tests in the mir package (cargo test --package mir) translate from AST to MIR, and we've identified the problems detailed below in the obtained graphs. Otherwise the graphs seem consistent with what we expect.

There are currently 3 identified issues left to work on, plus the features not marked as resolved above.

Parameter/argument unpacking:

Parameter unpacking for *AccessBinding is currently done during translation from ast to mir, but should happen during inlining.

Don't unpack call arguments during build, and dont check the length of arguments.
Don't unpack parameters during translation from ast to Mir:
- Instead add a binding to a Vector<Op::Parameter>
- and replace Op<Value<... TraceAccessBinding>> in the function/evaluator's body by Accessor<Vector<Op::Parameter>, TraceAccess> to the corresponding bound value.
Unpack during inlining

Mutability is not really handled through enums and conversions.

Mutating one occurence does not mutate others because conversion is done through cloning the underlying structure and wrapping that in a new Link.

Add Link<T> to enum variants' content, ie. :

enum Op {
    Add(Add),
    ...
}

becomes

enum Op {
    Add(Link<Add>)
    ...
}

~~Rework the builder pattern accordingly:~~
- ~~Builder works on Links directly, and mutate references instead of consuming self.~~
- ~~remove the last .build() step to allow storing partial but valid objects as their corresponding enum variant without adding new variants (their types are different)~~
- remove unneeded .edit() abstraction.
Change parent from BackLink<Op> to Vec<BackLink<Op>> to allow sharing of an operation node:
- Enable automatic parent setting in the builder pattern when adding children. They are currently not set at all.

Rework passes

After the previous is done, we will be able to work on:

inlining pass which currently mostly doesnt mutate anything if at all properly. It has been excluded temporarily from the pipeline while we work on fixing it.
unrolling pass which is not currently in the pipeline as it relies on the inlining pass.

cleanup the codebase

~~add a few tests for identified edegecases~~ tested via codegen diffs
~~convert relevant tests to compare parsed Mir between the input program and the expected optimized version.~~ tested via codegen diffs
document all public interfaces
refactor the boilerplate:
- extract the builder pattern into a derive macro Builder
- ~~extract converters into As* and TryAs* traits (AsNode, AsOp, AsRoot, AsAdd, ..., TryAsNode) and add generic implementations where possible.~~ Not needed

The updated check-list was moved to this comment: #4 (comment)

Soulthym · 2025-01-15T16:08:06Z

Hey,
After some experimentations you can find in the history of this branch, we have figured out a way to restore shared mutability when performing mutations on the IR.

For some context about why this change was needed, the current implementation has the following limitations:

mutating a struct to another type of struct is simply not supported, even if they are the same kind of struct (e.g. changing a Link<Add> to Link<Sub> are both Op members).
mutating a struct to a different struct with the same type does not propagate the update to existing enum wrappers (Link<Node>, Link<Op>, Link<Root>, Link<Leaf>, Link<Owner>)

Proposed solutions:

1) wrap struct in enums by default

we can wrap structs in an enum by default. This will allow us to mutate a struct to another variant of the same enum.
After some tests, we found that it doesn't need to be Node, and a more specialized enum can be used, such as Op and Root, which cover all node types without overlap.

main caveat:

accessing the underlying struct from the enum is not ergonomic, and it requires a match/if let statement to unwrap the enum when needed.
Converters to the struct type (as_${struct_name}) become fairly useless in this case, since we need to match on the enum to access the struct fields anyway.
Note: there may be a way to restore functionality via RefCell::map or RefCell::map_split, but isnt a priority to restore functionality and we have not tested it yet.

2) Store Singletons

make enum wrappers singletons, so that they are shared across all instances of the struct.

2.a) we modify other enum wrappers (Node, Owner, Leaf) to take a BackLink<Op>or BackLink<Root> to avoid reference cycles.
2.b) we can then store a singleton of the struct's enum wrappers in the struct itself. The singleton will thus be shared across all instances of the struct.
2.c) we expose a set(&self, value: &Self) helper method on Link<Op> and Link<Root> that will update the singleton when the struct is mutated or replaced with another variant of the same enum.

main caveat:

mutating manually the struct will not udpate the singletons. every mutation needs to happen through the obj.set(&value) helper method.

A minimum working example of the proposed solution is available Here, with a patched Link implementation

Identified codebase changes to implement the solution:

The updated check-list was moved to this comment: #4 (comment)

bitwalker · 2025-01-24T15:42:22Z

I'm a bit concerned about the overall complexity of things here, largely resulting from the way Link/BackLink, and the various special entity types like Owner, Node, etc., interact. It seems particularly awkward, and the issues with mutating the IR I think largely fall out of the concrete representation we ended up with here, rather than the general idea behind it.

What do you think about simplifying this a bit, along the lines of the following approach:

Each operation type stands alone as a separate type, e.g. Call
Each operation type implements an object-safe Op trait, which exposes methods to query both children and parent of the op (see below), as well as request metadata that is generic over operations (e.g. arguments, results, attributes), and provide basic operations for mutating them.
A similar representation would be used for "values", so that you can operation on values generically, but also access the concrete type via downcasting when needed.
Define a type alias pub type Entity<T> = Rc<RefCell<T>>;. This will keep the visual clutter down, but also enable a lot of ergonomic improvements available via Rc that cannot be applied to custom smart pointer types without nightly features (for now at least). An equivalent alias for Weak would also be needed.

In short, you'd generally be operating on either an Entity<T> or an Entity<dyn Op>/Entity<dyn Value>. You can upcast from Entity<T> to Entity<dyn Trait>, and use the Rc::downcast_ref/Rc::downcast_mut methods to downcast to the concrete type, or Rc::is<T> for simple dynamic type checks. Weak references would be used for parent links, when the parent holds a strong reference to the child.

To sketch this out in more concrete terms, we'd end up with something that looks a bit like this:

pub type Entity<T> = Rc<RefCell<T>>;
pub type WeakEntity<T> = Weak<RefCell<T>>;

// The [Op] trait represents common behavior/actions over operations in the IR.
//
// NOTE: This trait definition omits some obviously useful methods to avoid
// too much clutter in this example. The key thing is that the trait must
// remain object-safe.
pub trait Op {
    // Operations with a parent must belong to a Block
    fn parent(&self) -> Option<Entity<Block>>;
    // This is needed to add/remove operations from a Block generically
    fn set_parent(&mut self, parent: Option<WeakEntity<Block>>);
    // A convenience for accessing the parent Op of the containing Block
    fn parent_op(&self) -> Option<Entity<dyn Op>> {
        let block = self.parent()?;
        Some(block.owner())
    }
    // Get access to the argument vector for this operation
    fn arguments(&self) -> &[Entity<dyn Value>];
    // Get mutable access to the argument vector for this operation
    fn arguments_mut(&mut self) -> &mut [Entity<dyn Value>];
    // Get the result defined by this operation, if it produces one
    fn result(&self) -> Option<Entity<dyn Value>>;
    fn body(&self) -> Option<Entity<Block>>;
    fn has_body(&self) -> bool;
    fn is_primitive(&self) -> bool {
        !self.has_body()
    }
    fn has_uses(&self) -> bool {
        if let Some(value) = self.result() {
            value.is_used()
        } else {
            // If an op is non-primitive, we treat it as presumptively live.
            !self.is_primitive()
        }
    }
}

// You can add convenience methods here making them available for all
// dyn Op references, whether &dyn Op, or Entity<dyn Op>, etc.
impl dyn Op {
    /// Insert this operation at the end of `block`
    pub fn insert_at_end(self: Entity<Self>, block: Entity<Block>) {
        assert!(self.parent().is_none(), "op is already attached to a block");
        {
            let mut op = self.borrow_mut();
            op.set_parent(Some(Entity::downgrade(&block)));
        }
        let mut block = block.borrow_mut();
        block.push(self);
    }
    
    /// Replaces any uses of `value` with `replacement` in the argument vector
    /// of this operation. It could be further extended to also visit any nested
    /// operations (if this op has a body).
    pub fn replace_uses_of(self: Entity<Self>, value: Entity<dyn Value>, replacement: Entity<dyn Value>) {
        let mut op = self.borrow_mut();
        for (i, arg) in op.arguments_mut().iter_mut().enumerate() {
            if Entity::ptr_eq(arg, &value) {
                let user = User {
                    user: Entity::downgrade(&self),
                    index: i,
                };
                arg.remove_user(&user);
                *arg = replacement.clone();
                arg.add_user(user);
            }
        }
    }
}

// NOTE: An example of a structured/non-primitive operation with a block
pub struct Function {
    /// The function name
    pub symbol: Ident,
    /// The function type signature
    pub signature: FunctionType,
    /// The function body
    ///
    /// NOTE: The block argument types must match the signaturea
    pub body: Entity<Block>,
}

impl Function {
    pub fn new(symbol: Ident, signature: FunctionType) -> Entity<Self> {
        Entity::new_cyclic(move |this| {
            let body = Block::new(this, &signature.arguments);
            Self {
                symbol,
                signature,
                body,
            }
        })
    }
}

impl Op for Function {
    fn parent(&self) -> Option<Entity<Block>> {
        // Functions are always top-level
        None
    }
    
    fn arguments(&self) -> &[Entity<dyn Value>] {
        // A function op never has arguments
        &[]
    }
    
    fn result(&self) -> Option<Entity<dyn Value>> {
        // A function never produces a result, only `call` does
        None
    }
    
    fn body(&self) -> Option<Entity<Block>> {
        Some(self.body.clone())
    }
    
    fn has_body(&self) -> bool {
        true
    }
}

/// The operation used to invoke [Function] operations (also [Evaluator]).
///
/// NOTE: This is an example of a primitive operation
pub struct Call {
    parent: Option<WeakEntity<Block>>,
    /// The function/evaluator to call
    ///
    /// This is dyn Op here, because both evaluators and functions are callable
    callee: Entity<dyn Op>,
    /// The values to use as arguments for the callee. These must match the
    /// callee type signature.
    ///
    /// NOTE: Ops do not own the individual arguments, but do hold strong refs
    /// to them.
    args: Vec<Entity<dyn Value>>,
    /// The result of the function/evaluator call. Evaluators produce none, 
    /// while functions always produce one.
    ///
    /// NOTE: Ops own their results
    result: Option<Entity<dyn Value>>,
}

impl Call {
    pub fn new<I>(callee: Entity<dyn Op>, args: I) -> Entity<Self>
    where
        I: IntoIterator<Item = Entity<dyn Value>>,
    {
        Entity::new_cyclic(|this| {
            let result_ty = Self::infer_result_type(&callee);
            let result = result_ty.map(|ty| OpResult::new(this, ty));
            Self {
                parent: None,
                callee,
                args: args.into_iter().collect(),
                result,
            }
        })
    }
}

impl Op for Call {
    fn parent(&self) -> Option<Entity<Block>> {
        WeakEntity::upgrade(self.parent.as_ref()?).ok()
    }
    
    fn arguments(&self) -> &[Entity<dyn Value>] {
        &self.args
    }
    
    fn result(&self) -> Option<Entity<dyn Value>> {
        self.result.clone()
    }
    
    fn body(&self) -> Option<Entity<Block>> {
        None
    }
    
    fn has_body(&self) -> bool {
        false
    }
}

/// A special type used for type checking function-like operations
///
/// It is not a member of MirType.
pub struct FunctionType {
    pub arguments: Vec<MirType>,
    pub result: MirType,
}

/// Represents a basic block with arguments
pub struct Block {
    owner: WeakEntity<dyn Op>,
    /// The parameter list of the block. Each of these values represents an
    /// SSA value definition. Operations within a Block can reference values
    /// outside the block, and thus block arguments can be elided, however that
    /// is not permitted from the body of a [Function] or [Evaluator], without
    /// first cloning the Block so that it can be inlined at a callsite.
    pub arguments: Vec<Entity<BlockArgument>>,
    pub body: Vec<Entity<dyn Op>>,
}

impl Block {
    pub fn new(owner: WeakEntity<dyn Op>, arguments: &[MirType]) -> Entity<Self> {
        Entity::new_cyclic(|this| {
            let arguments = arguments.iter().map(|arg| {
                BlockArgument::new(this.clone(), arg)
            }).collect();
            Self {
                owner,
                arguments,
                body: vec![],
            }
        })
    }
    
    /// Callers must ensure that `op` has parent set to this block, and that it
    /// was not already attached to some other block.
    pub fn push(&mut self, op: Entity<dyn Op>) {
        self.body.push(op);
    }
    
    pub fn owner(&self) -> Entity<dyn Op> {
        WeakEntity::upgrade(&self.owner).expect("stale owner reference")
    }
}

pub trait Value {
    // NOTE: A value always has an owning op, but is internally stored as Weak.
    // Here, ownership refers to the entity that defines the value, never an
    // entity that uses a value.
    fn owner(&self) -> Entity<dyn Op>;
    fn ty(&self) -> MirType;
    fn users(&self) -> &UseList;
    fn users_mut(&mut self) -> &mut UseList;
    fn is_used(&self) -> bool {
        !self.users().is_empty()
    }
    fn remove_user(&mut self, user: &User) {
        self.users_mut().remove(&user);
    }
    fn add_user(&mut self, user: User) {
        self.users_mut().insert(user);
    }
}

/// Represents the set of users of some value definition
/// 
/// This information is used to trace from uses to defs, and to determine if
/// a given value is used. An operation without side effects that has no uses
/// of its result(s), can be considered dead and stripped from the program.
#[derive(Default, Clone)]
pub struct UseList {
    users: Vec<User>,
}

impl UseList {
    /// Indicates whether the containing value has any uses
    pub fn is_empty(&self) -> bool {
        self.users.is_empty()
    }
    
    /// Remove a user from the list
    pub fn remove(&mut self, user: &User) {
        self.users.remove(user);
    }
    
    /// Add a new user to the list, if that use is not already present
    pub fn insert(&mut self, user: User) {
        if !self.users.contains(&user) {
            self.users.push(user);
        }
    }
    
    pub fn iter(&self) -> impl Iterator<Item = &User> {
        self.users.iter()
    }
}

#[derive(Clone)]
pub struct User {
    /// The using operation
    pub user: WeakEntity<dyn Op>,
    /// The index of the use in the operations' argument vector
    pub index: usize,
}

/// The value representation for block parameters. These represent distinct
/// SSA values, so that transformations within a block can be done without
/// having to know how the block is reached. These are equivalent
/// to Phi nodes in SSA literature, but in a more intuitive form.
pub struct BlockArgument {
    owner: WeakEntity<Block>,
    ty: MirType,
    users: UseList,
}

impl BlockArgument {
    pub fn new(owner: WeakEntity<Block>, ty: MirType) -> Entity<Self> {
        Entity::new(Self {
            owner,
            ty,
            users: UseList::default(),
        })
    }
}

impl Value for BlockArgument {
    fn owner(&self) -> Entity<dyn Op> {
        let block = WeakEntity::upgrade(&self.owner).expect("stale owner reference");
        block.owner()
    }
    
    fn ty(&self) -> MirType {
        self.ty.clone()
    }
    
    fn users(&self) -> &UseList {
        &self.users
    }
    
    fn users_mut(&mut self) -> &mut UseList {
        &mut self.users
    }
}

/// The value representation for operation results. Much like BlockArgument,
/// these define new SSA values.
pub struct OpResult {
    owner: WeakEntity<dyn Op>,
    ty: MirType,
    users: UseList,
}

impl OpResult {
    pub fn new(owner: WeakEntity<dyn Op>, ty: MirType) -> Entity<Self> {
        Self {
            owner,
            ty,
            users: UseList::default(),
        }
    }
}

impl Value for OpResult {
    fn owner(&self) -> Entity<dyn Op> {
        WeakEntity::upgrade(&self.owner).expect("stale owner reference")
    }
    
    fn ty(&self) -> MirType {
        self.ty.clone()
    }
    
    fn users(&self) -> &UseList {
        &self.users
    }
    
    fn users_mut(&mut self) -> &mut UseList {
        &mut self.users
    }
}

/// An example of a custom value type
pub struct TraceAccessBinding {
    owner: WeakEntity<dyn Op>,
    users: UseList,
    pub segment: Rc<TraceSegment>,
    /// The offset to the first column of the segment which is bound by this binding
    pub offset: usize,
    /// The number of columns which are bound
    pub size: usize,
}

impl Value for TraceAccessBinding {
    fn owner(&self) -> Entity<dyn Op> {
        WeakEntity::upgrade(&self.owner).expect("stale owner reference")
    }
    fn ty(&self) -> MirType {
        if size > 1 {
            MirType::Felt
        } else {
            MirType::Vector(self.size)
        }
    }
    
    fn users(&self) -> &UseList {
        &self.users
    }
    
    fn users_mut(&mut self) -> &mut UseList {
        &mut self.users
    }
}

You can then define some convenience builder APIs:

pub struct FunctionBuilder {
    function: Entity<Function>,
    block: Entity<Block>,
}

impl From<Entity<Function>> for FunctionBuilder {
    fn from(function: Entity<Function>) -> Self {
        let block = function.borrow().body.clone();
        Self {
            function,
            block,
        }
    }
}

impl FunctionBuilder {
    pub fn new(symbol: Ident, signature: FunctionType) -> Self {
        let function = Function::new(symbol, signature);
        Self::from(function)
    }
    
    pub fn get_argument(&self, index: usize) -> Entity<dyn Value> {
        let block = self.block.borrow();
        block.arguments()[index].clone()
    }
    
    pub fn ins(&mut self) -> InstBuilder<'_> {
        let owner = Entity::downgrade(&self.block);
        InstBuilder {
            owner,
            block: self.block.borrow_mut(),
        }
    }
    
    pub fn build(self) -> Entity<Function> {
        self.function
    }
}

pub struct InstBuilder<'f> {
    owner: WeakEntity<Block>,
    block: std::cell::RefMut<'f, Block>,
}

impl InstBuilder<'_> {
    /// This function returns a reference to the op because it may or may not
    /// produce a result, depending on what is being called.
    pub fn call<I>(&mut self, callee: Entity<dyn Op>, args: I) -> Entity<Call> 
    where
        I: IntoIterator<Item = Entity<dyn Value>>,
    {
        let call = Call::new(callee, args);
        self.insert_at_end(call.clone());
        call
    }
    
    /// This function returns a value reference, because it is a primitive op
    /// that always produces a single result.
    pub fn add(&mut self, lhs: Entity<dyn Value>, rhs: Entity<dyn Value>) -> Entity<dyn Value> {
        assert_eq!(lhs.ty(), rhs.ty(), "mismatched value types for binary operator");
        
        let add: Entity<dyn Op> = Add::new(lhs, rhs);
        let result = add.borrow().result().unwrap();
        self.insert_at_end(add);
        result
    }
    
    /// Block terminators need not return anything, but are expected to always
    /// be the last instruction in a block.
    pub fn ret(&mut self, result: Option<Entity<dyn Value>>) {
        self.insert_at_end(Ret::new(result));
    }
    
    fn insert_at_end(&mut self, op: Entity<dyn Op>) {
        {
            let mut op = op.borrow_mut();
            op.set_parent(Some(self.owner.clone()));
        }
        self.block.push(self);
    }
}

Those end up getting used something like this when constructing IR:

// Construct IR for a function defined as:
//
// fn adder(a: felt, b: felt) -> felt {
//     a + b
// }
let mut fb = FunctionBuilder::new("adder".into(), FunctionType::new([MirType::Felt, MirType::Felt], MirType::Felt));

let a = fb.get_argument(0);
let b = fb.get_argument(1);
let c = fb.ins().add(a, b);
fb.ins().ret(Some(c));

let adder = fb.build();

A key aspect of working with this IR structure, is that the connections between operations are represented completely in terms of SSA values, and they are never mutated, only created (defined) or replaced/destroyed. So let's say that you want to replace something like x = y * 2 with x = y + y. The actual process of doing that would play out like this:

Insert x' = y + y in the block after x = y * 2
Replace all uses of x with x'
Remove x = y * 2, as x is now unused, and thus the expression y * 2 is dead

I believe that will solve the issues you have encountered with handling mutation of the IR. More generally, the above structure is significantly simpler, as you need far fewer concepts (and corresponding structs/enums).

The primary awkwardness here is around the fact that all of the borrow checking is pushed to runtime, so you have to take care to limit when/how long you borrow the underlying data for an entity, to avoid mutable aliasing (and thus a panic). In practice though, this isn't too difficult with the structure I've outlined above, as adding ops to a Block doesn't require holding a mutable reference to the containing Op. It could also be made more ergonomic if we could use nightly features, but this is still fairly easy to work with.

* fix translate.rs not translating bodies * expand fn and ev args, missing scoping * expand all arguments in function and evaluators * unpack vec in all cases * lookup arguments in the access_map if not found in bindings * stop unpacking call arguments, add Vector<Params> to bindings * insert accessors to bound Vector<Param> * fix Value Builder types * insert links in enums * added missing Child/Parent traits * Remove fn edit(self) on builders * Make translate_from_mir.rs compile * fix double borrow * Handle multiple parents * remove some warns * Adapt Visitor and Inlining * Update inlining and visitor * Fix translate of functions * visitor with pre-scan * Add unrolling2 + fmt * Add comments to passes * Add comment to translate.rs pass * Update Inlining and Unrolling passes Stilll some debug todo * Cargo fmt * Add TODOs * Builder derive macro * fix Builder derive macro compilation * Builder derive macro: fix Default recursion * Builder derive macro: fix mutability on non-link, fix transistion derive for Function * Builder derive macro: handle Vec<BackLink> + impl Sub * Builder derive macro: docs & fix + test >2 required fields case * Builder derive macro: derive on all structs * remived unused buggy api to remoce_child * mutability examples and potention solutions: one bug left * double borrow bug remaining * fix test_mutability_wrap_enum: missing clone * Fix doouble borrow in Link::update * fix singleton updates in final design * cleanup intermediary versions * fix module name * Translate - Split function / ev parameter translation * Fix for_node children, don't put None selector * Fix inlining - Needs mutability * Cleanup unrolling * cargo fmt * fix warns * Improve comments * Improve context setting in Unrolling * refactor checkpoint: wrap structs in Op/Root enums * Fix let translation * Add prints to unrolling * Add diagnostics to translate * swap with ir + fix Builder enumwrapper checkpoint: does not compile * Setup diagnostics for inlining and unrolling * converer bug: return from local reference * avoid double option in BackLink::to_link() * wrap all structs in enum: checkpoint: compiles, 81 tests pass * revert examples * fix missing ; * fix compilation: 43 tests pass * Remove warns, fix translate_from_mir.rs * Fix Inlining * Fix inlining after git merge inconsistency * Have parameter reference ref_node, and better handle evaluator params / args * Update inlining2.rs * Only inline Params that target the ref_node we aim * Improve diags on Evaluator args and params mismatch * Improve diags again * Reworked visit_if to work with selectors, kept visit_if_old * cargo clippy * Clippy fix * Clippy fix and remove prints * cargo fmt * cargo clippy + fmt * Remove prints * Builder: ignore fields that start with underscore * singleton converters * patch link * api for shared mutability * transform BackLink to Link when comparing and hashing singlton enum wrappers * swap *obj.borrow_mut() = value to obj.set(value) * use ptr as hashmap key to preserve 1 key per instance restore PartialEq Invariant for HashMap key * use pointer as HashMap keys * Some fixes for translate + inlining * fix BackLink fields' mutability in Builder derive macro * cargo fmt * fix Parameter.ref_node cyclic reference + restric to Owner + compare via get_ptr fix Parameter.ref_node cyclic reference: Parameter.ref_Node used to store a Link to its parent which caused PartialEq and Hash to loop. Replaced with a BackLink + restricted the field from Node -> Owner + expose inner get_ptr on Owner + comparison and hash via get_ptr * Update translate, mod and inlining * Fix nested evaluators * fix Parameter PartialEq and Hash to work on disconnected identical graphs * add debugging method to Op/Root + Link/BackLink * filter stale node wrappers in visitor * fix edgecase in `Link<Op>::set` * manual pointer review checks to verify Op::set logic * fix parameters * Fix let translation * Debug air-ir tests, begin codegen tests * Set program name in Mir * Fix mutability test --------- Co-authored-by: Thybault Alabarbe <thybault.alabarbe@gmail.com>

Soulthym · 2025-02-04T11:14:50Z

Hey @bitwalker,
I think I understand the design from a high-level perspective, and overall think it would be a very welcome upgrade to the current implementation.
I am however somewhat still confused about the following points:

Phi Nodes in Block

/// The value representation for block parameters. These represent distinct
/// SSA values, so that transformations within a block can be done without
/// having to know how the block is reached. These are equivalent
/// to Phi nodes in SSA literature, but in a more intuitive form.
pub struct BlockArgument {
    owner: WeakEntity<Block>,
    ty: MirType,
    users: UseList,
}

I'm not sure I understand the relationship with Phi nodes. I thought those reconciled Block results in SSA style IRs, as in they represent a ternary operation of both Blocks, as-in an if else statement?

Regardless, I understand their usage as replacement targets with bound values during inlining/unrolling, so it shouldn't be much of an issue in practice, I am mostly asking in case I missed other cases that would behave differently.

parent, owner, and use_list

In the current design, we use the field owner to store the operations that use an Op accessed with the Parent trait, and do not have the corresponding block stored.

Given that, am I understanding the following correctly?
Starting from the current design and ending at yours, semantically:

an owner becomes a user in the operation's UseList
a parent becomes an owner as in the parent in the Mir graph, minus the owning block (cf next)
parent now stores the operation's owning Block(Function, Evaluator, For, If,...) as well in all cases.

Future progressive migration

Checklist before merging

Our design is passing all existing tests correctly, but still needs some improvements.

Our plan is to do the following, roughly in this order:

add span fields
propagate spans through transformations (translation AST -> Mir and passes)
document public APIs

Checklist after merging (to do according to the priority of the other milestone)

Once merged upstream, we plan to:

simplify the boilerplate
refactor/simplify complex parts:
- really use the visitor in passes. Currently passes re-implement dispatching instead of using the internal dispatcher.
- replace the lazy part of singletons by creating them during struct instantation, simplifying their update mechanism (less cases to cover)
add a pass of constant propagation/folding in the IR (these causes the only differences in codegen for Winterfell):
- remove useless nodes, such as multiplying by 1, adding 0, etc.
- simplify exponentiation where possible. For example: 2.pow(3) turns into 2*2*2 but what we really want is 8.

Migrate in parallel

At this point we plan on working on the next steps, while migrating to your design progressively in the background.

Migration checklist:

add parents automatically within the builder pattern, currently not present in most places.
implement an alternative update mechanism, that mutates the contents of parent structs instead of primitive enums (Op and Root), to match your design's behaviour.
migrate shared mutability usage to the new update mechsnism
rename functions/methods/fields in our API to match yours where update behaviour is the same.
implement the Block abstraction, like it is in your design.
add owner references, like in your design.
Then, simulatenously:
- replace our uses of primitive enums (Op and Root, those that own the inner struct) with their respective object safe dyn Traits
- remove our uses of wrapper enums (mainly Owner and Node, those that contain a BackLink to the primitive enum) as they can now be used by dyn Traits
- swap current remaining structs' implementation with your design.

I have updated previous checklists, marking items with ~~strikethrough~~ that became irrelevant with our latest modifications.

swap-ir-impl starting point

b401c73

Soulthym mentioned this pull request Dec 2, 2024

MassaLabs: Implement a Intermediate Representation to improve the compilation process 0xPolygonMiden/air-script#359

Draft

Soulthym and others added 16 commits December 2, 2024 16:07

use absolute paths in IsNode derive macro

80aeb77

binary_op: move Add, implement Mul and Sub

9be932c

fix IsNode derive macro module name

29740bf

fix name conflict with rust keywords in IsNode derive macro

a536f8c

blocks: Function, Evaluator, If, For, Fold

ee7a499

structured_op: moved Fold, Call

48b8192

Implement Visitor on new structure

8e82494

Add some fields to Graph to match previous API

12c3831

IsNode derive macro: support enums

c80352d

unary_ops: Boundary and Enf

82145b5

IsNode derive macro: support extra fields

IsLeaf derive macro for enums

297dc91

leaf nodes: SpannedMirVariable + Parameter

ba9a535

aggregated_op: Vector and Matrix

8bd6946

represent Matrix as Node of Vectors

065e8be

First steps on updating pretty printer

5b5739f

Start introducing new node structure in Mir

8e7812a

Leo-Besancon added 10 commits December 6, 2024 11:02

Update passes

2e99206

Implement Unrolling (needs Hash & Option/Vecs)

9d70fbd

Implement Lowering MIR > AIR

61b7d8a

Avoid referencing a func/ev that has not been defined

038e7d6

+ cargo fmt

Update Unrolling and Inlining with helper

7275311

Comment out Old passes

67a423c

Cleaned up a bit

51fc61b

Add Access node usage to MIR

07bb97d

bugfix

9722ae5

Finished duplicate_node helper

6551843

Leo-Besancon and others added 9 commits December 10, 2024 09:06

rename Access => Accessor

a0ca0a6

Fix Accessor unrolling

7199c7e

cargo fmt

8e06ab2

Small bugfixes in Unrolling for Accessor

e5f0111

Merge pull request #6 from massalabs/update_passes

1bc73aa

Update passes

Cargo format and make Lowering Mir > Air build

f1330df

Clean both pipelines, removed unused structures

2dd46ea

removed: unused derive_graph, allow unused

62f1c8d

accept non-vectors as For.iterators

3d57fa7

Leo-Besancon and others added 15 commits January 29, 2025 09:26

cargo fmt

524c175

removed unused files

134f744

rename passes files

bd51468

Cargo check fixes

ecff9a4

cargo clippy fixes

f98d36f

Improve Accessor of trace columns handling

0982113

Add Mir Exp Node, fix Accessor logic

ec5acec

Update expected codegen results (equivalent, but less optimized)

a9cfcc3

Fix functions_complex and evaluator codegen tests

1cdce6c

fix uneeded unwrap

85d30f6

Make test pass

b805077

cargo clippy fixes

a7e170d

cargo fmt

57c8267

Defaults to Mir pipeline in CLI, expose CLI arg to bypass Mir

be67019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swap-ir-impl starting point #4

swap-ir-impl starting point #4

Soulthym commented Dec 2, 2024 •

edited

Loading

bitwalker commented Dec 4, 2024 •

edited

Loading

Soulthym commented Dec 5, 2024

Soulthym commented Dec 12, 2024 •

edited

Loading

Soulthym commented Dec 20, 2024 •

edited

Loading

Soulthym commented Jan 15, 2025 •

edited

Loading

bitwalker commented Jan 24, 2025

Soulthym commented Feb 4, 2025

swap-ir-impl starting point #4

Are you sure you want to change the base?

swap-ir-impl starting point #4

Conversation

Soulthym commented Dec 2, 2024 • edited Loading

bitwalker commented Dec 4, 2024 • edited Loading

Soulthym commented Dec 5, 2024

Soulthym commented Dec 12, 2024 • edited Loading

Soulthym commented Dec 20, 2024 • edited Loading

Parameter/argument unpacking:

Mutability is not really handled through enums and conversions.

Rework passes

cleanup the codebase

Soulthym commented Jan 15, 2025 • edited Loading

Proposed solutions:

1) wrap struct in enums by default

main caveat:

2) Store Singletons

main caveat:

bitwalker commented Jan 24, 2025

Soulthym commented Feb 4, 2025

Phi Nodes in Block

parent, owner, and use_list

Future progressive migration

Checklist before merging

Checklist after merging (to do according to the priority of the other milestone)

Migrate in parallel

Soulthym commented Dec 2, 2024 •

edited

Loading

bitwalker commented Dec 4, 2024 •

edited

Loading

Soulthym commented Dec 12, 2024 •

edited

Loading

Soulthym commented Dec 20, 2024 •

edited

Loading

Soulthym commented Jan 15, 2025 •

edited

Loading