-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
swap-ir-impl starting point #4
base: feat-implement-ir
Are you sure you want to change the base?
Conversation
IsNode derive macro: support extra fields
@Soulthym I just realized that since this PR is in your fork, and not the 0xPolygonMiden repo, I can't leave review comments. Not sure if it's easier to open a PR stacked on 0xPolygonMiden/air-script#359 with these changes, or add me as a collaborator to this repo for reviews, but I'll have to wait for one or the other to provide my review feedback. Just let me know which! |
Hey @bitwalker, we just added you as a contributor to our fork. We hope that solves the issue. |
* derive Hash for SpannedMirValue * ir3: better node splitting, no macro, simpler api missimg builder pattern * replaced Link<Op> by Link<Owner> for Operations and Leaves * added IndexAccess Operand * renamed IndexAccess to Accessor re use parser's AccessType impl Hash for AccessType * fix call being copied instead of pointed to by get_children() * typesafe builder pattern for most ops. Misssing structured ops * type-safe Builder pattern for structured ops * MirGraph Builder editing * Add Leaves to Op enum * ir3: comment top level items * isolate Leaf, Op, Owner, Root * add converters to enums * added converters to structs renamed IndexAccess to Acccessor * Initial merge ir3 > update_passes * remove comments * Update mod.rs * Graph: implement public api * Swapped in helpers of passes/mod.rs * Node: all node types + converters * Full inlining with Node + cargo fmt * Remove Link for FoldOperator * Swap unrolling * Swap lowering Mir -> Air * Update visitor (with suboptimal get_children() herlper, to refactor) * Add MirType to Parameter * Update inlining to handle evaluators * most of translated adapted * add type support for Parameter in translate * in build_statement, handle Owner::None (integrity or boundary constraint) --------- Co-authored-by: Thybault Alabarbe <thybault.alabarbe@gmail.com>
Update passes
Hey @bitwalker, We have some tests passing on the end2end pipeline, missing the lowering pass ones. Missing features:
Based on the mir crate tests, here is what is currently missing/broken:
Edit as of 13 dec.: all current tests fixed by relaxing constraints for iterables Once the API is stabilized, we will refactor the boilerplate. The updated check-list was moved to this comment: #4 (comment) |
Hey @bitwalker, here is a recap of the work left to do. The latest version is available on a new branch of this fork: https://github.com/massalabs/air-script/tree/fix-ir-translate Currently, the tests in the mir package ( There are currently 3 identified issues left to work on, plus the features not marked as resolved above. Parameter/argument unpacking:Parameter unpacking for
Mutability is not really handled through enums and conversions.Mutating one occurence does not mutate others because conversion is done through cloning the underlying structure and wrapping that in a new Link.
enum Op {
Add(Add),
...
} becomes enum Op {
Add(Link<Add>)
...
}
Rework passesAfter the previous is done, we will be able to work on:
cleanup the codebase
The updated check-list was moved to this comment: #4 (comment) |
Hey, For some context about why this change was needed, the current implementation has the following limitations:
Proposed solutions:1) wrap struct in enums by default
main caveat:
2) Store Singletonsmake enum wrappers singletons, so that they are shared across all instances of the struct.
main caveat:
A minimum working example of the proposed solution is available Here, with a patched Link implementation Identified codebase changes to implement the solution:
The updated check-list was moved to this comment: #4 (comment) |
I'm a bit concerned about the overall complexity of things here, largely resulting from the way What do you think about simplifying this a bit, along the lines of the following approach:
In short, you'd generally be operating on either an To sketch this out in more concrete terms, we'd end up with something that looks a bit like this: pub type Entity<T> = Rc<RefCell<T>>;
pub type WeakEntity<T> = Weak<RefCell<T>>;
// The [Op] trait represents common behavior/actions over operations in the IR.
//
// NOTE: This trait definition omits some obviously useful methods to avoid
// too much clutter in this example. The key thing is that the trait must
// remain object-safe.
pub trait Op {
// Operations with a parent must belong to a Block
fn parent(&self) -> Option<Entity<Block>>;
// This is needed to add/remove operations from a Block generically
fn set_parent(&mut self, parent: Option<WeakEntity<Block>>);
// A convenience for accessing the parent Op of the containing Block
fn parent_op(&self) -> Option<Entity<dyn Op>> {
let block = self.parent()?;
Some(block.owner())
}
// Get access to the argument vector for this operation
fn arguments(&self) -> &[Entity<dyn Value>];
// Get mutable access to the argument vector for this operation
fn arguments_mut(&mut self) -> &mut [Entity<dyn Value>];
// Get the result defined by this operation, if it produces one
fn result(&self) -> Option<Entity<dyn Value>>;
fn body(&self) -> Option<Entity<Block>>;
fn has_body(&self) -> bool;
fn is_primitive(&self) -> bool {
!self.has_body()
}
fn has_uses(&self) -> bool {
if let Some(value) = self.result() {
value.is_used()
} else {
// If an op is non-primitive, we treat it as presumptively live.
!self.is_primitive()
}
}
}
// You can add convenience methods here making them available for all
// dyn Op references, whether &dyn Op, or Entity<dyn Op>, etc.
impl dyn Op {
/// Insert this operation at the end of `block`
pub fn insert_at_end(self: Entity<Self>, block: Entity<Block>) {
assert!(self.parent().is_none(), "op is already attached to a block");
{
let mut op = self.borrow_mut();
op.set_parent(Some(Entity::downgrade(&block)));
}
let mut block = block.borrow_mut();
block.push(self);
}
/// Replaces any uses of `value` with `replacement` in the argument vector
/// of this operation. It could be further extended to also visit any nested
/// operations (if this op has a body).
pub fn replace_uses_of(self: Entity<Self>, value: Entity<dyn Value>, replacement: Entity<dyn Value>) {
let mut op = self.borrow_mut();
for (i, arg) in op.arguments_mut().iter_mut().enumerate() {
if Entity::ptr_eq(arg, &value) {
let user = User {
user: Entity::downgrade(&self),
index: i,
};
arg.remove_user(&user);
*arg = replacement.clone();
arg.add_user(user);
}
}
}
}
// NOTE: An example of a structured/non-primitive operation with a block
pub struct Function {
/// The function name
pub symbol: Ident,
/// The function type signature
pub signature: FunctionType,
/// The function body
///
/// NOTE: The block argument types must match the signaturea
pub body: Entity<Block>,
}
impl Function {
pub fn new(symbol: Ident, signature: FunctionType) -> Entity<Self> {
Entity::new_cyclic(move |this| {
let body = Block::new(this, &signature.arguments);
Self {
symbol,
signature,
body,
}
})
}
}
impl Op for Function {
fn parent(&self) -> Option<Entity<Block>> {
// Functions are always top-level
None
}
fn arguments(&self) -> &[Entity<dyn Value>] {
// A function op never has arguments
&[]
}
fn result(&self) -> Option<Entity<dyn Value>> {
// A function never produces a result, only `call` does
None
}
fn body(&self) -> Option<Entity<Block>> {
Some(self.body.clone())
}
fn has_body(&self) -> bool {
true
}
}
/// The operation used to invoke [Function] operations (also [Evaluator]).
///
/// NOTE: This is an example of a primitive operation
pub struct Call {
parent: Option<WeakEntity<Block>>,
/// The function/evaluator to call
///
/// This is dyn Op here, because both evaluators and functions are callable
callee: Entity<dyn Op>,
/// The values to use as arguments for the callee. These must match the
/// callee type signature.
///
/// NOTE: Ops do not own the individual arguments, but do hold strong refs
/// to them.
args: Vec<Entity<dyn Value>>,
/// The result of the function/evaluator call. Evaluators produce none,
/// while functions always produce one.
///
/// NOTE: Ops own their results
result: Option<Entity<dyn Value>>,
}
impl Call {
pub fn new<I>(callee: Entity<dyn Op>, args: I) -> Entity<Self>
where
I: IntoIterator<Item = Entity<dyn Value>>,
{
Entity::new_cyclic(|this| {
let result_ty = Self::infer_result_type(&callee);
let result = result_ty.map(|ty| OpResult::new(this, ty));
Self {
parent: None,
callee,
args: args.into_iter().collect(),
result,
}
})
}
}
impl Op for Call {
fn parent(&self) -> Option<Entity<Block>> {
WeakEntity::upgrade(self.parent.as_ref()?).ok()
}
fn arguments(&self) -> &[Entity<dyn Value>] {
&self.args
}
fn result(&self) -> Option<Entity<dyn Value>> {
self.result.clone()
}
fn body(&self) -> Option<Entity<Block>> {
None
}
fn has_body(&self) -> bool {
false
}
}
/// A special type used for type checking function-like operations
///
/// It is not a member of MirType.
pub struct FunctionType {
pub arguments: Vec<MirType>,
pub result: MirType,
}
/// Represents a basic block with arguments
pub struct Block {
owner: WeakEntity<dyn Op>,
/// The parameter list of the block. Each of these values represents an
/// SSA value definition. Operations within a Block can reference values
/// outside the block, and thus block arguments can be elided, however that
/// is not permitted from the body of a [Function] or [Evaluator], without
/// first cloning the Block so that it can be inlined at a callsite.
pub arguments: Vec<Entity<BlockArgument>>,
pub body: Vec<Entity<dyn Op>>,
}
impl Block {
pub fn new(owner: WeakEntity<dyn Op>, arguments: &[MirType]) -> Entity<Self> {
Entity::new_cyclic(|this| {
let arguments = arguments.iter().map(|arg| {
BlockArgument::new(this.clone(), arg)
}).collect();
Self {
owner,
arguments,
body: vec![],
}
})
}
/// Callers must ensure that `op` has parent set to this block, and that it
/// was not already attached to some other block.
pub fn push(&mut self, op: Entity<dyn Op>) {
self.body.push(op);
}
pub fn owner(&self) -> Entity<dyn Op> {
WeakEntity::upgrade(&self.owner).expect("stale owner reference")
}
}
pub trait Value {
// NOTE: A value always has an owning op, but is internally stored as Weak.
// Here, ownership refers to the entity that defines the value, never an
// entity that uses a value.
fn owner(&self) -> Entity<dyn Op>;
fn ty(&self) -> MirType;
fn users(&self) -> &UseList;
fn users_mut(&mut self) -> &mut UseList;
fn is_used(&self) -> bool {
!self.users().is_empty()
}
fn remove_user(&mut self, user: &User) {
self.users_mut().remove(&user);
}
fn add_user(&mut self, user: User) {
self.users_mut().insert(user);
}
}
/// Represents the set of users of some value definition
///
/// This information is used to trace from uses to defs, and to determine if
/// a given value is used. An operation without side effects that has no uses
/// of its result(s), can be considered dead and stripped from the program.
#[derive(Default, Clone)]
pub struct UseList {
users: Vec<User>,
}
impl UseList {
/// Indicates whether the containing value has any uses
pub fn is_empty(&self) -> bool {
self.users.is_empty()
}
/// Remove a user from the list
pub fn remove(&mut self, user: &User) {
self.users.remove(user);
}
/// Add a new user to the list, if that use is not already present
pub fn insert(&mut self, user: User) {
if !self.users.contains(&user) {
self.users.push(user);
}
}
pub fn iter(&self) -> impl Iterator<Item = &User> {
self.users.iter()
}
}
#[derive(Clone)]
pub struct User {
/// The using operation
pub user: WeakEntity<dyn Op>,
/// The index of the use in the operations' argument vector
pub index: usize,
}
/// The value representation for block parameters. These represent distinct
/// SSA values, so that transformations within a block can be done without
/// having to know how the block is reached. These are equivalent
/// to Phi nodes in SSA literature, but in a more intuitive form.
pub struct BlockArgument {
owner: WeakEntity<Block>,
ty: MirType,
users: UseList,
}
impl BlockArgument {
pub fn new(owner: WeakEntity<Block>, ty: MirType) -> Entity<Self> {
Entity::new(Self {
owner,
ty,
users: UseList::default(),
})
}
}
impl Value for BlockArgument {
fn owner(&self) -> Entity<dyn Op> {
let block = WeakEntity::upgrade(&self.owner).expect("stale owner reference");
block.owner()
}
fn ty(&self) -> MirType {
self.ty.clone()
}
fn users(&self) -> &UseList {
&self.users
}
fn users_mut(&mut self) -> &mut UseList {
&mut self.users
}
}
/// The value representation for operation results. Much like BlockArgument,
/// these define new SSA values.
pub struct OpResult {
owner: WeakEntity<dyn Op>,
ty: MirType,
users: UseList,
}
impl OpResult {
pub fn new(owner: WeakEntity<dyn Op>, ty: MirType) -> Entity<Self> {
Self {
owner,
ty,
users: UseList::default(),
}
}
}
impl Value for OpResult {
fn owner(&self) -> Entity<dyn Op> {
WeakEntity::upgrade(&self.owner).expect("stale owner reference")
}
fn ty(&self) -> MirType {
self.ty.clone()
}
fn users(&self) -> &UseList {
&self.users
}
fn users_mut(&mut self) -> &mut UseList {
&mut self.users
}
}
/// An example of a custom value type
pub struct TraceAccessBinding {
owner: WeakEntity<dyn Op>,
users: UseList,
pub segment: Rc<TraceSegment>,
/// The offset to the first column of the segment which is bound by this binding
pub offset: usize,
/// The number of columns which are bound
pub size: usize,
}
impl Value for TraceAccessBinding {
fn owner(&self) -> Entity<dyn Op> {
WeakEntity::upgrade(&self.owner).expect("stale owner reference")
}
fn ty(&self) -> MirType {
if size > 1 {
MirType::Felt
} else {
MirType::Vector(self.size)
}
}
fn users(&self) -> &UseList {
&self.users
}
fn users_mut(&mut self) -> &mut UseList {
&mut self.users
}
} You can then define some convenience builder APIs: pub struct FunctionBuilder {
function: Entity<Function>,
block: Entity<Block>,
}
impl From<Entity<Function>> for FunctionBuilder {
fn from(function: Entity<Function>) -> Self {
let block = function.borrow().body.clone();
Self {
function,
block,
}
}
}
impl FunctionBuilder {
pub fn new(symbol: Ident, signature: FunctionType) -> Self {
let function = Function::new(symbol, signature);
Self::from(function)
}
pub fn get_argument(&self, index: usize) -> Entity<dyn Value> {
let block = self.block.borrow();
block.arguments()[index].clone()
}
pub fn ins(&mut self) -> InstBuilder<'_> {
let owner = Entity::downgrade(&self.block);
InstBuilder {
owner,
block: self.block.borrow_mut(),
}
}
pub fn build(self) -> Entity<Function> {
self.function
}
}
pub struct InstBuilder<'f> {
owner: WeakEntity<Block>,
block: std::cell::RefMut<'f, Block>,
}
impl InstBuilder<'_> {
/// This function returns a reference to the op because it may or may not
/// produce a result, depending on what is being called.
pub fn call<I>(&mut self, callee: Entity<dyn Op>, args: I) -> Entity<Call>
where
I: IntoIterator<Item = Entity<dyn Value>>,
{
let call = Call::new(callee, args);
self.insert_at_end(call.clone());
call
}
/// This function returns a value reference, because it is a primitive op
/// that always produces a single result.
pub fn add(&mut self, lhs: Entity<dyn Value>, rhs: Entity<dyn Value>) -> Entity<dyn Value> {
assert_eq!(lhs.ty(), rhs.ty(), "mismatched value types for binary operator");
let add: Entity<dyn Op> = Add::new(lhs, rhs);
let result = add.borrow().result().unwrap();
self.insert_at_end(add);
result
}
/// Block terminators need not return anything, but are expected to always
/// be the last instruction in a block.
pub fn ret(&mut self, result: Option<Entity<dyn Value>>) {
self.insert_at_end(Ret::new(result));
}
fn insert_at_end(&mut self, op: Entity<dyn Op>) {
{
let mut op = op.borrow_mut();
op.set_parent(Some(self.owner.clone()));
}
self.block.push(self);
}
} Those end up getting used something like this when constructing IR: // Construct IR for a function defined as:
//
// fn adder(a: felt, b: felt) -> felt {
// a + b
// }
let mut fb = FunctionBuilder::new("adder".into(), FunctionType::new([MirType::Felt, MirType::Felt], MirType::Felt));
let a = fb.get_argument(0);
let b = fb.get_argument(1);
let c = fb.ins().add(a, b);
fb.ins().ret(Some(c));
let adder = fb.build(); A key aspect of working with this IR structure, is that the connections between operations are represented completely in terms of SSA values, and they are never mutated, only created (defined) or replaced/destroyed. So let's say that you want to replace something like
I believe that will solve the issues you have encountered with handling mutation of the IR. More generally, the above structure is significantly simpler, as you need far fewer concepts (and corresponding structs/enums). The primary awkwardness here is around the fact that all of the borrow checking is pushed to runtime, so you have to take care to limit when/how long you borrow the underlying data for an entity, to avoid mutable aliasing (and thus a panic). In practice though, this isn't too difficult with the structure I've outlined above, as adding ops to a Block doesn't require holding a mutable reference to the containing Op. It could also be made more ergonomic if we could use nightly features, but this is still fairly easy to work with. |
* fix translate.rs not translating bodies * expand fn and ev args, missing scoping * expand all arguments in function and evaluators * unpack vec in all cases * lookup arguments in the access_map if not found in bindings * stop unpacking call arguments, add Vector<Params> to bindings * insert accessors to bound Vector<Param> * fix Value Builder types * insert links in enums * added missing Child/Parent traits * Remove fn edit(self) on builders * Make translate_from_mir.rs compile * fix double borrow * Handle multiple parents * remove some warns * Adapt Visitor and Inlining * Update inlining and visitor * Fix translate of functions * visitor with pre-scan * Add unrolling2 + fmt * Add comments to passes * Add comment to translate.rs pass * Update Inlining and Unrolling passes Stilll some debug todo * Cargo fmt * Add TODOs * Builder derive macro * fix Builder derive macro compilation * Builder derive macro: fix Default recursion * Builder derive macro: fix mutability on non-link, fix transistion derive for Function * Builder derive macro: handle Vec<BackLink> + impl Sub * Builder derive macro: docs & fix + test >2 required fields case * Builder derive macro: derive on all structs * remived unused buggy api to remoce_child * mutability examples and potention solutions: one bug left * double borrow bug remaining * fix test_mutability_wrap_enum: missing clone * Fix doouble borrow in Link::update * fix singleton updates in final design * cleanup intermediary versions * fix module name * Translate - Split function / ev parameter translation * Fix for_node children, don't put None selector * Fix inlining - Needs mutability * Cleanup unrolling * cargo fmt * fix warns * Improve comments * Improve context setting in Unrolling * refactor checkpoint: wrap structs in Op/Root enums * Fix let translation * Add prints to unrolling * Add diagnostics to translate * swap with ir + fix Builder enumwrapper checkpoint: does not compile * Setup diagnostics for inlining and unrolling * converer bug: return from local reference * avoid double option in BackLink::to_link() * wrap all structs in enum: checkpoint: compiles, 81 tests pass * revert examples * fix missing ; * fix compilation: 43 tests pass * Remove warns, fix translate_from_mir.rs * Fix Inlining * Fix inlining after git merge inconsistency * Have parameter reference ref_node, and better handle evaluator params / args * Update inlining2.rs * Only inline Params that target the ref_node we aim * Improve diags on Evaluator args and params mismatch * Improve diags again * Reworked visit_if to work with selectors, kept visit_if_old * cargo clippy * Clippy fix * Clippy fix and remove prints * cargo fmt * cargo clippy + fmt * Remove prints * Builder: ignore fields that start with underscore * singleton converters * patch link * api for shared mutability * transform BackLink to Link when comparing and hashing singlton enum wrappers * swap *obj.borrow_mut() = value to obj.set(value) * use ptr as hashmap key to preserve 1 key per instance restore PartialEq Invariant for HashMap key * use pointer as HashMap keys * Some fixes for translate + inlining * fix BackLink fields' mutability in Builder derive macro * cargo fmt * fix Parameter.ref_node cyclic reference + restric to Owner + compare via get_ptr fix Parameter.ref_node cyclic reference: Parameter.ref_Node used to store a Link to its parent which caused PartialEq and Hash to loop. Replaced with a BackLink + restricted the field from Node -> Owner + expose inner get_ptr on Owner + comparison and hash via get_ptr * Update translate, mod and inlining * Fix nested evaluators * fix Parameter PartialEq and Hash to work on disconnected identical graphs * add debugging method to Op/Root + Link/BackLink * filter stale node wrappers in visitor * fix edgecase in `Link<Op>::set` * manual pointer review checks to verify Op::set logic * fix parameters * Fix let translation * Debug air-ir tests, begin codegen tests * Set program name in Mir * Fix mutability test --------- Co-authored-by: Thybault Alabarbe <thybault.alabarbe@gmail.com>
Hey @bitwalker, Phi Nodes in Block
I'm not sure I understand the relationship with Phi nodes. I thought those reconciled Regardless, I understand their usage as replacement targets with bound values during inlining/unrolling, so it shouldn't be much of an issue in practice, I am mostly asking in case I missed other cases that would behave differently. parent, owner, and use_listIn the current design, we use the field Given that, am I understanding the following correctly?
Future progressive migrationChecklist before mergingOur design is passing all existing tests correctly, but still needs some improvements. Our plan is to do the following, roughly in this order:
Checklist after merging (to do according to the priority of the other milestone)Once merged upstream, we plan to:
Migrate in parallelAt this point we plan on working on the next steps, while migrating to your design progressively in the background. Migration checklist:
I have updated previous checklists, marking items with |
Define missing operation types: now under
ir2/nodes/...
VariableParameterImplement Spanned trait for nodes:
Helper traits are available (IsNode, NotNode, IsGraph, NotGraph, IsLeaf, NotLeaf)
And in reverse.
Matchgraph/mod.rs
api, now underir2/graph.rs
Pretty printer (for better SSA visualization): uncommited - waiting for all nodes to matchinsert_op_*
->new_*().add_child()
orAdd::new(parent, lhs, rhs)
Redefine visitor pattern with the new structure
Redefine remaining APIs referencing
NodeIndex
, replace them withLink<NodeType>
Convert passes to use ir2 and updated visitor pattern
Add validation to the builder pattern
Maybe isolate builder pattern into its own trait
The updated check-list was moved to this comment: #4 (comment)