Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oxidize commutative cancellation #13091

Merged
merged 11 commits into from
Sep 10, 2024

Conversation

sbrandhsn
Copy link
Contributor

@sbrandhsn sbrandhsn commented Sep 5, 2024

Summary

fixes #12270

Details and comments

This replaces the Python implementation of CommutativeCancellation with a Rust implementation:

  • As the Rust implementation does not use the PassManager to run its prerequisite CommutationAnalysis, the test case has been test_callback_with_pass_requires fixed. We might be able to remove that test case as the check is already covered in
  • The global phase was probably incorrectly set in the Python implementation in
    dag.global_phase = total_phase - new_op_phase
    , I changed this in the Rust implementation
  • Currently, control-flow operations are handled in Python - I'm looking into a Rust implementation for that too

Speedups (this branch vs Qiskit 1.2)

![image](
clifford_circuits
random_circuits
rotations_circuits)

Top to bottom Clifford circuits, random circuits, random circuits with rotation gates only

Up to 5.5x speedup for Clifford circuits where the commutation relation of two gates is always determined by a look-up in an a priori constructed commutation library. Up to 7.0 speedup for random circuits as generated by random_circuit and up to 8x speedup when restricting that random circuit to rotational gates only. Note this benchmark compares end-to-end performance as given by

            pm = PassManager([CommutativeCancellation(target=be.target)])
            start_ts = time.time()
            pm.run(qc)
            end_ts = time.time()

instead of only comparing a python implementation of CommutativeCancellation with underlying Rust code vs a rust implementation of CommutativeCancellation with the same underlying code.

@sbrandhsn sbrandhsn requested a review from a team as a code owner September 5, 2024 10:08
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core
  • @kevinhartman
  • @mtreinish

@coveralls
Copy link

coveralls commented Sep 5, 2024

Pull Request Test Coverage Report for Build 10797636548

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 170 of 190 (89.47%) changed or added relevant lines in 7 files are covered.
  • 127 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.04%) to 89.134%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/commutation_cancellation.rs 155 175 88.57%
Files with Coverage Reduction New Missed Lines %
qiskit/synthesis/one_qubit/one_qubit_decompose.py 1 70.67%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py 2 88.26%
crates/qasm2/src/lex.rs 5 92.48%
crates/qasm2/src/parse.rs 6 97.61%
qiskit/synthesis/two_qubit/xx_decompose/decomposer.py 6 90.84%
crates/circuit/src/dag_node.rs 11 78.55%
qiskit/circuit/quantumcircuit.py 96 93.25%
Totals Coverage Status
Change from base Build 10774677144: -0.04%
Covered Lines: 73092
Relevant Lines: 82002

💛 - Coveralls

@sbrandhsn sbrandhsn added performance Rust This PR or issue is related to Rust code in the repository labels Sep 5, 2024
@sbrandhsn sbrandhsn added this to the 1.3 beta milestone Sep 5, 2024
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this, this has the potential to improve performance quite a bit, not only by speeding up the cancellation, but also by avoiding all the python overhead for the analysis. I've done a quick first pass over the code. I left a few inline comments, besides those the big thing I think we want to be careful for here is making sure we don't panic needlessly and instead return Python exceptions when there is a possible error condition. Right now the code seems a bit too likely to just call panic!() which won't be use actionable or catchable and also exits hard.

Comment on lines 83 to 149
(0..dag.num_qubits() as u32).for_each(|qubit| {
let wire = Qubit(qubit);
if let Some(wire_commutation_set) = commutation_set.get(&Wire::Qubit(wire)) {
wire_commutation_set
.iter()
.enumerate()
.for_each(|(com_set_idx, com_set)| {
// This ensures that we only have DAGOPNodes in the current com_set, yuck...
if let NodeType::Operation(_node0) = &dag.dag[*com_set.first().unwrap()] {
com_set.iter().for_each(|node| {
let op = match &dag.dag[*node] {
NodeType::Operation(instr) => instr,
_ => panic!("Unexpected type in commutation set."),
};
let num_qargs = dag.get_qargs(op.qubits).len();
// no support for cancellation of parameterized gates
if op
.params_view()
.iter()
.all(|p| !matches!(p, Param::ParameterExpression(_)))
{
let op_name = op.op.name().to_string();
if num_qargs == 1usize && _gates.contains(op_name.as_str()) {
single_q_cancellation_sets
.entry((op_name.clone(), wire, com_set_idx))
.or_insert_with(Vec::new)
.push(*node);
}

if num_qargs == 1usize && _z_rotations.contains(op_name.as_str()) {
single_q_cancellation_sets
.entry((Z_ROTATION.to_string(), wire, com_set_idx))
.or_insert_with(Vec::new)
.push(*node);
}
if num_qargs == 1usize && _x_rotations.contains(op_name.as_str()) {
single_q_cancellation_sets
.entry((X_ROTATION.to_string(), wire, com_set_idx))
.or_insert_with(Vec::new)
.push(*node);
}
// Don't deal with Y rotation, because Y rotation doesn't commute with
// CNOT, so it should be dealt with by optimized1qgate pass
if num_qargs == 2usize
&& dag.get_qargs(op.qubits).first().unwrap() == &wire
{
let second_qarg = dag.get_qargs(op.qubits)[1];
let q2_key = (
op_name,
wire,
second_qarg,
com_set_idx,
*node_indices
.get(&(*node, Wire::Qubit(second_qarg)))
.unwrap(),
);
two_q_cancellation_sets
.entry(q2_key)
.or_insert_with(Vec::new)
.push(*node);
}
}
})
}
})
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious and sketched out using a parallel iterator chain here to build the cancellation sets hashmap using the above suggestion on changing the types:

    let cancellation_sets: HashMap<CancellationSetKey, Vec<NodeIndex>> = (0..dag.num_qubits())
        .into_par_iter()
        .filter_map(|qubit| {
            let wire = Qubit(qubit as u32);
            if let Some(wire_commutation_set) = commutation_set.get(&Wire::Qubit(wire)) {
                Some(wire_commutation_set
                    .iter()
                    .enumerate()
                    .filter_map(|(com_set_index, com_set)| {
                        // This ensures that we only have DAGOPNodes in the current com_set, yuck...
                        if let NodeType::Operation(_node0) = &dag.dag[*com_set.first().unwrap()] {
                            Some(com_set.iter().filter_map(|node| {
                                let op = match &dag.dag[*node] {
                                    NodeType::Operation(instr) => instr,
                                    _ => panic!("Unexpected type in commutation set."),
                                };
                                let num_qargs = dag.get_qargs(op.qubits).len();
                                // no support for cancellation of parameterized gates
                                if op
                                    .params_view()
                                    .iter()
                                    .all(|p| !matches!(p, Param::ParameterExpression(_)))
                                {
                                    let op_name = op.op.name().to_string();
                                    let gate = op.op.standard_gate();
                                    if num_qargs == 1usize && GATES.contains(&gate) {
                                        let key = CancellationSetKey {
                                            gate: GateOrRotation::Gate(gate),
                                            qubits: smallvec![wire],
                                            com_set_index,
                                            second_index: None,
                                        };
                                        Some((key, *node))
                                    } else if num_qargs == 1usize && _z_rotations.contains(op_name.as_str()) {
                                        let key = CancellationSetKey {
                                            gate: GateOrRotation::Z_ROTATION,
                                            qubits: smallvec![wire],
                                            com_set_index,
                                            second_index: None,
                                        };
                                        Some((key, *node))
                                    } else if num_qargs == 1usize && _x_rotations.contains(op_name.as_str()) {
                                        let key = CancellationSetKey {
                                            gate: GateOrRotation::X_ROTATION,
                                            qubits: smallvec![wire],
                                            com_set_index,
                                            second_index: None,
                                        };
                                        Some((key, *node))
                                    }
                                    // Don't deal with Y rotation, because Y rotation doesn't commute with
                                    // CNOT, so it should be dealt with by optimized1qgate pass
                                    else if num_qargs == 2usize
                                        && dag.get_qargs(op.qubits).first().unwrap() == &wire
                                    {
                                        let second_qarg = dag.get_qargs(op.qubits)[1];
                                        let key = CancellationSetKey {
                                            gate: GateOrRotation::Gate(gate),
                                            qubits: smallvec![wire, second_qarg],
                                            com_set_index,
                                            second_index: Some(*node_indices.get(&(*node, Wire::Qubit(second_qarg))).unwrap()),
                                        };
                                        Some((key, *node))
                                    } else {
                                        None
                                    }
                                } else {
                                    None
                                }
                        }).collect::<Vec<(CancellationSetKey, NodeIndex)>>())
                    } else {
                        None
                    }
                })
                .flat_map(|entries| entries.into_iter()))
            } else {
                None
            }
        })
        .fold(|| HashMap::new(), |map, item| {
            for (key, value) in item {
                map.entry(key).and_modify(|indices: &mut Vec<NodeIndex>| indices.push(value)).or_insert_with(|| vec![value]);
            }
            map
        })
        .reduce(|| HashMap::new(), |outer_map, inner_map| {
            for (key, value) in inner_map.iter_mut() {
                outer_map.entry(*key).and_modify(|indices| indices.append(value)).or_insert_with(|| *value);
            }
            outer_map
        });

This doesn't fully compile yet because I haven't changed the logic below it to work with a single hashmap yet. But it at least gets passed the type checking (which probably means the borrow checker will hate it :D). I'm skeptical doing this in parallel actually helps performance here though because it results in several intermediate allocations that could be avoided. It was more just an exercise for me to see how it could be done. But this might be useful as a reference for other changes so I decided to post it even though it's not complete.

@mtreinish
Copy link
Member

Oh also can you please rebase this on main so the commit history is just the local changes. While the final diff is ok, it makes the commit message and authors list for this PR the sum of all the commits on the branch, which is much larger than the actual work being done here.

@sbrandhsn sbrandhsn force-pushed the oxidize-commutative-cancellation branch from 858ca8f to 8041ef7 Compare September 6, 2024 09:51
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick updates. I took another pass through the code after the last updates. This is looking a lot better now and much less error prone using an enum and rust types instead of string comparisons for everything. I'm still a bit concerned how many panic!() calls there are though, I feel like a fair number of them can be avoided though, and either eliminated with some logical changes or replaced with Python errors. I left some inline comments with suggestions on how to do this.

Comment on lines 242 to 243
total_phase += match definition.global_phase() {Param::Float(f) => f, _ => panic!("PackedInstruction with definition has no global phase set as floating point number")};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the logic is right now this should return a python error:

Suggested change
total_phase += match definition.global_phase() {Param::Float(f) => f, _ => panic!("PackedInstruction with definition has no global phase set as floating point number")};
}
total_phase += match definition.global_phase() {
Param::Float(f) => f,
_ => return Err(QiskitError::new_err(format!("PackedInstruction with definition has no global phase set as floating point number")))
};
}

Although realistically if you make my above suggestion about how to get the definition directly from StandardGate I think you can change this to just use unreachable!() since the generator code for definitions of standard gates never has a parameterized phase, it's always a float 0.0. Something like:

Suggested change
total_phase += match definition.global_phase() {Param::Float(f) => f, _ => panic!("PackedInstruction with definition has no global phase set as floating point number")};
}
let Param::Float(new_phase) = definition.global_phase() else { unreachable!() };
total_phase += new_phase

(although maybe set an error message saying standard gate phase is never parameterized)

would do the trick. But we can't make this change unless we're working with a StandardGate when we call definition().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something: one of Rz or P (whichever isn't exactly equal to u1) can have a parametric phase if the angle is parametric. GlobalPhaseGate definitely can (if the angle is parametric).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point, I thought there was one, but I scanned through the match statement yesterday and missed it on rzgate. But regardless this statement still holds since the angle is always a float at this point

@sbrandhsn sbrandhsn requested a review from mtreinish September 10, 2024 12:18
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this almost ready now, the current state of this PR is looking much better. Thanks for making all the updates. I have another round of inline comments, but most are pretty small nits and mechanical things to make the code a bit more concise.

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Copy link
Contributor Author

@sbrandhsn sbrandhsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! :-)

@sbrandhsn sbrandhsn requested a review from mtreinish September 10, 2024 16:44
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I glossed over the changes to dag_circuit.rs in my last review. We shouldn't change the interface to insert_1q_on_incoming_qubit unless we need to. Since nothing is using it anymore we should revert that piece. Also, I think we shouldn't make op_names public because that opens up the possibility of a transpiler pass corrupting the name counts accidentally by mutating it. We have the count_ops() method already, but we can also add a no-copy method to get an immutable reference to the inner mapping. These are so minor I'll push up a quick commit to fix both right now. Otherwise I think this is ready to merge.

Previously this PR was making the op_names field of the DAGCircuit
struct public so it was accessible to the new transpiler pass code.
However, this opened up the possibility of mutating the field by
mistake, instead this makes the field private again and adds a no-copy
method to get an immutable reference to the field. Additionally, there
were some interface changes made to one method in DAGCircuit that were
not needed anymore, which this reverts to minimize the diff.
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks for all the work on this! It's going to have a very positive impact on runtime performance for the transpiler at optimization levels 2 and 3.

@mtreinish mtreinish enabled auto-merge September 10, 2024 17:32
@mtreinish mtreinish added this pull request to the merge queue Sep 10, 2024
Merged via the queue into Qiskit:main with commit 49b8a5f Sep 10, 2024
15 checks passed
@raynelfss raynelfss added the Changelog: None Do not include in changelog label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog performance Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port CommutativeCancellation to Rust
6 participants