Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update instructions for adding a SemIR instruction. #4348

Merged
merged 16 commits into from
Oct 10, 2024
2 changes: 1 addition & 1 deletion toolchain/check/context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1125,7 +1125,7 @@ class TypeCompleter {
auto BuildValueRepr(SemIR::TypeId type_id, SemIR::Inst inst) const
-> SemIR::ValueRepr {
// Use overload resolution to select the implementation, producing compile
// errors when BuildTypeForInst isn't defined for a given instruction.
// errors when BuildValueReprForInst isn't defined for a given instruction.
CARBON_KIND_SWITCH(inst) {
#define CARBON_SEM_IR_INST_KIND(Name) \
case CARBON_KIND(SemIR::Name typed_inst): { \
Expand Down
95 changes: 80 additions & 15 deletions toolchain/docs/adding_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Parse](#parse)
- [Typed parse node metadata implementation](#typed-parse-node-metadata-implementation)
- [Check](#check)
- [Adding a new SemIR instruction](#adding-a-new-semir-instruction)
- [SemIR typed instruction metadata implementation](#semir-typed-instruction-metadata-implementation)
- [Lower](#lower)
- [Tests and debugging](#tests-and-debugging)
Expand Down Expand Up @@ -232,42 +233,106 @@ Note: this is broadly similar to
Each parse node kind requires adding a `Handle<kind>` function in a
`check/handle_*.cpp` file.

### Adding a new SemIR instruction

If the resulting SemIR needs a new instruction:

- add a new kind to [sem_ir/inst_kind.def](/toolchain/sem_ir/inst_kind.def)
- Add a new kind to [sem_ir/inst_kind.def](/toolchain/sem_ir/inst_kind.def).

- Add a `CARBON_SEM_IR_INST_KIND(NewInstKindName)` line in alphabetical
order
- a new struct definition to

- Add a new struct definition to
[sem_ir/typed_insts.h](/toolchain/sem_ir/typed_insts.h), such as:

```cpp
struct NewInstKindName {
static constexpr auto Kind = InstKind::NewInstKindName.Define(
// the name used in textual IR
"new_inst_kind_name"
// Optional: , TerminatorKind::KindOfTerminator
static constexpr auto Kind =
// `Parse::SomeId` should be one of:
// - A node ID from `parse/node_ids.h`,
// specifying the kind of parse nodes for this instruction.
// This could be a node kind from `parse/node_kind.def`
// suffixed by `Id`, or one of the `Any`...`Id` alias
// declarations that match multiple kinds of parse nodes.
// - `Parse::NodeId` if it can be any kind of parse node.
// - `Parse::InvalidNodeId` if no associated parse node.
InstKind::NewInstKindName.Define<Parse::SomeId>(
// The name used in textual IR:
{.ir_name = "new_inst_kind_name"}
// Other parameters have defaults.
);

// Optional: omit if not associated with a parse node.
Parse::Node parse_node;

// Optional: omit if this sem_ir instruction does not produce a value.
// Optional: Include if this instruction produces a value used in
// an expression.
TypeId type_id;

// 0-2 id fields, with types from sem_ir/ids.h or sem_ir/builtin_kind.h
// For example, fields would look like:
// 0-2 id fields, with types from sem_ir/ids.h or
// sem_ir/builtin_kind.h. For example, fields would look like:
StringId name_id;
InstId value_id;
};
```

Adding an instruction will also require a handler in the Lower step.
- [`sem_ir/inst_kind.h`](/toolchain/sem_ir/inst_kind.h) documents the
different options when defining a new instruction, as well as their
defaults, see `InstKind::DefinitionInfo`.
- If an instruction always produces a type:

- Set `.is_type = InstIsType::Always` in its `Kind` definition.
- When constructing instructions of this kind, pass
`SemIR::TypeId::TypeType` in as the value of the `type_id` field, as
in:

```
SemIR::InstId inst_id = context.AddInst<SemIR::NewInstKindName>(
node_id, {.type_id = SemIR::TypeId::TypeType, ...});
```

- Although most instructions have distinct types represented by instructions like `ClassType`, we also have builtin types for cases where types don't need to be distinct per-entity. This is rare, but used, for example, when an expression
implicitly uses a value as part of SemIR evaluation or as part of
desugaring. We have builtin types for bound methods, namespaces,
witnesses, among others. These are defined in
[`sem_ir/builtin_inst_kind.def`](/toolchain/sem_ir/builtin_inst_kind.def).
To get a type id for one of these builtin types, use something like
`context.GetBuiltinType(SemIR::BuiltinInstKind::WitnessType)`, as in:

```
SemIR::TypeId witness_type_id =
context.GetBuiltinType(SemIR::BuiltinInstKind::WitnessType);
SemIR::InstId inst_id = context.AddInst<SemIR::NewInstKindName>(
node_id, {.type_id = witness_type_id, ...});
```

- Instructions without types may still be used as arguments to
instructions.

Once those are added, a rebuild will give errors showing what needs to be
updated. The updates needed, can depend on whether the instruction produces a
type. Look to the comments on those functions for instructions on what is
needed.

Instructions won't be given a name unless
[`InstNamer::CollectNamesInBlock](/toolchain/sem_ir/inst_namer.cpp) is called on
the `InstBlockId` they are a member of. As of this writing,
`InstNamer::CollectNamesInBlock` should only be called once per `InstBlockId`.
To accomplish this, there should be one instruction kind that "owns" the
instruction block, and will have a case in `InstNamer::CollectNamesInBlock` that
visits the `InstBlockId`. That instruction kind will typically use
`FormatTrailingBlock` in the `sem_ir/formatter.cpp` to list the instructions in
curly braces (`{`...`}`). Other instructions that reference that `InstBlockId`
will use the default rendering that has just the instruction names in parens
(`(`...`)`).

Adding an instruction will generally also require a handler in the Lower step.

Most new instructions will automatically be formatted reasonably by the SemIR
formatter.
formatter. If not, then add a `FormatInst` overload to
[`sem_ir/formatter.cpp`](/toolchain/sem_ir/formatter.cpp). If only the arguments
need custom formatting, then a `FormatInstRHS` overload can be implemented
instead.

If the resulting SemIR needs a new built-in, add it to
[builtin_inst_kind.def](/toolchain/sem_ir/builtin_inst_kind.def).
[`sem_ir/builtin_inst_kind.def`](/toolchain/sem_ir/builtin_inst_kind.def).

### SemIR typed instruction metadata implementation

Expand Down
Loading