-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework the README.md for rustc and add other readmes #44505
Changes from all commits
44e45d9
73a4e8d
76eac36
70db841
f130e7d
032fdef
38813cf
638958b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# Introduction to the HIR | ||
|
||
The HIR -- "High-level IR" -- is the primary IR used in most of | ||
rustc. It is a desugared version of the "abstract syntax tree" (AST) | ||
that is generated after parsing, macro expansion, and name resolution | ||
have completed. Many parts of HIR resemble Rust surface syntax quite | ||
closely, with the exception that some of Rust's expression forms have | ||
been desugared away (as an example, `for` loops are converted into a | ||
`loop` and do not appear in the HIR). | ||
|
||
This README covers the main concepts of the HIR. | ||
|
||
### Out-of-band storage and the `Crate` type | ||
|
||
The top-level data-structure in the HIR is the `Crate`, which stores | ||
the contents of the crate currently being compiled (we only ever | ||
construct HIR for the current crate). Whereas in the AST the crate | ||
data structure basically just contains the root module, the HIR | ||
`Crate` structure contains a number of maps and other things that | ||
serve to organize the content of the crate for easier access. | ||
|
||
For example, the contents of individual items (e.g., modules, | ||
functions, traits, impls, etc) in the HIR are not immediately | ||
accessible in the parents. So, for example, if had a module item `foo` | ||
containing a function `bar()`: | ||
|
||
``` | ||
mod foo { | ||
fn bar() { } | ||
} | ||
``` | ||
|
||
Then in the HIR the representation of module `foo` (the `Mod` | ||
stuct) would have only the **`ItemId`** `I` of `bar()`. To get the | ||
details of the function `bar()`, we would lookup `I` in the | ||
`items` map. | ||
|
||
One nice result from this representation is that one can iterate | ||
over all items in the crate by iterating over the key-value pairs | ||
in these maps (without the need to trawl through the IR in total). | ||
There are similar maps for things like trait items and impl items, | ||
as well as "bodies" (explained below). | ||
|
||
The other reason to setup the representation this way is for better | ||
integration with incremental compilation. This way, if you gain access | ||
to a `&hir::Item` (e.g. for the mod `foo`), you do not immediately | ||
gain access to the contents of the function `bar()`. Instead, you only | ||
gain access to the **id** for `bar()`, and you must invoke some | ||
function to lookup the contents of `bar()` given its id; this gives us | ||
a chance to observe that you accessed the data for `bar()` and record | ||
the dependency. | ||
|
||
### Identifiers in the HIR | ||
|
||
Most of the code that has to deal with things in HIR tends not to | ||
carry around references into the HIR, but rather to carry around | ||
*identifier numbers* (or just "ids"). Right now, you will find four | ||
sorts of identifiers in active use: | ||
|
||
- `DefId`, which primarily name "definitions" or top-level items. | ||
- You can think of a `DefId` as being shorthand for a very explicit | ||
and complete path, like `std::collections::HashMap`. However, | ||
these paths are able to name things that are not nameable in | ||
normal Rust (e.g., impls), and they also include extra information | ||
about the crate (such as its version number, as two versions of | ||
the same crate can co-exist). | ||
- A `DefId` really consists of two parts, a `CrateNum` (which | ||
identifies the crate) and a `DefIndex` (which indixes into a list | ||
of items that is maintained per crate). | ||
- `HirId`, which combines the index of a particular item with an | ||
offset within that item. | ||
- the key point of a `HirId` is that it is *relative* to some item (which is named | ||
via a `DefId`). | ||
- `BodyId`, this is an absolute identifier that refers to a specific | ||
body (definition of a function or constant) in the crate. It is currently | ||
effectively a "newtype'd" `NodeId`. | ||
- `NodeId`, which is an absolute id that identifies a single node in the HIR tree. | ||
- While these are still in common use, **they are being slowly phased out**. | ||
- Since they are absolute within the crate, adding a new node | ||
anywhere in the tree causes the node-ids of all subsequent code in | ||
the crate to change. This is terrible for incremental compilation, | ||
as you can perhaps imagine. | ||
|
||
### HIR Map | ||
|
||
Most of the time when you are working with the HIR, you will do so via | ||
the **HIR Map**, accessible in the tcx via `tcx.hir` (and defined in | ||
the `hir::map` module). The HIR map contains a number of methods to | ||
convert between ids of various kinds and to lookup data associated | ||
with a HIR node. | ||
|
||
For example, if you have a `DefId`, and you would like to convert it | ||
to a `NodeId`, you can use `tcx.hir.as_local_node_id(def_id)`. This | ||
returns an `Option<NodeId>` -- this will be `None` if the def-id | ||
refers to something outside of the current crate (since then it has no | ||
HIR node), but otherwise returns `Some(n)` where `n` is the node-id of | ||
the definition. | ||
|
||
Similarly, you can use `tcx.hir.find(n)` to lookup the node for a | ||
`NodeId`. This returns a `Option<Node<'tcx>>`, where `Node` is an enum | ||
defined in the map; by matching on this you can find out what sort of | ||
node the node-id referred to and also get a pointer to the data | ||
itself. Often, you know what sort of node `n` is -- e.g., if you know | ||
that `n` must be some HIR expression, you can do | ||
`tcx.hir.expect_expr(n)`, which will extract and return the | ||
`&hir::Expr`, panicking if `n` is not in fact an expression. | ||
|
||
Finally, you can use the HIR map to find the parents of nodes, via | ||
calls like `tcx.hir.get_parent_node(n)`. | ||
|
||
### HIR Bodies | ||
|
||
A **body** represents some kind of executable code, such as the body | ||
of a function/closure or the definition of a constant. Bodies are | ||
associated with an **owner**, which is typically some kind of item | ||
(e.g., a `fn()` or `const`), but could also be a closure expression | ||
(e.g., `|x, y| x + y`). You can use the HIR map to find find the body | ||
associated with a given def-id (`maybe_body_owned_by()`) or to find | ||
the owner of a body (`body_owner_def_id()`). |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
The HIR map, accessible via `tcx.hir`, allows you to quickly navigate the | ||
HIR and convert between various forms of identifiers. See [the HIR README] for more information. | ||
|
||
[the HIR README]: ../README.md |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -413,6 +413,10 @@ pub struct WhereEqPredicate { | |
|
||
pub type CrateConfig = HirVec<P<MetaItem>>; | ||
|
||
/// The top-level data structure that stores the entire contents of | ||
/// the crate currently being compiled. | ||
/// | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this newline serve a purpose? |
||
/// For more details, see [the module-level README](README.md). | ||
#[derive(Clone, PartialEq, Eq, RustcEncodable, RustcDecodable, Debug)] | ||
pub struct Crate { | ||
pub module: Mod, | ||
|
@@ -927,7 +931,27 @@ pub struct BodyId { | |
pub node_id: NodeId, | ||
} | ||
|
||
/// The body of a function or constant value. | ||
/// The body of a function, closure, or constant value. In the case of | ||
/// a function, the body contains not only the function body itself | ||
/// (which is an expression), but also the argument patterns, since | ||
/// those are something that the caller doesn't really care about. | ||
/// | ||
/// # Examples | ||
/// | ||
/// ``` | ||
/// fn foo((x, y): (u32, u32)) -> u32 { | ||
/// x + y | ||
/// } | ||
/// ``` | ||
/// | ||
/// Here, the `Body` associated with `foo()` would contain: | ||
/// | ||
/// - an `arguments` array containing the `(x, y)` pattern | ||
/// - a `value` containing the `x + y` expression (maybe wrapped in a block) | ||
/// - `is_generator` would be false | ||
/// | ||
/// All bodies have an **owner**, which can be accessed via the HIR | ||
/// map using `body_owner_def_id()`. | ||
#[derive(Clone, PartialEq, Eq, RustcEncodable, RustcDecodable, Hash, Debug)] | ||
pub struct Body { | ||
pub arguments: HirVec<Arg>, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# Types and the Type Context | ||
|
||
The `ty` module defines how the Rust compiler represents types | ||
internally. It also defines the *typing context* (`tcx` or `TyCtxt`), | ||
which is the central data structure in the compiler. | ||
|
||
## The tcx and how it uses lifetimes | ||
|
||
The `tcx` ("typing context") is the central data structure in the | ||
compiler. It is the context that you use to perform all manner of | ||
queries. The struct `TyCtxt` defines a reference to this shared context: | ||
|
||
```rust | ||
tcx: TyCtxt<'a, 'gcx, 'tcx> | ||
// -- ---- ---- | ||
// | | | | ||
// | | innermost arena lifetime (if any) | ||
// | "global arena" lifetime | ||
// lifetime of this reference | ||
``` | ||
|
||
As you can see, the `TyCtxt` type takes three lifetime parameters. | ||
These lifetimes are perhaps the most complex thing to understand about | ||
the tcx. During Rust compilation, we allocate most of our memory in | ||
**arenas**, which are basically pools of memory that get freed all at | ||
once. When you see a reference with a lifetime like `'tcx` or `'gcx`, | ||
you know that it refers to arena-allocated data (or data that lives as | ||
long as the arenas, anyhow). | ||
|
||
We use two distinct levels of arenas. The outer level is the "global | ||
arena". This arena lasts for the entire compilation: so anything you | ||
allocate in there is only freed once compilation is basically over | ||
(actually, when we shift to executing LLVM). | ||
|
||
To reduce peak memory usage, when we do type inference, we also use an | ||
inner level of arena. These arenas get thrown away once type inference | ||
is over. This is done because type inference generates a lot of | ||
"throw-away" types that are not particularly interesting after type | ||
inference completes, so keeping around those allocations would be | ||
wasteful. | ||
|
||
Often, we wish to write code that explicitly asserts that it is not | ||
taking place during inference. In that case, there is no "local" | ||
arena, and all the types that you can access are allocated in the | ||
global arena. To express this, the idea is to us the same lifetime | ||
for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch | ||
confusing, we tend to use the name `'tcx` in such contexts. Here is an | ||
example: | ||
|
||
```rust | ||
fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) { | ||
// ---- ---- | ||
// Using the same lifetime here asserts | ||
// that the innermost arena accessible through | ||
// this reference *is* the global arena. | ||
} | ||
``` | ||
|
||
In contrast, if we want to code that can be usable during type inference, then you | ||
need to declare a distinct `'gcx` and `'tcx` lifetime parameter: | ||
|
||
```rust | ||
fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) { | ||
// ---- ---- | ||
// Using different lifetimes here means that | ||
// the innermost arena *may* be distinct | ||
// from the global arena (but doesn't have to be). | ||
} | ||
``` | ||
|
||
### Allocating and working with types | ||
|
||
Rust types are represented using the `Ty<'tcx>` defined in the `ty` | ||
module (not to be confused with the `Ty` struct from [the HIR]). This | ||
is in fact a simple type alias for a reference with `'tcx` lifetime: | ||
|
||
```rust | ||
pub type Ty<'tcx> = &'tcx TyS<'tcx>; | ||
``` | ||
|
||
[the HIR]: ../hir/README.md | ||
|
||
You can basically ignore the `TyS` struct -- you will basically never | ||
access it explicitly. We always pass it by reference using the | ||
`Ty<'tcx>` alias -- the only exception I think is to define inherent | ||
methods on types. Instances of `TyS` are only ever allocated in one of | ||
the rustc arenas (never e.g. on the stack). | ||
|
||
One common operation on types is to **match** and see what kinds of | ||
types they are. This is done by doing `match ty.sty`, sort of like this: | ||
|
||
```rust | ||
fn test_type<'tcx>(ty: Ty<'tcx>) { | ||
match ty.sty { | ||
ty::TyArray(elem_ty, len) => { ... } | ||
... | ||
} | ||
} | ||
``` | ||
|
||
The `sty` field (the origin of this name is unclear to me; perhaps | ||
structural type?) is of type `TypeVariants<'tcx>`, which is an enum | ||
definined all of the different kinds of types in the compiler. | ||
|
||
> NB: inspecting the `sty` field on types during type inference can be | ||
> risky, as there are may be inference variables and other things to | ||
> consider, or sometimes types are not yet known that will become | ||
> known later.). | ||
To allocate a new type, you can use the various `mk_` methods defined | ||
on the `tcx`. These have names that correpond mostly to the various kinds | ||
of type variants. For example: | ||
|
||
```rust | ||
let array_ty = tcx.mk_array(elem_ty, len * 2); | ||
``` | ||
|
||
These methods all return a `Ty<'tcx>` -- note that the lifetime you | ||
get back is the lifetime of the innermost arena that this `tcx` has | ||
access to. In fact, types are always canonicalized and interned (so we | ||
never allocate exactly the same type twice) and are always allocated | ||
in the outermost arena where they can be (so, if they do not contain | ||
any inference variables or other "temporary" types, they will be | ||
allocated in the global arena). However, the lifetime `'tcx` is always | ||
a safe approximation, so that is what you get back. | ||
|
||
> NB. Because types are interned, it is possible to compare them for | ||
> equality efficiently using `==` -- however, this is almost never what | ||
> you want to do unless you happen to be hashing and looking for | ||
> duplicates. This is because often in Rust there are multiple ways to | ||
> represent the same type, particularly once inference is involved. If | ||
> you are going to be testing for type equality, you probably need to | ||
> start looking into the inference code to do it right. | ||
You can also find various common types in the tcx itself by accessing | ||
`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more). | ||
|
||
### Beyond types: Other kinds of arena-allocated data structures | ||
|
||
In addition to types, there are a number of other arena-allocated data | ||
structures that you can allocate, and which are found in this | ||
module. Here are a few examples: | ||
|
||
- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to | ||
specify the values to be substituted for generics (e.g., `HashMap<i32, u32>` | ||
would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`. | ||
- `TraitRef`, typically passed by value -- a **trait reference** | ||
consists of a reference to a trait along with its various type | ||
parameters (including `Self`), like `i32: Display` (here, the def-id | ||
would reference the `Display` trait, and the substs would contain | ||
`i32`). | ||
- `Predicate` defines something the trait system has to prove (see `traits` module). | ||
|
||
### Import conventions | ||
|
||
Although there is no hard and fast rule, the `ty` module tends to be used like so: | ||
|
||
```rust | ||
use ty::{self, Ty, TyCtxt}; | ||
``` | ||
|
||
In particular, since they are so common, the `Ty` and `TyCtxt` types | ||
are imported directly. Other types are often referenced with an | ||
explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules | ||
choose to import a larger or smaller set of names explicitly. |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,302 @@ | ||
# The Rust Compiler Query System | ||
|
||
The Compiler Query System is the key to our new demand-driven | ||
organization. The idea is pretty simple. You have various queries | ||
that compute things about the input -- for example, there is a query | ||
called `type_of(def_id)` that, given the def-id of some item, will | ||
compute the type of that item and return it to you. | ||
|
||
Query execution is **memoized** -- so the first time you invoke a | ||
query, it will go do the computation, but the next time, the result is | ||
returned from a hashtable. Moreover, query execution fits nicely into | ||
**incremental computation**; the idea is roughly that, when you do a | ||
query, the result **may** be returned to you by loading stored data | ||
from disk (but that's a separate topic we won't discuss further here). | ||
|
||
The overall vision is that, eventually, the entire compiler | ||
control-flow will be query driven. There will effectively be one | ||
top-level query ("compile") that will run compilation on a crate; this | ||
will in turn demand information about that crate, starting from the | ||
*end*. For example: | ||
|
||
- This "compile" query might demand to get a list of codegen-units | ||
(i.e., modules that need to be compiled by LLVM). | ||
- But computing the list of codegen-units would invoke some subquery | ||
that returns the list of all modules defined in the Rust source. | ||
- That query in turn would invoke something asking for the HIR. | ||
- This keeps going further and further back until we wind up doing the | ||
actual parsing. | ||
|
||
However, that vision is not fully realized. Still, big chunks of the | ||
compiler (for example, generating MIR) work exactly like this. | ||
|
||
### Invoking queries | ||
|
||
To invoke a query is simple. The tcx ("type context") offers a method | ||
for each defined query. So, for example, to invoke the `type_of` | ||
query, you would just do this: | ||
|
||
```rust | ||
let ty = tcx.type_of(some_def_id); | ||
``` | ||
|
||
### Cycles between queries | ||
|
||
Currently, cycles during query execution should always result in a | ||
compilation error. Typically, they arise because of illegal programs | ||
that contain cyclic references they shouldn't (though sometimes they | ||
arise because of compiler bugs, in which case we need to factor our | ||
queries in a more fine-grained fashion to avoid them). | ||
|
||
However, it is nonetheless often useful to *recover* from a cycle | ||
(after reporting an error, say) and try to soldier on, so as to give a | ||
better user experience. In order to recover from a cycle, you don't | ||
get to use the nice method-call-style syntax. Instead, you invoke | ||
using the `try_get` method, which looks roughly like this: | ||
|
||
```rust | ||
use ty::maps::queries; | ||
... | ||
match queries::type_of::try_get(tcx, DUMMY_SP, self.did) { | ||
Ok(result) => { | ||
// no cycle occurred! You can use `result` | ||
} | ||
Err(err) => { | ||
// A cycle occurred! The error value `err` is a `DiagnosticBuilder`, | ||
// meaning essentially an "in-progress", not-yet-reported error message. | ||
// See below for more details on what to do here. | ||
} | ||
} | ||
``` | ||
|
||
So, if you get back an `Err` from `try_get`, then a cycle *did* occur. This means that | ||
you must ensure that a compiler error message is reported. You can do that in two ways: | ||
|
||
The simplest is to invoke `err.emit()`. This will emit the cycle error to the user. | ||
|
||
However, often cycles happen because of an illegal program, and you | ||
know at that point that an error either already has been reported or | ||
will be reported due to this cycle by some other bit of code. In that | ||
case, you can invoke `err.cancel()` to not emit any error. It is | ||
traditional to then invoke: | ||
|
||
``` | ||
tcx.sess.delay_span_bug(some_span, "some message") | ||
``` | ||
|
||
`delay_span_bug()` is a helper that says: we expect a compilation | ||
error to have happened or to happen in the future; so, if compilation | ||
ultimately succeeds, make an ICE with the message `"some | ||
message"`. This is basically just a precaution in case you are wrong. | ||
|
||
### How the compiler executes a query | ||
|
||
So you may be wondering what happens when you invoke a query | ||
method. The answer is that, for each query, the compiler maintains a | ||
cache -- if your query has already been executed, then, the answer is | ||
simple: we clone the return value out of the cache and return it | ||
(therefore, you should try to ensure that the return types of queries | ||
are cheaply cloneable; insert a `Rc` if necessary). | ||
|
||
#### Providers | ||
|
||
If, however, the query is *not* in the cache, then the compiler will | ||
try to find a suitable **provider**. A provider is a function that has | ||
been defined and linked into the compiler somewhere that contains the | ||
code to compute the result of the query. | ||
|
||
**Providers are defined per-crate.** The compiler maintains, | ||
internally, a table of providers for every crate, at least | ||
conceptually. Right now, there are really two sets: the providers for | ||
queries about the **local crate** (that is, the one being compiled) | ||
and providers for queries about **external crates** (that is, | ||
dependencies of the local crate). Note that what determines the crate | ||
that a query is targeting is not the *kind* of query, but the *key*. | ||
For example, when you invoke `tcx.type_of(def_id)`, that could be a | ||
local query or an external query, depending on what crate the `def_id` | ||
is referring to (see the `self::keys::Key` trait for more information | ||
on how that works). | ||
|
||
Providers always have the same signature: | ||
|
||
```rust | ||
fn provider<'cx, 'tcx>(tcx: TyCtxt<'cx, 'tcx, 'tcx>, | ||
key: QUERY_KEY) | ||
-> QUERY_RESULT | ||
{ | ||
... | ||
} | ||
``` | ||
|
||
Providers take two arguments: the `tcx` and the query key. Note also | ||
that they take the *global* tcx (i.e., they use the `'tcx` lifetime | ||
twice), rather than taking a tcx with some active inference context. | ||
They return the result of the query. | ||
|
||
#### How providers are setup | ||
|
||
When the tcx is created, it is given the providers by its creator using | ||
the `Providers` struct. This struct is generate by the macros here, but it | ||
is basically a big list of function pointers: | ||
|
||
```rust | ||
struct Providers { | ||
type_of: for<'cx, 'tcx> fn(TyCtxt<'cx, 'tcx, 'tcx>, DefId) -> Ty<'tcx>, | ||
... | ||
} | ||
``` | ||
|
||
At present, we have one copy of the struct for local crates, and one | ||
for external crates, though the plan is that we may eventually have | ||
one per crate. | ||
|
||
These `Provider` structs are ultimately created and populated by | ||
`librustc_driver`, but it does this by distributing the work | ||
throughout the other `rustc_*` crates. This is done by invoking | ||
various `provide` functions. These functions tend to look something | ||
like this: | ||
|
||
```rust | ||
pub fn provide(providers: &mut Providers) { | ||
*providers = Providers { | ||
type_of, | ||
..*providers | ||
}; | ||
} | ||
``` | ||
|
||
That is, they take an `&mut Providers` and mutate it in place. Usually | ||
we use the formulation above just because it looks nice, but you could | ||
as well do `providers.type_of = type_of`, which would be equivalent. | ||
(Here, `type_of` would be a top-level function, defined as we saw | ||
before.) So, if we wanted to have add a provider for some other query, | ||
let's call it `fubar`, into the crate above, we might modify the `provide()` | ||
function like so: | ||
|
||
```rust | ||
pub fn provide(providers: &mut Providers) { | ||
*providers = Providers { | ||
type_of, | ||
fubar, | ||
..*providers | ||
}; | ||
} | ||
|
||
fn fubar<'cx, 'tcx>(tcx: TyCtxt<'cx, 'tcx>, key: DefId) -> Fubar<'tcx> { .. } | ||
``` | ||
|
||
NB. Most of the `rustc_*` crate only provide **local | ||
providers**. Almost all **extern providers** wind up going through the | ||
`rustc_metadata` crate, which loads the information from the crate | ||
metadata. But in some cases there are crates that provide queries for | ||
*both* local and external crates, in which case they define both a | ||
`provide` and a `provide_extern` function that `rustc_driver` can | ||
invoke. | ||
|
||
### Adding a new kind of query | ||
|
||
So suppose you want to add a new kind of query, how do you do so? | ||
Well, defining a query takes place in two steps: | ||
|
||
1. first, you have to specify the query name and arguments; and then, | ||
2. you have to supply query providers where needed. | ||
|
||
The specify the query name and arguments, you simply add an entry | ||
to the big macro invocation in `mod.rs`. This will probably have changed | ||
by the time you read this README, but at present it looks something | ||
like: | ||
|
||
``` | ||
define_maps! { <'tcx> | ||
/// Records the type of every item. | ||
[] fn type_of: TypeOfItem(DefId) -> Ty<'tcx>, | ||
... | ||
} | ||
``` | ||
|
||
Each line of the macro defines one query. The name is broken up like this: | ||
|
||
``` | ||
[] fn type_of: TypeOfItem(DefId) -> Ty<'tcx>, | ||
^^ ^^^^^^^ ^^^^^^^^^^ ^^^^^ ^^^^^^^^ | ||
| | | | | | ||
| | | | result type of query | ||
| | | query key type | ||
| | dep-node constructor | ||
| name of query | ||
query flags | ||
``` | ||
|
||
Let's go over them one by one: | ||
|
||
- **Query flags:** these are largely unused right now, but the intention | ||
is that we'll be able to customize various aspects of how the query is | ||
processed. | ||
- **Name of query:** the name of the query method | ||
(`tcx.type_of(..)`). Also used as the name of a struct | ||
(`ty::maps::queries::type_of`) that will be generated to represent | ||
this query. | ||
- **Dep-node constructor:** indicates the constructor function that | ||
connects this query to incremental compilation. Typically, this is a | ||
`DepNode` variant, which can be added by modifying the | ||
`define_dep_nodes!` macro invocation in | ||
`librustc/dep_graph/dep_node.rs`. | ||
- However, sometimes we use a custom function, in which case the | ||
name will be in snake case and the function will be defined at the | ||
bottom of the file. This is typically used when the query key is | ||
not a def-id, or just not the type that the dep-node expects. | ||
- **Query key type:** the type of the argument to this query. | ||
This type must implement the `ty::maps::keys::Key` trait, which | ||
defines (for example) how to map it to a crate, and so forth. | ||
- **Result type of query:** the type produced by this query. This type | ||
should (a) not use `RefCell` or other interior mutability and (b) be | ||
cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for | ||
non-trivial data types. | ||
- The one exception to those rules is the `ty::steal::Steal` type, | ||
which is used to cheaply modify MIR in place. See the definition | ||
of `Steal` for more details. New uses of `Steal` should **not** be | ||
added without alerting `@rust-lang/compiler`. | ||
|
||
So, to add a query: | ||
|
||
- Add an entry to `define_maps!` using the format above. | ||
- Possibly add a corresponding entry to the dep-node macro. | ||
- Link the provider by modifying the appropriate `provide` method; | ||
or add a new one if needed and ensure that `rustc_driver` is invoking it. | ||
|
||
#### Query structs and descriptions | ||
|
||
For each kind, the `define_maps` macro will generate a "query struct" | ||
named after the query. This struct is a kind of a place-holder | ||
describing the query. Each such struct implements the | ||
`self::config::QueryConfig` trait, which has associated types for the | ||
key/value of that particular query. Basically the code generated looks something | ||
like this: | ||
|
||
```rust | ||
// Dummy struct representing a particular kind of query: | ||
pub struct type_of<'tcx> { phantom: PhantomData<&'tcx ()> } | ||
|
||
impl<'tcx> QueryConfig for type_of<'tcx> { | ||
type Key = DefId; | ||
type Value = Ty<'tcx>; | ||
} | ||
``` | ||
|
||
There is an additional trait that you may wish to implement called | ||
`self::config::QueryDescription`. This trait is used during cycle | ||
errors to give a "human readable" name for the query, so that we can | ||
summarize what was happening when the cycle occurred. Implementing | ||
this trait is optional if the query key is `DefId`, but if you *don't* | ||
implement it, you get a pretty generic error ("processing `foo`..."). | ||
You can put new impls into the `config` module. They look something like this: | ||
|
||
```rust | ||
impl<'tcx> QueryDescription for queries::type_of<'tcx> { | ||
fn describe(tcx: TyCtxt, key: DefId) -> String { | ||
format!("computing the type of `{}`", tcx.item_path_str(key)) | ||
} | ||
} | ||
``` | ||
|
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
// Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT | ||
// file at the top-level directory of this distribution and at | ||
// http://rust-lang.org/COPYRIGHT. | ||
// | ||
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or | ||
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license | ||
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your | ||
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
//! Defines the set of legal keys that can be used in queries. | ||
use hir::def_id::{CrateNum, DefId, LOCAL_CRATE, DefIndex}; | ||
use mir::transform::{MirSuite, MirPassIndex}; | ||
use ty::{self, Ty, TyCtxt}; | ||
use ty::subst::Substs; | ||
use ty::fast_reject::SimplifiedType; | ||
|
||
use std::fmt::Debug; | ||
use std::hash::Hash; | ||
use syntax_pos::{Span, DUMMY_SP}; | ||
use syntax_pos::symbol::InternedString; | ||
|
||
/// The `Key` trait controls what types can legally be used as the key | ||
/// for a query. | ||
pub trait Key: Clone + Hash + Eq + Debug { | ||
/// Given an instance of this key, what crate is it referring to? | ||
/// This is used to find the provider. | ||
fn map_crate(&self) -> CrateNum; | ||
|
||
/// In the event that a cycle occurs, if no explicit span has been | ||
/// given for a query with key `self`, what span should we use? | ||
fn default_span(&self, tcx: TyCtxt) -> Span; | ||
} | ||
|
||
impl<'tcx> Key for ty::InstanceDef<'tcx> { | ||
fn map_crate(&self) -> CrateNum { | ||
LOCAL_CRATE | ||
} | ||
|
||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
tcx.def_span(self.def_id()) | ||
} | ||
} | ||
|
||
impl<'tcx> Key for ty::Instance<'tcx> { | ||
fn map_crate(&self) -> CrateNum { | ||
LOCAL_CRATE | ||
} | ||
|
||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
tcx.def_span(self.def_id()) | ||
} | ||
} | ||
|
||
impl Key for CrateNum { | ||
fn map_crate(&self) -> CrateNum { | ||
*self | ||
} | ||
fn default_span(&self, _: TyCtxt) -> Span { | ||
DUMMY_SP | ||
} | ||
} | ||
|
||
impl Key for DefIndex { | ||
fn map_crate(&self) -> CrateNum { | ||
LOCAL_CRATE | ||
} | ||
fn default_span(&self, _tcx: TyCtxt) -> Span { | ||
DUMMY_SP | ||
} | ||
} | ||
|
||
impl Key for DefId { | ||
fn map_crate(&self) -> CrateNum { | ||
self.krate | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
tcx.def_span(*self) | ||
} | ||
} | ||
|
||
impl Key for (DefId, DefId) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.0.krate | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.1.default_span(tcx) | ||
} | ||
} | ||
|
||
impl Key for (CrateNum, DefId) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.0 | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.1.default_span(tcx) | ||
} | ||
} | ||
|
||
impl Key for (DefId, SimplifiedType) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.0.krate | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.0.default_span(tcx) | ||
} | ||
} | ||
|
||
impl<'tcx> Key for (DefId, &'tcx Substs<'tcx>) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.0.krate | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.0.default_span(tcx) | ||
} | ||
} | ||
|
||
impl Key for (MirSuite, DefId) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.1.map_crate() | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.1.default_span(tcx) | ||
} | ||
} | ||
|
||
impl Key for (MirSuite, MirPassIndex, DefId) { | ||
fn map_crate(&self) -> CrateNum { | ||
self.2.map_crate() | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.2.default_span(tcx) | ||
} | ||
} | ||
|
||
impl<'tcx> Key for Ty<'tcx> { | ||
fn map_crate(&self) -> CrateNum { | ||
LOCAL_CRATE | ||
} | ||
fn default_span(&self, _: TyCtxt) -> Span { | ||
DUMMY_SP | ||
} | ||
} | ||
|
||
impl<'tcx, T: Key> Key for ty::ParamEnvAnd<'tcx, T> { | ||
fn map_crate(&self) -> CrateNum { | ||
self.value.map_crate() | ||
} | ||
fn default_span(&self, tcx: TyCtxt) -> Span { | ||
self.value.default_span(tcx) | ||
} | ||
} | ||
|
||
impl Key for InternedString { | ||
fn map_crate(&self) -> CrateNum { | ||
LOCAL_CRATE | ||
} | ||
fn default_span(&self, _tcx: TyCtxt) -> Span { | ||
DUMMY_SP | ||
} | ||
} |
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
// Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT | ||
// file at the top-level directory of this distribution and at | ||
// http://rust-lang.org/COPYRIGHT. | ||
// | ||
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or | ||
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license | ||
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your | ||
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
use ty::{self, Ty, TyCtxt}; | ||
|
||
use syntax::symbol::Symbol; | ||
|
||
pub(super) trait Value<'tcx>: Sized { | ||
fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> Self; | ||
} | ||
|
||
impl<'tcx, T> Value<'tcx> for T { | ||
default fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> T { | ||
tcx.sess.abort_if_errors(); | ||
bug!("Value::from_cycle_error called without errors"); | ||
} | ||
} | ||
|
||
impl<'tcx, T: Default> Value<'tcx> for T { | ||
default fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> T { | ||
T::default() | ||
} | ||
} | ||
|
||
impl<'tcx> Value<'tcx> for Ty<'tcx> { | ||
fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> Ty<'tcx> { | ||
tcx.types.err | ||
} | ||
} | ||
|
||
impl<'tcx> Value<'tcx> for ty::DtorckConstraint<'tcx> { | ||
fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> Self { | ||
Self::empty() | ||
} | ||
} | ||
|
||
impl<'tcx> Value<'tcx> for ty::SymbolName { | ||
fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> Self { | ||
ty::SymbolName { name: Symbol::intern("<error>").as_str() } | ||
} | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
NB: This crate is part of the Rust compiler. For an overview of the | ||
compiler as a whole, see | ||
[the README.md file found in `librustc`](../librustc/README.md). | ||
|
||
`librustc_back` contains some very low-level details that are | ||
specific to different LLVM targets and so forth. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
NB: This crate is part of the Rust compiler. For an overview of the | ||
compiler as a whole, see | ||
[the README.md file found in `librustc`](../librustc/README.md). | ||
|
||
The `driver` crate is effectively the "main" function for the rust | ||
compiler. It orchstrates the compilation process and "knits together" | ||
the code from the other crates within rustc. This crate itself does | ||
not contain any of the "main logic" of the compiler (though it does | ||
have some code related to pretty printing or other minor compiler | ||
options). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This definition of |
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,7 @@ | ||
See [librustc/README.md](../librustc/README.md). | ||
NB: This crate is part of the Rust compiler. For an overview of the | ||
compiler as a whole, see | ||
[the README.md file found in `librustc`](../librustc/README.md). | ||
|
||
The `trans` crate contains the code to convert from MIR into LLVM IR, | ||
and then from LLVM IR into machine code. In general it contains code | ||
that runs towards the end of the compilation process. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
NB: This crate is part of the Rust compiler. For an overview of the | ||
compiler as a whole, see | ||
[the README.md file found in `librustc`](../librustc/README.md). | ||
|
||
The `rustc_typeck` crate contains the source for "type collection" and | ||
"type checking", as well as a few other bits of related functionality. | ||
(It draws heavily on the [type inferencing][infer] and | ||
[trait solving][traits] code found in librustc.) | ||
|
||
[infer]: ../librustc/infer/README.md | ||
[traits]: ../librustc/traits/README.md | ||
|
||
## Type collection | ||
|
||
Type "collection" is the process of convering the types found in the | ||
HIR (`hir::Ty`), which represent the syntactic things that the user | ||
wrote, into the **internal representation** used by the compiler | ||
(`Ty<'tcx>`) -- we also do similar conversions for where-clauses and | ||
other bits of the function signature. | ||
|
||
To try and get a sense for the difference, consider this function: | ||
|
||
```rust | ||
struct Foo { } | ||
fn foo(x: Foo, y: self::Foo) { .. } | ||
// ^^^ ^^^^^^^^^ | ||
``` | ||
|
||
Those two parameters `x` and `y` each have the same type: but they | ||
will have distinct `hir::Ty` nodes. Those nodes will have different | ||
spans, and of course they encode the path somewhat differently. But | ||
once they are "collected" into `Ty<'tcx>` nodes, they will be | ||
represented by the exact same internal type. | ||
|
||
Collection is defined as a bundle of queries (e.g., `type_of`) for | ||
computing information about the various functions, traits, and other | ||
items in the crate being compiled. Note that each of these queries is | ||
concerned with *interprocedural* things -- for example, for a function | ||
definition, collection will figure out the type and signature of the | ||
function, but it will not visit the *body* of the function in any way, | ||
nor examine type annotations on local variables (that's the job of | ||
type *checking*). | ||
|
||
For more details, see the `collect` module. | ||
|
||
## Type checking | ||
|
||
TODO |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
NB: This crate is part of the Rust compiler. For an overview of the | ||
compiler as a whole, see | ||
[the README.md file found in `librustc`](../librustc/README.md). | ||
|
||
The `syntax` crate contains those things concerned purely with syntax | ||
– that is, the AST ("abstract syntax tree"), parser, pretty-printer, | ||
lexer, macro expander, and utilities for traversing ASTs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: find find