rust-lang · bors · Sep 20, 2017 · Aug 31, 2017 · Sep 15, 2017 · Sep 18, 2017
diff --git a/src/librustc/README.md b/src/librustc/README.md
diff --git a/src/librustc/hir/README.md b/src/librustc/hir/README.md
@@ -0,0 +1,119 @@
+# Introduction to the HIR
+
+The HIR -- "High-level IR" -- is the primary IR used in most of
+rustc. It is a desugared version of the "abstract syntax tree" (AST)
+that is generated after parsing, macro expansion, and name resolution
+have completed. Many parts of HIR resemble Rust surface syntax quite
+closely, with the exception that some of Rust's expression forms have
+been desugared away (as an example, `for` loops are converted into a
+`loop` and do not appear in the HIR).
+
+This README covers the main concepts of the HIR.
+
+### Out-of-band storage and the `Crate` type
+
+The top-level data-structure in the HIR is the `Crate`, which stores
+the contents of the crate currently being compiled (we only ever
+construct HIR for the current crate). Whereas in the AST the crate
+data structure basically just contains the root module, the HIR
+`Crate` structure contains a number of maps and other things that
+serve to organize the content of the crate for easier access.
+
+For example, the contents of individual items (e.g., modules,
+functions, traits, impls, etc) in the HIR are not immediately
+accessible in the parents. So, for example, if had a module item `foo`
+containing a function `bar()`:
+
+```
+mod foo {
+  fn bar() { }
+}
+```
+
+Then in the HIR the representation of module `foo` (the `Mod`
+stuct) would have only the **`ItemId`** `I` of `bar()`. To get the
+details of the function `bar()`, we would lookup `I` in the
+`items` map.
+
+One nice result from this representation is that one can iterate
+over all items in the crate by iterating over the key-value pairs
+in these maps (without the need to trawl through the IR in total).
+There are similar maps for things like trait items and impl items,
+as well as "bodies" (explained below).
+
+The other reason to setup the representation this way is for better
+integration with incremental compilation. This way, if you gain access
+to a `&hir::Item` (e.g. for the mod `foo`), you do not immediately
+gain access to the contents of the function `bar()`. Instead, you only
+gain access to the **id** for `bar()`, and you must invoke some
+function to lookup the contents of `bar()` given its id; this gives us
+a chance to observe that you accessed the data for `bar()` and record
+the dependency.
+
+### Identifiers in the HIR
+
+Most of the code that has to deal with things in HIR tends not to
+carry around references into the HIR, but rather to carry around
+*identifier numbers* (or just "ids"). Right now, you will find four
+sorts of identifiers in active use:
+
+- `DefId`, which primarily name "definitions" or top-level items.
+  - You can think of a `DefId` as being shorthand for a very explicit
+    and complete path, like `std::collections::HashMap`. However,
+    these paths are able to name things that are not nameable in
+    normal Rust (e.g., impls), and they also include extra information
+    about the crate (such as its version number, as two versions of
+    the same crate can co-exist).
+  - A `DefId` really consists of two parts, a `CrateNum` (which
+    identifies the crate) and a `DefIndex` (which indixes into a list
+    of items that is maintained per crate).
+- `HirId`, which combines the index of a particular item with an
+  offset within that item.
+  - the key point of a `HirId` is that it is *relative* to some item (which is named
+    via a `DefId`).
+- `BodyId`, this is an absolute identifier that refers to a specific
+  body (definition of a function or constant) in the crate. It is currently
+  effectively a "newtype'd" `NodeId`.
+- `NodeId`, which is an absolute id that identifies a single node in the HIR tree.
+  - While these are still in common use, **they are being slowly phased out**.
+  - Since they are absolute within the crate, adding a new node
+    anywhere in the tree causes the node-ids of all subsequent code in
+    the crate to change. This is terrible for incremental compilation,
+    as you can perhaps imagine.
+
+### HIR Map
+
+Most of the time when you are working with the HIR, you will do so via
+the **HIR Map**, accessible in the tcx via `tcx.hir` (and defined in
+the `hir::map` module). The HIR map contains a number of methods to
+convert between ids of various kinds and to lookup data associated
+with a HIR node.
+
+For example, if you have a `DefId`, and you would like to convert it
+to a `NodeId`, you can use `tcx.hir.as_local_node_id(def_id)`. This
+returns an `Option<NodeId>` -- this will be `None` if the def-id
+refers to something outside of the current crate (since then it has no
+HIR node), but otherwise returns `Some(n)` where `n` is the node-id of
+the definition.
+
+Similarly, you can use `tcx.hir.find(n)` to lookup the node for a
+`NodeId`. This returns a `Option<Node<'tcx>>`, where `Node` is an enum
+defined in the map; by matching on this you can find out what sort of
+node the node-id referred to and also get a pointer to the data
+itself. Often, you know what sort of node `n` is -- e.g., if you know
+that `n` must be some HIR expression, you can do
+`tcx.hir.expect_expr(n)`, which will extract and return the
+`&hir::Expr`, panicking if `n` is not in fact an expression.
+
+Finally, you can use the HIR map to find the parents of nodes, via
+calls like `tcx.hir.get_parent_node(n)`.
+
+### HIR Bodies
+
+A **body** represents some kind of executable code, such as the body
+of a function/closure or the definition of a constant. Bodies are
+associated with an **owner**, which is typically some kind of item
+(e.g., a `fn()` or `const`), but could also be a closure expression
+(e.g., `|x, y| x + y`). You can use the HIR map to find find the body
+associated with a given def-id (`maybe_body_owned_by()`) or to find
+the owner of a body (`body_owner_def_id()`).
diff --git a/src/librustc/hir/map/README.md b/src/librustc/hir/map/README.md
@@ -0,0 +1,4 @@
+The HIR map, accessible via `tcx.hir`, allows you to quickly navigate the
+HIR and convert between various forms of identifiers. See [the HIR README] for more information.
+
+[the HIR README]: ../README.md
diff --git a/src/librustc/hir/mod.rs b/src/librustc/hir/mod.rs
@@ -413,6 +413,10 @@ pub struct WhereEqPredicate {
 
 pub type CrateConfig = HirVec<P<MetaItem>>;
 
+/// The top-level data structure that stores the entire contents of
+/// the crate currently being compiled.
+///
+/// For more details, see [the module-level README](README.md).
 #[derive(Clone, PartialEq, Eq, RustcEncodable, RustcDecodable, Debug)]
 pub struct Crate {
     pub module: Mod,
@@ -927,7 +931,27 @@ pub struct BodyId {
     pub node_id: NodeId,
 }
 
-/// The body of a function or constant value.
+/// The body of a function, closure, or constant value. In the case of
+/// a function, the body contains not only the function body itself
+/// (which is an expression), but also the argument patterns, since
+/// those are something that the caller doesn't really care about.
+///
+/// # Examples
+///
+/// ```
+/// fn foo((x, y): (u32, u32)) -> u32 {
+///     x + y
+/// }
+/// ```
+///
+/// Here, the `Body` associated with `foo()` would contain:
+///
+/// - an `arguments` array containing the `(x, y)` pattern
+/// - a `value` containing the `x + y` expression (maybe wrapped in a block)
+/// - `is_generator` would be false
+///
+/// All bodies have an **owner**, which can be accessed via the HIR
+/// map using `body_owner_def_id()`.
 #[derive(Clone, PartialEq, Eq, RustcEncodable, RustcDecodable, Hash, Debug)]
 pub struct Body {
     pub arguments: HirVec<Arg>,

diff --git a/src/librustc/lib.rs b/src/librustc/lib.rs
@@ -8,7 +8,28 @@
 // option. This file may not be copied, modified, or distributed
 // except according to those terms.
 
-//! The Rust compiler.
+//! The "main crate" of the Rust compiler. This crate contains common
+//! type definitions that are used by the other crates in the rustc
+//! "family". Some prominent examples (note that each of these modules
+//! has their own README with further details).
+//!
+//! - **HIR.** The "high-level (H) intermediate representation (IR)" is
+//!   defined in the `hir` module.
+//! - **MIR.** The "mid-level (M) intermediate representation (IR)" is
+//!   defined in the `mir` module. This module contains only the
+//!   *definition* of the MIR; the passes that transform and operate
+//!   on MIR are found in `librustc_mir` crate.
+//! - **Types.** The internal representation of types used in rustc is
+//!   defined in the `ty` module. This includes the **type context**
+//!   (or `tcx`), which is the central context during most of
+//!   compilation, containing the interners and other things.
+//! - **Traits.** Trait resolution is implemented in the `traits` module.
+//! - **Type inference.** The type inference code can be found in the `infer` module;
+//!   this code handles low-level equality and subtyping operations. The
+//!   type check pass in the compiler is found in the `librustc_typeck` crate.
+//!
+//! For a deeper explanation of how the compiler works and is
+//! organized, see the README.md file in this directory.
 //!
 //! # Note
 //!

diff --git a/src/librustc/ty/README.md b/src/librustc/ty/README.md
@@ -0,0 +1,165 @@
+# Types and the Type Context
+
+The `ty` module defines how the Rust compiler represents types
+internally. It also defines the *typing context* (`tcx` or `TyCtxt`),
+which is the central data structure in the compiler.
+
+## The tcx and how it uses lifetimes
+
+The `tcx` ("typing context") is the central data structure in the
+compiler. It is the context that you use to perform all manner of
+queries. The struct `TyCtxt` defines a reference to this shared context:
+
+```rust
+tcx: TyCtxt<'a, 'gcx, 'tcx>
+//          --  ----  ----
+//          |   |     |
+//          |   |     innermost arena lifetime (if any)
+//          |   "global arena" lifetime
+//          lifetime of this reference
+```
+
+As you can see, the `TyCtxt` type takes three lifetime parameters.
+These lifetimes are perhaps the most complex thing to understand about
+the tcx. During Rust compilation, we allocate most of our memory in
+**arenas**, which are basically pools of memory that get freed all at
+once. When you see a reference with a lifetime like `'tcx` or `'gcx`,
+you know that it refers to arena-allocated data (or data that lives as
+long as the arenas, anyhow).
+
+We use two distinct levels of arenas. The outer level is the "global
+arena". This arena lasts for the entire compilation: so anything you
+allocate in there is only freed once compilation is basically over
+(actually, when we shift to executing LLVM).
+
+To reduce peak memory usage, when we do type inference, we also use an
+inner level of arena. These arenas get thrown away once type inference
+is over. This is done because type inference generates a lot of
+"throw-away" types that are not particularly interesting after type
+inference completes, so keeping around those allocations would be
+wasteful.
+
+Often, we wish to write code that explicitly asserts that it is not
+taking place during inference. In that case, there is no "local"
+arena, and all the types that you can access are allocated in the
+global arena.  To express this, the idea is to us the same lifetime
+for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch
+confusing, we tend to use the name `'tcx` in such contexts. Here is an
+example:
+
+```rust
+fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) {
+    //                                        ----  ----
+    //                                        Using the same lifetime here asserts
+    //                                        that the innermost arena accessible through
+    //                                        this reference *is* the global arena.
+}
+```
+
+In contrast, if we want to code that can be usable during type inference, then you
+need to declare a distinct `'gcx` and `'tcx` lifetime parameter:
+
+```rust
+fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) {
+    //                                                ----  ----
+    //                                        Using different lifetimes here means that
+    //                                        the innermost arena *may* be distinct
+    //                                        from the global arena (but doesn't have to be).
+}
+```
+
+### Allocating and working with types
+
+Rust types are represented using the `Ty<'tcx>` defined in the `ty`
+module (not to be confused with the `Ty` struct from [the HIR]). This
+is in fact a simple type alias for a reference with `'tcx` lifetime:
+
+```rust
+pub type Ty<'tcx> = &'tcx TyS<'tcx>;
+```
+
+[the HIR]: ../hir/README.md
+
+You can basically ignore the `TyS` struct -- you will basically never
+access it explicitly. We always pass it by reference using the
+`Ty<'tcx>` alias -- the only exception I think is to define inherent
+methods on types. Instances of `TyS` are only ever allocated in one of
+the rustc arenas (never e.g. on the stack).
+
+One common operation on types is to **match** and see what kinds of
+types they are. This is done by doing `match ty.sty`, sort of like this:
+
+```rust
+fn test_type<'tcx>(ty: Ty<'tcx>) {
+    match ty.sty {
+        ty::TyArray(elem_ty, len) => { ... }
+        ...
+    }
+}
+```
+
+The `sty` field (the origin of this name is unclear to me; perhaps
+structural type?) is of type `TypeVariants<'tcx>`, which is an enum
+definined all of the different kinds of types in the compiler.
+
+> NB: inspecting the `sty` field on types during type inference can be
+> risky, as there are may be inference variables and other things to
+> consider, or sometimes types are not yet known that will become
+> known later.).
+
+To allocate a new type, you can use the various `mk_` methods defined
+on the `tcx`. These have names that correpond mostly to the various kinds
+of type variants. For example:
+
+```rust
+let array_ty = tcx.mk_array(elem_ty, len * 2);
+```
+
+These methods all return a `Ty<'tcx>` -- note that the lifetime you
+get back is the lifetime of the innermost arena that this `tcx` has
+access to. In fact, types are always canonicalized and interned (so we
+never allocate exactly the same type twice) and are always allocated
+in the outermost arena where they can be (so, if they do not contain
+any inference variables or other "temporary" types, they will be
+allocated in the global arena). However, the lifetime `'tcx` is always
+a safe approximation, so that is what you get back.
+
+> NB. Because types are interned, it is possible to compare them for
+> equality efficiently using `==` -- however, this is almost never what
+> you want to do unless you happen to be hashing and looking for
+> duplicates. This is because often in Rust there are multiple ways to
+> represent the same type, particularly once inference is involved. If
+> you are going to be testing for type equality, you probably need to
+> start looking into the inference code to do it right.
+
+You can also find various common types in the tcx itself by accessing
+`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more).
+
+### Beyond types: Other kinds of arena-allocated data structures
+
+In addition to types, there are a number of other arena-allocated data
+structures that you can allocate, and which are found in this
+module. Here are a few examples:
+
+- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to
+  specify the values to be substituted for generics (e.g., `HashMap<i32, u32>`
+  would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`.
+- `TraitRef`, typically passed by value -- a **trait reference**
+  consists of a reference to a trait along with its various type
+  parameters (including `Self`), like `i32: Display` (here, the def-id
+  would reference the `Display` trait, and the substs would contain
+  `i32`).
+- `Predicate` defines something the trait system has to prove (see `traits` module).
+
+### Import conventions
+
+Although there is no hard and fast rule, the `ty` module tends to be used like so:
+
+```rust
+use ty::{self, Ty, TyCtxt};
+```
+
+In particular, since they are so common, the `Ty` and `TyCtxt` types
+are imported directly. Other types are often referenced with an
+explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules
+choose to import a larger or smaller set of names explicitly.
diff --git a/src/librustc/ty/context.rs b/src/librustc/ty/context.rs
@@ -793,9 +793,10 @@ impl<'tcx> CommonTypes<'tcx> {
     }
 }
 
-/// The data structure to keep track of all the information that typechecker
-/// generates so that so that it can be reused and doesn't have to be redone
-/// later on.
+/// The central data structure of the compiler. It stores references
+/// to the various **arenas** and also houses the results of the
+/// various **compiler queries** that have been performed. See [the
+/// README](README.md) for more deatils.
 #[derive(Copy, Clone)]
 pub struct TyCtxt<'a, 'gcx: 'a+'tcx, 'tcx: 'a> {
     gcx: &'a GlobalCtxt<'gcx>,

diff --git a/src/librustc/ty/maps.rs b/src/librustc/ty/maps.rs
diff --git a/src/librustc/ty/maps/README.md b/src/librustc/ty/maps/README.md
@@ -0,0 +1,302 @@
+# The Rust Compiler Query System
+
+The Compiler Query System is the key to our new demand-driven
+organization.  The idea is pretty simple. You have various queries
+that compute things about the input -- for example, there is a query
+called `type_of(def_id)` that, given the def-id of some item, will
+compute the type of that item and return it to you.
+
+Query execution is **memoized** -- so the first time you invoke a
+query, it will go do the computation, but the next time, the result is
+returned from a hashtable. Moreover, query execution fits nicely into
+**incremental computation**; the idea is roughly that, when you do a
+query, the result **may** be returned to you by loading stored data
+from disk (but that's a separate topic we won't discuss further here).
+
+The overall vision is that, eventually, the entire compiler
+control-flow will be query driven. There will effectively be one
+top-level query ("compile") that will run compilation on a crate; this
+will in turn demand information about that crate, starting from the
+*end*.  For example:
+
+- This "compile" query might demand to get a list of codegen-units
+  (i.e., modules that need to be compiled by LLVM).
+- But computing the list of codegen-units would invoke some subquery
+  that returns the list of all modules defined in the Rust source.
+- That query in turn would invoke something asking for the HIR.
+- This keeps going further and further back until we wind up doing the
+  actual parsing.
+
+However, that vision is not fully realized. Still, big chunks of the
+compiler (for example, generating MIR) work exactly like this.
+
+### Invoking queries
+
+To invoke a query is simple. The tcx ("type context") offers a method
+for each defined query. So, for example, to invoke the `type_of`
+query, you would just do this:
+
+```rust
+let ty = tcx.type_of(some_def_id);
+```
+
+### Cycles between queries
+
+Currently, cycles during query execution should always result in a
+compilation error. Typically, they arise because of illegal programs
+that contain cyclic references they shouldn't (though sometimes they
+arise because of compiler bugs, in which case we need to factor our
+queries in a more fine-grained fashion to avoid them).
+
+However, it is nonetheless often useful to *recover* from a cycle
+(after reporting an error, say) and try to soldier on, so as to give a
+better user experience. In order to recover from a cycle, you don't
+get to use the nice method-call-style syntax. Instead, you invoke
+using the `try_get` method, which looks roughly like this:
+
+```rust
+use ty::maps::queries;
+...
+match queries::type_of::try_get(tcx, DUMMY_SP, self.did) {
+  Ok(result) => {
+    // no cycle occurred! You can use `result`
+  }
+  Err(err) => {
+    // A cycle occurred! The error value `err` is a `DiagnosticBuilder`,
+    // meaning essentially an "in-progress", not-yet-reported error message.
+    // See below for more details on what to do here.
+  }
+}
+```
+
+So, if you get back an `Err` from `try_get`, then a cycle *did* occur. This means that
+you must ensure that a compiler error message is reported. You can do that in two ways:
+
+The simplest is to invoke `err.emit()`. This will emit the cycle error to the user.
+
+However, often cycles happen because of an illegal program, and you
+know at that point that an error either already has been reported or
+will be reported due to this cycle by some other bit of code. In that
+case, you can invoke `err.cancel()` to not emit any error. It is
+traditional to then invoke:
+
+```
+tcx.sess.delay_span_bug(some_span, "some message")
+```
+
+`delay_span_bug()` is a helper that says: we expect a compilation
+error to have happened or to happen in the future; so, if compilation
+ultimately succeeds, make an ICE with the message `"some
+message"`. This is basically just a precaution in case you are wrong.
+
+### How the compiler executes a query
+
+So you may be wondering what happens when you invoke a query
+method. The answer is that, for each query, the compiler maintains a
+cache -- if your query has already been executed, then, the answer is
+simple: we clone the return value out of the cache and return it
+(therefore, you should try to ensure that the return types of queries
+are cheaply cloneable; insert a `Rc` if necessary).
+
+#### Providers
+
+If, however, the query is *not* in the cache, then the compiler will
+try to find a suitable **provider**. A provider is a function that has
+been defined and linked into the compiler somewhere that contains the
+code to compute the result of the query.
+
+**Providers are defined per-crate.** The compiler maintains,
+internally, a table of providers for every crate, at least
+conceptually. Right now, there are really two sets: the providers for
+queries about the **local crate** (that is, the one being compiled)
+and providers for queries about **external crates** (that is,
+dependencies of the local crate). Note that what determines the crate
+that a query is targeting is not the *kind* of query, but the *key*.
+For example, when you invoke `tcx.type_of(def_id)`, that could be a
+local query or an external query, depending on what crate the `def_id`
+is referring to (see the `self::keys::Key` trait for more information
+on how that works).
+
+Providers always have the same signature:
+
+```rust
+fn provider<'cx, 'tcx>(tcx: TyCtxt<'cx, 'tcx, 'tcx>,
+                       key: QUERY_KEY)
+                       -> QUERY_RESULT
+{
+    ...
+}
+```
+
+Providers take two arguments: the `tcx` and the query key. Note also
+that they take the *global* tcx (i.e., they use the `'tcx` lifetime
+twice), rather than taking a tcx with some active inference context.
+They return the result of the query.
+
+####  How providers are setup
+
+When the tcx is created, it is given the providers by its creator using
+the `Providers` struct. This struct is generate by the macros here, but it
+is basically a big list of function pointers:
+
+```rust
+struct Providers {
+    type_of: for<'cx, 'tcx> fn(TyCtxt<'cx, 'tcx, 'tcx>, DefId) -> Ty<'tcx>,
+    ...
+}
+```
+
+At present, we have one copy of the struct for local crates, and one
+for external crates, though the plan is that we may eventually have
+one per crate.
+
+These `Provider` structs are ultimately created and populated by
+`librustc_driver`, but it does this by distributing the work
+throughout the other `rustc_*` crates. This is done by invoking
+various `provide` functions. These functions tend to look something
+like this:
+
+```rust
+pub fn provide(providers: &mut Providers) {
+    *providers = Providers {
+        type_of,
+        ..*providers
+    };
+}
+```
+
+That is, they take an `&mut Providers` and mutate it in place. Usually
+we use the formulation above just because it looks nice, but you could
+as well do `providers.type_of = type_of`, which would be equivalent.
+(Here, `type_of` would be a top-level function, defined as we saw
+before.) So, if we wanted to have add a provider for some other query,
+let's call it `fubar`, into the crate above, we might modify the `provide()`
+function like so:
+
+```rust
+pub fn provide(providers: &mut Providers) {
+    *providers = Providers {
+        type_of,
+        fubar,
+        ..*providers
+    };
+}
+
+fn fubar<'cx, 'tcx>(tcx: TyCtxt<'cx, 'tcx>, key: DefId) -> Fubar<'tcx> { .. }
+```
+
+NB. Most of the `rustc_*` crate only provide **local
+providers**. Almost all **extern providers** wind up going through the
+`rustc_metadata` crate, which loads the information from the crate
+metadata.  But in some cases there are crates that provide queries for
+*both* local and external crates, in which case they define both a
+`provide` and a `provide_extern` function that `rustc_driver` can
+invoke.
+
+### Adding a new kind of query
+
+So suppose you want to add a new kind of query, how do you do so?
+Well, defining a query takes place in two steps:
+
+1. first, you have to specify the query name and arguments; and then,
+2. you have to supply query providers where needed.
+
+The specify the query name and arguments, you simply add an entry
+to the big macro invocation in `mod.rs`. This will probably have changed
+by the time you read this README, but at present it looks something
+like:
+
+```
+define_maps! { <'tcx>
+    /// Records the type of every item.
+    [] fn type_of: TypeOfItem(DefId) -> Ty<'tcx>,
+
+    ...
+}
+```
+
+Each line of the macro defines one query. The name is broken up like this:
+
+```
+[] fn type_of: TypeOfItem(DefId) -> Ty<'tcx>,
+^^    ^^^^^^^  ^^^^^^^^^^ ^^^^^     ^^^^^^^^
+|     |        |          |         |
+|     |        |          |         result type of query
+|     |        |          query key type
+|     |        dep-node constructor
+|     name of query
+query flags
+```
+
+Let's go over them one by one:
+
+- **Query flags:** these are largely unused right now, but the intention
+  is that we'll be able to customize various aspects of how the query is
+  processed.
+- **Name of query:** the name of the query method
+  (`tcx.type_of(..)`). Also used as the name of a struct
+  (`ty::maps::queries::type_of`) that will be generated to represent
+  this query.
+- **Dep-node constructor:** indicates the constructor function that
+  connects this query to incremental compilation. Typically, this is a
+  `DepNode` variant, which can be added by modifying the
+  `define_dep_nodes!` macro invocation in
+  `librustc/dep_graph/dep_node.rs`.
+  - However, sometimes we use a custom function, in which case the
+    name will be in snake case and the function will be defined at the
+    bottom of the file. This is typically used when the query key is
+    not a def-id, or just not the type that the dep-node expects.
+- **Query key type:** the type of the argument to this query.
+  This type must implement the `ty::maps::keys::Key` trait, which
+  defines (for example) how to map it to a crate, and so forth.
+- **Result type of query:** the type produced by this query. This type
+  should (a) not use `RefCell` or other interior mutability and (b) be
+  cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for
+  non-trivial data types.
+  - The one exception to those rules is the `ty::steal::Steal` type,
+    which is used to cheaply modify MIR in place. See the definition
+    of `Steal` for more details. New uses of `Steal` should **not** be
+    added without alerting `@rust-lang/compiler`.
+
+So, to add a query:
+
+- Add an entry to `define_maps!` using the format above.
+- Possibly add a corresponding entry to the dep-node macro.
+- Link the provider by modifying the appropriate `provide` method;
+  or add a new one if needed and ensure that `rustc_driver` is invoking it.
+
+#### Query structs and descriptions
+
+For each kind, the `define_maps` macro will generate a "query struct"
+named after the query. This struct is a kind of a place-holder
+describing the query. Each such struct implements the
+`self::config::QueryConfig` trait, which has associated types for the
+key/value of that particular query. Basically the code generated looks something
+like this:
+
+```rust
+// Dummy struct representing a particular kind of query:
+pub struct type_of<'tcx> { phantom: PhantomData<&'tcx ()> }
+
+impl<'tcx> QueryConfig for type_of<'tcx> {
+  type Key = DefId;
+  type Value = Ty<'tcx>;
+}
+```
+
+There is an additional trait that you may wish to implement called
+`self::config::QueryDescription`. This trait is used during cycle
+errors to give a "human readable" name for the query, so that we can
+summarize what was happening when the cycle occurred. Implementing
+this trait is optional if the query key is `DefId`, but if you *don't*
+implement it, you get a pretty generic error ("processing `foo`...").
+You can put new impls into the `config` module. They look something like this:
+
+```rust
+impl<'tcx> QueryDescription for queries::type_of<'tcx> {
+    fn describe(tcx: TyCtxt, key: DefId) -> String {
+        format!("computing the type of `{}`", tcx.item_path_str(key))
+    }
+}
+```
+
diff --git a/src/librustc/ty/maps/config.rs b/src/librustc/ty/maps/config.rs
diff --git a/src/librustc/ty/maps/keys.rs b/src/librustc/ty/maps/keys.rs
@@ -0,0 +1,162 @@
+// Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+// file at the top-level directory of this distribution and at
+// http://rust-lang.org/COPYRIGHT.
+//
+// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+// option. This file may not be copied, modified, or distributed
+// except according to those terms.
+
+//! Defines the set of legal keys that can be used in queries.
+
+use hir::def_id::{CrateNum, DefId, LOCAL_CRATE, DefIndex};
+use mir::transform::{MirSuite, MirPassIndex};
+use ty::{self, Ty, TyCtxt};
+use ty::subst::Substs;
+use ty::fast_reject::SimplifiedType;
+
+use std::fmt::Debug;
+use std::hash::Hash;
+use syntax_pos::{Span, DUMMY_SP};
+use syntax_pos::symbol::InternedString;
+
+/// The `Key` trait controls what types can legally be used as the key
+/// for a query.
+pub trait Key: Clone + Hash + Eq + Debug {
+    /// Given an instance of this key, what crate is it referring to?
+    /// This is used to find the provider.
+    fn map_crate(&self) -> CrateNum;
+
+    /// In the event that a cycle occurs, if no explicit span has been
+    /// given for a query with key `self`, what span should we use?
+    fn default_span(&self, tcx: TyCtxt) -> Span;
+}
+
+impl<'tcx> Key for ty::InstanceDef<'tcx> {
+    fn map_crate(&self) -> CrateNum {
+        LOCAL_CRATE
+    }
+
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        tcx.def_span(self.def_id())
+    }
+}
+
+impl<'tcx> Key for ty::Instance<'tcx> {
+    fn map_crate(&self) -> CrateNum {
+        LOCAL_CRATE
+    }
+
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        tcx.def_span(self.def_id())
+    }
+}
+
+impl Key for CrateNum {
+    fn map_crate(&self) -> CrateNum {
+        *self
+    }
+    fn default_span(&self, _: TyCtxt) -> Span {
+        DUMMY_SP
+    }
+}
+
+impl Key for DefIndex {
+    fn map_crate(&self) -> CrateNum {
+        LOCAL_CRATE
+    }
+    fn default_span(&self, _tcx: TyCtxt) -> Span {
+        DUMMY_SP
+    }
+}
+
+impl Key for DefId {
+    fn map_crate(&self) -> CrateNum {
+        self.krate
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        tcx.def_span(*self)
+    }
+}
+
+impl Key for (DefId, DefId) {
+    fn map_crate(&self) -> CrateNum {
+        self.0.krate
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.1.default_span(tcx)
+    }
+}
+
+impl Key for (CrateNum, DefId) {
+    fn map_crate(&self) -> CrateNum {
+        self.0
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.1.default_span(tcx)
+    }
+}
+
+impl Key for (DefId, SimplifiedType) {
+    fn map_crate(&self) -> CrateNum {
+        self.0.krate
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.0.default_span(tcx)
+    }
+}
+
+impl<'tcx> Key for (DefId, &'tcx Substs<'tcx>) {
+    fn map_crate(&self) -> CrateNum {
+        self.0.krate
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.0.default_span(tcx)
+    }
+}
+
+impl Key for (MirSuite, DefId) {
+    fn map_crate(&self) -> CrateNum {
+        self.1.map_crate()
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.1.default_span(tcx)
+    }
+}
+
+impl Key for (MirSuite, MirPassIndex, DefId) {
+    fn map_crate(&self) -> CrateNum {
+        self.2.map_crate()
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.2.default_span(tcx)
+    }
+}
+
+impl<'tcx> Key for Ty<'tcx> {
+    fn map_crate(&self) -> CrateNum {
+        LOCAL_CRATE
+    }
+    fn default_span(&self, _: TyCtxt) -> Span {
+        DUMMY_SP
+    }
+}
+
+impl<'tcx, T: Key> Key for ty::ParamEnvAnd<'tcx, T> {
+    fn map_crate(&self) -> CrateNum {
+        self.value.map_crate()
+    }
+    fn default_span(&self, tcx: TyCtxt) -> Span {
+        self.value.default_span(tcx)
+    }
+}
+
+impl Key for InternedString {
+    fn map_crate(&self) -> CrateNum {
+        LOCAL_CRATE
+    }
+    fn default_span(&self, _tcx: TyCtxt) -> Span {
+        DUMMY_SP
+    }
+}
diff --git a/src/librustc/ty/maps/mod.rs b/src/librustc/ty/maps/mod.rs
diff --git a/src/librustc/ty/maps/plumbing.rs b/src/librustc/ty/maps/plumbing.rs
diff --git a/src/librustc/ty/maps/values.rs b/src/librustc/ty/maps/values.rs
@@ -0,0 +1,49 @@
+// Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
+// file at the top-level directory of this distribution and at
+// http://rust-lang.org/COPYRIGHT.
+//
+// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+// option. This file may not be copied, modified, or distributed
+// except according to those terms.
+
+use ty::{self, Ty, TyCtxt};
+
+use syntax::symbol::Symbol;
+
+pub(super) trait Value<'tcx>: Sized {
+    fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> Self;
+}
+
+impl<'tcx, T> Value<'tcx> for T {
+    default fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> T {
+        tcx.sess.abort_if_errors();
+        bug!("Value::from_cycle_error called without errors");
+    }
+}
+
+impl<'tcx, T: Default> Value<'tcx> for T {
+    default fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> T {
+        T::default()
+    }
+}
+
+impl<'tcx> Value<'tcx> for Ty<'tcx> {
+    fn from_cycle_error<'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>) -> Ty<'tcx> {
+        tcx.types.err
+    }
+}
+
+impl<'tcx> Value<'tcx> for ty::DtorckConstraint<'tcx> {
+    fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> Self {
+        Self::empty()
+    }
+}
+
+impl<'tcx> Value<'tcx> for ty::SymbolName {
+    fn from_cycle_error<'a>(_: TyCtxt<'a, 'tcx, 'tcx>) -> Self {
+        ty::SymbolName { name: Symbol::intern("<error>").as_str() }
+    }
+}
+
diff --git a/src/librustc_back/README.md b/src/librustc_back/README.md
@@ -0,0 +1,6 @@
+NB: This crate is part of the Rust compiler. For an overview of the
+compiler as a whole, see
+[the README.md file found in `librustc`](../librustc/README.md).
+
+`librustc_back` contains some very low-level details that are
+specific to different LLVM targets and so forth.
diff --git a/src/librustc_driver/README.md b/src/librustc_driver/README.md
@@ -0,0 +1,12 @@
+NB: This crate is part of the Rust compiler. For an overview of the
+compiler as a whole, see
+[the README.md file found in `librustc`](../librustc/README.md).
+
+The `driver` crate is effectively the "main" function for the rust
+compiler.  It orchstrates the compilation process and "knits together"
+the code from the other crates within rustc. This crate itself does
+not contain any of the "main logic" of the compiler (though it does
+have some code related to pretty printing or other minor compiler
+options).
+
+
diff --git a/src/librustc_trans/README.md b/src/librustc_trans/README.md
@@ -1 +1,7 @@
-See [librustc/README.md](../librustc/README.md).
+NB: This crate is part of the Rust compiler. For an overview of the
+compiler as a whole, see
+[the README.md file found in `librustc`](../librustc/README.md).
+
+The `trans` crate contains the code to convert from MIR into LLVM IR,
+and then from LLVM IR into machine code. In general it contains code
+that runs towards the end of the compilation process.
diff --git a/src/librustc_typeck/README.md b/src/librustc_typeck/README.md
@@ -0,0 +1,48 @@
+NB: This crate is part of the Rust compiler. For an overview of the
+compiler as a whole, see
+[the README.md file found in `librustc`](../librustc/README.md).
+
+The `rustc_typeck` crate contains the source for "type collection" and
+"type checking", as well as a few other bits of related functionality.
+(It draws heavily on the [type inferencing][infer] and
+[trait solving][traits] code found in librustc.)
+
+[infer]: ../librustc/infer/README.md
+[traits]: ../librustc/traits/README.md
+
+## Type collection
+
+Type "collection" is the process of convering the types found in the
+HIR (`hir::Ty`), which represent the syntactic things that the user
+wrote, into the **internal representation** used by the compiler
+(`Ty<'tcx>`) -- we also do similar conversions for where-clauses and
+other bits of the function signature.
+
+To try and get a sense for the difference, consider this function:
+
+```rust
+struct Foo { }
+fn foo(x: Foo, y: self::Foo) { .. }
+//        ^^^     ^^^^^^^^^
+```
+
+Those two parameters `x` and `y` each have the same type: but they
+will have distinct `hir::Ty` nodes. Those nodes will have different
+spans, and of course they encode the path somewhat differently. But
+once they are "collected" into `Ty<'tcx>` nodes, they will be
+represented by the exact same internal type.
+
+Collection is defined as a bundle of queries (e.g., `type_of`) for
+computing information about the various functions, traits, and other
+items in the crate being compiled. Note that each of these queries is
+concerned with *interprocedural* things -- for example, for a function
+definition, collection will figure out the type and signature of the
+function, but it will not visit the *body* of the function in any way,
+nor examine type annotations on local variables (that's the job of
+type *checking*).
+
+For more details, see the `collect` module.
+
+## Type checking
+
+TODO
diff --git a/src/librustc_typeck/collect.rs b/src/librustc_typeck/collect.rs
@@ -8,50 +8,21 @@
 // option. This file may not be copied, modified, or distributed
 // except according to those terms.
 
-/*
-
-# Collect phase
-
-The collect phase of type check has the job of visiting all items,
-determining their type, and writing that type into the `tcx.types`
-table.  Despite its name, this table does not really operate as a
-*cache*, at least not for the types of items defined within the
-current crate: we assume that after the collect phase, the types of
-all local items will be present in the table.
-
-Unlike most of the types that are present in Rust, the types computed
-for each item are in fact type schemes. This means that they are
-generic types that may have type parameters. TypeSchemes are
-represented by a pair of `Generics` and `Ty`.  Type
-parameters themselves are represented as `ty_param()` instances.
-
-The phasing of type conversion is somewhat complicated. There is no
-clear set of phases we can enforce (e.g., converting traits first,
-then types, or something like that) because the user can introduce
-arbitrary interdependencies. So instead we generally convert things
-lazilly and on demand, and include logic that checks for cycles.
-Demand is driven by calls to `AstConv::get_item_type_scheme` or
-`AstConv::trait_def`.
-
-Currently, we "convert" types and traits in two phases (note that
-conversion only affects the types of items / enum variants / methods;
-it does not e.g. compute the types of individual expressions):
-
-0. Intrinsics
-1. Trait/Type definitions
-
-Conversion itself is done by simply walking each of the items in turn
-and invoking an appropriate function (e.g., `trait_def_of_item` or
-`convert_item`). However, it is possible that while converting an
-item, we may need to compute the *type scheme* or *trait definition*
-for other items.
-
-There are some shortcomings in this design:
-- Because the item generics include defaults, cycles through type
-  parameter defaults are illegal even if those defaults are never
-  employed. This is not necessarily a bug.
-
-*/
+//! "Collection" is the process of determining the type and other external
+//! details of each item in Rust. Collection is specifically concerned
+//! with *interprocedural* things -- for example, for a function
+//! definition, collection will figure out the type and signature of the
+//! function, but it will not visit the *body* of the function in any way,
+//! nor examine type annotations on local variables (that's the job of
+//! type *checking*).
+//!
+//! Collecting is ultimately defined by a bundle of queries that
+//! inquire after various facts about the items in the crate (e.g.,
+//! `type_of`, `generics_of`, `predicates_of`, etc). See the `provide` function
+//! for the full set.
+//!
+//! At present, however, we do run collection across all items in the
+//! crate as a kind of pass. This should eventually be factored away.
 
 use astconv::{AstConv, Bounds};
 use lint;

diff --git a/src/libsyntax/README.md b/src/libsyntax/README.md
@@ -0,0 +1,7 @@
+NB: This crate is part of the Rust compiler. For an overview of the
+compiler as a whole, see
+[the README.md file found in `librustc`](../librustc/README.md).
+
+The `syntax` crate contains those things concerned purely with syntax
+– that is, the AST ("abstract syntax tree"), parser, pretty-printer,
+lexer, macro expander, and utilities for traversing ASTs.