Skip to content

Commit

Permalink
Make FileId non-zero
Browse files Browse the repository at this point in the history
When porting from `codespan` 0.3 Arret I needed to track the `FileId`
separately from the `Span` now that the file isn't implied in the span
itself. Having a simple struct like this works for 90% of the cases:

```rust
struct Span {
  file_id: codespan::FileId,
  codespan_span: codespan::Span,
}
```

However, this requires that every time we parse any source we require a
`FileId` which requires a `codespan::Files` to create. This is overkill
in a couple of cases:

1. Unit tests that just need to parse simple Arret strings as a
   shorthand for building our AST

2. Syntax highlighting in the REPL

The compromise in etaoins/arret#160 is to use `Option<FileId>`. I'm not
100% happy with that as the ultimate solution but it was the easiest
path to take during porting. However, this makes the size of the above
struct 13 bytes which will typically be rounded up to 16 bytes for
alignment and padding.

We can use `std::num::NonZeroU32` instead and offset all of our indexes
by 1. This allows the Rust compiler to use 0 to represent the `None`
case and bring the size down to 12 bytes. Because the struct only needs
an alignment of 4 bytes this is more likely to stay as 12 bytes in most
contexts.

A nicer solution would be if Rust had e.g. `NonMaxU32` but it appears
the Rust team instead wants to use const generics so arbitrary niche
values can be specified.
  • Loading branch information
etaoins committed Jan 4, 2020
1 parent 8d94c0c commit edd0e94
Showing 1 changed file with 21 additions and 4 deletions.
25 changes: 21 additions & 4 deletions codespan/src/file.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#[cfg(feature = "serialization")]
use serde::{Deserialize, Serialize};
use std::num::NonZeroU32;
use std::{error, fmt};

use crate::{ByteIndex, ColumnIndex, LineIndex, LineOffset, Location, RawIndex, Span};
Expand Down Expand Up @@ -67,7 +68,23 @@ impl fmt::Display for SpanOutOfBoundsError {
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[cfg_attr(feature = "serialization", derive(Deserialize, Serialize))]
#[cfg_attr(feature = "memory_usage", derive(heapsize_derive::HeapSizeOf))]
pub struct FileId(u32);
pub struct FileId(NonZeroU32);

impl FileId {
/// Offset of our `FileId`'s numeric value to an index on `Files::files`.
///
/// This is to ensure the first `FileId` is non-zero for memory layout optimisations (e.g.
/// `Option<FileId>` is 4 bytes)
const OFFSET: u32 = 1;

fn new(index: usize) -> FileId {
FileId(NonZeroU32::new(index as u32 + Self::OFFSET).unwrap())
}

fn get(self) -> usize {
(self.0.get() - Self::OFFSET) as usize
}
}

/// A database of source files.
#[derive(Clone, Debug, Default)]
Expand All @@ -84,7 +101,7 @@ impl Files {
/// Add a file to the database, returning the handle that can be used to
/// refer to it again.
pub fn add(&mut self, name: impl Into<String>, source: impl Into<String>) -> FileId {
let file_id = FileId(self.files.len() as u32);
let file_id = FileId::new(self.files.len());
self.files.push(File::new(name.into(), source.into()));
file_id
}
Expand All @@ -100,13 +117,13 @@ impl Files {
/// Get a the source file using the file id.
// FIXME: return an option or result?
fn get(&self, file_id: FileId) -> &File {
&self.files[file_id.0 as usize]
&self.files[file_id.get()]
}

/// Get a the source file using the file id.
// FIXME: return an option or result?
fn get_mut(&mut self, file_id: FileId) -> &mut File {
&mut self.files[file_id.0 as usize]
&mut self.files[file_id.get()]
}

/// Get the name of the source file.
Expand Down

0 comments on commit edd0e94

Please sign in to comment.