Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of hard tab indentation. #290

Merged
merged 5 commits into from
Sep 20, 2015

Conversation

SiegeLord
Copy link
Contributor

This is a draft, as perhaps not all implementation choices I made are ideal. I also wasn't sure what to do with tests.

The general idea of this change is to switch from using a plain usize to track indentation to a special Indent type which tracks block indentation and alignment (for visual alignment) separately. Adding the hard tab mode on top of that is then trivial. This increased type safety should prevent bugs where width is passed instead of offset and vice versa (e.g. see a possible bug in src/items.rs:61-62).

I've tested the patch and it seems to work fine in most cases. It doesn't interact well with visual indentation for function arguments + closures, but I don't think there's expectation that every possible indentation configuration option should result in a sensible style.

@nrc
Copy link
Member

nrc commented Sep 8, 2015

This comment might be useful re tests - https://github.com/nrc/rustfmt/pull/260#issuecomment-138117901

@nrc
Copy link
Member

nrc commented Sep 8, 2015

Could you explain what you mean by 'hard tab' please? I'm not familiar with the term

@nrc
Copy link
Member

nrc commented Sep 8, 2015

I think the implementation strategy is Ok, but it is hard to tell without knowing what exactly you are trying to accomplish. To some extent we already separate the indent for blocks from the indent for overlfows (see the two fields in RewriteContext) and it seems like that should be unified with your work, though I'm not sure exactly how. Abstracting over the indent seems like a good idea in general.

@SiegeLord
Copy link
Contributor Author

hard tabs mean using the tab character (\t) while soft tabs mean using a set number of spaces (e.g. 4 as per Rust style guide). I don't actually know where this terminology comes from, hah.

@fbergroth
Copy link

Should rustfmt have options that goes against the style guide?

Use 4 spaces for indentation, not tabs.

@SiegeLord
Copy link
Contributor Author

Should rustfmt have options that goes against the style guide?

To the extent that rustfmt already has options that enable going against the style guide (e.g. you can set your soft tab size to 2 spaces if you like), one more option doesn't seem like it is going to break things. This option in particular is pretty easy to implement (most of the changes in this PR is a refactoring to make it possible and are useful for other reasons as I've explained, the bit that changes spaces to tabs is about 10 lines long).

@nrc
Copy link
Member

nrc commented Sep 10, 2015

Yeah, I'm happy to have options which are not allowed by the style guide, that might change in the long run, but it's fine for the foreseeable future.

@nrc
Copy link
Member

nrc commented Sep 10, 2015

@SiegeLord so, I think this (supporting hard tabs) would be a good thing to add to rustfmt, but I'd like to separate some of the things going on in this PR. As I understand it, the Indent refactoring is not essential for supporting hard tabs, but it makes it easier because you only have to reason about it in one place, rather than in every place we use an indent. However, (and I think this might only show up when you rebase) we already have a notion of overflow vs block indent, which I think is the same notion as your alignment vs block. So, I think a rebase would clear up how those things interact.

An easy way to make this work might be to just modify the make_indent function. If that works I'd like to do that and add the hard tabs config option, and then look at the Indent refactoring separately. In particular, I'm not sure how many places use just a block indent, if that's not many places, then using an Indent probably makes sense. Do you think modifying make_indent as a first attempt will work? Does this seem like a good way forward?

@SiegeLord
Copy link
Contributor Author

There's not enough information in make_indent itself to indent things correctly, e.g. consider this:

mod test {
    fn test(a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32,
            a: i32) {
    }
}

The indents before each argument with this PR are a single hard-tab character (because we're in a mod) followed by 8 spaces (because we're using visual function indent style). If I implemented make_indent by just replacing all the spaces by indent / tab_spaces then that'd not be formatted correctly (at least, that's the only thing I can imagine doing inside make_indent if not tracking the two indent styles separately).

I noticed the overflow_indent when I was implementing this and it seemed like the same thing, but it's only used when formatting let assignments, as far as I can tell, and in that place it's actually used with context.config.tab_spaces which means that it actually is a tab indent and not a visual indent (probably a bug?).

Basically, the implementation strategy I went by was whenever I saw tab_spaces used I interpreted that as a block_indent, and for everything else I used alignment. This is consistent with the visual vs tabbed indent styles as I understood them.

If the overflow_context was used in all places where there was an overflow, and it was used consistently, then yes, that would obviate the need for the Indent struct, I think. The one way I can imagine this is that you'd change the signature of Rewrite::rewrite to not contain a separate indent argument, but instead use the two indent fields inside RewriteContext. I suppose I could do that instead... it'd be a bit of work :P.

@nrc
Copy link
Member

nrc commented Sep 13, 2015

In the example you give, I'm not 100% clear what you mean by incorrect formatting - are you saying that the arg lines would be started with 3 tab spaces, and you'd expect 2?

AIUI, there are only problems where we are using visual indenting. From your earlier comment it sounds like you don't expect hard tabs and visual indenting to work together. If using hard tabs overrode any visual indent options (i.e., hard tabs => all indents are block indents), would it then be possible to just modify make_indent? Is this an acceptable outcome, or do you expect some visual indenting when working with hard tabs (sorry, I've never used hard tabs, so I'm not really sure what is expected).

Hmm, so it seems that Indent::Alignment and overflow_indent are doing different things. I think your Alignment is strictly used for visual indent, whereas overflow_indent is for the indent used for multiple line expressions (which may be visual or block sized), as opposed to the indent due to blocks. Depending on which options are set overflow_indent could be either a multiple of tab_spaces, or not.

'overflow_indent' is a pretty new concept and it is not widely used. It should be used in more places, I think.

So, it sounds like the two concepts are not easily unifiable. I must admit that having these two similar, but different concepts is unappealing. On the other hand, I do want to be able to support hard tabs, and your implementation seems like a good way to do so.

@marcusklaas, @cassiersg do you have an opinion about any of this?

@SiegeLord
Copy link
Contributor Author

Hard tabs and visual alignment in principle can work everywhere except in an expression context with blocks (e.g. closures). E.g. in the above example, I think the hard tab + visual alignment should produce this (I use ---> to signify tab character, and . to signify a space):

mod c {
--->fn foo(a: i32,
--->.......a: i32) {}
}

That's perfectly reasonable, and something I think I'd personally use.

For hard tab + Block style, it'd look like this (I think it's reasonable as well):

mod c {
--->fn foo(
--->--->a: i32,
--->--->a: i32
--->) {}
}

Actually, the only place I notice it breaking down is with list formatting (I erroneously thought it was with all expression formating). E.g. with spaces this is what you typically get with function calls + closures:

fn foo() {
....foobar(a,
...........|abc| {
...............return 0;
...........});
}

With hard tabs the current algorithm produces this:

fn foo() {
--->foobar(a,
--->.......|abc| {
--->--->.......return 0;
--->.......});
}

which is completely ridiculous looking (but in my opinion the only correct way to do it). It occurs to me that this isn't actually configurable right now, so you always get a visual alignment with vertical lists... but in principle if there was a tabbed variant of this, it'd look just fine with hard tabs as well as spaces.

@marcusklaas
Copy link
Contributor

The tabs for block indentation, spaces for alignment is a very valid style in my view. In some ways, it is actually ideal. It enables one to change the tab-size in editor without breaking formatting, not counting going over the column limit. It's just a very hard style to do by hand, so if rustfmt can help out here, it'd be sweet.

My thoughts are mostly with the implementation side of things. We should keep the mental model as clean as simple as possible. The distinction between block indentation and visual indentation is very natural and one we should definitely keep, if not expand. We sometimes distinguish between the two by keeping a block_indent in RewriteContext and FmtVisitor, but often times disregard the difference. We should, however, make this distinction in exactly one place. Ideally, we'd just keep track of a single Ident, without having an additional block_indent and overflow_indent in our contexts. I don't know if that's possible, but we should think about this thoroughly and pick the most natural fit.

@cassiersg
Copy link
Contributor

I agree with @marcusklaas for tabs indentation.

On the implementation side, I think we ideal would be to have only Indent. The only drawback I see is a loss of semantic information: block indentation and overflow indentation (eg. line broken after let x =) would be mixed in a single variable. I think it wouldn't cause any problem currently, but it may if we decide one day (in some extreme width shortage) to break the line and go back to block indentation (but this would be poor style).
Anyway, if we need that, we could add a field in Indent.

@SiegeLord
Copy link
Contributor Author

So is there something I need to change then, or is it all good and I can spend some time writing tests?

@nrc
Copy link
Member

nrc commented Sep 17, 2015

@SiegeLord sorry, I've been really slow on this, I must admit that that is due in part to some uncertainty about it, but that's a poor excuse.

My only question is about using Indent for the overflow indent - it seems weird that we might have tabs and spaces for both the block and overflow indent, but I guess it doesn't cause errors?

And a concrete comment: could you add Indent::empty (or no_indent or something) so you don't need Indent::new(0, 0)?

Otherwise, I think this looks good. I'd like to see a bunch of tests, otherwise I think it is ready to land.

@SiegeLord SiegeLord force-pushed the tabs branch 2 times, most recently from ebc645c to 4916264 Compare September 19, 2015 17:15
@SiegeLord
Copy link
Contributor Author

My only question is about using Indent for the overflow indent - it seems weird that we might have tabs and spaces for both the block and overflow indent, but I guess it doesn't cause errors?

The only usage of overflow indent today actually uses config.tab_spaces for it's indent width, which implies to me that it should be formatted using tabs. I imagine future uses of it might want it to be all spaces. Either way, it won't cause errors.

I've now rebased this change and made a few changes on top of it:

  • I've removed make_indent and called Indent::to_string directly, as it seems more clear that way.
  • I've added Indent::none() as was suggested.
  • I've modified visit_expr to use the visitor's block_indent rather than inferring the indent from the buffer's offset. This seemed not to affect tests, but made the hard tab output a lot better.
  • I've added a test file. I wasn't sure how crazy you wanted me to go with testing the corner cases, so I just made some educated guesses of what are some interesting indentation situations from other tests and plopped them into a single file. They were good enough to point out to me the issue with visit_expr so they weren't completely pointless.

@cassiersg
Copy link
Contributor

Needs a rebase to get tests working (see #332).

@@ -20,15 +20,15 @@ impl<'a> FmtVisitor<'a> {
self.format_missing_inner(end, |this, last_snippet, _| this.buffer.push_str(last_snippet))
}

pub fn format_missing_with_indent(&mut self, end: BytePos) {
pub fn format_missing_with_indent(&mut self, end: BytePos, config: &Config) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should already have access to the configuration through self.config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks! Can't believe I did that.

@marcusklaas
Copy link
Contributor

This looks super good! Final question from me: could we easily merge the block_indent from RewriteContext into the Indent we pass to instances of Rewrite?

@SiegeLord
Copy link
Contributor Author

By merge do you mean remove it from RewriteContext entirely and propagate it via the 'offset' argument instead? I can try it.

@marcusklaas
Copy link
Contributor

Yes, that'd be nice! But if it's not easily done, we can just merge this as-is, I think.

@SiegeLord
Copy link
Contributor Author

So I looked into this, and it turned out to be a bit more complicated than I initially thought it might be. While it's easy to make something compile, there are many spots in the code where the context's block_indent goes out of sync from the offset parameter. E.g. consider formatting this:

match x {
    Some(x) if x > 0 ||
            x < 0 || { // offset contains one block indent and some alignment spaces, block_indent is just the block indent
        x; // only block_indent has the information to align this
    }
}

Sure, you can drop the alignment when you indent the contents of that block, but then that breaks some other pieces of code:

aaaaaaaaaaaaaaaa.map(|x| {
                        x; // now x needs to be aligned with both tabs and spaces
                    })

I'm sure it can all be fixed, but I think it'd take the greater part of the day, I think.

marcusklaas added a commit that referenced this pull request Sep 20, 2015
Initial implementation of hard tab indentation.
@marcusklaas marcusklaas merged commit ce2c4f6 into rust-lang:master Sep 20, 2015
@marcusklaas
Copy link
Contributor

Thanks mega for this, @SiegeLord!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants