Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(GH-824)(GH-812) Improve code folding speed #825

Merged
merged 5 commits into from
Jan 15, 2019

Conversation

glennsarti
Copy link
Contributor

Previously there were no tests for PowerShell classes. This commit adds a simple
test for this scenario to ensure future changes do not break folding.

Previously the folding provider created many intermediate arrays and lists and
required post-processing. This commit changes the behaviour to use an
accumlator patter with an extended Dictionary class. This new class adds a
SafeAdd method to add FoldingRanges, which then has the logic to determine if
the range should indeed be added, for example, passing nulls or pre-existing
larger ranges.

By passing around this list using ByReference we can avoid creating many objects
which are just then thrown away.

Previously each token type detection was separated into discrete blocks
to make reading the code easier. However this meant there were many
enumerations of the Tokens array as well as passing around intermediate
arrays/lists. This commit takes the individual methods and overlaps them
to reduce the number of enumerations and regular expression matching to
a minimum.

Previously the code folding was not tested against DSC configuration scripts.
This commit adds tests for a sample DSC script to ensure the folding occurs at the correct
places

Fixes #824
Fixes #812

@glennsarti glennsarti requested a review from rjmholt as a code owner December 14, 2018 13:59
@glennsarti
Copy link
Contributor Author

@TylerLeonhardt @rjmholt These are the code folding improvements from the AST work I did, but now are Token processing only.

Managed to shave nearly 50% runtime and probably reduced objects allocated significantly.

Raw data;
Given a script with 4096 foldable regions;

Before

Duration: 230.5017  <--- Ignored
Duration: 32.9991
Duration: 40.5561
Duration: 35.4358
Duration: 38.018
Duration: 36.9978
Duration: 34.0487
Duration: 31.9936
Duration: 34.9428
Duration: 41.0429
Duration: 36.0661
Duration: 34.0165
Duration: 37.0365
Duration: 34.026
Duration: 33.9919
Duration: 36.0026
Duration: 34.585
Duration: 39.5611
Duration: 34.5185
Duration: 34.5172
Avg Duration: 35.8082210526316

After

Duration: 132.0037  <--- Ignored
Duration: 15.9972
Duration: 15.963
Duration: 22.0686
Duration: 15.9767
Duration: 14.9559
Duration: 30.5756
Duration: 18.0449
Duration: 18.9995
Duration: 32.9537
Duration: 16.0499
Duration: 17.0227
Duration: 20.4783
Duration: 16.9985
Duration: 14.0394
Duration: 21.9654
Duration: 17.5853
Duration: 14.5099
Duration: 15.039
Duration: 26.9451
Avg Duration: 19.2720315789474

Copy link
Member

@TylerLeonhardt TylerLeonhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good - and those perf numbers 👌👌👌 awesome. Just had a couple comments but this is looking good.

@glennsarti
Copy link
Contributor Author

@TylerLeonhardt Comments addressed

Copy link
Member

@TylerLeonhardt TylerLeonhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - the perf numbers are exciting 😃

@glennsarti
Copy link
Contributor Author

@TylerLeonhardt Tag you're it. Typo fixed.

@TylerLeonhardt
Copy link
Member

I'm good. Let's wait on 1 more 👍

foreach (FoldingReference fold in TokenOperations.FoldableRegions(
editorSession.Workspace.GetFile(documentUri).ScriptTokens,
this.currentSettings.CodeFolding.ShowLastLine))
script.ScriptTokens).ToArray())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you assign TokenOperations.FoldableRegions(script.ScriptTokens).ToArray() to an intermediate variable (foldableRegions?) for better debuggability and so this foreach doesn't have to wrap to a second line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW is it really necessary to create/copy a new array here - which I see the ToArray method below is doing? The Dictionary.Values property will return an IEnumerable<FoldingReference> which is all you need to iterate (foreach) over it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are serialization errors when passing IEnumberable<FoldingReference> because the LSP needs FoldingRange[] (

)

And by errors I mean the serialization does not conform to the LSP specification. Which is why I do an object conversion. I tried to keep the number of array creations/copies to a minimum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you assign TokenOperations.FoldableRegions(script.ScriptTokens).ToArray() to an intermediate variable (foldableRegions?) for better debuggability and so this foreach doesn't have to wrap to a second line?

I can. I was trying to avoid creating array copies.

Copy link
Contributor

@rkeithhill rkeithhill Dec 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using intermediates variables won't cause array copies.

There are serialization errors when passing IEnumberable

In this case (foreach case) you aren't really using the array but each element in the array. Therefore IEnumerable<FoldingReference> as the target of the foreach will yield individual FoldingReference objects that the foreach loop operates on. Now, you may use ToArray() later - when sending results back to the LSP client. In that case, sure, convert it to an array. But here, in this foreach loop, I still don't think it is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Went with
foreach (KeyValuePair<int, FoldingReference> entry in TokenOperations.FoldableRegions(script.ScriptTokens))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When moving from inheritance to composition this has now changed to;
foreach (FoldingReference fold in TokenOperations.FoldableReferences(script.ScriptTokens).References)

public void SafeAdd(FoldingReference item)
{
if (item == null) { return; }
FoldingReference currentItem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert blank line here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW you can see these various recommendations (rules) here SA1505 - SA1517. At some point, I'd really like to add this analyzer pkg to PSES and turn on these style rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've been nagging @TylerLeonhardt and @rjmholt on slack about that. 😉 I'd love to see some linting tools, at the very least, I can run locally (and run in CI)

@glennsarti
Copy link
Contributor Author

@rkeithhill Comments addressed.

Copy link
Contributor

@rkeithhill rkeithhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for making those changes!

/// A class that holds a list of FoldingReferences and ensures that when adding a reference that the
/// folding rules are obeyed, e.g. Only one fold per start line
/// </summary>
public class FoldingReferenceList : Dictionary<int, FoldingReference>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't require the all the dictionary methods, I would opt for composition over inheritance, so that no methods are exposed that would allow an invalid state in the list.

Copy link
Contributor Author

@glennsarti glennsarti Dec 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Also took the opportunity to rename TokenOperations.FoldableRegions to TokenOperations.FoldableReferences as it returns a FoldingReferenceList type.

{
if (currentItem.CompareTo(item) == 1) { this[item.StartLine] = item; }
}
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use return rather than else here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns void so the return doesn't seem to be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he meant:

if (references.TryGetValue(item.StartLine, out FoldingReference currentItem))
{
    if (currentItem.CompareTo(item) == 1) { references[item.StartLine] = item; }
    return;
}

references[item.StartLine] = item;

tokenCommentRegionStack.Push(token);
continue;
}
else if (s_endRegionTextRegex.IsMatch(token.Text))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for else given continue above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@glennsarti glennsarti force-pushed the make-folding-bettere branch from 0983866 to a3f4495 Compare January 7, 2019 07:52
@glennsarti
Copy link
Contributor Author

Rebased due to conflict

@glennsarti glennsarti force-pushed the make-folding-bettere branch from a3f4495 to 5899940 Compare January 7, 2019 08:03
@TylerLeonhardt
Copy link
Member

@glennsarti can you deal with the last conflict?

@rjmholt can you look at this when you can? I know you're on your way back to Seattle so no rush

@glennsarti
Copy link
Contributor Author

Stop merging other stuff :-)

Previously there were no tests for PowerShell classes.  This commit adds a simple
test for this scenario to ensure future changes do not break folding.
@glennsarti glennsarti force-pushed the make-folding-bettere branch from 5899940 to fd89b7a Compare January 8, 2019 05:29
@glennsarti
Copy link
Contributor Author

Rebased, waiting on CI

@rjmholt rjmholt self-assigned this Jan 10, 2019
@TylerLeonhardt
Copy link
Member

ping @glennsarti about CI

…o it's own class

Previously the folding provider created many intermediate arrays and lists and
required post-processing.  This commit changes the behaviour to use an
accumlator patter with an extended Dictionary class.  This new class adds a
`SafeAdd` method to add FoldingRanges, which then has the logic to determine if
the range should indeed be added, for example, passing nulls or pre-existing
larger ranges.

By passing around this list using ByReference we can avoid creating many objects
which are just then thrown away.

This commit also moves the ShowLastLine code from the FoldingProvider into the
Language Server.  This reduces the number of array enumerations to one.
@glennsarti glennsarti force-pushed the make-folding-bettere branch from fd89b7a to 0da7198 Compare January 12, 2019 07:34
Previously each token type detection was separated into discrete blocks
to make reading the code easier. However this meant there were many
enumerations of the Tokens array as well as passing around intermediate
arrays/lists.  This commit takes the individual methods and overlaps them
to reduce the number of enumerations and regular expression matching to
a minimum.

Note that there are considerable code comments here due to the code now
being more complex on initial review.
Previously the code folding was not tested against DSC configuration scripts.
This commit adds tests for a sample DSC script to ensure the folding occurs at the correct
places
Previously the folder would search for the region markers without case sensitivity.
This commit modifies the regular expressions to be more strict on the what
is a region marker and adds a negative test to ensure that regions that are not
cased correctly are not folded .
@glennsarti glennsarti force-pushed the make-folding-bettere branch from 0da7198 to 4ce90d9 Compare January 12, 2019 07:50
@glennsarti
Copy link
Contributor Author

@TylerLeonhardt Yeah whoops, broke after the rebase. Also fixed up some codacy violations.

@TylerLeonhardt TylerLeonhardt merged commit 81b3cee into PowerShell:master Jan 15, 2019
@glennsarti glennsarti deleted the make-folding-bettere branch January 21, 2019 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants