Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove re-analysis check for non-scripts in VS - Fixed a cross-project reference regression #11228

Merged
merged 37 commits into from
Mar 24, 2021

Conversation

TIHan
Copy link
Contributor

@TIHan TIHan commented Mar 12, 2021

There is a piece of code that reacts to our background compiler's file checked changes to tell Roslyn to re-analyze a file. This bit of code can changed to only do the behavior on script files as Roslyn does the right thing in re-evaluating files whose dependents have changed.

Also, in our current dev16.10 preview, we have a regression where cross-project referencing only "sometimes" work in reacting to changes on dependent projects. This is most likely caused by the changes we have made in removing the reliance on the reactor thread in FCS. This PR fixes this to an extent. There remains one problem where you modify a file and then wait a few seconds before saving, other files won't react to the changes when the file is saved; when a file is saved in Roslyn workspaces, an event does not occur - they only occur when you change the file/source text in the editor itself, or if it changed on-disk outside of VS. When we start removing the dependence on the on-disk file system in FCS, we will be able to fully fix this issue.

The fix for part of the regression is removing the ability to get stale data in what otherwise looks like an up-to-date call. Instead, it will either get the data if the data is still valid by the timestamps on disk, or we use the cacheStamp.

@TIHan
Copy link
Contributor Author

TIHan commented Mar 12, 2021

@dsyme - will need you to look at this because this is based on the conversation we had regarding the explicit re-analysis call.

@TIHan TIHan requested a review from dsyme March 12, 2021 21:44
@cartermp
Copy link
Contributor

cartermp commented Mar 17, 2021

@dsyme I think we need this in. Current dogfood builds of 16.10 constantly peg the CPU. In my case, my CPU is constantly at 15% or higher CPU even if I'm doing nothing and I've only had one file open.

I inspected a trace and found that there is just an insane amount of re-typechecking going on. Something has clearly regressed there:

image

That's over 17 thousand calls to TcExpr and its children in 39 seconds after leaving a single file in FSharp.Editor open. Clearly something is horribly wrong.

@TIHan's changes fix the issue for me and critically, in-memory cross-project references and all associated tooling all appears to still work.

@@ -444,18 +444,17 @@ type BackgroundCompiler(legacyReferenceResolver, projectCacheSize, keepAssemblyC
}

member _.GetCachedCheckFileResult(builder: IncrementalBuilder, filename, sourceText: ISourceText, options) =
// This is to check if the check results are still valid by their timestamp.
// A return of None means the file needs to be re-evaluated as a dependent has changed; therefore, do not get the cached result.
match builder.TryGetCheckResultsForFileInProject filename with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will and I talked this through - this is basically sound and represents a fix, but two problems:

  1. Better if this is just a fix to the existing AreCheckResultsBeforeFileInProjectReady - we only care about the BeforeFileInProject state - this will be a marginal perf improvement hitting the cache more often.

  2. This requires timestamp checks on potentially 1000s of files in large solutions. Currently we do this when caching fails, but never in the process of looking up the cache (which happens very very often, even for basic editing like quick info, when the solution is not changing at all).

For (2) an alternative is to pass in the "GetDependentVersion" from the Roslyn project to act as a first quick positive test for the validity of the cache:

            let cacheEntry = lookup cache
             if (project1.GetDependentVersion == cacheEntry.SavedProjectDependentVersion) 
            THEN use cacheEntry
            ELSE 
               IF AreCheckResultsBeforeFileInProjectReady... &&  sourceText.GetHashCode() = cacheEntry.SavedHashCode THEN use cacheEntry
               ELSE
                   recompute

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsyme What would the alternative look like for non-Roslyn scenarios? Can this check be a parameter lambda passed to the checker?

Copy link
Member

@auduchinok auduchinok Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsyme I've checked and we can compute file/assembly dependencies timestamp for a project efficiently. @TIHan would it be possible to have a parameter with type like FSharpProjectOptions -> DateTime?

Copy link
Contributor

@dsyme dsyme Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the request is here. You can pass in a hash of the state of the world using the changes in this PR if you wish. If not you will get the old behaviour

So I think what's in is ok. We don't want to complicate things with extra configurable function paramaters

@dsyme
Copy link
Contributor

dsyme commented Mar 17, 2021

Will and I had a look

  1. Testing: We've had many problems in this area in the past and need a solid manual test matrix that we can each replicate
  2. Scripts: Likely problems with script files (see below)
  3. Perf: Likely problem with too many file stamp time checks when hitting the CheckFileInProject cache, see this comment

Testing

We need a very careful manual test matrix for this (ideally documented under tests/walkthroughs)

  1. script files (with #load and #r - editing, creating, deleting these)

    • open script file with #load "b.fs", wait for no errors, make change in b.fs with external tool, see if errors appear when switching back to a.fs. Similarly other changes like deleting b.fs (causing errors), creating b.fs (clearing errors)
  2. in-project files

    • external change to file in project
    • external change to file in referenced project (with in-memory cross-project references)
    • external build/create/delete of DLLs for referenced projects (without in-memory cross-project references)
    • external build/create/delete of directly referenced DLLs (with/without in-memory cross-project references)
    • changes to signature files
    • changes to implementation files

Scripts and Reanalyze() triggers

  • A Reanalyze call (either by Roslyn or us) is needed when the "before file" logical state of a file changes

  • Our understanding is that Roslyn correctly makes these calls for all in-project files. Our test matrix needs to confirm this.

    • Roslyn has a "SolutionCrawler"

    • Our understanding is that Roslyn keeps file-watchers for metadata references and files-in-solution for actual projects. We believe these are adequate to trigger reanalysis events within projects

  • Our understanding is that Roslyn doesn't make these calls for scripts. How can it? The script is in the "Misc files project" that knows nothing about dependencies.

  • Typical Scenario: Script references DLLs built by a project. Has red squigglies. Start build of solution from command line.
    Go back to script. Expect red squigglies to disappear (because a Reanalyze happens)

  • Some ideas for how to arrange for these events to be triggered ourself:

    1. The current solution is the BeforeFileChecked event being removed here. This relies on something happening in the script (e.g. quick info)
      to trigger ImplicitBackgroundProjectBuild work, which then does the timestamp check, raises the event and
      the reanalysis happens.

    2. A background async loop in FSharp.Editor of our own that continually recomputes the scriptProject.BeforeFileStateToken() and if
      it has changed then Reanalyze; OR

    3. Somehow set up a Roslyn miscellaneous project that accurately represents the dependencies of the script project
      Then Roslyn would schedule the Reanalyze for us when the FileNotify on the DLL changes.

    4. On each user action (focus on document, quick info etc) spawn a single background async check as in (2)

Will will probably go for option (1).

@dsyme
Copy link
Contributor

dsyme commented Mar 17, 2021

@cartermp Will and I discussed the perf things you've been seeing

  1. Opening a file in FSharp.Editor will result in a typecheck of the entire compiler. This could easily be 17K calls. We're not too worried by this if it was a perf measurement from startup (first open of file) or similar

  2. We're most concerned about the case where you saw ongoing CPU usage even after leaving VS to acquiesce.

  3. Could you write us a step-by-step repro please?

@cartermp
Copy link
Contributor

The repro on my machine is:

  1. Open VisualFSharp.sln
  2. Open DocumentDiagnosticAnalyzer.fs
  3. Wait for colors
  4. Keep waiting

I consistently observe 15% CPU at all times with this repro and latest dogfood build of 16.10. If I use 16.9 or if I have @TIHan's fix, I do not observe the constant CPU usage.

@dsyme
Copy link
Contributor

dsyme commented Mar 17, 2021

I consistently observe 15% CPU at all times with this repro and latest dogfood build of 16.10. If I use 16.9 or if I have @TIHan's fix, I do not observe the constant CPU usage.

OK cool thanks. Odd as nothing in this change should specifically fix this AFAICS....

Is that dogfood the internal preview feed?

@cartermp
Copy link
Contributor

Yep, internal feed.

…Editor to only call the FSharpChecker extensions for checking a file.
@TIHan TIHan changed the title Remove re-analysis check in VS - Fixed a cross-project reference regression Remove re-analysis check for non-scripts in VS - Fixed a cross-project reference regression Mar 17, 2021
@cartermp
Copy link
Contributor

I think that we should also check against issues like this: #6646

To make sure we haven't regressed.

let cacheStamp = defaultArg cacheStamp (sourceText.GetHashCode() |> int64)

// Check the cache. We can only use cached results when there is no work to do to bring the background builder up-to-date
let cachedResults = parseCacheLock.AcquireLock (fun ltok -> checkFileInProjectCache.TryGet(ltok, (filename, cacheStamp, options)))
Copy link
Member

@auduchinok auduchinok Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TIHan This check doesn't work properly for me: changes in prior files don't make subsequent files get reanalyzed when cacheStamp is not used. Consider the following repro steps for project with two files:

  • Parse and check File2.fs (File1.fs is checked as a dependency, results for File1 and File2 get cached)
  • Change File1.fs and parse and check it
  • Parse and check File2.fs again, it'll have the same stamp (since its source hasn't changed), so it will use the old results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AreCheckResultsBeforeFileInProjectReady should be returning false in your case, perhaps I'll force clear the cache since it's technically invalid at that point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we able to use CheckFileInProjectAllowingStaleCachedResults for invalid results? It might be OK for some features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fair, but I don't know if FCS should dictate when to use stale cached results as each editor will be different.

Copy link
Contributor Author

@TIHan TIHan Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckFileInProjectAllowingStaleCachedResults looks like it's considered Obsolete and encourages us to use CheckFileInProject. I think we should consider not making it Obsolete and just have a separate call to get cached stale results.

@TIHan TIHan closed this Mar 22, 2021
@TIHan TIHan reopened this Mar 22, 2021
// If a cache stamp is not provided, we need to do a full-up-to-date check on the timestamps for the dependencies.
let recheckDeps = cacheStamp.IsNone
let cacheStamp = defaultArg cacheStamp (sourceText.GetHashCode() |> int64)
if recheckDeps && not (builder.AreCheckResultsBeforeFileInProjectReady filename) then
Copy link
Contributor

@dsyme dsyme Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't yet convince me. If cacheStamp.IsNone I now believe we should basically do as we did before.

  1. We must avoid calling AreCheckResultsBeforeFileInProjectReady before the lookup since it is an expensive operation.

  2. We should look up the cache using sourceText.GetHashCode() and re-check validity of the entry after a successful lookup, just as we did before. If invalid we can remove the entry

  3. The previous code's use of GetCheckResultsBeforeFileInProjectEvenIfStale used tcPrior.TimeStamp = priorTimeStamp as an extra way of avoiding a AreCheckResultsBeforeFileInProjectReady call. That should stay in, it is a correct optimization. If the stamps don't match then there's no chance AreCheckResultsBeforeFileInProjectReady will succeed.

That is, for the case where cacheStamp.IsNone I'm now convinced the previous code was both correct and efficient (given that AreCheckResultsBeforeFileInProjectReady is fixed as part of this PR) - so the basic structure of what was there should stay in place. Adding the cache removal makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Now that AreCheckResultsBeforeFileInProjectReady is fixed, we should keep the previous behavior when the cacheStamp is None.

@TIHan
Copy link
Contributor Author

TIHan commented Mar 24, 2021

This is ready.

@dsyme dsyme merged commit 36b286b into dotnet:main Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants