Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

Allows to use .pyi file annotations supplement to .py files in analysis core #331

Closed

Conversation

lostmsu
Copy link
Contributor

@lostmsu lostmsu commented Nov 1, 2018

PythonAnalyzer.AddModuleAnnotations allows to add a special kind of module, that is external type annotation. In this change, internally those are tracked as ordinary modules with a special name suffix __pyi__ (not .__pyi__! That is required for relative imports to work).

These modules are analyzed just like any other module.

For every normal module DDG creates a reference to corresponding .pyi, if one exists, to create a dependency.

Now that both are analyzed, FunctionAnalysisUnit needs to find its counterpart from supplemental annotations. For this purpose, AnalysisUnit implements GetExternalAnnotationAnalysisUnit. For module, it looks for the annotations module in ProjectState. For class and function, unit simply descends into the scope by class/function name.

Once FunctionAnalysisUnit has it annotations counterpart, it can read analyzed annotations from it, if the local annotations are missing.

This change only applies to the core analysis engine. Language server will need to be updated separately to take advantage of it.

@lostmsu
Copy link
Contributor Author

lostmsu commented Nov 1, 2018

Should resolve #296.

@lostmsu
Copy link
Contributor Author

lostmsu commented Nov 1, 2018

This change only works for scenarios, where the annotations module does not have any internal references.

From the analysis standpoint, class A defined in .py file, and then annotated in .pyi files are two different classes, so if some function foo in A is annotated to return type A in .pyi, then analysis will not equate it to returning A from .py.

if (!ProjectEntry.ProjectState.Modules.TryImport(analysisModuleName, out var analysisModuleReference))
return null;

return analysisModuleReference.AnalysisModule?.AnalysisUnit;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imported non-user files only get through AST analysis, do you actually get analysis unit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand this comment.

PythonAnalyzer.AddModuleAnnotations adds annotations module to its Modules section. Just like for normal modules, you'd have to call .Analyze on the IPythonProjectEntry that is returned by AddModuleAnnotations.

Is there some other kind of analysis, that I am missing?

}

moduleName += AnnotationsModuleSuffix;
var entry = new ProjectEntry(this, moduleName, filePath, documentUri, cookie);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the body of the method is similar to AddModule:

public IPythonProjectEntry AddModuleAnnotations(string moduleName, string filePath, Uri documentUri = null, IAnalysisCookie cookie = null) {
    Check.ArgumentNotNull(nameof(moduleName), moduleName);
    return AddModule(moduleName + AnnotationsModuleSuffix, filePath, documentUri, cookie);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

/// <summary>
/// Returns the only element of sequence, or the default value of <typeparamref name="T"/>, if sequence has &lt;&gt; 1 elements.
/// </summary>
public static T OnlyOneOrDefault<T>(this IEnumerable<T> source) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is Enumerable.SingleOrDefault<T> method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a different semantics. SingleOrDefault throws if there's more than 1 element. This one returns default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning here is one would use OnlyOneOrDefault when they want to make a decision based on unambiguous analysis result, if there is no expectation of lack of ambiguity.

@@ -46,7 +46,7 @@ internal sealed class ProjectEntry : IPythonProjectEntry, IAggregateableProjectE
private readonly HashSet<AggregateProjectEntry> _aggregates = new HashSet<AggregateProjectEntry>();

private TaskCompletionSource<IModuleAnalysis> _analysisTcs = new TaskCompletionSource<IModuleAnalysis>();
private AnalysisUnit _unit;
internal AnalysisUnit _unit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it to be internal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in ModuleInfo.AnalysisUnit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With some exceptions, fields should be private, they should be wrapped with properties or access methods. But in this case, it is not safe to provide AnalysisUnit from ProjectEntry for public access. It may be changed while analysis is in progress and that would create unpredictable results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add a AnalysisUnit to ModuleInfo in a similar way it exists in ClassInfo. Then ProjectEntry will have control over when to change it. However, if you are talking about potentially multithreaded analysis, they can still be out of sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analysis is single threaded, but it is in parallel with parsing. Unfortunately, you can't do it like with ClassInfo, because there is only one instance of ModuleInfo per ProjectEntry - it isn't recreated when new analysis arrives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for ClassInfos to be recreated and and ModuleInfos not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like _unit reassignment happens just before clearing ModuleInfo contents anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for ClassInfo to be recreated and ModuleInfo not?

ModuleReference (https://github.com/Microsoft/python-language-server/blob/2935897bfd8244dbe4296101a80021400ff3c88e/src/Analysis/Engine/Impl/ModuleReference.cs#L37) and some other parts of module dependency analysis work through instance references.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open for suggestions. Actually, at this moment I don't understand how can parsing be completed in parallel to analysis without race conditions. Here's the relevant code excerpt from ProjectEntry.Parse:

_unit = new AnalysisUnit(tree, MyScope.Scope);
AnalysisLog.NewUnit(_unit);

MyScope.Scope.Children = new List<InterpreterScope>();
MyScope.Scope.ClearNodeScopes();
MyScope.Scope.ClearNodeValues();
MyScope.Scope.ClearLinkedVariables();
MyScope.Scope.ClearVariables();
MyScope.ClearReferencedModules();
MyScope.ClearUnresolvedModules();
_unit.State.ClearDiagnostics(this);

There does not seem to be anything, that would prevent modification of MyScope.* if there might be an analysis of it going on at the same time.

Copy link

@MikhailArkhipov MikhailArkhipov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like change conflicts with Alex' work in progress, so let's wait until then

@AlexanderSher
Copy link
Contributor

@lostmsu , can you create a minimal repro project (or provide content of the files) for the issue that you want to fix with this PR? I'm fixing *.pyi handling inside #361, will be useful for me.

@lostmsu
Copy link
Contributor Author

lostmsu commented Nov 24, 2018

@AlexanderSher
the scenario this change addresses is: suppose you open a directory in VS Code with nothing but the following two files in it:
alongside_annotated.py

class Annotated:
    def foo(self, bar):
        bar

and alongside_annotated.pyi

class Annotated:
    def foo(self, bar: int): ...

The goal is to have VS Code tell you, that bar in alongside_annotated.py is int when you hover over it, and get appropriate completions.

This change only affects analysis part: it enables one to add .pyi files to PythonAnalyzer by calling PythonAnalyzer.AddModuleAnnotations. These are then used for function parameters and return types.

@lostmsu
Copy link
Contributor Author

lostmsu commented Nov 27, 2018

@AlexanderSher did that work for you? Does your work cover this use case?

@MikhailArkhipov
Copy link

We are doing major rework of the analysis engine and #546 supports them. DDG analysis is no longer actively supported.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants