Skip to content

Commit

Permalink
Merge pull request #718 from google/google_sync
Browse files Browse the repository at this point in the history
Google sync
  • Loading branch information
Solumin authored Oct 19, 2020
2 parents 1c4ac1a + e34b2d1 commit a4d56ef
Show file tree
Hide file tree
Showing 26 changed files with 599 additions and 114 deletions.
2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ include(PyTypeUtils)
set(PYTYPE_OUT_BIN_DIR ${PROJECT_BINARY_DIR}/bin)
file(MAKE_DIRECTORY ${PYTYPE_OUT_BIN_DIR})

add_subdirectory(pybind11)

add_subdirectory(pytype)

# Add the "googletest" directory at the end as it defines its own CMake rules
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ merge-pyi -i <filepath>.py .pytype/pyi/<filename>.pyi

## Requirements

You need a Python 3.5-3.8 interpreter to run pytype, as well as an
You need a Python 3.6-3.8 interpreter to run pytype, as well as an
interpreter in `$PATH` for the Python version of the code you're analyzing
(supported: 2.7, 3.5-3.8).

Expand Down
4 changes: 3 additions & 1 deletion docs/_layouts/dev_guide.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,12 @@ <h1><a href="{{ "/" | absolute_url }}">{{ site.title | default: site.github.repo
&bull; <a href="{{ "developers/directives.html" | relative_url }}">Directives and annotations</a>
&bull; <a href="{{ "developers/special_builtins.html" | relative_url }}">Special Builtins</a>
<br />
&bull; <a href="{{ "developers/overlays.md" | relative_url }}">Overlays</a>
&bull; <a href="{{ "developers/overlays.html" | relative_url }}">Overlays</a>
<br />
&bull; <a href="{{ "developers/typegraph.html" | relative_url }}">Typegraph internals</a>
<br />
&bull; <a href="{{ "developers/annotations.html" | relative_url }}">Type annotations</a>
<br />
&bull; <a href="{{ "developers/type_stubs.html" | relative_url }}">Type stubs</a>
</p>

Expand Down
193 changes: 193 additions & 0 deletions docs/developers/annotations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# Type Annotations

<!--ts-->
* [Type Annotations](#type-annotations)
* [Introduction](#introduction)
* [Annotations dictionary](#annotations-dictionary)
* [Converting variable annotations to types](#converting-variable-annotations-to-types)
* [Forward references](#forward-references)
* [Complex annotations](#complex-annotations)
* [Conversion to abstract types](#conversion-to-abstract-types)
* [Tracking local operations](#tracking-local-operations)

<!-- Added by: mdemello, at: 2020-10-12T13:44-07:00 -->

<!--te-->

## Introduction

In [PEP484](https://www.python.org/dev/peps/pep-0484/), python added syntactic
support for type annotations (also referred to as "type hints"). These are
not enforced or applied by the python interpreter, but are instead intended as a
combination of documentation and assertions that can be checked by third-party
tools like pytype. [This blog
post](http://veekaybee.github.io/2019/07/08/python-type-hints/) is a good quick
overview of how type hints fit into the python ecosystem in general.

A significant difference between annotations and typecomments is that
annotations are parsed and compiled by the interpreter, even if they have no
semantic meaning in the runtime code. From pytype's point of view, this means
that we can process them as part of the regular bytecode VM (by contrast,
typecomments need a [separate system](directives) to parse and integrate them
into the main code). For example, the following code:

```
class A: pass
x: A
```

compiles to

```
SETUP_ANNOTATIONS
... class A definition ...
LOAD_NAME 0 (A)
STORE_ANNOTATION 1 (x)
```

## Annotations dictionary

Python's `SETUP_ANNOTATION` and `STORE_ANNOTATION` opcodes respectively create
and populate an `__annotations__` dict in `locals` (for variables in functions)
or in `__dict__` (for annotated class members). Pytype similarly creates a
corresponding dictionary, `abstract.AnnotationsDict`, which it stores in the
equivalent locals or class member dictionary.

The annotations dict is updated via the `vm._update_annotations_dict()` method,
which is called from two entry points:

* `vm._record_local()` records a type annotation on a local variable. The
AnnotationsDict is retrieved via `self.current_annotated_locals`, which
gets the AnnotationsDict for the current frame.

* `vm._apply_annotation()` is called with an explicit AnnotationsDict, which, in
turn, is either the `current_annotated_locals` or the annotations dict for a
class object, retrieved via
```
annotations_dict = abstract_utils.get_annotations_dict(cls.members)
```

A class's AnnotationsDict is also updated directly in `byte_STORE_ATTRIBUTE`,
handling the case where we have an annotation on an attribute assignment that
has not already been recorded as a class-level attribute annotation.


## Converting variable annotations to types

As a first step, type annotations on a variable are converted to pytype's
abstract types, and then stored as the type of that variable in much the same
way assignments are. Specifically, `x = Foo()` and `x: Foo` should both lead to
the same internal type being retrieved for `x` when it is referred to later in
the code.

### Forward references

Python currently supports two kinds of annotation,

```
x: Foo
```

where `Foo` is treated as a symbol that is looked up in the current namespace,
and then stored under `x` in the `__annotations__` dictionary, and

```
x: 'Foo'
```

where `Foo` is simply stored as a string. The latter case is useful because it
lets us annotate variables with types that have not been defined yet;
annotations of this type are variously referred to as "string annotations",
"forward references" or "late annotations".

### Complex annotations

While an annotation like `x: Foo` corresponds directly to the runtime type
`class Foo`, in general the type annotation system supports more complex types
that do not correspond directly to a runtime python type.

Some examples:

* Parametrised types, e.g. `List[int]` is the type of lists of integers, and
`Dict[K, V]` is the (generic) type of dictionaries whose keys and values have
types K and V respectively.
* Union types, e.g. `Union[int, str]` is the type of variables that could
contain either an `int` or a `str` for the purposes of static type analysis.
At runtime, they will contain a single concrete type.
* Optional types are a special subcase of unions; `Optional[T] = Union[T,
None]`.

NOTE: Technically, these types *do* correspond to runtime classes defined in
[typing.py](https://github.com/python/typing/blob/master/src/typing.py), but
that is just an implementation detail to avoid compiler errors when using them.
They are meant to be used by type checkers, not by python code.

Python's general syntax for complex annotations is

```
Base[param1, param2, ...]
```

where the base type `Base` is a python class subclassing `typing.Generic`, and
the `param`s are types (possibly parametrised themselves) or lists of types.

### Conversion to abstract types

The main annotation processing code lives in the
`annotations_util.AnnotationsUtil` class (instantiated as a member of the VM).
This code has several entry points, for various annotation contexts, but the
bulk of the conversion work is done in the internal method
`_process_one_annotation()`.

Unprocessed annotations are represented as `abstract.AnnotationClass` (including the
derived class `abstract.AnnotationContainer`) for immediate annotations, and
`abstract.LateAnnotation` for late annotations. There is also a mixin class,
`mixin.NestedAnnotation`, which has some common code for dealing with inner
types (the types within the `[]` that the base type is parametrised on).

NOTE: The two types can be mixed; an immediate annotation can be parametrised
with a late annotation, e.g. ` x: List['A']` which will eventually be converted
to `x = List[A]` when we can resolve the name `'A'`.

`_process_one_annotation()` is essentially a large switch statement dealing with
various kinds of annotations, and calling itself recursively to deal with nested
annotations. The return value of `_process_one_annotation` is an
`abstract.*` object that can be applied as the python type of a variable.

The various public methods in `AnnotationsUtil` cover different contexts in
which we can encounter variable annotations while processing bytecode; search
for `self.annotations_util` in `vm.py` to see where each one is used.

## Tracking local operations

There is a class of python code that does read type annotations at runtime, for
metaprogramming reasons. The commonest example is `dataclasses` in the standard
library (from python 3.7 onwards); for example the following will generate a
class with an appropriate `__init__` function:

```
@dataclasses.dataclass
class A:
x: int
y: str
```

Pytype has some custom [overlay](overlays) code to replicate the effects of
this metaprogramming, but it needs a explicit record of variable annotations,
possibly in the order in which they appear in the code, to handle the general
case. This is distinct from the regular use of annotations to assign types to
variables, and the information we need is not preserved by the regular pytype
type tracking machinery.

To support this use case, we have a separate record of all assignments and
annotations to local variables, stored in a `vm.local_ops` dictionary and
indexed by the current frame. See `vm._record_local()` for how this dictionary
is updated, and `get_class_locals()` in `overlays/classgen.py` for an instance
of it is used along with `vm.annotated_locals` to recover a class's variable
annotations.

[directives]: directives.md
[overlays]: overlays.md
7 changes: 1 addition & 6 deletions docs/developers/attributes.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,10 @@
* [get_special_attribute](#get_special_attribute)
* [valself](#valself)

<!-- Added by: rechen, at: 2020-10-03T02:13-07:00 -->
<!-- Added by: rechen, at: 2020-10-12T17:27-07:00 -->

<!--te-->

<!-- TODO(rechen):
* Do testing to verify that the `valself` section is accurate; it is based on
the docstring of `get_attribute`.
-->

## Introduction

The [attribute] module handles getting and setting attributes on abstract
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ merge-pyi -i <filepath>.py .pytype/pyi/<filename>.pyi

## Requirements

You need a Python 3.5-3.8 interpreter to run pytype, as well as an
You need a Python 3.6-3.8 interpreter to run pytype, as well as an
interpreter in `$PATH` for the Python version of the code you're analyzing
(supported: 2.7, 3.5-3.8).

Expand Down
12 changes: 12 additions & 0 deletions pytype/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -660,6 +660,18 @@ py_test(
.pytd
)

py_test(
NAME
typegraph_metrics_test
SRCS
typegraph_metrics_test.py
DEPS
.config
.libvm
pytype.typegraph.cfg
pytype.tests.test_base
)

add_subdirectory(overlays)
add_subdirectory(pyc)
add_subdirectory(pyi)
Expand Down
99 changes: 99 additions & 0 deletions pytype/attribute_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,105 @@
import unittest


def _get_origins(binding):
"""Gets all the bindings in the given binding's origins."""
bindings = set()
for origin in binding.origins:
for source_set in origin.source_sets:
bindings |= source_set
return bindings


class ValselfTest(test_base.UnitTest):
"""Tests for get_attribute's `valself` parameter."""

def setUp(self):
super().setUp()
options = config.Options.create(python_version=self.python_version)
self.vm = vm.VirtualMachine(
errors.ErrorLog(), options, load_pytd.Loader(None, self.python_version))
self.node = self.vm.root_cfg_node
self.attribute_handler = self.vm.attribute_handler

def test_instance_no_valself(self):
instance = abstract.Instance(self.vm.convert.int_type, self.vm)
_, attr_var = self.attribute_handler.get_attribute(
self.node, instance, "real")
attr_binding, = attr_var.bindings
self.assertEqual(attr_binding.data.cls, self.vm.convert.int_type)
# Since `valself` was not passed to get_attribute, a binding to
# `instance` is not among the attribute's origins.
self.assertNotIn(instance, [o.data for o in _get_origins(attr_binding)])

def test_instance_with_valself(self):
instance = abstract.Instance(self.vm.convert.int_type, self.vm)
valself = instance.to_binding(self.node)
_, attr_var = self.attribute_handler.get_attribute(
self.node, instance, "real", valself)
attr_binding, = attr_var.bindings
self.assertEqual(attr_binding.data.cls, self.vm.convert.int_type)
# Since `valself` was passed to get_attribute, it is added to the
# attribute's origins.
self.assertIn(valself, _get_origins(attr_binding))

def test_class_no_valself(self):
meta_members = {"x": self.vm.convert.none.to_variable(self.node)}
meta = abstract.InterpreterClass("M", [], meta_members, None, self.vm)
cls = abstract.InterpreterClass("X", [], {}, meta, self.vm)
_, attr_var = self.attribute_handler.get_attribute(self.node, cls, "x")
# Since `valself` was not passed to get_attribute, we do not look at the
# metaclass, so M.x is not returned.
self.assertIsNone(attr_var)

def test_class_with_instance_valself(self):
meta_members = {"x": self.vm.convert.none.to_variable(self.node)}
meta = abstract.InterpreterClass("M", [], meta_members, None, self.vm)
cls = abstract.InterpreterClass("X", [], {}, meta, self.vm)
valself = abstract.Instance(cls, self.vm).to_binding(self.node)
_, attr_var = self.attribute_handler.get_attribute(
self.node, cls, "x", valself)
# Since `valself` is an instance of X, we do not look at the metaclass, so
# M.x is not returned.
self.assertIsNone(attr_var)

def test_class_with_class_valself(self):
meta_members = {"x": self.vm.convert.none.to_variable(self.node)}
meta = abstract.InterpreterClass("M", [], meta_members, None, self.vm)
cls = abstract.InterpreterClass("X", [], {}, meta, self.vm)
valself = cls.to_binding(self.node)
_, attr_var = self.attribute_handler.get_attribute(
self.node, cls, "x", valself)
# Since `valself` is X itself, we look at the metaclass and return M.x.
self.assertEqual(attr_var.data, [self.vm.convert.none])

def test_getitem_no_valself(self):
cls = abstract.InterpreterClass("X", [], {}, None, self.vm)
_, attr_var = self.attribute_handler.get_attribute(
self.node, cls, "__getitem__")
attr, = attr_var.data
# Since we looked up __getitem__ on a class without passing in `valself`,
# the class is treated as an annotation.
self.assertIs(attr.func.__func__, abstract.AnnotationClass.getitem_slot)

def test_getitem_with_instance_valself(self):
cls = abstract.InterpreterClass("X", [], {}, None, self.vm)
valself = abstract.Instance(cls, self.vm).to_binding(self.node)
_, attr_var = self.attribute_handler.get_attribute(
self.node, cls, "__getitem__", valself)
# Since we passed in `valself` for this lookup of __getitem__ on a class,
# it is treated as a normal lookup; X.__getitem__ does not exist.
self.assertIsNone(attr_var)

def test_getitem_with_class_valself(self):
cls = abstract.InterpreterClass("X", [], {}, None, self.vm)
valself = cls.to_binding(self.node)
_, attr_var = self.attribute_handler.get_attribute(
self.node, cls, "__getitem__", valself)
# Since we passed in `valself` for this lookup of __getitem__ on a class,
# it is treated as a normal lookup; X.__getitem__ does not exist.
self.assertIsNone(attr_var)


class AttributeTest(test_base.UnitTest):

def setUp(self):
Expand Down
9 changes: 8 additions & 1 deletion pytype/matcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,9 @@ def _match_type_against_type(self, left, other_type, subst, node, view):
elif _is_callback_protocol(other_type):
return self._match_type_against_callback_protocol(
left, other_type, subst, node, view)
elif left.cls:
return self._match_type_against_type(
abstract.Instance(left.cls, self.vm), other_type, subst, node, view)
else:
return None
elif isinstance(left, dataclass_overlay.FieldInstance) and left.default:
Expand Down Expand Up @@ -961,7 +964,11 @@ def _get_concrete_values_and_classes(self, var):
def _enforce_single_type(self, var, node):
"""Enforce that the variable contains only one concrete type."""
concrete_values, classes = self._get_concrete_values_and_classes(var)
if len(set(classes)) > 1:
class_names = {c.full_name for c in classes}
for compat_name, name in _COMPATIBLE_BUILTINS:
if {compat_name, name} <= class_names:
class_names.remove(compat_name)
if len(class_names) > 1:
# We require all occurrences to be of the same type, no subtyping allowed.
return None
if concrete_values and len(concrete_values) < len(var.data):
Expand Down
Loading

0 comments on commit a4d56ef

Please sign in to comment.