PEP 647 – User-Defined Type Guards
- PEP
- 647
- Title
- User-Defined Type Guards
- Author
- Eric Traut <erictr at microsoft.com>
- Sponsor
- Guido van Rossum <guido at python.org>
- Discussions-To
- Typing-Sig <typing-sig at python.org>
- Status
- Accepted
- Type
- Standards Track
- Created
- 07-Oct-2020
- Python-Version
- 3.10
- Post-History
- 28-Dec-2020, 9-Apr-2021
- Resolution
- https://mail.python.org/archives/list/python-dev@python.org/thread/2ME6F6YUVKHOQYKSHTVQQU5WD4CVAZU4/
Contents
Abstract
This PEP specifies a way for programs to influence conditional type narrowing employed by a type checker based on runtime checks.
Motivation
Static type checkers commonly employ a technique called “type narrowing” to
determine a more precise type of an expression within a program’s code flow.
When type narrowing is applied within a block of code based on a conditional
code flow statement (such as if
and while
statements), the conditional
expression is sometimes referred to as a “type guard”. Python type checkers
typically support various forms of type guards expressions.
def func(val: Optional[str]):
# "is None" type guard
if val is not None:
# Type of val is narrowed to str
...
else:
# Type of val is narrowed to None
...
def func(val: Optional[str]):
# Truthy type guard
if val:
# Type of val is narrowed to str
...
else:
# Type of val remains Optional[str]
...
def func(val: Union[str, float]):
# "isinstance" type guard
if isinstance(val, str):
# Type of val is narrowed to str
...
else:
# Type of val is narrowed to float
...
def func(val: Literal[1, 2]):
# Comparison type guard
if val == 1:
# Type of val is narrowed to Literal[1]
...
else:
# Type of val is narrowed to Literal[2]
...
There are cases where type narrowing cannot be applied based on static information only. Consider the following example:
def is_str_list(val: List[object]) -> bool:
"""Determines whether all objects in the list are strings"""
return all(isinstance(x, str) for x in val)
def func1(val: List[object]):
if is_str_list(val):
print(" ".join(val)) # Error: invalid type
This code is correct, but a type checker will report a type error because
the value val
passed to the join
method is understood to be of type
List[object]
. The type checker does not have enough information to
statically verify that the type of val
is List[str]
at this point.
This PEP introduces a way for a function like is_str_list
to be defined as
a “user-defined type guard”. This allows code to extend the type guards that
are supported by type checkers.
Using this new mechanism, the is_str_list
function in the above example
would be modified slightly. Its return type would be changed from bool
to TypeGuard[List[str]]
. This promises not merely that the return value
is boolean, but that a true indicates the input to the function was of the
specified type.
from typing import TypeGuard
def is_str_list(val: List[object]) -> TypeGuard[List[str]]:
"""Determines whether all objects in the list are strings"""
return all(isinstance(x, str) for x in val)
User-defined type guards can also be used to determine whether a dictionary conforms to the type requirements of a TypedDict.
class Person(TypedDict):
name: str
age: int
def is_person(val: dict) -> "TypeGuard[Person]":
try:
return isinstance(val["name"], str) and isinstance(val["age"], int)
except KeyError:
return False
def print_age(val: dict):
if is_person(val):
print(f"Age: {val['age']}")
else:
print("Not a person!")
Specification
TypeGuard Type
This PEP introduces the symbol TypeGuard
exported from the typing
module. TypeGuard
is a special form that accepts a single type argument.
It is used to annotate the return type of a user-defined type guard function.
Return statements within a type guard function should return bool values,
and type checkers should verify that all return paths return a bool.
In all other respects, TypeGuard is a distinct type from bool. It is not a
subtype of bool. Therefore, Callable[..., TypeGuard[int]]
is not assignable
to Callable[..., bool]
.
When TypeGuard
is used to annotate the return type of a function or
method that accepts at least one parameter, that function or method is
treated by type checkers as a user-defined type guard. The type argument
provided for TypeGuard
indicates the type that has been validated by
the function.
User-defined type guards can be generic functions, as shown in this example:
_T = TypeVar("_T")
def is_two_element_tuple(val: Tuple[_T, ...]) -> TypeGuard[Tuple[_T, _T]]:
return len(val) == 2
def func(names: Tuple[str, ...]):
if is_two_element_tuple(names):
reveal_type(names) # Tuple[str, str]
else:
reveal_type(names) # Tuple[str, ...]
Type checkers should assume that type narrowing should be applied to the expression that is passed as the first positional argument to a user-defined type guard. If the type guard function accepts more than one argument, no type narrowing is applied to those additional argument expressions.
If a type guard function is implemented as an instance method or class method, the first positional argument maps to the second parameter (after “self” or “cls”).
Here are some examples of user-defined type guard functions that accept more than one argument:
def is_str_list(val: List[object], allow_empty: bool) -> TypeGuard[List[str]]:
if len(val) == 0:
return allow_empty
return all(isinstance(x, str) for x in val)
_T = TypeVar("_T")
def is_set_of(val: Set[Any], type: Type[_T]) -> TypeGuard[Set[_T]]:
return all(isinstance(x, type) for x in val)
The return type of a user-defined type guard function will normally refer to
a type that is strictly “narrower” than the type of the first argument (that
is, it’s a more specific type that can be assigned to the more general type).
However, it is not required that the return type be strictly narrower. This
allows for cases like the example above where List[str]
is not assignable
to List[object]
.
When a conditional statement includes a call to a user-defined type guard function, and that function returns true, the expression passed as the first positional argument to the type guard function should be assumed by a static type checker to take on the type specified in the TypeGuard return type, unless and until it is further narrowed within the conditional code block.
Some built-in type guards provide narrowing for both positive and negative
tests (in both the if
and else
clauses). For example, consider the
type guard for an expression of the form x is None
. If x
has a type that
is a union of None and some other type, it will be narrowed to None
in the
positive case and the other type in the negative case. User-defined type
guards apply narrowing only in the positive case (the if
clause). The type
is not narrowed in the negative case.
OneOrTwoStrs = Union[Tuple[str], Tuple[str, str]]
def func(val: OneOrTwoStrs):
if is_two_element_tuple(val):
reveal_type(val) # Tuple[str, str]
...
else:
reveal_type(val) # OneOrTwoStrs
...
if not is_two_element_tuple(val):
reveal_type(val) # OneOrTwoStrs
...
else:
reveal_type(val) # Tuple[str, str]
...
Backwards Compatibility
Existing code that does not use this new functionality will be unaffected.
Notably, code which uses annotations in a manner incompatible with the stdlib typing library should simply not import TypeGuard.
Reference Implementation
The Pyright type checker supports the behavior described in this PEP.
Rejected Ideas
Decorator Syntax
The use of a decorator was considered for defining type guards.
@type_guard(List[str])
def is_str_list(val: List[object]) -> bool: ...
The decorator approach is inferior because it requires runtime evaluation of the type, precluding forward references. The proposed approach was also deemed to be easier to understand and simpler to implement.
Enforcing Strict Narrowing
Strict type narrowing enforcement (requiring that the type specified
in the TypeGuard type argument is a narrower form of the type specified
for the first parameter) was considered, but this eliminates valuable
use cases for this functionality. For instance, the is_str_list
example
above would be considered invalid because List[str]
is not a subtype of
List[object]
because of invariance rules.
One variation that was considered was to require a strict narrowing requirement by default but allow the type guard function to specify some flag to indicate that it is not following this requirement. This was rejected because it was deemed cumbersome and unnecessary.
Another consideration was to define some less-strict check that ensures that there is some overlap between the value type and the narrowed type specified in the TypeGuard. The problem with this proposal is that the rules for type compatibility are already very complex when considering unions, protocols, type variables, generics, etc. Defining a variant of these rules that relaxes some of these constraints just for the purpose of this feature would require that we articulate all of the subtle ways in which the rules differ and under what specific circumstances the constrains are relaxed. For this reason, it was decided to omit all checks.
It was noted that without enforcing strict narrowing, it would be possible to break type safety. A poorly-written type guard function could produce unsafe or even nonsensical results. For example:
def f(value: int) -> TypeGuard[str]:
return True
However, there are many ways a determined or uninformed developer can subvert
type safety – most commonly by using cast
or Any
. If a Python
developer takes the time to learn about and implement user-defined
type guards within their code, it is safe to assume that they are interested
in type safety and will not write their type guard functions in a way that will
undermine type safety or produce nonsensical results.
Conditionally Applying TypeGuard Type
It was suggested that the expression passed as the first argument to a type
guard function should retain its existing type if the type of the expression was
a proper subtype of the type specified in the TypeGuard return type.
For example, if the type guard function is def f(value: object) ->
TypeGuard[float]
and the expression passed to this function is of type
int
, it would retain the int
type rather than take on the
float
type indicated by the TypeGuard return type. This proposal was
rejected because it added complexity, inconsistency, and opened up additional
questions about the proper behavior if the type of the expression was of
composite types like unions or type variables with multiple constraints. It was
decided that the added complexity and inconsistency was not justified given
that it would provide little or no added value.
Narrowing of Arbitrary Parameters
TypeScript’s formulation of user-defined type guards allows for any input parameter to be used as the value tested for narrowing. The TypeScript language authors could not recall any real-world examples in TypeScript where the parameter being tested was not the first parameter. For this reason, it was decided unnecessary to burden the Python implementation of user-defined type guards with additional complexity to support a contrived use case. If such use cases are identified in the future, there are ways the TypeGuard mechanism could be extended. This could involve the use of keyword indexing, as proposed in PEP 637.
Narrowing of Implicit “self” and “cls” Parameters
The proposal states that the first positional argument is assumed to be the
value that is tested for narrowing. If the type guard function is implemented
as an instance or class method, an implicit self
or cls
argument will
also be passed to the function. A concern was raised that there may be
cases where it is desired to apply the narrowing logic on self
and cls
.
This is an unusual use case, and accommodating it would significantly
complicate the implementation of user-defined type guards. It was therefore
decided that no special provision would be made for it. If narrowing
of self
or cls
is required, the value can be passed as an explicit
argument to a type guard function.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/master/pep-0647.rst
Last modified: 2021-07-14 18:01:22 GMT