Type hints

inside the snake pit

by Bernat Gabor / @gjbernat / Bloomberg
https://gaborbernat.github.io/type-hint-recr-2022

Who am I?

  • software engineer at Bloomberg (data ingestion and quality control)
  • OSS contributor github/gaborbernat, bernat.tech
  • member of the Python Packaging Authority
  • 😎 parent to two Yorkshire Terriers
Silky

Copyright of the above image owned by Bernat Gabor

Maintainer (author *) of Python OSS projects

Type hints in Python?

interested

© Creative Commons CC0 1.0

Type hinting

  • foundations laid down in PEP-484
  • implemented in language by (+ reference checker - mypy)
    • Guido van Rossum - BDFL
    • Jukka Lehtosalo - mypy
    • Ivan Levkivskyi
    • Łukasz Langa
  • added to Python with version 3.5 - September 13th, 2015
                      
                      from typing import TypedDict
                      
                    
  • received many improvements ever since

why do it?

© Creative Commons CC0 1.0

easier to

  • understand (and use) the source code
  • maintain
  • debug
                
                 def send_request(request_data : Any,
                                  headers: dict[str, str] | None,
                                  user_id: UserId | None = None,
                                  as_json: bool = True):
                      """this is a great sender"""
                
            

easier to

  • understand (and use) the source code
  • maintain
  • debug
                
                def send_request(request_data : Any,
                                 headers: dict[str, str] | None,
                                 user_id: UserId | None = None,
                                 as_json: bool = True):
                     """this is a great sender"""
                
            

easier to

  • understand (and use) the source code
  • maintain
  • debug
                
                def send_request(request_data : Any,
                                 headers: dict[str, str] | None,
                                 user_id: UserId | None = None,
                                 as_json: bool = True):
                     """this is a great sender"""
                
            

easier to

  • understand (and use) the source code
  • maintain
  • debug
                
                def send_request(request_data : Any,
                                 headers: dict[str, str] | None,
                                 user_id: UserId | None = None,
                                 as_json: bool = True):
                     """this is a great sender"""
                
            

easier to

  • understand (and use) the source code
  • maintain
  • debug
                
                def send_request(request_data : Any,
                                 headers: dict[str, str] | None,
                                 user_id: UserId | None = None,
                                 as_json: bool = True):
                     """this is a great sender"""
                
            

Easier refactoring

  • find usage of types
  • discover call hierarchy
  • better detection of objects and lifecycle flow

accurate type information for IDE auto complete

Copyright of the above image owned by Bernat Gabor

lint checks - find bugs with no tests

improve documentation

Copyright of the above image owned by Bernat Gabor

data validation - pydantic

  • a cleaner syntax to specify type requirements
  •                     
                  from datetime import datetime
    
                  from pydantic import BaseModel, ValidationError
    
                  class User(BaseModel):
                      id: int
                      signup_ts: datetime = None
                      friends: list[int] = []
    
    
                  external_data = {
                      "id": "123",
                      "signup_ts": "broken",
                      "friends": [1, 2, "not a number"],
                  }
                  try:
                      user = User(**external_data)
                  except ValidationError as e:
                      print(e.json())
                        
                    

data validation - pydantic

  • a cleaner syntax to specify type requirements
  •                     
                  from datetime import datetime
    
                  from pydantic import BaseModel, ValidationError
    
                  class User(BaseModel):
                      id: int
                      signup_ts: datetime = None
                      friends: list[int] = []
    
    
                  external_data = {
                      "id": "123",
                      "signup_ts": "broken",
                      "friends": [1, 2, "not a number"],
                  }
                  try:
                      user = User(**external_data)
                  except ValidationError as e:
                      print(e.json())
                        
                    

data validation - pydantic

  • a cleaner syntax to specify type requirements
  •                     
                  from datetime import datetime
    
                  from pydantic import BaseModel, ValidationError
    
                  class User(BaseModel):
                      id: int
                      signup_ts: datetime = None
                      friends: list[int] = []
    
    
                  external_data = {
                      "id": "123",
                      "signup_ts": "broken",
                      "friends": [1, 2, "not a number"],
                  }
                  try:
                      user = User(**external_data)
                  except ValidationError as e:
                      print(e.json())
                        
                    

data validation - pydantic

  • a cleaner syntax to specify type requirements
  •                     
                  from datetime import datetime
    
                  from pydantic import BaseModel, ValidationError
    
                  class User(BaseModel):
                      id: int
                      signup_ts: datetime = None
                      friends: list[int] = []
    
    
                  external_data = {
                      "id": "123",
                      "signup_ts": "broken",
                      "friends": [1, 2, "not a number"],
                  }
                  try:
                      user = User(**external_data)
                  except ValidationError as e:
                      print(e.json())
                        
                    

detect vulnerable code - security

Facebook - pyre

                    
                      # Avoid passing user input to system calls (safety check)

                      
                      

                      def call(cmd: list[str]) -> None:
                          subprocess.check_call(cmd)

                      
                      
                      
                      
                      user_input: list[str] = input().split(" ")
                      call(user_input)
                    
                

detect vulnerable code - security

Facebook - pyre

                    
                      # Avoid passing user input to system calls (safety check)

                      
                      

                      def call(cmd: list[str]) -> None:
                          subprocess.check_call(cmd)

                      def ensure_safe(args: list[str]) -> list[str]:
                          assert args == [sys.executable, "--version"]
                          return args
                      
                      user_input: list[str] = input().split(" ")
                      call(user_input)  # do not allow this
                    
                

detect vulnerable code - security

Facebook - pyre

                    
                      # Avoid passing user input to system calls (safety check)

                      
                      

                      def call(cmd: list[str]) -> None:
                          subprocess.check_call(cmd)

                      def ensure_safe(args: list[str]) -> list[str]:
                          assert args == [sys.executable, "--version"]
                          return args
                      
                      user_input: list[str] = input().split(" ")
                      call(ensure_safe(user_input))  # this is now safe
                    
                

detect vulnerable code - security

Facebook - pyre

                    
                      # Avoid passing user input to system calls (safety check)
                      from typing import NewType, cast

                      SafeStr = NewType("SafeStr", str)

                      def call(cmd: list[SafeStr]) -> None:
                          subprocess.check_call(cmd)

                      def ensure_safe(args: list[str]) -> list[SafeStr]:
                          assert args == [sys.executable, "--version"]
                          return [cast(SafeStr, s) for s in args]
                      
                      user_input: list[str] = input().split(" ")
                      call(ensure_safe(user_input))  # this is now safe
                    
                

what it is not

essentially treated as comment during script evaluation

  • no runtime type inference
  • no performance improvement

what it is not

essentially treated as comment during script evaluation

  • no runtime type inference
  • no performance improvement

mypyc - speed up for free (mostly)

  • compile a C extension type hinted code (does not support full syntax yet)
  • 4-20 X performance improvement (due to avoid hashtable lookups)
  • projects using it: mypy, black

Improve the developer experience, not performance

© Creative Commons CC0 License

Improve the developer experience, not performance

© Creative Commons CC0 License

Improve the developer experience, and performance

© Creative Commons CC0 License

what kind of?

© Creative Commons CC0 1.0

Gradual typing

  • no type hints specified ⇒ code dynamically typed
  • but we can type hint
    • function parameters
    • function return values
    • variables
  • only type hinted elements are type checked

gradual typing

  • running linter will report errors (if called code is type hinted):
                
                    # tests/test_magic_field.py
                    f = MagicField(name=1, MagicType.DEFAULT)
                    f.names()
            
                
                bernat@uvm ~/python-magic (master●)$ mypy --ignore-missing-imports tests/test_magic_field.py
                tests/test_magic_field.py:21: error: Argument 1 to "MagicField" has incompatible type "int";
                    expected "Union[str, bytes]"
                tests/test_magic_field.py:22: error: "MagicField" has no attribute "names"; maybe "name" or "_name"?
                
            

gradual typing

  • running linter will report errors (if called code is type hinted):
                
                    # tests/test_magic_field.py
                    f = MagicField(name=1, MagicType.DEFAULT)
                    f.names()
            
                
                bernat@uvm ~/python-magic (master●)$ mypy --ignore-missing-imports tests/test_magic_field.py
                tests/test_magic_field.py:21: error: Argument 1 to "MagicField" has incompatible type "int";
                    expected "Union[str, bytes]"
                tests/test_magic_field.py:22: error: "MagicField" has no attribute "names"; maybe "name" or "_name"?
                
            

gradual typing

  • running linter will report errors (if called code is type hinted):
                
                    # tests/test_magic_field.py
                    f = MagicField(1, MagicType.DEFAULT)
                    f.names()
                
            
                
                bernat@uvm ~/python-magic (master●)$ mypy --ignore-missing-imports tests/test_magic_field.py
                tests/test_magic_field.py:21: error: Argument 1 to "MagicField" has incompatible type "int";
                    expected "Union[str, bytes]"
                tests/test_magic_field.py:22: error: "MagicField" has no attribute "names"; maybe "name" or "_name"?
                
            

gradual typing

  • running linter will report errors (if called code is type hinted):
                
                    # tests/test_magic_field.py
                    f = MagicField(1, MagicType.DEFAULT)
                    f.names()
                
            
                
                bernat@uvm ~/python-magic (master●)$ mypy --ignore-missing-imports tests/test_magic_field.py
                tests/test_magic_field.py:21: error: Argument 1 to "MagicField" has incompatible type "int";
                    expected "Union[str, bytes]"
                tests/test_magic_field.py:22: error: "MagicField" has no attribute "names"; maybe "name" or "_name"?
                
            

How to add it

© Creative Commons CC0 License

type annotations

syntax based on

type annotations

for example

                    
                    def greeting(name: str) -> str:
                        value : str = "Hello"
                        return f"{value} {name}""
                    
                

type annotations

function annotation

                    
                    def greeting(name: str) -> str:
                        value : str = "Hello"
                        return f"{value} {name}""
                    
                

type annotations

variable annotation

                    
                    def greeting(name: str) -> str:
                        value : str = "Hello"
                        return f"{value} {name}""
                    
                

type annotations

                
            class A:
                def __init__() -> None:
                     self.elements : list[int] = []

               def add(element: int) -> None:
                     self.elements.append(element)
                    
            
  • the canonical and clean way
  • packaging solved
  • requires use of Python 3.6 (no Python 2, or <=3.4)
  • requires importing all type dependencies (runtime penalty?)
  • the interpreter needs to evaluate type hints at syntax parsing - time consuming???
    PEP-563 ~ postponed evaluation of annotations - Python 3.7+
    but can cause issues for tools (pydantic) needing at runtime the type information
                    
                            from __future__ import annotations
                    
                

type comments

                
            from typing import List

            class A:
                def __init__():
                     # type: () -> None
                     self.elements = []  # type: List[int]

               def add(element):
                     # type: (List[int]) -> None
                     self.elements.append(element)
                    
            
  • was mostly used for supporting both Python 2 and 3 - migrate via com2ann
  • works under any Python version
  • type information is kept locally
  • packaging solved
  • kinda ugly, lots of noise beside logic
  • unused imports
  • sometime confuses linters, need to add noqa/pylint

interface/stub files

                
                class A:
                  def __init__() -> None:
                      self.elements = []

                  def add(element):
                      self.elements.append(element)
                    
            
                
                # a.pyi alongside a.py
                class A:
                  elements: list[int] = ...
                  def __init__() -> None: ...
                  def add(element: int) -> None: ...
                    
            
  • works under any Python version
  • no original source code change required
  • can use latest Python features
  • no conflicts with other linter tools
  • well tested with typeshed

interface/stub files

                
                class A:
                  def __init__() -> None:
                      self.elements = []

                  def add(element):
                      self.elements.append(element)
                    
            
                
                # a.pyi alongside a.py - ellipse for body stub
                class A:
                  elements: list[int] = ...
                  def __init__() -> None: ...
                    
            

bonus: docstrings

                
                class A:
                    def __init__():
                         self.elements = []

                   def add(element):
                       """
                       :param List[int] element: the element to add
                        :rtype: None
                       """
                       self.elements.append(element)
                    
            
  • works under any Python version
  • does not clash with other linters (unless its a doc linter)
  • no standard way to specify complex cases (union)
  • tool dependent support
  • requires changing the documentation
  • does not play well with type hinted code

what to add?

© Creative Commons License

nominal types

  • all built in Python types (e.g., int, float, type, object, etc.)
  • generic containers (a few examples only)
                        
                       t : tuple[int, float] = 0, 1.2
                       d : dict[str, int] = {"a": 1, "b": 2}
                       d : MutableMapping[str, int] = {"a": 1, "b": 2}
                       l : list[int] = [1, 2, 3]
                       i : Iterable[str] = ["1", "2", "3"]
                        
                        
  • alias types
                        
                        Vector = list[float]  # implicit
                        Vector: TypeAlias[list[float]]  # explicit
                        
                        
  • distinct types
                        
                       UserId = NewType("UserId", int)
                       some_id = UserId(524313)
                        
                        

nominal types

  • NamedTuple
                        
                       class Employee(NamedTuple):
                            name: str
                            id: int
                        
                        
  • composer - one of
                        
                        Union[None, int, str]
                        None | int | str
                        
                        
  • composer - optional
                        
                        float | None  # Optional[float]
                        
                        

nominal types

  • callable - functions
                        
                       # Callable[[Arg1Type, Arg2Type], ReturnType]
                       def feeder(get_next_item: Callable[[], str]) -> None:
                        
                        
  • generics - TypeVar
                        
                       T = TypeVar("T")
    
                       class Magic(Generic[T]):
                             def __init__(self, value: T) -> None:
                                self.value : T = value
    
                        def square_values(vars: Iterable[Magic[int]]) -> None:
                            v.value = v.value * v.value
                        
                        
  • Any - disable type check
                        
                       def foo(item: Any) -> int:
                            item.bar()
                        

PEP-544 - protocols

nominal (main) vs structural typing (support)

                

                    KEY = TypeVar("KEY", contravariant=true)

                    class MagicGetter(Protocol[KEY], Sized):
                            def __getitem__(self, item: KEY) -> int: ...

                    def func_int(param: MagicGetter[int]) -> int:
                        return param["a"] * 2

                    def func_str(param: MagicGetter[str]) -> str:
                        return f"{param['a']}"

                    
            

PEP-544 - protocols

nominal (main) vs structural typing (support)

                

                    KEY = TypeVar("KEY", contravariant=true)

                    class MagicGetter(Protocol[KEY], Sized):
                            def __getitem__(self, item: KEY) -> int: ...

                    def func_int(param: MagicGetter[int]) -> int:
                        return param["a"] * 2

                    def func_str(param: MagicGetter[str]) -> str:
                        return f"{param['a']}"

                    
            

Gotchas

© Creative Commons CC 2.0 License

Gotcha #1 - multiple return type

                
                def magic(i: str | int) ->  str | int:
                    return i * 2
                    
            
                
                def other_func() -> int:
                    result = magic(2)
                    assert isinstance(result, int)
                    return result
                    
            

Gotcha #1 - multiple return type

                
                def magic(i: str | int) -> Any:
                    return i * 2
                    
            
                
                def other_func() -> int:
                    result = magic(2)

                    return result
                    
            

Gotcha #1 - multiple return type

                
                from typing import overload

                @overload
                def magic(i: int) -> int:
                    ...

                @overload
                def magic(i: str) -> str:
                    ...

                def magic(i:  str | int) ->  str | int:
                    return i * 2
                    
            
                
                def other_func() -> int:
                    result = magic(2)
                    return result
                    
            

Gotcha #1 - multiple return type

                
                from typing import overload

                @overload
                def magic(i: int) -> int:  # pylint: disable=function-redefined
                    ...

                @overload
                def magic(i: str) -> str:  # pylint: disable=function-redefined
                    ...

                def magic(i:  str | int) ->  str | int:
                    return i * 2
                    
            
                
                def other_func() -> int:
                    result = magic(2)
                    return result
                    
            

Gotcha #2 - contravariant argument

                
                from abc import ABCMeta, abstractmethod

                class A(metaclass=ABCMeta):
                    @abstractmethod
                    def func(self, key: str | int) -> str:
                        raise NotImplementedError

                class B(A):
                    def func(self, key: int) -> str:
                        return str(key)

                class C(A):
                    def func(self, key: str) -> str:
                        return key
                    
            
                
                test.py:12: error: Argument 1 of "func" incompatible with supertype "A"
                test.py:17: error: Argument 1 of "func" incompatible with supertype "A"
                    
            

Gotcha #2 - contravariant argument

specialization can handle more

                
                from abc import ABCMeta, abstractmethod

                class A(metaclass=ABCMeta):
                    @abstractmethod
                    def func(self, key: str | int) -> str:
                        raise NotImplementedError

                class B(A):
                    def func(self, key: str | int | bool) -> str:
                        return str(key)

                class C(A):
                    def func(self, key: str | int | list[int]) -> str:
                        return key
                    
            

Gotcha #3 - compatibility

                
                class A:
                    @classmethod
                    def magic(cls, a: int) -> "A":
                        return cls()

                class B(A):
                    @classmethod
                    def magic(cls, a: int, b: bool) -> "B":
                        return cls()
                    
            
                
                elements: list[type[A]] = [A, B]
                print([e.magic(1) for e in elements])
                    
            
                
                    TypeError: B.magic() missing 1 required positional argument: 'b'
                  
            
                
                    test.py:9: error: Signature of "magic" incompatible with supertype "A"
                  
            

Gotcha #3 - compatibility

                
                class A:
                    @classmethod
                    def magic(cls, a: int) -> "A":
                        return cls()

                class B(A):
                    @classmethod
                    def magic(cls, a: int, b: bool = False) -> "B":
                        return cls()
                    
            
                
                elements: list[type[A]] = [A, B]
                print([e.magic(1) for e in elements])
                    
            

Gotcha #3 - compatibility

                
                class A:
                    def __init__(self, a: int) -> None:
                        pass

                class B(A):
                    def __init__(self, a: int, b: bool) -> None:
                        super().__init__(a)
                    
            
                
                elements: list[type[A]]= [A, B]
                print([e(1) for e in elements])
                    
            
                
                      TypeError: B.__init__() missing 1 required positional argument: 'b'
                    
            
                
                    
            
  • too common to prohibit incompatible __init__ and __new__

Copyright of the above image owned by Bernat Gabor

when you hit the wall

  • take out the bigger hammer
  • use reveal_type to see inferred type (or reveal_locals)
  • use cast to force a given type:
                    
                    from typing import cast
    
                    a = [4]
                    reveal_type(a)         # -> error: Revealed type is 'builtins.list[builtins.int*]'
    
                    b = cast(list[str], a) # passes fine (no runtime check)
                    reveal_type(c)         # -> error: Revealed type is 'builtins.list[builtins.str]'
    
                    reveal_locals()
                    # note: Revealed local types are:
                    # note:     a: builtins.list[builtins.int*]
                    # note:     b: builtins.list[builtins.str]
                    
                
  • use the # type: ignore comment to ignore an error:
                    
                    x = confusing_function() # type: ignore # see mypy/issues/1167
                        
                

The ever-evolving snake

PEP-586 Literal types

                
                @overload
                def open(fn: str, mode: Literal["r", "w"]) -> IO[str]: pass
                @overload
                def open(fn: str, mode: Literal["rb", "wb"]) -> IO[bytes]: pass
                def open(fn: str, mode: str):
                    ...

                with open("/etc/passwd", "r") as ft:
                    lines: str = ft.read()

                with open("/bin/sh", "rb") as fb:
                    data: bytes = fb.read()
                
            

PEP-586 Literal types - added in 3.8

                
@overload
def open(fn: str, mode: Literal["r", "w"]) -> IO[str]: pass
@overload
def open(fn: str, mode: Literal["rb", "wb"]) -> IO[bytes]: pass
def open(fn: str, mode: str):
    ...

with open("/etc/passwd", "r") as ft:
    lines: str = ft.read()

with open("/bin/sh", "rb") as fb:
    data: bytes = fb.read()
                
            

PEP-586 Literal types - added in 3.8

                
@overload
def open(fn: str, mode: Literal['r', 'w']) -> IO[str]: pass
@overload
def open(fn: str, mode: Literal['rb', 'wb']) -> IO[bytes]: pass
def open(fn: str, mode: str):
    ...

with open("/etc/passwd", "r") as ft:
    lines: str = ft.read()

with open("/bin/sh", "rb") as fb:
    data: bytes = fb.read()
                
            

PEP-589 Typed dictionaries - added in 3.8

                
from typing import TypedDict


class Movie(TypedDict, total=True):
    name: str
    year: int

movie: Movie = {
    "name": "Blade Runner",
    "year": 1982,
}
                
            

PEP-589 Typed dictionaries - added in 3.8

                
from typing import TypedDict


class Movie(TypedDict, total=True):
    name: str
    year: int

movie: Movie = {
    "name": "Blade Runner",
    "year": 1982,
}
                
            

PEP-589 Typed dictionaries - added in 3.8

                
from typing import TypedDict


class Movie(TypedDict, total=True):
    name: str
    year: int

movie: Movie = {
    "name": "Blade Runner",
    "year": 1982,
}
                
            

PEP-591 Final qualifier - added in 3.8

                

RATE: Final[int] = 3000
RATE = 2500  # ERROR -- cannot set Final variable

# Final class variable
class User:
    DEFAULT_ID: Final[int] = 0
class BusinessUser(User):
    DEFAULT_ID = 1  # ERROR -- cannot set inherited Final class variable
                
            

PEP-591 Final qualifier - added in 3.8

                

RATE: Final[int] = 3000
RATE = 2500  # ERROR -- cannot set Final variable

# Final class variable
class User:
    DEFAULT_ID: Final[int] = 0
class BusinessUser(User):
    DEFAULT_ID = 1  # ERROR -- cannot set inherited Final class variable
                
            

PEP-591 Final qualifier - added in 3.8

                

RATE: Final[int] = 3000
RATE = 2500  # ERROR -- cannot set Final variable

# Final class variable
class User:
    DEFAULT_ID: Final[int] = 0
class BusinessUser(User):
    DEFAULT_ID = 1  # ERROR -- cannot set inherited Final class variable
                
            

PEP-585 less camel case and imports - added in 3.9

                # python 3.5 or later
from typing import Dict, List

a: List[int] = [1, 2]
b : Dict[int, str] = {}
            

PEP-585 less camel case and imports - added in 3.9

                # python 3.9 or later


a: list[int] = [1, 2]
b : dict[int, str] = {}
            

PEP-585 less camel case and imports - added in 3.9

                # python 3.7 or later
from __future__ import annotations

a: list[int] = [1, 2]
b : dict[int, str] = {}
            

PEP-604 shorter syntax for Union - added in 3.10

                # python3.5 or later
from typing import Optional, Union

a : Union[str, int] = "1"
b : Optional[int] = 2
            

PEP-604 shorter syntax for Union - added in 3.10

                # python3.10 or later


a : str | int = "1"
b : int | None = 2
            

PEP-604 shorter syntax for Union

                # python3.7 or later
from __future__ import annotations

a : str | int = "1"
b : int | None = 2
            

Other accepted typing PEPs

Merge docstring and typing?

how to document types?

  • PEP-257 defines how to document Python code
  • also supports adding type information for:
    • variable
    • parameter
                    
                    class A:
                        def __init__():
                             self.elements = []
    
                       def add(element):
                           """
                           :param List[int] element: the element to add
                            :rtype: None
                           """
                           self.elements.append(element)
                        
                

sphinx-autodoc-typehints

  • avoid type duplication between docstring and type hinting
  • during document generation get type hinted types and insert it into the docstring

how to use it

  • install the library
                    
                        pip install sphinx-autodoc-types>=1.17
                        
                
  • enable it inside conf.py
                    
                        # conf.py
                        extensions = ["sphinx_autodoc_typehints"]
                        
                
  • generate the documentation as usual
outcome - RookieGameDevs/revived

Copyright of the above image owned by Bernat Gabor

outcome - RookieGameDevs/revived

Copyright of the above image owned by Bernat Gabor

conclusion

when to use it?

  • think of them as unit tests to ensure type correctness
  • so use them whenever you would write unit tests
  • but remember, they can do much more
    • checked docstring typing
    • runtime type validation
    • etc.
  • but prepare for some discomfort and challenges sometimes

thank you

© Creative Commons Attribution-Share Alike 2.5 Generic

https://www.bernat.tech/the-state-of-type-hints-in-python/

we're hiring

https://bloomberg.com/engineering

see TechAtBloomberg.com / @TechAtBloomberg

Bloomberg

© 2022 Bloomberg Finance L.P.
All rights reserved.