Python Enhancement Proposals

PEP 671 – Syntax for late-bound function argument defaults

PEP
671
Title
Syntax for late-bound function argument defaults
Author
Chris Angelico <rosuav at gmail.com>
Status
Draft
Type
Standards Track
Created
24-Oct-2021
Python-Version
3.11
Post-History
24-Oct-2021

Contents

Abstract

Function parameters can have default values which are calculated during function definition and saved. This proposal introduces a new form of argument default, defined by an expression to be evaluated at function call time.

Motivation

Optional function arguments, if omitted, often have some sort of logical default value. When this value depends on other arguments, or needs to be reevaluated each function call, there is currently no clean way to state this in the function header.

Currently-legal idioms for this include:

# Very common: Use None and replace it in the function
def bisect_right(a, x, lo=0, hi=None, *, key=None):
    if hi is None:
        hi = len(a)

# Also well known: Use a unique custom sentinel object
_USE_GLOBAL_DEFAULT = object()
def connect(timeout=_USE_GLOBAL_DEFAULT):
    if timeout is _USE_GLOBAL_DEFAULT:
        timeout = default_timeout

# Unusual: Accept star-args and then validate
def add_item(item, *optional_target):
    if not optional_target:
        target = []
    else:
        target = optional_target[0]

In each form, help(function) fails to show the true default value. Each one has additional problems, too; using None is only valid if None is not itself a plausible function parameter, the custom sentinel requires a global constant; and use of star-args implies that more than one argument could be given.

Specification

Function default arguments can be defined using the new => notation:

def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
def connect(timeout=>default_timeout):
def add_item(item, target=>[]):
def format_time(fmt, time_t=>time.time()):

The expression is saved in its source code form for the purpose of inspection, and bytecode to evaluate it is prepended to the function’s body.

Notably, the expression is evaluated in the function’s run-time scope, NOT the scope in which the function was defined (as are early-bound defaults). This allows the expression to refer to other arguments.

Multiple late-bound arguments are evaluated from left to right, and can refer to previously-defined values. Order is defined by the function, regardless of the order in which keyword arguments may be passed. Using names of later arguments should not be relied upon, and while this MAY work in some Python implementations, it should be considered dubious:

def prevref(word="foo", a=>len(word), b=>a//2): # Valid
def selfref(spam=>spam): # Highly likely to give an error
def spaminate(sausage=>eggs + 1, eggs=>sausage - 1): # Confusing, may fail
def frob(n=>len(items), items=[]): # May fail, may succeed

Choice of spelling

Our chief syntax proposal is name=>expression – our two syntax proposals … ahem. Amongst our potential syntaxes are:

# Preferred options: adorn the equals sign (approximate preference order)
def bisect(a, hi=>len(a)):
def bisect(a, hi=:len(a)):
def bisect(a, hi:=len(a)):
def bisect(a, hi?=len(a)):
def bisect(a, hi!=len(a)):
def bisect(a, hi=\len(a)):
def bisect(a, hi=@len(a)):
# Less preferred option: adorn the variable name
def bisect(a, @hi=len(a)):
# Less preferred option: adorn the expression
def bisect(a, hi=`len(a)`):

Since default arguments behave largely the same whether they’re early or late bound, the preferred syntax is very similar to the existing early-bind syntax. The alternatives offer little advantage over the preferred one.

How to Teach This

Early-bound default arguments should always be taught first, as they are the simpler and more efficient way to evaluate arguments. Building on them, late bound arguments are broadly equivalent to code at the top of the function:

def add_item(item, target=>[]):

# Equivalent pseudocode:
def add_item(item, target=<OPTIONAL>):
    if target was omitted: target = []

Interaction with other open PEPs

PEP 661 attempts to solve one of the same problems as this does. It seeks to improve the documentation of sentinel values in default arguments, where this proposal seeks to remove the need for sentinels in many common cases. PEP 661 is able to improve documentation in arbitrarily complicated functions (it cites traceback.print_exception as its primary motivation, which has two arguments which must both-or-neither be specified); on the other hand, many of the common cases would no longer need sentinels if the true default could be defined by the function. Additionally, dedicated sentinel objects can be used as dictionary lookup keys, where PEP 671 does not apply.

Open Issues

  • Annotations go before the default, so in all syntax options, it must be unambiguous (both to the human and the parser) whether this is an annotation, a default, or both. The worst offender is the := notation, as :int= would be a valid annotation and early-bound default.

Implementation details

The following relates to the reference implementation, and is not necessarily part of the specification.

Argument defaults (positional or keyword) have both their values, as already retained, and an extra piece of information. For positional arguments, the extras are stored in a tuple in __defaults_extra__, and for keyword-only, a dict in __kwdefaults_extra__. If this attribute is None, it is equivalent to having None for every argument default.

For each parameter with a late-bound default, the special value Ellipsis is stored as the value placeholder, and the corresponding extra information needs to be queried. If it is None, then the default is indeed the value Ellipsis; otherwise, it is a descriptive string and the true value is calculated as the function begins.

When a parameter with a late-bound default is omitted, the function will begin with the parameter unbound. The function begins by testing for each parameter with a late-bound default, and if unbound, evaluates the original expression.

Out-of-order variable references are permitted as long as the referent has a value from an argument or early-bound default.

Costs

When no late-bound argument defaults are used, the following costs should be all that are incurred:

  • Function objects require two additional pointers, which will be NULL
  • Compiling code and constructing functions have additional flag checks
  • Using Ellipsis as a default value will require run-time verification to see if late-bound defaults exist.

These costs are expected to be minimal (on 64-bit Linux, this increases all function objects from 152 bytes to 168), with virtually no run-time cost when late-bound defaults are not used.

Backward incompatibility

Where late-bound defaults are not used, behaviour should be identical. Care should be taken if Ellipsis is found, as it may not represent itself, but beyond that, tools should see existing code unchanged.

References

https://github.com/rosuav/cpython/tree/pep-671

Source: https://github.com/python/peps/blob/master/pep-0671.rst

Last modified: 2021-11-10 07:23:03 GMT