top | item 40502935

(no title)

JulianChastain | 1 year ago

Yes, I believe that this is due to how python lets you dynamically redefine functions even to violate lexical scoping. So in the author's example:

  def f(x):
      if x > 0:
          return f(x-1)
      return 0
  # Here is where a compiler might assume that f(x) is tail recursive, and so do TCE
  g = f
  def f(x):
     return x
  g(5)
So according to the python standard, g(5) returns 4, since it calls the original f, but then the first recursive call will go to the new f. If f was TCE'd, then it would return 0, as the actual recursion would be eliminated, so it wouldn't matter that you've reassigned f to a new function.

discuss

order

OskarS|1 year ago

I find that argument very unpersuasive: the optimization is tail CALL elimination, not tail RECURSION elimination, and the call to f(x-1) is certainly a tail call. I see no reason you couldn't optimize away that stack frame. "Tail recursion" is a technique in functional programming, that depends crucially on tail call elimination, but the tail call itself absolutely does not have to be recursive (it just has to be in tail position), it's just that it happens to be so in a tail recursive algorithm.

Incidentally: I don't believe that is a violation of lexical scoping. f is scoped globally, inside itself f is not defined, so f(x-1) refers to the globally scoped f. When it's reassigned, you reassign the global f. Nothing here violates lexical scoping, since f is never scoped anywhere other than globally (note that the blog post never claims it is, either)

I don't disagree with Guido's larger point, though, Python probably shouldn't use TCO. The debugging/stack trace point is very true, and also it's just not Pythonic to write algorithms TCO style. You just don't need to, in Python.

wk_end|1 year ago

It’s somewhat unclear, but in this case Guido is responding directly to a particular - broken - attempt to add TCO to self-tail calls to CPython in the fashion described. I assume, having not read the linked article, that the author must have done that as a quick hack - because CPython’s bytecode must allow arbitrary jumping within a function body but doesn’t have the means to express a true tail call. I guess.

He goes on, later in the post, to describe how he’d go about adding tail calls - although this treatment is also confused.

JulianChastain|1 year ago

Well I mean the reason you can't optimize away the stack frame is that at any point during runtime f can be redefined to be a different function. In the above example, do you agree that any tail call elimination would result in the wrong evaluation of `g(5) = 0` instead of the correct `g(5) = 4`?

And your understanding of why python doesn't permit TCE is because functions are globally scoped with indefinite extent?

Jtsummers|1 year ago

What amuses me is that he goes on to explain how you'd retain the existing Python behavior and still do a tail call optimization. You have to retain the late binding by not converting the call-return into a jump but instead to a special new call_return which still performs the lookup (to satisfy late binding) and can reuse the stack frame instead of generating a new one.

This would work for every instance of call-return pairs except in try statements and similar (anything that that has whatever Python calls unwind-protect). Since, in those cases, the except/finally/else of a try is code that may be executed (or is always executed in the case of the finally) after the call but before the return.

So it really does come down to a desire to prevent the use of deep call stacks (auto-recursion is just one case that would have been optimized here) rather than a true technical limitation. This could even have been brought in incrementally over the years and provided as an option.

bhickey|1 year ago

> dynamically redefine functions even to violate lexical scoping

When would someone want to do this?

JulianChastain|1 year ago

Let's say you had a library function foo, which takes no arguments, does an expensive computation, and then prints a single result to the console. You need the function to instead return the value, so that you can do more computation on it. You could write a wrapper function that will call foo, but replace the print function with one that records the printed result.

For example:

  def foo():
      print('usually this value is inaccessible from "python land"')
  def extract_printed_values(bar):
      global print
      old_print, returnv = print, None
      def new_print(x):
          nonlocal returnv
          old_print(returnv := x)
      print = new_print
      bar()
      print = old_print
      return returnv
  fooval = extract_printed_values(foo)
fooval will then be equal to whatever value foo printed to the console, otherwise foo will behave exactly the same as normal (assuming it only printed a single value), and print will even behave normally afterwards