Tail call optimization? #288

mlmiller751 · 2023-03-22T20:53:24Z

mlmiller751
Mar 22, 2023

Does Codon support tail recursion optimization? If not, is this a feature planned for the future?

The current behavior of old fashioned Python is that it does not support this optimization, thereby discouraging educators and students from exploring recursion, once they see something like a naive version of Fibonacci in traditional Python. If Codon does this correctly, I will be eager to show it to my students.

Answered by arshajii

Mar 22, 2023

Hi @mlmiller751 -- yes, Codon will do this through LLVM, which has tail call optimization as part of it's standard optimization passes.

For example:

def factorial(n: int):
    return 1 if n <= 1 else n*factorial(n - 1)

compiles to:

; Function Attrs: nofree nosync nounwind readnone
define i64 @factorial(i64 %0) local_unnamed_addr #1 personality ptr @seq_personality {
entry:
  %tmp.i5 = icmp sgt i64 %0, 1
  br i1 %tmp.i5, label %ternary.false.preheader, label %ternary.exit

ternary.false.preheader:                          ; preds = %entry
  %1 = add nsw i64 %0, -1
  %min.iters.check = icmp eq i64 %0, 2
  br i1 %min.iters.check, label %ternary.false.preheader17, label %vector.ph

vector.ph:…

View full answer

arshajii · 2023-03-22T21:56:23Z

arshajii
Mar 22, 2023
Maintainer

Hi @mlmiller751 -- yes, Codon will do this through LLVM, which has tail call optimization as part of it's standard optimization passes.

For example:

def factorial(n: int):
    return 1 if n <= 1 else n*factorial(n - 1)

compiles to:

; Function Attrs: nofree nosync nounwind readnone
define i64 @factorial(i64 %0) local_unnamed_addr #1 personality ptr @seq_personality {
entry:
  %tmp.i5 = icmp sgt i64 %0, 1
  br i1 %tmp.i5, label %ternary.false.preheader, label %ternary.exit

ternary.false.preheader:                          ; preds = %entry
  %1 = add nsw i64 %0, -1
  %min.iters.check = icmp eq i64 %0, 2
  br i1 %min.iters.check, label %ternary.false.preheader17, label %vector.ph

vector.ph:                                        ; preds = %ternary.false.preheader
  %n.vec = and i64 %1, -2
  br label %vector.body

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi i64 [ 1, %vector.ph ], [ %2, %vector.body ]
  %vec.phi8 = phi i64 [ 1, %vector.ph ], [ %3, %vector.body ]
  %offset.idx = sub i64 %0, %index
  %induction9 = add i64 %offset.idx, -1
  %2 = mul i64 %offset.idx, %vec.phi
  %3 = mul i64 %induction9, %vec.phi8
  %index.next = add nuw i64 %index, 2
  %4 = icmp eq i64 %index.next, %n.vec
  br i1 %4, label %middle.block, label %vector.body, !llvm.loop !2

middle.block:                                     ; preds = %vector.body
  %ind.end = sub nsw i64 %0, %n.vec
  %bin.rdx = mul i64 %3, %2
  %cmp.n = icmp eq i64 %1, %n.vec
  br i1 %cmp.n, label %ternary.exit, label %ternary.false.preheader17

ternary.false.preheader17:                        ; preds = %ternary.false.preheader, %middle.block
  %.tr7.ph = phi i64 [ %ind.end, %middle.block ], [ 2, %ternary.false.preheader ]
  %accumulator.tr6.ph = phi i64 [ %bin.rdx, %middle.block ], [ 1, %ternary.false.preheader ]
  br label %ternary.false

ternary.false:                                    ; preds = %ternary.false.preheader17, %ternary.false
  %.tr7 = phi i64 [ %tmp.i3, %ternary.false ], [ %.tr7.ph, %ternary.false.preheader17 ]
  %accumulator.tr6 = phi i64 [ %tmp.i4, %ternary.false ], [ %accumulator.tr6.ph, %ternary.false.preheader17 ]
  %tmp.i3 = add nsw i64 %.tr7, -1
  %tmp.i4 = mul i64 %accumulator.tr6, %.tr7
  %tmp.i = icmp ugt i64 %.tr7, 2
  br i1 %tmp.i, label %ternary.false, label %ternary.exit, !llvm.loop !4

ternary.exit:                                     ; preds = %ternary.false, %middle.block, %entry
  %accumulator.tr.lcssa = phi i64 [ 1, %entry ], [ %bin.rdx, %middle.block ], [ %tmp.i4, %ternary.false ]
  ret i64 %accumulator.tr.lcssa
}

i.e. there's no recursive call to factorial.

0 replies

mlmiller751 · 2023-03-22T22:08:11Z

mlmiller751
Mar 22, 2023
Author

Awesome! Thank you!

0 replies

Seirdy · 2023-06-05T07:35:03Z

Seirdy
Jun 5, 2023

My understanding is that LLVM's support for TCO is somewhat limited, and generally referred to as "sibling call optimization" to distinguish it for only supporting a strict subset of all possible TCO use-cases.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tail call optimization? #288

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Tail call optimization? #288

mlmiller751 Mar 22, 2023

Replies: 3 comments

arshajii Mar 22, 2023 Maintainer

mlmiller751 Mar 22, 2023 Author

Seirdy Jun 5, 2023

mlmiller751
Mar 22, 2023

arshajii
Mar 22, 2023
Maintainer

mlmiller751
Mar 22, 2023
Author

Seirdy
Jun 5, 2023