Tail Recursion (C)

What It Is

This is actually pretty logical if you think about it for a second, but I never really did. As we all know, function calls are pricy because they require context-switches. A function that uses recursion is inheritly quite limited because the stack will quickly run out of space to store all the function calls. Enter tail recursion: when the last operation in a function is another function call, there is no need to store the current context. Suppose function a calls function b, which in turn only calls function c. Instead of making c return to b when it’s done, it might just as well return directly to a. During recursion this can save a lot of time if the function calls itself very often, even making it at all possible to recurse a nearly infinite amount of times (at least you will not be bound by the stack size anymore).

So, if you write a time-critical function that uses recursion, try to make the recurring call the last thing the function does. Compiler optimization should take care of the rest.

Example: recurse.c

#include <limits.h>
#include <stdio.h>

volatile int sink = INT_MAX;

void recurse(void) {
    if (--sink) recurse();
}

int main() {
    recurse();
    printf("Success\n");
    return 0;
}
$ gcc recurse.c -o recurse_normal
$ gcc recurse.c -O3 -o recurse_optimized
$ ./recurse_normal
zsh: segmentation fault ./recurse_normal
$ ./recurse_optimized
Success

You can see this in the assembly output (GCC’s -S flag):

recurse:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     r3, .L6      @@ r3 := &sink
.L2:
        ldr     r2, [r3, #0] @@ r2 := *r3
        sub     r2, r2, #1   @@ r2 := r2 - 1
        str     r2, [r3, #0] @@ *r3 := r2
        ldr     r2, [r3, #0] @@ r2 := *r3
        cmp     r2, #0       @@ r2 = 0 ?
        bne     .L2          @@ no: goto .L2
        bx      lr           @@ yes: goto *lr

Single-@ comments are GCC’s, double-@ are mine. As you can see, GCC explicitly mentions that it enabled tail recursion (“link register save eliminated”). The critical point here is the next-to-last line. There, instead of doing a full branch-load instruction (bl) which saves the current instruction pointer in the return register, a simple branch instruction is made that saves nothing. This is what you would expect from a normal loop, and it explains is why there is no stack overflow.



Or leave a comment below: