llvm2scratch - C to Scratch compiler

-Zn

I found another method for branching which doesn't require all downstream functions that call a function with a loop to return to an address and is slightly faster by unwinding the stack to a ‘jump table’ when resetting the call stack.

define caller
callee
repeat until <(Branch ID) = (0)>
if <(Branch ID) = (1)> then // binary search

  branch 1

else
  ... // other branches
end
end
... // code after func call

define callee
increment counter :: control
branch 1

define branch 1
increment counter :: control
if <(counter :: control reporter) > [1000000]> then
clear counter :: control
set [Branch ID v] to [1]
stop [this script v]
end
branch 1

I probably won't implement this tho because I'm fine with the current solution (which uses a broadcast to reset the stack) and this method has some quirks that make it annoying to implement. Another issue is that it requires functions both upstream and downstream of a loop to have duplicate return to address code (because the counter can only be used once) and use the branch ID system, or use a slower branching method.

-Zn

MonkeyBean2 wrote:
It looks like you could do a bit more optimization though, as the example code sets variables that are only used once directly after (so you could remove that and collapse it into one expression).

Spent a few days implementing this and it works!

Code:

#include <stdio.h>
 
static int a = 7;
 
int add_one(int num) {
  return num + 1;
}
 
int test_branch(int num) {
  int a = 3;
 
  if (num != 1) a = 50;
 
  return a + num;
}
 
int main(void) {
  puts("hello world");
  a += 2;
  a -= 4;
  a *= 2;
  a /= -3;
  a = -340;
  a %= -60;
 
  a = 31;
  a <<= a;
  a >>= 3;
 
  unsigned int b = 3204;
  b >>= 2;
  b ^= 113;
  b |= 1546;
  b &= 393;
 
  a = add_one(a);
 
  int c = test_branch(2);
 
  for (unsigned char d = 65; d != 70; d++) {
    putchar(d);
  }
 
  unsigned int e = 46;
  switch (e) {
    case 0:
      puts("0");
      break;
    case 1:
      puts("1");
      break;
    case 20:
      puts("20");
      break;
    case 21:
      puts("21");
      break;
    default:
      puts("default");
      break;
  }
 
  return 0;
}

Resultant scratch code (optimized and unoptimized versions):
https://scratch.mit.edu/projects/1208872099/

Essentially, I work out which variables are modified where and only elide them if a dependency is not changed (+ a LOT of edge cases). I also work out the computational cost to calculate the value and how many times it is used to work out if it's actually less expensive to do so (e.g. a complex value or used multiple times in a loop).

bobbybee

Super cool. I think you want to call Mem2Reg (?) on the input LLVM code to elide most of the stack garbage, that should speed up a lot too.

-Zn

UPDATE: recursion and stack deallocation are now supported

For stack deallocation, I try to work out if a function allocates a known amount overall by finding unavoidable branches (nodes). If it is known, I increase the stack size variable at the start of function by the same amount I decrease it on return. If the amount is unknown, I store the previous stack size and restore it on return. If it doesn't call any functions which allocate on the stack, then I don't increase the stack size and treat the offsets from the stack differently (the stack size is only really needed to indicate the offset to the next function). I also gave the stack size variable some special optimization properties because every function call will return the stack size variable back to it's original value (expect branches but I never run any code after a branch).

With recursion, I either use custom block parameters or a list (similar to the regular stack) to store local variables in functions to be accessed after the function call.

Example:
https://scratch.mit.edu/projects/1211169662/

Source Code:

#include <stdio.h>
 
int factorial_recurse(int n) {
  if (n == 1) return 1;
  return factorial_recurse(n - 1) * n;
}
 
int sum_to_one_digit(unsigned int n) {
  unsigned int sum = 0;
 
  while (n > 0) {
    sum += n % 10;
    n /= 10;
  }
 
  if (sum >= 10) return sum_to_one_digit(sum);
  return sum;
}
 
int main(void) {
  puts("hello world");
 
  int f = factorial_recurse(10);
  int g = sum_to_one_digit(473);
 
  return 0;
}

-Zn

UPDATE (been a while lol):

I rewrote the code to my own wrapper to a different library (because the old one couldn't provide enough information)
I added support for some instructions used in optimizations (phi, select, etc)
Most recently I now support the getelementptr instructions, allowing some array support.

Example:
https://scratch.mit.edu/projects/1226122280/

Source:

#include <stdio.h>
 
static char str[] = "hello world";
 
void numberize(char* str) {
  for (int i = 0; str[i] != '\0'; i++) {
    switch (str[i]) {
      case 'a':
        str[i] = '4';
        break;
      case 'e':
        str[i] = '3';
        break;
      case 'l':
        str[i] = '1';
        break;
      case 'o':
        str[i] = '0';
        break;
    }
  }
}
 
int main(void) {
  numberize(str);
  puts(str);
 
  return 0;
}

Resultant IR (also scratch supports llvm ir highlighting what):

@str = internal global [12 x i8] c"hello world\00"
 
define void @numberize(i8* %0) {
  br label %2
2:
  %3 = phi i32 [ 0, %1 ], [ %16, %15 ]
  %4 = getelementptr inbounds i8, i8* %0, i32 %3
  %5 = load i8, i8* %4
  %6 = icmp eq i8 %5, 0
  br i1 %6, label %7, label %8
7: 
  ret void
8:    
  %9 = zext i8 %5 to i32
  switch i32 %9, label %15 [
    i32 97, label %13
    i32 101, label %10
    i32 108, label %11
    i32 111, label %12
  ]
10:
  br label %13
 
11: 
  br label %13
 
12:
  br label %13
 
13:
  %14 = phi i8 [ 51, %10 ], [ 49, %11 ], [ 48, %12 ], [ 52, %8 ]
  store i8 %14, i8* %4
  ; branch to %15
 
  ; label %15
  %16 = add i32 %3, 1
  ; branch to %2
}
 
define i32 @main() {
  call void @numberize(i8* getelementptr ([12 x i8], [12 x i8]* @str, i32 0, i32 0))
  %1 = call i32 @puts(i8* getelementptr ([12 x i8], [12 x i8]* @str, i32 0, i32 0))
  ret i32 0
}

Last edited by -Zn (Oct. 7, 2025 18:55:09)

FurryR

I noticed that variables in LLVM IR are immutable. In Scratch this would create A LOT OF variables. Please consider reuse variables to create a (at least) maintainable project.

-Zn

FurryR wrote:
I noticed that variables in LLVM IR are immutable. In Scratch this would create A LOT OF variables. Please consider reuse variables to create a (at least) maintainable project.

All the variables are local to the sprite so it shouldn't affect other sprites and writing/calling code inside the sprite containing the code is not preferable anyway because you'd have to modify the sprite each time you modified the source code.

If you did want to interact with the functions I'd probably add some kind of system where you call functions via a broadcast defined in some C code or similar (and a C api for scratch blocks).

That said, I am planning to find which functions can never be ran at the same time and re-use identifiers between them.

-Zn

UPDATE:

Full getelementptr support, meaning structs and arrays are now complete (except in aggregate form)
Floating point numbers and arithmetic are also finished
All comparison types are now supported
All conversion instructions are supported
The memcpy intrinsic is supported

Since most instructions and types are supported, next I'll focus on

Aggregate struct/array types
>48 bit integer arithmetic

EXAMPLE (pi calculator):

https://scratch.mit.edu/projects/1233764273/

#include "sb3api.h"
 
double pi = 0.0;
double term;
int sign = 1;
unsigned int i = 0U;
 
void __attribute__((noinline)) calc_pi_step(void) {
  term = 1.0 / (2.0 * i + 1.0);
  pi += sign * term;
  sign = -sign;
  ++i;
}
 
int main(void) {
  const unsigned int print_interval = 10000U;
 
  for (;;) {
    calc_pi_step();
 
    if (i % print_interval == 0U) {
      SB3_say_dbl(100000000000 * 4.0 * pi);
    }
  }
}

Discuss Scratch