Pages

Saturday 19 March 2011

Parrot-on-Parrot or "Crazy Idea For JIT Prototype"

Aloha!


This is just a braindump of idea for prototyping of JIT in Parrot.

As I described early now we have properly parsed ops and half-baked LLVM bindings. Based on it I can implement JIT prototype for Parrot.



Execution of ops


Little bit of theory how ops are executed inside Parrot.

In the nutshell execution can be expressed in next pseudo-code:

int main() {
    /* Create interp */
    Parrot_Interp *interp = Parrot_interp_create();

    /* Load bytecode */
    PackFile *pf = Parrot_pbc_load_bytecode(interp, "file.pbc");

    /* Setup ProgramCounter */
    opcode_t *pc = find_main_sub();

    /* Execute ops */
    while (pc) {
        pc = interp->code->op_func_table[*pc](pc, interp);
    }

    return 0;
}

Every single op is function similar to:

opcode_t *
Parrot_add_i_i_ic(opcode_t *pc, Parrot_Interp *interp) {
    IREG(1) = IREG(2) + ICONST(3);
    return pc + 4; /* 'goto' next op. 4 is size of "add" */
}

IREG, ICONST (and similar for S, N, P) are macros to access INTVAL registers and constants. Number in parenthesis is offset into bytecode from op start. For registers we store it number inside bytecode. For constants we store constant number. Exception is INTVAL constant when we store constant itself.

For example if for add_i_i_ic we have bytecode 0x01, 0x07, 0x03, 0x2a then it explained as:
  • 0x01 — identifier of opcode. Offset inside interp->code->op_func_table
  • 0x07 — store result into register number 7.
  • 0x03 — first argument in register 3.
  • 0x2a — second argument is INTVAL constant 42.

Because of this conventions and CPS nature of Parrot VM main execution loop is quite simple. Just execute ops while we have PC not null.

There is only one small problem with it - all of this is low-level C stuff which is not directly available to HLLs implemented on top of Parrot. And I'm really-really don't want to use C as prototyping language.

As usual there is good news. We can manipulate PBC files using Packfile* PMC. Including create/update/store and, most importantly, load. And based on this I came up with...

Crazy Idea


Let's emulate runcore in NQP. Or PIR. Or Winxed. Or whatever.

What I need is:
  • Ability to load and introspect bytecode files — "Packfile* PMCs". Tick.
  • Semantically parsed ops to generate something different from existing C ops in runtime — "opsc revamp". Tick.
  • Some way of generate native code in runtime — "LLVM bindings". Tick.
Looks like all pre-requirements are met. Of course there is some missing parts. For example opsc doesn't parse C macros (yet). And LLVM bindings aren't finished. And Packfile PMCs aren't well-tested and probably will change in the future. And phase of the Moon is totally wrong. Yada-yada-yada. Based on this I decided to do nothing and wait forever until all things will be finished and bright future will be just around the corner. Thinking... Thinking... Thinking... Adding -Ofun to pre-requirements... Hey! Who cares? I can start prototyping right now and fix all other things as I go.

Prototyping approach

  1. Create small subset of ops for jitting purpose. Two main reasons for it are:
    • handling of C macros in opsc isn't implemented;
    • opsc is really slow; Parsing of "core" ops takes about a minute on my box.
  2. Extend opsc to emit LLVM bitcode from parsed op body and PBC. PBC required to "expand" register and constant access macros. And this is much closer to real JIT.
  3. Implement something like parrot.nqp which will
    • Load pre-generated PBC file.
    • Parse jitted.ops
    • Emulate runcore main loop "JITting" every op one-by-one.
After this prototype done I can extend it to:
  • JIT whole Sub.
  • Handle local goto's.
  • Handle invokecc and other invocations.

But first thing first — am I crazy enough so this idea will actually work?