Pages

Thursday, 17 March 2011

Parrot ops revamp in 3.2 (part 2)

Hi there.

In previous post I briefly described Parrot's ops. For past 10 years we used to "parse" ops as just chunks of text almost without any semantics behind. And this approach used to work for this 10 years. But life changed and now we need more than this.



Couple of years ago we ripped previous JIT implementation from Parrot. Main reason for such movement was that JIT was totally unsupported kludge. For each opcode we have to maintain separate "JITTable definition". Which is actually a hard job considering limited number of people working on parrot. And maintain 2 different definition for every single op is less than awesome. Long story short - old JIT is dead.

And we (parrot's developers) start discussion of next steps. JIT is pretty much "must have" in modern VM world. Many ideas were discussed. Most of them were declared "total crap". Some had interesting bits. And this is how Lorito was born.

We started from CISC-to-RISC idea. With minimal set of "RISC" ops which is suitable for implementing other "CISC" ops. We named them L1 ops. After few iterations we created some kind of "research project" and named it Lorito.

One of the big tasks for Lorito (apart from many others) is ability to translate current ops definitions into L1 ops. And to do it we either have to create new small "ops" language or parse existing one. Both approaches has pros and cons. For example new language can be very simple, easy to parse, etc. But it will require to rewrite all existing ops in one big step. And I'm talking only about core ops in parrot. All HLLs have to rewrite dynops in same time. Which is quite big effort. And I decided to try (just try) to implement semi-complete C parser.

About an year ago me and cotto reimplemented previous incarnation of Ops Compiler into NQP. One of the challenges was bootstrapping. Chicken and egg problem - opsc implemented in nqp, which is run by Parrot VM, which requires compiled ops, which compiled by opsc. This problem was solved by committing generated C code from ops into source repository.

So, I had all basic blocks to start parsing ops properly: nqp with awesome Perl6 grammars, opsc for parsing basic structure of ops, bootstrapping, etc.

Few weeks after - it was done. Sometimes it was actually funny. Sometimes I was totally bored. "C grammar" isn't most interesting thing in the world. After implementing full semantic parsing I implemented emitting of C from AST. And then pretty-printer. And then fixed few bugs withing parser just because pretty-printer showed clearly that parsing was wrong sometimes.

And now, we have clear road ahead to implement emitting of L1 (which is renamed to M0) ops from existing ops.

And just for fun some examples. We parse this op (which happen to be in src/dynoplibs/math.ops)

inline op cmod(out INT, in INT, in INT) :base_core {
    const INTVAL den = $3;
    if ($3 == 0) {
        opcode_t * const handler = Parrot_ex_throw_from_op_args(interp, expr NEXT(),
            EXCEPTION_DIV_BY_ZERO,
            "Divide by zero");
        goto ADDRESS(handler);
    }
    $1 = $2 % den;
}


with "./parrot runtime/parrot/library/opsc.pbc --target=past m.ops". "target=past" emits PAST - Parrot Abstract Syntax Tree.

"past" => PMC 'PAST;Stmts'  {
    <ops> => PMC 'PAST;Stmts'  {
        <pos> => 0
        [0] => PMC 'Ops;Op'  {
            <arg_types> => ResizablePMCArray (size:3) [
                "i",
                "i",
                "i"
            ]
            <args> => ResizablePMCArray (size:3) [
                PMC 'PAST;Var'  {
                    <direction> => "out"
                    <isdecl> => 1
                    <pos> => 15
                    <source> => \past
                    <type> => "INT"
                },
            ...
            [0] => PMC 'PAST;Block'  {
                <pos> => 51
                [0] => PMC 'PAST;Var'  {
                    <isdecl> => 1
                    <name> => "den"
                    <pointer> => ""
                    <pos> => 70
                    <vivibase> => "const INTVAL "
                    <viviself> => PMC 'PAST;Var'  {
                        <name> => 3
                        <pos> => 76
                        <scope> => "register"
                    }
                }
                [1] => PMC 'PAST;Op'  {
                    <pasttype> => "if"
                    [0] => PMC 'PAST;Op'  {
                        <name> => "&infix:<==>"
                        <pirop> => "=="
                        <pos> => 91
                        [0] => PMC 'PAST;Var'  {
                            <name> => 3
                            <pos> => 88
                            <scope> => "register"
                        }
                        [1] => PMC 'PAST;Val'  {
                            <returns> => "int"
                            <value> => "0"
                        }
                    }
                    [1] => PMC 'PAST;Block'  {
                        ...
                    }
                }
                [2] => PMC 'PAST;Op'  {
                    <name> => "&infix:<=>"
                    <pirop> => "="
                    [0] => PMC 'PAST;Var'  {
                        <name> => 1
                        <scope> => "register"
                    }
                    [1] => PMC 'PAST;Op'  {
                        <name> => "&infix:<%>"
                        <pirop> => "%"
                        [0] => PMC 'PAST;Var'  {
                            <name> => 2
                            <scope> => "register"
                        }
                        [1] => PMC 'PAST;Var'  {
                            <name> => "den"
                        }
                    }
                }
                [3] => PMC 'PAST;Op'  {
                    <name> => "goto_offset"
                    <pasttype> => "macro"
                    [0] => PMC 'PAST;Val'  {
                        <returns> => "int"
                        <value> => 4
                    }
                }
            }
        }
    }
}


(I omitted few bits for "TL;DR" purpose)