Hi there!
Preamble
Few days ago I posted note about LLVM bindings in Parrot. It was in my experimental branchopsc_llvm
. When I started this branch it was purely my own playground for LLVM/opsc/JIT/etc. I had no idea what I can get out of it. Especially because I had no previous experience not only with with LLVM but with Parrot's NCI (Native Call Interface) as well. But now I can almost get something really useful - Fully Functional LLVM Bindings.<assumption>You know how to checkout particular branch in Parrot's git repository. All files referenced further will be from this branch</assumption>
LLVM quick intro
LLVM is really cool. It's stand for Low Level Virtual Machine and provide a lot of functionality. I was most interesting in runtime code generation and optimizations. Just because I never tried to generate native code in runtime before. And "High Level" optimizations is my old love. Implementing JIT in Parrot is "nice to have side-effect" :)I will not tell about LLVM features anymore. You can just go to llvm.org and read a lot of documents. But I highly recommend to read Kaleidoscope Tutorial to understand basic precipices of LLVM usage/embedding.
For "embedding purpose" LLVM provides 2 set of APIs — C++ (which is kind of obvious because LLVM is implemented in C++) and C (which is lagging behind C++ APIs). Because Parrot's implementation language is pure-old-not-so-good-C I choose "C APIs". Unfortunately I wasn't able to find any good docs for C API, so my main source of truth was Core.h and other header files. It's not so as you can expect. Just because it's really close to "C++ API" module C limitations.
After few days of pure play with LLVM (you can imagine some kind of 6 years old boy who got new shiny RC Airplane Model) I came out with next architecture/design/bestpractice/younameit.
LLVM Bindings
<warning>Hardcore technical stuff with a lot of jargon</warning>
All things are located in
runtime/parrot/library
directory. Mapping of classes to filenames is following Perl conventions. E.g. LLVM::Builder
is defined in runtime/parrot/library/LLVM/Builder.pm
Basic skeleton of LLVM bindings consists of:
LLVM.pm
— main LLVM-to-NCI loader. Provides "nice" wrapper to call LLVM functions.
LLVM::Opaque
— base class for any objects returned from LLVM APIs. In the nutshell - everything from LLVM represented by some kind of opaque pointer.
LLVM::Value
— base class for values (includingFunction
,Constant
, etc)
LLVM::Function
— "proper" OO binding for llvm::Function.
LLVM::Module
— same for llvm::Module.
LLVM::Builder
— same for ...
LLVM::BasicBlock
— same ...
LLVM::Constant
— same ... Hang on. It's just bunch of static functions to generate llvm constants!
- Use of method
.create
instead of.new
for creating new LLVM::Foo objects. Mostly because I found some weird shenanigans with.new
and inheritance in NQP. And I'm too lazy to fix it.
- Every single object returned from llvm should be wrapped into
LLVM::Opaque
object. E.g.LLVM::Value
.
opsc_llvm
branch and look in runtime/parrot/library/LLVM
and t/library/llvm
for more stuff. Things to do before declare victory
To declare llvm binding "finished" (or better "mostly useful") and merge branch back to trunk few things need to happen:- Wrap more functions related to
LLVM::Builder
. It's mostly one-line-per-function.
- Same for constants creation. Check
LLVM/Constant.pm
. It's almost empty.
- Types creation/usage/etc. Everything inside
LLVM/Type.pm
.
- Finishing of
LLVM::Builder
.LLVMBuildFoo
functions from C API should be wrapped and exposed.
- "Navigational" methods for BasicBlock/Function/etc. Think of .next/.prev/.first/.last.
- "LLVM Memory Management". LLVMBufferPtr wrapped into Parrot's PMC.
PtrBuf
looks like obvious choice. But it's "bleeding edge" functionality and I have no idea how it should work.
- "LLVM BitReader/BitWriter". Ability to read bitcode from disk is kind of crucial for JIT implementation.
- (Most annoying thing) Implement proper loading of
libLLVM.so
in LLVM bindings. Currently it's hardcoded tolibLLVM-2.7.so
which is bad and not acceptable for merging branch to master.
So, if anyone (including parrot's committers :) want to help with this — you are welcome :)