PreambleFew days ago I posted note about LLVM bindings in Parrot. It was in my experimental branch
opsc_llvm. When I started this branch it was purely my own playground for LLVM/opsc/JIT/etc. I had no idea what I can get out of it. Especially because I had no previous experience not only with with LLVM but with Parrot's NCI (Native Call Interface) as well. But now I can almost get something really useful - Fully Functional LLVM Bindings.
<assumption>You know how to checkout particular branch in Parrot's git repository. All files referenced further will be from this branch</assumption>
LLVM quick introLLVM is really cool. It's stand for Low Level Virtual Machine and provide a lot of functionality. I was most interesting in runtime code generation and optimizations. Just because I never tried to generate native code in runtime before. And "High Level" optimizations is my old love. Implementing JIT in Parrot is "nice to have side-effect" :)
I will not tell about LLVM features anymore. You can just go to llvm.org and read a lot of documents. But I highly recommend to read Kaleidoscope Tutorial to understand basic precipices of LLVM usage/embedding.
For "embedding purpose" LLVM provides 2 set of APIs — C++ (which is kind of obvious because LLVM is implemented in C++) and C (which is lagging behind C++ APIs). Because Parrot's implementation language is pure-old-not-so-good-C I choose "C APIs". Unfortunately I wasn't able to find any good docs for C API, so my main source of truth was Core.h and other header files. It's not so as you can expect. Just because it's really close to "C++ API" module C limitations.
After few days of pure play with LLVM (you can imagine some kind of 6 years old boy who got new shiny RC Airplane Model) I came out with next architecture/design/bestpractice/younameit.
<warning>Hardcore technical stuff with a lot of jargon</warning>
All things are located in
runtime/parrot/librarydirectory. Mapping of classes to filenames is following Perl conventions. E.g.
LLVM::Builderis defined in
Basic skeleton of LLVM bindings consists of:
LLVM.pm— main LLVM-to-NCI loader. Provides "nice" wrapper to call LLVM functions.
LLVM::Opaque— base class for any objects returned from LLVM APIs. In the nutshell - everything from LLVM represented by some kind of opaque pointer.
LLVM::Value— base class for values (including
LLVM::Function— "proper" OO binding for llvm::Function.
LLVM::Module— same for llvm::Module.
LLVM::Builder— same for ...
LLVM::BasicBlock— same ...
LLVM::Constant— same ... Hang on. It's just bunch of static functions to generate llvm constants!
- Use of method
.newfor creating new LLVM::Foo objects. Mostly because I found some weird shenanigans with
.newand inheritance in NQP. And I'm too lazy to fix it.
- Every single object returned from llvm should be wrapped into
opsc_llvmbranch and look in
t/library/llvmfor more stuff.
Things to do before declare victoryTo declare llvm binding "finished" (or better "mostly useful") and merge branch back to trunk few things need to happen:
- Wrap more functions related to
LLVM::Builder. It's mostly one-line-per-function.
- Same for constants creation. Check
LLVM/Constant.pm. It's almost empty.
- Types creation/usage/etc. Everything inside
- Finishing of
LLVMBuildFoofunctions from C API should be wrapped and exposed.
- "Navigational" methods for BasicBlock/Function/etc. Think of .next/.prev/.first/.last.
- "LLVM Memory Management". LLVMBufferPtr wrapped into Parrot's PMC.
PtrBuflooks like obvious choice. But it's "bleeding edge" functionality and I have no idea how it should work.
- "LLVM BitReader/BitWriter". Ability to read bitcode from disk is kind of crucial for JIT implementation.
- (Most annoying thing) Implement proper loading of
libLLVM.soin LLVM bindings. Currently it's hardcoded to
libLLVM-2.7.sowhich is bad and not acceptable for merging branch to master.
So, if anyone (including parrot's committers :) want to help with this — you are welcome :)