Orc's External Value API#

Orc needs a powerful API for accessing external objects, sites, and values. The existing API is not providing the flexibility and performance the new implementations need.

The Orc 2.x API#

The API provided in Orc 2.x is based on dispatching the call based on runtime values. A cake-pattern stack of traits tries various approaches to make a call, each time a call is needed. All calls go through the same invoke method on the cake, and all responses return through a defined handle interface.


  • Very simple in terms of implementation and execution.


  • Extremely slow due to a huge number of megamorphic call sites and the fact that much of the dispatch has to be repeated at every call.
  • Does not enable static/JIT optimization since dispatch has to happen at the same time as the call itself.

Field Support#

The Orc 2.x site API does not have any specific support for fields. However, it could be added fairly simply by adding another method getField similar to invoke, but implementing field extractions semantics.

Accessor API (Proposed)#

The accessor API is based on class-based dispatch which is decoupled from the call itself. The API would provide a getInvoker method which takes the callee class and argument classes. The argument classes are required because Java supports static multiple dispatch which translates to dynamic multiple dispatch in a partially dynamic language like Orc. getInvoker returns an Invoker object with a single method invoke(handle, callee, args...) which executes the call. The invoker can be reused on any callee and args with the same classes.

For example, if types are known at compile-time we might be able to generate this:

static final Invoker inv = runtime.getInvoker(Adder.class, BigInt.class, BigInt.class)
void run() {
  inv.invoke(adder, one, two)


  • Can be implemented as a wrapper over the old API by returning an invoker which implements the old API when an unknown class is encountered.
  • The Invoker class can be a custom generated class for the specific set of argument types so the invocation can be very fast.
  • The Java overloading and dispatch rules are totally type based so they would only need to be run at getInvoker time.
  • Polymorphic methods, like +, could use dispatch to type specific implementations at getInvoker time.
  • Compiled code (or even the interpreter) can cache or pre-generate invokers and use them if the types at the call site are as expected or as before.
  • With full compile-time type information for the Orc program, all of the invokers could be constructed statically (or at program start) without any need for runtime types. With support from the sites the invokers could even be eliminated and invoke calls replaced with inline dispatch logic.


  • Sites supporting this interface would be harder to write since they need to separate class based dispatch from value based execution.
  • This approach cannot be ported to targets which do not support strong runtime type information (JavaScript) if the input Orc program lacks types.

Field Support#

Fields can be implemented using a getFieldAccessor method similar to getInvoker. getFieldAccessor takes class and a field name and returns a FieldAccessor with a single get(object) method. get returns the value of the selected field in the given object. get can be applied to any object of any subtype of the class passed to getFieldAccessor.

The FieldAccessor returned from getFieldAccessor can be generated at runtime to provide the best performance by "statically" referencing the field name and object type.

Further Optimization#

For even more performance the FieldAccessor and Invoker classes could be replaced by Java method handles. This would allow call sites in the Orc program to maintain a polymorphic inline cache for calls across multiple different external invocation or field lookup schemes (using invokedynamic). This approach would make Orc calls into Java code perform almost exactly the same as Java to Java calls (4-10ns depending on how polymorphic the site is).

This optimization does not require generating the whole program as bytecode, so the Orc compiler can still target Java. Instead, the invokedynamic code can be generated at runtime by generating small subclasses of a known superclass. The subclass would contain a single invoke dynamic instruction which invokes an appropriate bootstrap method to get and cache the appropriate Invoker/FieldAccessor. The HotSpot compiler can inline the generated method (complete with its invoke dynamic instruction) into the calling class written in Java as long as that call site is monomorphic.

Benchmarks Some simple microbenchmarks show that using invokedynamic, dynamic call sites perform almost identically to normal Java method calls (probably with a slight cost in additional compiled code). Both cost about 4ns for monomorphic calls and ~10ns for polymorphic calls (5-6 possible targets). This out performs other options:

  • Looking up the target every call: ~200ns/~480ns calls (~40x slower)
  • Caching the reflected method and calling it each time: 13ns/16ns calls (~2x slower) (Scala structural type method calls)
  • Caching a method handle and calling it each time: 10ns/not-tested (1.6x slower)
In addition, the invokedynamic implementation scales somewhat better to larger number of threads. For example, monomorphic calls in 6 threads are ~8ns with invokedynamic, 52ns (6.5x) with cached method reflection, and 20ns (2.5x) with cached method handles.

The factor differences are significant, but the total amount of time doing these invocations is likely to be small at first. However, once other optimizations are in place then site invocation and field access may become a more important performance element, so it is important to plan for these optimizations.

Static Metadata#

Compile-time versions of the Invoker and FieldAccessor classes and a parallel compile time site loading API provide the compiler with site metadata at various levels.

The compile-time and run-time APIs should be distinct because the runtime should not pay the time and memory cost of generating metadata that is not needed. In may be useful to have some metadata (such as non-blocking) on invokers at runtime.

See Also#

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-8) was last changed on 12-Mar-2017 20:56 by Arthur Peters