Java Reflection, 1000x Faster

[+] btown|8 years ago|reply

> The simple approach is to simply add a children() method to the interface and implement it in every kind of node. Very tedious, and boilerplaty as hell.

The best approach without boilerplate would be to generate this at compile time, not runtime. You could use an approach like in http://hannesdorfmann.com/annotation-processing/annotationpr... to create at compile-time, using reflection, a ChildWalker class that accesses the relevant properties for every class annotated with @Node. At runtime, the JIT should get this down to a vtable lookup and field lookups - just about as optimized as you can get.

EDIT: Always remember https://xkcd.com/1205/ - if it would take 5 seconds to write each children() method, and you write no more than 5 Node classes per day (amortized), then you shouldn't spend more 12 hours writing the code for this blog post ;) Unless, of course, that was the point all along!

[+] divs1210|8 years ago|reply

Seeing all these comments about code generation at compile time makes me wonder if there's an overlap between people who use/write these kind of tools for complex languages and those who post 'what's so great about (lisp) macros?'.

eg: JSX users.

[+] Cieplak|8 years ago|reply

Although not a realistic option for everyone, I find c++ templates to be simpler than java annotation processors for type-safe compile-time code gen.

[+] needusername|8 years ago|reply

An annotation processor does not seem to be a good fit here. Annotation processors can only generate new classes, not modify existing ones. So you would have to make all classes abstract and then generate non-abstract ones that have a #children() method and refer to the generated classes everywhere in your code.

[+] specialist|8 years ago|reply

Developing, debugging, using an annotation processor is opposite of productive. Not recommended.

[+] munificent|8 years ago|reply

Or, heck, just hack together a little script to generate the code and hook it into your build system. You don't need anything as fancy as an annotation processor. A handful of Python code could accomplish just as well.

[+] norswap|8 years ago|reply

Indeed, code generation was going to be my permanent solution, until this turned out fast enough not to bother!

Seems like I'm doing fine by xkcd, at least if you remove the time to make a simpler version for the blog post =) And the learning has some value too anyhow.

[+] jtmarmon|8 years ago|reply

Seems to me this is approaching this totally the wrong way. First off, the current implementation is just looping through all the methods of the implementing class and the first one that matches a Node or [Node] return type is supposedly its children? That seems extremely brittle.

The fact that you have 100 classes implementing this is a good sign you should turn the design on its head. Why not just have a single tree structure built using generics?

edit: also I would point out that you're not really making reflection any faster, just your code

[+] dragandj|8 years ago|reply

On top of that, he acknowledges that there is a much, much simpler solution, known from the dawn of Java, and this is to have a polymorphic children() method.

But that would (gasp!) require adding those implementations. Which is tedious (it is not with a bit of inheritance or some other technique but ok). OK, so, it requires a bit of work. But those methods would take a few (or a dozen) nanoseconds to dispatch; you can't go faster than that. What's the alternative? To build your brittle infrastructure? Hmmm, I must be missing something...

[+] flukus|8 years ago|reply

> The fact that you have 100 classes implementing this is a good sign you should turn the design on its head. Why not just have a single tree structure built using generics?

Does anyone know why gui toolkits never do this? Every one seems to follow an OO model where each widget owns it's own children. Even in GTK they hacked in an OO system to do it instead of having a separate tree.

[+] norswap|8 years ago|reply

I'm not seeing how I can use some generic tree class without adding code for each Node subclass, which is exactly what I'm trying to avoid.

At the least, I'd have to populate the children list of that class (which is more or less what children() would have done). The identity of each child (what it represents), such as "condition of a while statement" or "body of the else part of an if statement" must be preserved.

Also, the issue with methods really isn't, because these classes are really POJOs. Getters and methods inherited from Object is pretty much all they have, so I'm not worried about accidentally capturing a method that is not a getter.

And well, this make reflection calls faster by replacing them (probably) with the equivalent plain code.

[+] hota_mazi|8 years ago|reply

If you are using reflection and encountering performance problems, the correct approach is pretty much always to write an annotation processor and generate that code instead.

[+] avodonosov|8 years ago|reply

Or generate code without using annotation processor

[+] kodablah|8 years ago|reply

Didn't look too deep, but this might be due to the fact that call sites are memoized for invoke calls as part of the invokedynamic improvements (but really MethodHandle::invoke and MethodHandle::invokeExact are where the magic is as is mentioned by the JVM spec IIRC).

Regardless, MethodHandle invocations are always going to be faster than reflection Method invocations. A primary reason is that the parameters for invoke and invokeExact are not subject to the same boxing and runtime-validation requirements as reflection.

[+] JackFr|8 years ago|reply

> I had an interface (representing a tree node)

No you didn't. An interface representing a tree node would have a children() method.

[+] yipopov|8 years ago|reply

Why? I see why it might be a good idea depending on what problem you are trying to solve, but I don't see why anything has to have any references to its children in order to be a tree node.

[+] xenadu02|8 years ago|reply

I assume the JIT isn't optimizing until the functions execute a certain number of times. Given the speed difference maybe it isn't even natively compiling at first and using the interpreter.

I don't know about Java but in .NET there's a call you can make to force the dynamically generated lamba expression to be compiled immediately.

Of course if you're being really adventurous you can use IL emit to build a function up from IL opcodes.

[+] ShabbyDoo|8 years ago|reply

Yes. The author didn't note his testing methodology. JMH (Java Microbenchmark Harness) is the gold standard for executing Java benchmarking tests:

http://openjdk.java.net/projects/code-tools/jmh/

It takes care of details like JVM warm-ups. ensuring sufficient invocations for JIT compilation to have occurred, etc.

[+] rhpistole|8 years ago|reply

That's the kind of solution I've used before when performing reflection in a tight loop. The important thing there is to cache the result of the compiled expression or the dynamic IL code because the compile time/emit is slower than a single use of reflection.

[+] joemag|8 years ago|reply

In the past, I've used Introspector API [1] for property reflection. It's part of the JDK, and provides a lot of the caching this article is talking about. Doubt this would provide the same level of improvement as adding the direct children() method, or even ASM based approaches, but it's certainly faster than naked reflection.

[1] https://docs.oracle.com/javase/7/docs/api/java/beans/Introsp...

[+] solomatov|8 years ago|reply

There's another, and IMO, a better way to make Java reflection faster. It's to disable checks on reflection calls. Just call setAccessible(true), and you will notice substantial improvement.

[+] dmux|8 years ago|reply

Isn't setAccessible required before you can interact with fields?

[+] jtmarmon|8 years ago|reply

he does that

[+] agentgt|8 years ago|reply

What I do for these situations is do the reflection as a unit test to confirm that the hand written children() is correct.

When the unit test fails for a node type it prints what it should be and you just copy and paste that code for children() should be.

This is also nice for the case where nodes have some sort special children(). You can make the unit test ignore those.

[+] fulafel|8 years ago|reply

I wonder if this could be used by JVM languages that sometimes fall back to reflection, like Clojure. Transparently profile the reflection calls and recompile the hot ones?

[+] divs1210|8 years ago|reply

Clojure provides a var called warn-on-reflection that can be set to true during dev, and all (or a vast majority of) reflective calls can be optimized away by adding type hints wherever it warns. This is to say that Clojure devs get rid of reflection instead of trying to make it faster.

[+] agumonkey|8 years ago|reply

I wonder if the clojure devs use these techniques. IIRC clojure relies a lot on reflection.

[+] divs1210|8 years ago|reply

Clojure devs get rid of reflection by adding type hints where required. The compiler has a flag to help with this, called warn-on-reflection.

[+] DonHopkins|8 years ago|reply

C# reflection is also slow, and even slower in Unity3D with the IL2CPP ahead-of-time compiler, where there is no just-in-time compiler or System.Reflection.Emit to dynamically generate code at runtime (which is impossible and prohibited on iOS, for example).

Here are the limitations of the Mono runtime:

https://developer.xamarin.com/guides/ios/advanced_topics/lim...

However, there are ways of compiling the little snippets of the code you need ahead of time, then plugging them together with delegates at runtime, which avoids much of the overhead.

Much of the cost is the overhead of converting from unknown types to specific types. So you can pre-compile loosely typed adaptor thunks that take generic "object" arguments, convert them to the required type, and directly call strongly typed delegates (which you can plug into the adaptor, so adaptors can be used for any number of compatible delegates).

Check out how the following code uses FastInvoke, ToOpenDelegate, Delegate.CreateDelegate, Expression.GetActionType, and Expression.GetFuncType.

It may be that some of these general techniques could be applied to Java.

Faster Invoke for reflected property access and method invocation with AOT compilation:

https://web-beta.archive.org/web/20120826100615/http://whydo...

The bane of the iOS programmers life, when working with reflection in Mono, is that you cant go around making up new generic types to ensure that your reflected properties and methods get called at decent speed. This is because Mono on iOS is fully Ahead Of Time compiled and simply cant make up new stuff as you go along. That coupled with the dire performance of Invoke when using reflected properties lead me to construct a helper class.

This works by registering a series of method signatures with the compiler, so that they are available to code running on the device. In my tests property access was 4.5x faster and method access with one parameters was 2.4x faster. Not earth shattering but every little helps. If you knew what you wanted ahead of time, then you could probably do a lot better. See here for info.

You have to register signatures inside each class Im afraid. Nothing I can do about that.

So to register a signature you use:

    static MyClass()
    {
         //All methods returning string can be accelerated
         DelegateSupport.RegisterFunctionType<MyClass, string>();         
         //All methods returning string and taking an int can be accelerated
         DelegateSupport.RegisterFunctionType<MyClass, int, string>();    
         //All methods returning void and taking a bool can be accelerated
         DelegateSupport.RegisterActionType<MyClass, bool>();             
    }

Then when you have a MethodInfo you use the extension method FastInvoke(object target, params object[] parameters) to call it. FastInvoke will default to using normal Invoke if you havent accelerated a particular type.

myObject.GetType().GetProperty("SomeProperty").GetGetMethod().FastInvoke(myObject);

myObject.GetType().GetMethod("SomeMethod").FastInvoke(myObject, 1, 2);

You can download the source code for FastInvoke from here.

https://web-beta.archive.org/web/20120826100615/http://www.w...

Newer version from unityserializer-ng:

https://gitgud.io/TheSniperFan/unityserializer-ng/blob/maste...

55 comments