After further profiling of xstream, another hotspot that has emerged is the writeField method of Sun14ReflectionProvider.java. Calling Unsafe.objectFieldOffset() seem to account for another ~20% of execution time. Therefore, a simple cache of this value should greatly improve performance.
I cannot verify your results. I am using normally some test objects that require a lot of introspection to create some significant usage of the ReflectionConverter code, but Unsafe.objectFieldOffset() is nowhere near the top (for me it is not even present in YourKit's hotspots). I am using Sun JDK 6 (32-bit) on Linux with YourKit 8.0.2 (same after upgrade to YourKit 8.0.5). Looking at AbstractRecflectionConverter.duUnmarshal alone, the calls to HashMap.get() already take 19% of the method's time while the calls to Sun14ReflectionProvider.writeField() take only about 3% of this time.
In consequence we're starting to talk here about very specific scenarios and I am a little hesitant to add performance optimizations for such cases.
Thanks for looking into this. It may be a difference in JVMs. I'm running on JDK 6 (64-bit) on Mac OS X. I took a quick look at Suns own implementation of Field.set, and they are caching the field offset as well, so I suspect under some circumstances this operation can be expensive. (If you want to look yourself, see sun.reflect.UnsafeFieldAccesorImpl, which in turn is held as a singleton in java.lang.reflect.Field). I'll run my code through the profiler again, though, to verify my results.
So I ran my de-serialization through the profiler again, and here's what I'm seeing:
Time spent in fromXml: 92000ms
Time spent in Unsafe.objectFieldOffset: 31000ms
So nearly a 1/3 of the time is spent in retrieving the field offset. If I apply the attached patch, the time for offset retrieval drops to ~1000ms (including cache hits). I am guessing the differences we are seeing are due to either 64bit mode that I'm running under, or something in the underlying OS.
Also, is the set of test objects you mentioned above included in the xstream source? If so, where can I find them? I'll try running them through the profiler on my machine as well, to see if I can replicate it on the test suite as well.
Applied in HEAD with the small difference of using an instance member for the cache to avoid classloader problems and with the background knowledge that even different provider instance will normally be used for distinct types. Thanks, Keith.
Here's a patch implementing a simple cache of Field to field offsets.