Originally asked by Robert Stupp at hotspot-dev: http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-February/017084.html "Native.malloc() is up to 3 times faster than Unsafe.allocateMemory()", why? Let me outline how you can disentangle the reason for the performance difference like this. I haven't bothered to run the original benchmark, and instead hacked together a quick targeted one: @State(Scope.Benchmark) public class AllocateBench { @Param("1") private int size; @Benchmark public long testMethod() { long addr = U.allocateMemory(size); U.freeMemory(addr); return addr; } } Running on i7-4790K 4.0 GHz, Linux x86_64, JDK 8u40 EA. It does make sense to look into the small allocations to see the infrastructure overhead. Now, if you run this benchmark under Solaris Studio Performance Analyzer, which can profile both Java and native parts, you will see this calltree: 237.436 (54%) ....AllocateBench_testMethod.avgt_jmhStub( +- 125.398 (28%) sun.misc.Unsafe.allocateMemory(long) | +- 114.080 (26%) Unsafe_AllocateMemory | +- 78.615 (18%) os::malloc(unsigned long,MemoryType) | | +- 65.176 (15%) os::malloc(unsigned...) | | | +- 53.507 (12%) malloc | | | +- 21.575 (5%) @0x7decf () | | +- 4.163 (1%) MallocTracker::record_malloc(void*,unsigned long,MemoryType,const NativeCallStack&,NMT_TrackingLevel) | +- 9.787 (2%) HandleMarkCleaner::~HandleMarkCleaner() | +- 2.522 (1%) JavaThread::thread_from_jni_environment(JNIEnv_*) +- 97.939 (22%) sun.misc.Unsafe.freeMemory(long) | +- 83.508 (19%) Unsafe_FreeMemory | +- 33.043 (7%) @0x7decf () | +- 11.508 (3%) ThreadStateTransition::trans_from_native(JavaThreadState) | +- 9.417 (2%) os::free(void*,MemoryType) | | +- 5.674 (1%) MallocTracker::record_free(void*) | +- 7.555 (2%) free | +- 5.894 (1%) HandleMarkCleaner::~HandleMarkCleaner() | +- 2.892 (1%) JavaThread::thread_from_jni_environment(JNIEnv_*) So, the Unsafe_AllocateMemory itself consumed around 125 seconds, while only 53 seconds are spent in malloc itself. The call tree in Solaris Studio can have the multiple roots, and so it sometimes easier to get the "aggregated" stack by following the interesting function in Callers->Callees view. This is the aggregated "stack trace", the numbers are the inclusive time spent in each method: 395.827 org.openjdk.generated.AllocateBench_testMethod.testMethod_avgt_jmhStub 221.125 sun.misc.Unsafe.allocateMemory(long) 209.216 Unsafe_AllocateMemory 144.781 os::malloc(unsigned long,MemoryType) 119.534 os::malloc(unsigned long,MemoryType,const NativeCallStack&) 98.289 malloc 7.615 MallocTracker::record_malloc(void*,unsigned long,MemoryType,const NativeCallStack&,NMT_TrackingLevel) 17.622 HandleMarkCleaner::~HandleMarkCleaner() 4.753 JavaThread::thread_from_jni_environment(JNIEnv_*) ... 173.681 sun.misc.Unsafe.freeMemory(long) ... So, the time under each parent branch should be distributed among the callees and the parent method itself. Let's see where the time is spent. 1. JMH stub itself divides the time between allocateMemory and freeMemory, almost nothing is left for the stub itself. This is good, and it tells the benchmarking infrastructure is not getting in the way. 2. Unsafe.allocateMemory wastes 20 seconds, before calling to Unsafe_AllocateMemory. If you look into the disassembly for the generated code [1], you will see the preparations for the native call. 3. Unsafe_AllocateMemory wastes 44 seconds, before calling to os::malloc, HandleMarkCleaner and others. If you look into the disassembly for this stub [2], you will see a significant amount of time spent dealing with doing the actual JNI transition. In the source code, this is hidden behind the UNSAFE_ENTRY macros in unsafe.cpp: UNSAFE_ENTRY(jlong, Unsafe_AllocateMemory(JNIEnv *env, jobject unsafe, jlong size)) 4. os::malloc(unsigned long,MemoryType) wastes 17 seconds before calling to overloaded version of itself. The disassembly [3] seems to show the inlined body of CALLER_PC macros that does a few "MemTracker" checks. 5. os::malloc(unsigned long,MemoryType,const NativeCallStack&) wastes another 20 seconds before finally reaching malloc. The disassembly [4] seems to show the inlined body of MallocTracker::record_malloc before the call into the actual glibc's malloc. 6. HandleMarkCleaner, thread_from_jni_environment waste another 23 seconds for themselves. The disassemblies for them [5, 6] are trivial, and it does not seem obvious if we can optimize them. ====== BOTTOM-LINE: The overheads of Unsafe.allocateMemory seem to lie in both handling the actual JNI transition, doing the VM housekeeping, and also paying the dues for NMT support. If there is a version that can avoid both costs, it would experience a performance boost. Back-envelope calculation: saving (20+44+17+20+23)=124 seconds out of 221 seconds for allocateMemory itself brings the speedup of 221/(221-124) = 2.27x. ===== COLLATERALS: ----------------- [1] sun.misc.Unsafe.allocateMemory disassembly: -------------------- 0. 0. [?] 0: movl 8(%rsi),%r10d 0. [?] 4: shlq $3,%r10 0. [?] 8: cmpq %r10,%rax 0. [?] b: je .+0xd [ 0x18 ] 0. [?] 11: jmp .-0x180171 [ 0xffffffffffe7fea0 ] 0. [?] 16: nop 0.290 [?] 18: movl %eax,-0x14000(%rsp) 1.681 [?] 1f: pushq %rbp 0.520 [?] 20: movq %rsp,%rbp 0.340 [?] 23: subq $0x40,%rsp 0.510 [?] 27: movq %rsi,(%rsp) 0.460 [?] 2b: cmpq $0,%rsi 0.060 [?] 2f: leaq (%rsp),%rsi 0.200 [?] 33: cmovq.e (%rsp),%rsi 1.811 [?] 38: movq $0x7f55e51c5fb8,%r10 0.050 [?] 42: movq %r10,0x1e0(%r15) 0.320 [?] 49: movq %rsp,0x1d8(%r15) 0.540 [?] 50: cmpb $0,0x166b950b(%rip) 0.450 [?] 57: je .+0x3a [ 0x91 ] 0. [?] 5d: pushq %rsi 0. [?] 5e: pushq %rdx 0. [?] 5f: movq $0x7f55c68cce70,%rsi 0. [?] 69: movq %r15,%rdi 0. [?] 6c: testl $0xf,%esp 0. [?] 72: je .+0x18 [ 0x8a ] 0. [?] 78: subq $8,%rsp 0. [?] 7c: call .+0x160d60d4 [ 0x160d6150 ] 0. [?] 81: addq $8,%rsp 0. [?] 85: jmp .+0xa [ 0x8f ] 0. [?] 8a: call .+0x160d60c6 [ 0x160d6150 ] 0. [?] 8f: popq %rdx 0. [?] 90: popq %rsi 0.210 [?] 91: leaq 0x1f8(%r15),%rdi 0.510 [?] 98: movl $4,0x270(%r15) 209.637 [?] a3: call .+0x161aaa3d [ 0x161aaae0 ] // Call into Unsafe_AllocateMemory 0.030 [?] a8: vzeroupper 1.231 [?] ab: movl $5,0x270(%r15) 0.120 [?] b6: movl %r15d,%ecx 0.781 [?] b9: shrl $4,%ecx 0.030 [?] bc: andl $0xffc,%ecx 0. [?] c2: movq $0x7f55ff457000,%r10 0.100 [?] cc: movl %ecx,(%r10,%rcx) 1.321 [?] d0: cmpl $0,0x166c8a26(%rip) 0.030 [?] da: jne .+0x14 [ 0xee ] 0.080 [?] e0: cmpl $0,0x30(%r15) 0.981 [?] e8: je .+0x27 [ 0x10f ] 0. [?] ee: movq %rax,-8(%rbp) 0. [?] f2: movq %r15,%rdi 0. [?] f5: movq %rsp,%r12 0. [?] f8: subq $0,%rsp 0. [?] fc: andq $-0x10,%rsp 0. [?] 100: call .+0x16184930 [ 0x16184a30 ] 0. [?] 105: movq %r12,%rsp 0. [?] 108: xorq %r12,%r12 0. [?] 10b: movq -8(%rbp),%rax 0.040 [?] 10f: movl $8,0x270(%r15) 0.200 [?] 11a: cmpl $1,0x29c(%r15) 0.540 [?] 125: je .+0x90 [ 0x1b5 ] 1.091 [?] 12b: cmpb $0,0x166b9430(%rip) 0.400 [?] 132: je .+0x3e [ 0x170 ] 0. [?] 138: movq %rax,-8(%rbp) 0. [?] 13c: movq $0x7f55c68cce70,%rsi 0. [?] 146: movq %r15,%rdi 0. [?] 149: testl $0xf,%esp 0. [?] 14f: je .+0x18 [ 0x167 ] 0. [?] 155: subq $8,%rsp 0. [?] 159: call .+0x160d5fe7 [ 0x160d6140 ] 0. [?] 15e: addq $8,%rsp 0. [?] 162: jmp .+0xa [ 0x16c ] 0. [?] 167: call .+0x160d5fd9 [ 0x160d6140 ] 0. [?] 16c: movq -8(%rbp),%rax 0.320 [?] 170: movq $0,%r10 0.781 [?] 17a: movq %r10,0x1d8(%r15) 0.460 [?] 181: movq $0,%r10 0.020 [?] 18b: movq %r10,0x1e0(%r15) 0.470 [?] 192: movq 0x38(%r15),%rcx 0.871 [?] 196: movl $0,0x100(%rcx) 1.451 [?] 1a0: leave 0.881 [?] 1a1: cmpq $0,8(%r15) 0.510 [?] 1a9: jne .+7 [ 0x1b0 ] 0.030 [?] 1af: ret 0. [?] 1b0: jmp .-0x1c5a50 [ 0xffffffffffe3a760 ] 0. [?] 1b5: movq %rax,-8(%rbp) 0. [?] 1b9: movq %rsp,%r12 0. [?] 1bc: subq $0,%rsp 0. [?] 1c0: andq $-0x10,%rsp 0. [?] 1c4: call .+0x160d5f5c [ 0x160d6120 ] 0. [?] 1c9: movq %r12,%rsp 0. [?] 1cc: xorq %r12,%r12 0. [?] 1cf: movq -8(%rbp),%rax 0. [?] 1d3: jmp .-0xa8 [ 0x12b ] ----------------- [2] Unsafe_AllocateMemory disassembly ----------------------- 0.530 [?] a7ea60: pushq %rbp 1.041 [?] a7ea61: movq %rsp,%rbp 0.310 [?] a7ea64: movq %rbx,-0x20(%rbp) 1.401 [?] a7ea68: movq %r12,-0x18(%rbp) 0.741 [?] a7ea6c: movq %rdx,%r12 0.030 [?] a7ea6f: movq %r14,-8(%rbp) 0.360 [?] a7ea73: movq %r13,-0x10(%rbp) 0.480 [?] a7ea77: subq $0x30,%rsp 5.104 [?] a7ea7b: call JavaThread::thread_from_jni_environment(JNIEnv_*) [ 0x6e2950, .-0x39c12b ] 0.490 [?] a7ea80: movq 0x4d03c1(%rip),%r14 0.630 [?] a7ea87: movl $5,0x270(%rax) 0.450 [?] a7ea91: movq %rax,%rbx 0.350 [?] a7ea94: movl (%r14),%edx 1.111 [?] a7ea97: cmpl $1,%edx 0. [?] a7ea9a: je .+0xfe [ 0xa7eb98 ] 0.260 [?] a7eaa0: movq 0x4d1b21(%rip),%rax 0.570 [?] a7eaa7: cmpb $0,(%rax) 2.692 [?] a7eaaa: jne .+0xd6 [ 0xa7eb80 ] 0.600 [?] a7eab0: movq 0x4d3ac9(%rip),%rax 0.090 [?] a7eab7: movq %rbx,%rdx 0.090 [?] a7eaba: shrq $4,%rdx 0.470 [?] a7eabe: movl (%rax),%eax 0.560 [?] a7eac0: andl %edx,%eax 0.160 [?] a7eac2: movq 0x4cf417(%rip),%rdx 0.680 [?] a7eac9: addq (%rdx),%rax 2.101 [?] a7eacc: movl $1,(%rax) 4.853 [?] a7ead2: movq 0x4cdfff(%rip),%r13 0.010 [?] a7ead9: movl 0(%r13),%eax 0.440 [?] a7eadd: testl %eax,%eax 0. [?] a7eadf: jne .+0xc [ 0xa7eaeb ] 0.130 [?] a7eae1: movl 0x30(%rbx),%eax 0.751 [?] a7eae4: testl $0x30000000,%eax 0. [?] a7eae9: je .+0xa [ 0xa7eaf3 ] 0. [?] a7eaeb: movq %rbx,%rdi 0. [?] a7eaee: call JavaThread::check_safepoint_and_suspend_for_native_trans(JavaThread*) [ 0xa581e0, .-0x2690e ] 0.030 [?] a7eaf3: cmpq $0,%r12 0.050 [?] a7eaf7: movl $6,0x270(%rbx) 0.210 [?] a7eb01: movq %rbx,-0x30(%rbp) 1.001 [?] a7eb05: jl .+0xf3 [ 0xa7ebf8 ] 0.010 [?] a7eb0b: jne .+0x11d [ 0xa7ec28 ] 0. [?] a7eb11: xorl %r12d,%r12d 0.901 [?] a7eb14: leaq -0x30(%rbp),%rdi 17.702 [?] a7eb18: call HandleMarkCleaner::~HandleMarkCleaner() [ 0x3dbc60, .-0x6a2eb8 ] 0.400 [?] a7eb1d: movl $7,0x270(%rbx) 0.801 [?] a7eb27: movl (%r14),%edx 0.320 [?] a7eb2a: cmpl $1,%edx 0. [?] a7eb2d: je .+0x83 [ 0xa7ebb0 ] 0. [?] a7eb33: movq 0x4d1a8e(%rip),%rax 0.560 [?] a7eb3a: cmpb $0,(%rax) 3.542 [?] a7eb3d: je .+0x8f [ 0xa7ebcc ] 0. [?] a7eb43: subl $1,%edx 0. [?] a7eb46: je .+0x132 [ 0xa7ec78 ] 0. [?] a7eb4c: lock addl $0,(%rsp) 0. [?] a7eb51: movl 0(%r13),%eax 0.050 [?] a7eb55: testl %eax,%eax 0. [?] a7eb57: je .+0xa [ 0xa7eb61 ] 0. [?] a7eb59: movq %rbx,%rdi 0. [?] a7eb5c: call SafepointSynchronize::block(JavaThread*) [ 0x9a3ae0, .-0xdb07c ] 0.680 [?] a7eb61: movl $4,0x270(%rbx) 0.010 [?] a7eb6b: movq %r12,%rax 0. [?] a7eb6e: movq -0x20(%rbp),%rbx 0.030 [?] a7eb72: movq -0x18(%rbp),%r12 1.151 [?] a7eb76: movq -0x10(%rbp),%r13 0.020 [?] a7eb7a: movq -8(%rbp),%r14 0. [?] a7eb7e: leave 0.911 [?] a7eb7f: ret 0. [?] a7eb80: subl $1,%edx 0. [?] a7eb83: je .+0x10d [ 0xa7ec90 ] 0. [?] a7eb89: lock addl $0,(%rsp) 0. [?] a7eb8e: jmp .-0xbc [ 0xa7ead2 ] 0. [?] a7eb93: nop 0(%rax,%rax) 0. [?] a7eb98: movq 0x4d4b69(%rip),%rax 0. [?] a7eb9f: cmpb $0,(%rax) 0. [?] a7eba2: je .-0xd0 [ 0xa7ead2 ] 0. [?] a7eba8: jmp .-0x108 [ 0xa7eaa0 ] 0. [?] a7ebad: nop (%rax) 0. [?] a7ebb0: movq 0x4d4b51(%rip),%rax 0. [?] a7ebb7: cmpb $0,(%rax) 0. [?] a7ebba: je .-0x69 [ 0xa7eb51 ] 0. [?] a7ebbc: movq 0x4d1a05(%rip),%rax 0. [?] a7ebc3: cmpb $0,(%rax) 0. [?] a7ebc6: jne .-0x83 [ 0xa7eb43 ] 0.370 [?] a7ebcc: movq 0x4d39ad(%rip),%rax 0. [?] a7ebd3: movq %rbx,%rdx 0.220 [?] a7ebd6: shrq $4,%rdx 0.460 [?] a7ebda: movl (%rax),%eax 0.530 [?] a7ebdc: andl %edx,%eax 0.030 [?] a7ebde: movq 0x4cf2fb(%rip),%rdx 0.320 [?] a7ebe5: addq (%rdx),%rax 1.431 [?] a7ebe8: movl $1,(%rax) 2.572 [?] a7ebee: jmp .-0x9d [ 0xa7eb51 ] 0. [?] a7ebf3: nop 0(%rax,%rax) 0. [?] a7ebf8: movq 0x4cf2e9(%rip),%rax 0. [?] a7ebff: leaq 0xc0dd2(%rip),%rsi 0. [?] a7ec06: xorl %r8d,%r8d 0. [?] a7ec09: movl $0x252,%edx 0. [?] a7ec0e: movq %rbx,%rdi 0. [?] a7ec11: xorl %r12d,%r12d 0. [?] a7ec14: movq 0x338(%rax),%rcx 0. [?] a7ec1b: call Exceptions::_throw_msg(Thread*,const char*,int,Symbol*,const char*) [ 0x575ba0, .-0x50907b ] 0. [?] a7ec20: jmp .-0x10c [ 0xa7eb14 ] 0. [?] a7ec25: nop (%rax) 0.070 [?] a7ec28: leaq 7(%r12),%rdi 0.190 [?] a7ec2d: movl $7,%esi 0.670 [?] a7ec32: andq $-8,%rdi 144.821 [?] a7ec36: call os::malloc(unsigned long,MemoryType) [ 0x906a20, .-0x178216 ] 1.591 [?] a7ec3b: testq %rax,%rax 0.040 [?] a7ec3e: movq %rax,%r12 0.030 [?] a7ec41: jne .-0x12d [ 0xa7eb14 ] 0. [?] a7ec47: movq 0x4cf29a(%rip),%rax 0. [?] a7ec4e: leaq 0xc0d83(%rip),%rsi 0. [?] a7ec55: xorl %r8d,%r8d 0. [?] a7ec58: movl $0x25a,%edx 0. [?] a7ec5d: movq %rbx,%rdi 0. [?] a7ec60: movq 0x448(%rax),%rcx 0. [?] a7ec67: call Exceptions::_throw_msg(Thread*,const char*,int,Symbol*,const char*) [ 0x575ba0, .-0x5090c7 ] 0. [?] a7ec6c: jmp .-0x158 [ 0xa7eb14 ] 0. [?] a7ec71: nop 0(%rax) 0. [?] a7ec78: movq 0x4d4a89(%rip),%rax 0. [?] a7ec7f: cmpb $0,(%rax) 0. [?] a7ec82: je .-0x131 [ 0xa7eb51 ] 0. [?] a7ec88: jmp .-0x13c [ 0xa7eb4c ] 0. [?] a7ec8d: nop (%rax) 0. [?] a7ec90: movq 0x4d4a71(%rip),%rax 0. [?] a7ec97: cmpb $0,(%rax) 0. [?] a7ec9a: je .-0x1c8 [ 0xa7ead2 ] 0. [?] a7eca0: jmp .-0x117 [ 0xa7eb89 ] ----------------- [3] os::malloc(unsigned long,MemoryType) disassembly ----------------------- 0. 0.350 [?] 906a20: pushq %rbp 1.281 [?] 906a21: movq %rsp,%rbp 0.070 [?] 906a24: movq %rbx,-0x18(%rbp) 1.001 [?] 906a28: movq %r12,-0x10(%rbp) 1.181 [?] 906a2c: movl %esi,%r12d 0. [?] 906a2f: movq %r13,-8(%rbp) 0.370 [?] 906a33: subq $0x50,%rsp 0.180 [?] 906a37: movq 0x6476f2(%rip),%rbx 0.891 [?] 906a3e: movq %rdi,%r13 0. [?] 906a41: movl (%rbx),%eax 1.041 [?] 906a43: cmpl $0xff,%eax 0. [?] 906a48: je .+0x88 [ 0x906ad0 ] 0.500 [?] 906a4e: movl (%rbx),%eax 0.811 [?] 906a50: cmpl $3,%eax 0. [?] 906a53: je .+0x55 [ 0x906aa8 ] 0.310 [?] 906a55: movq 0x64ac14(%rip),%rdx 0.240 [?] 906a5c: leaq -0x50(%rbp),%rbx 0.230 [?] 906a60: movq (%rdx),%rax 0.891 [?] 906a63: movq %rax,-0x50(%rbp) 0.560 [?] 906a67: movq 8(%rdx),%rax 0.230 [?] 906a6b: movq %rax,-0x48(%rbp) 0.600 [?] 906a6f: movq 0x10(%rdx),%rax 0.761 [?] 906a73: movq %rax,-0x40(%rbp) 0.570 [?] 906a77: movq 0x18(%rdx),%rax 0.200 [?] 906a7b: movq %rax,-0x38(%rbp) 0.791 [?] 906a7f: movl 0x20(%rdx),%eax 0.430 [?] 906a82: movl %eax,-0x30(%rbp) 0.550 [?] 906a85: movq %rbx,%rdx 0.080 [?] 906a88: movl %r12d,%esi 0.200 [?] 906a8b: movq %r13,%rdi 127.399 [?] 906a8e: call os::malloc(unsigned long,MemoryType,const NativeCallStack&) [ 0x906750, .-0x33e ] 0.600 [?] 906a93: movq -0x18(%rbp),%rbx 1.071 [?] 906a97: movq -0x10(%rbp),%r12 0.040 [?] 906a9b: movq -8(%rbp),%r13 0.931 [?] 906a9f: leave 0.420 [?] 906aa0: ret 0. [?] 906aa1: nop 0(%rax) 0. [?] 906aa8: movq 0x64c2d9(%rip),%rax 0. [?] 906aaf: movzbl (%rax),%eax 0. [?] 906ab2: testb %al,%al 0. [?] 906ab4: je .-0x5f [ 0x906a55 ] 0. [?] 906ab6: leaq -0x50(%rbp),%rbx 0. [?] 906aba: movl $1,%edx 0. [?] 906abf: movl $1,%esi 0. [?] 906ac4: movq %rbx,%rdi 0. [?] 906ac7: call NativeCallStack::NativeCallStack(int,bool) [ 0x8cc760, .-0x3a367 ] 0. [?] 906acc: jmp .-0x47 [ 0x906a85 ] 0. [?] 906ace: nop 0. [?] 906ad0: call MemTracker::init_tracking_level() [ 0x8731a0, .-0x93930 ] 0. [?] 906ad5: movl %eax,(%rbx) 0. [?] 906ad7: movq 0x6488aa(%rip),%rax 0. [?] 906ade: movl (%rbx),%edx 0. [?] 906ae0: movl %edx,(%rax) 0. [?] 906ae2: jmp .-0x94 [ 0x906a4e ] ----------------- [4] os::malloc(unsigned long,MemoryType,const NativeCallStack&) disassembly ----------------------- 0. 0.731 [?] 906750: pushq %rbp 0.871 [?] 906751: movl $1,%eax 0.190 [?] 906756: movq %rsp,%rbp 0.460 [?] 906759: movq %rbx,-0x28(%rbp) 1.741 [?] 90675d: movq %r12,-0x20(%rbp) 0.771 [?] 906761: movq %rdi,%r12 0.150 [?] 906764: movq %r15,-8(%rbp) 0.530 [?] 906768: movq %r13,-0x18(%rbp) 0.710 [?] 90676c: movq %rdx,%r15 0.250 [?] 90676f: movq %r14,-0x10(%rbp) 0.721 [?] 906773: subq $0x30,%rsp 0.140 [?] 906777: movq 0x6479b2(%rip),%rbx 0.580 [?] 90677e: testq %rdi,%rdi 0.190 [?] 906781: movl %esi,-0x2c(%rbp) 0.710 [?] 906784: cmovq.e %rax,%r12 0.420 [?] 906788: movl (%rbx),%eax 0.460 [?] 90678a: cmpl $0xff,%eax 0.010 [?] 90678f: je .+0xe1 [ 0x906870 ] 0.670 [?] 906795: movl (%rbx),%r14d 0.100 [?] 906798: cmpl $1,%r14d 0.350 [?] 90679c: sbbq %rax,%rax 0.791 [?] 90679f: notq %rax 0.650 [?] 9067a2: andl $0x10,%eax 0.610 [?] 9067a5: leaq (%rax,%r12),%rdi 0.630 [?] 9067a9: movq 0x64c288(%rip),%rax 0.320 [?] 9067b0: movq (%rax),%rdx 1.141 [?] 9067b3: testq %rdx,%rdx 0. [?] 9067b6: je .+0x7a [ 0x906830 ] 0. [?] 9067b8: movl 0x691dd2(%rip),%eax 0. [?] 9067be: movq %rdi,%r13 0. [?] 9067c1: xorl %ebx,%ebx 0. [?] 9067c3: shrq $3,%r13 0. [?] 9067c7: leaq 0(%r13,%rax),%rax 0. [?] 9067cc: cmpq %rax,%rdx 0. [?] 9067cf: jae .+0x31 [ 0x906800 ] 0.050 [?] 9067d1: movl %r14d,%r8d 0. [?] 9067d4: movq %r15,%rcx 0. [?] 9067d7: movl -0x2c(%rbp),%edx 1.261 [?] 9067da: movq %r12,%rsi 0.010 [?] 9067dd: movq %rbx,%rdi 0. [?] 9067e0: movq -0x20(%rbp),%r12 0.060 [?] 9067e4: movq -0x28(%rbp),%rbx 0.851 [?] 9067e8: movq -0x18(%rbp),%r13 0.010 [?] 9067ec: movq -0x10(%rbp),%r14 0.210 [?] 9067f0: movq -8(%rbp),%r15 0.030 [?] 9067f4: leave 1.311 [?] 9067f5: jmp MallocTracker::record_malloc(void*,unsigned long,MemoryType,const NativeCallStack&,NMT_TrackingLevel) [ 0x851990, .-0xb4e65 ] 0. [?] 9067fa: nop 0(%rax,%rax) 0. [?] 906800: call .-0x6eab40 [ 0x21bcc0 ] 0. [?] 906805: testq %rax,%rax 0. [?] 906808: movq %rax,%rbx 0. [?] 90680b: je .-0x3a [ 0x9067d1 ] 0. [?] 90680d: cmpl $1,0x691d0c(%rip) 0. [?] 906814: movl $1,%ecx 0. [?] 906819: je .+0x77 [ 0x906890 ] 0. [?] 90681b: movl %r13d,%edx 0. [?] 90681e: leaq 0x691d6b(%rip),%rax 0. [?] 906825: cmpl $0,%ecx 0. [?] 906828: je .+3 [ 0x90682b ] 0. [?] 90682a: lock xaddl %edx,(%rax) 0. [?] 90682e: jmp .+0xa [ 0x906838 ] 99.990 [?] 906830: call .-0x6eab70 [ 0x21bcc0 ] 0.040 [?] 906835: movq %rax,%rbx 0.811 [?] 906838: cmpq $-1,%rbx 0. [?] 90683c: jne .-0x6b [ 0x9067d1 ] 0. [?] 90683e: movq 0x64a00b(%rip),%rax 0. [?] 906845: leaq 0x22726c(%rip),%rsi 0. [?] 90684c: movq %rbx,%rcx 0. [?] 90684f: movq %r12,%rdx 0. [?] 906852: movq (%rax),%rdi 0. [?] 906855: xorl %eax,%eax 0. [?] 906857: call outputStream::print_cr(const char*,...) [ 0x91a1e0, .+0x13989 ] 0. [?] 90685c: call os::breakpoint() [ 0x908810, .+0x1fb4 ] 0. [?] 906861: jmp .-0x90 [ 0x9067d1 ] 0. [?] 906866: nop %cs:0(%rax,%rax) 0. [?] 906870: call MemTracker::init_tracking_level() [ 0x8731a0, .-0x936d0 ] 0. [?] 906875: movl %eax,(%rbx) 0. [?] 906877: movq 0x648b0a(%rip),%rax 0. [?] 90687e: movl (%rbx),%edx 0. [?] 906880: movl %edx,(%rax) 0. [?] 906882: jmp .-0xed [ 0x906795 ] 0. [?] 906887: nop 0(%rax,%rax) 0. [?] 906890: movq 0x64ce71(%rip),%rax 0. [?] 906897: movzbl (%rax),%ecx 0. [?] 90689a: jmp .-0x7f [ 0x90681b ] ----------------- [5] HandleMarkCleaner::~HandleMarkCleaner() ----------------------- 0. 2.742 [?] 3dbc60: pushq %rbp 0.480 [?] 3dbc61: movq %rsp,%rbp 0.180 [?] 3dbc64: movq %rbx,-0x10(%rbp) 3.422 [?] 3dbc68: movq %r12,-8(%rbp) 0.741 [?] 3dbc6c: subq $0x10,%rsp 0. [?] 3dbc70: movq (%rdi),%rax 0.260 [?] 3dbc73: movq 0x48(%rax),%rbx 2.842 [?] 3dbc77: movq 0x10(%rbx),%rax 3.002 [?] 3dbc7b: movq 8(%rbx),%r12 0.180 [?] 3dbc7f: cmpq $0,(%rax) 6.845 [?] 3dbc83: je .+0x1b [ 0x3dbc9e ] 0. [?] 3dbc85: movq 0x28(%rbx),%rsi 0. [?] 3dbc89: movq %r12,%rdi 0. [?] 3dbc8c: call Arena::set_size_in_bytes(unsigned long) [ 0x2db920, .-0x10036c ] 0. [?] 3dbc91: movq 0x10(%rbx),%rdi 0. [?] 3dbc95: call Chunk::next_chop() [ 0x2dccf0, .-0xfefa5 ] 0. [?] 3dbc9a: movq 0x10(%rbx),%rax 1.101 [?] 3dbc9e: movq %rax,0x10(%r12) 1.161 [?] 3dbca3: movq 0x18(%rbx),%rax 0.010 [?] 3dbca7: movq %rax,0x18(%r12) 1.081 [?] 3dbcac: movq 0x20(%rbx),%rax 0.751 [?] 3dbcb0: movq %rax,0x20(%r12) 1.001 [?] 3dbcb5: movq (%rsp),%rbx 0.170 [?] 3dbcb9: movq 8(%rsp),%r12 0.660 [?] 3dbcbe: leave 1.311 [?] 3dbcbf: ret ----------------- [6] JavaThread::thread_from_jni_environment(JNIEnv_*) ----------------------- 0. 2.162 [?] 6e2950: leaq -0x1f8(%rdi),%rdx 0.320 [?] 6e2957: pushq %rbp 1.581 [?] 6e2958: movl 0x288(%rdx),%eax 1.661 [?] 6e295e: movq %rsp,%rbp 0.440 [?] 6e2961: cmpl $0xdeab,%eax 0. [?] 6e2966: je .+0x19 [ 0x6e297f ] 0. [?] 6e2968: movl 0x288(%rdx),%eax 0. [?] 6e296e: cmpl $0xdeac,%eax 0. [?] 6e2973: je .+0xc [ 0x6e297f ] 0. [?] 6e2975: movq %rdx,%rdi 0. [?] 6e2978: call JavaThread::block_if_vm_exited() [ 0xa50e40, .+0x36e4c8 ] 0. [?] 6e297d: xorl %edx,%edx 0.791 [?] 6e297f: movq %rdx,%rax 0.941 [?] 6e2982: leave 2.252 [?] 6e2983: ret