# Detecting actual CPU count: 16 detected # JMH version: 1.21 # VM version: JDK 12-internal, OpenJDK 64-Bit Server VM, 12-internal+0-adhoc.shade.shenandoah-jdk # VM invoker: /home/shade/trunks/shenandoah-jdk/build/linux-x86_64-normal-server-release/images/jdk/bin/java # VM options: -XX:+UseShenandoahGC -Xmx10g -XX:ShenandoahGCHeuristics=static -XX:ShenandoahFreeThreshold=90 -Xlog:gc -XX:ConcGCThreads=1 -XX:-TieredCompilation -XX:-ExplicitGCInvokesConcurrent # Warmup: 5 iterations, 5 s each # Measurement: 5 iterations, 5 s each # Timeout: 10 min per iteration # Threads: 16 threads, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.openjdk.gcbench.wip.BarriersSATBSlowpath.test # Parameters: (size = 40000000) # Run progress: 0.00% complete, ETA 00:00:50 # Fork: 1 of 1 # Preparing profilers: LinuxPerfAsmProfiler # Profilers consume stdout and stderr from target VM, use -v EXTRA to copy to console # Warmup Iteration 1: 13.661 ±(99.9%) 0.622 ns/op # Warmup Iteration 2: 14.306 ±(99.9%) 1.088 ns/op # Warmup Iteration 3: 12.430 ±(99.9%) 0.697 ns/op # Warmup Iteration 4: 12.344 ±(99.9%) 0.641 ns/op # Warmup Iteration 5: 12.197 ±(99.9%) 0.500 ns/op Iteration 1: 12.250 ±(99.9%) 0.493 ns/op Iteration 2: 12.326 ±(99.9%) 1.006 ns/op Iteration 3: 12.235 ±(99.9%) 0.405 ns/op Iteration 4: 12.294 ±(99.9%) 0.797 ns/op Iteration 5: 12.284 ±(99.9%) 0.584 ns/op # Processing profiler results: LinuxPerfAsmProfiler Result "org.openjdk.gcbench.wip.BarriersSATBSlowpath.test": 12.278 ±(99.9%) 0.139 ns/op [Average] (min, avg, max) = (12.235, 12.278, 12.326), stdev = 0.036 CI (99.9%): [12.138, 12.417] (assumes normal distribution) Secondary result "org.openjdk.gcbench.wip.BarriersSATBSlowpath.test:·asm": PrintAssembly processed: 55700 total address lines. Perf output processed (skipped 34.341 seconds): Column 1: cycles (399145 events) Hottest code regions (>10.00% "cycles" events): ....[Hottest Region 1].............................................................................. libjvm.so, _ZN13SATBMarkQueue6filterEv (199 bytes) <no assembly is recorded, native region> .................................................................................................... 41.26% <total for region 1> ....[Hottest Region 2].............................................................................. c2, org.openjdk.gcbench.wip.BarriersSATBSlowpath::test, version 130 (136 bytes) # parm0: rdx:rdx = 'org/openjdk/gcbench/wip/BarriersSATBSlowpath$Holder' # [sp+0x20] (sp of caller) 0x00007fd204a341a0: mov 0x8(%rsi),%r10d 0x00007fd204a341a4: movabs $0x800000000,%r12 0x00007fd204a341ae: add %r12,%r10 0x00007fd204a341b1: xor %r12,%r12 0x00007fd204a341b4: cmp %r10,%rax 0x00007fd204a341b7: jne 0x00007fd2049c4080 ; {runtime_call ic_miss_stub} 0x00007fd204a341bd: data16 xchg %ax,%ax [Verified Entry Point] 2.11% 0x00007fd204a341c0: mov %eax,-0x14000(%rsp) 0.16% 0x00007fd204a341c7: push %rbp 1.63% 0x00007fd204a341c8: sub $0x10,%rsp 0.47% 0x00007fd204a341cc: mov -0x8(%rsi),%r10 0.18% 0x00007fd204a341d0: mov 0x14(%r10),%ebp 1.93% 0x00007fd204a341d4: testb $0x1,0x20(%r15) 0.47% ╭ 0x00007fd204a341d9: jne 0x00007fd204a3420a 0.03% │ ↗ 0x00007fd204a341db: mov %rdx,%rbx 0.12% │ │ 0x00007fd204a341de: test %rdx,%rdx │╭ │ 0x00007fd204a341e1: je 0x00007fd204a3424a 1.68% ││ │ 0x00007fd204a341e3: test %ebp,%ebp ││╭ │ 0x00007fd204a341e5: je 0x00007fd204a3421a 0.45% │││ │ 0x00007fd204a341e7: mov -0x8(%r12,%rbp,8),%rbp 3.96% │││ │↗ 0x00007fd204a341ec: testb $0x2,0x20(%r15) 0.08% │││╭ ││ 0x00007fd204a341f1: jne 0x00007fd204a3421e 0.02% ││││ ││↗↗ 0x00007fd204a341f3: mov %rbp,%r10 1.91% ││││ ││││ 0x00007fd204a341f6: shr $0x3,%r10 0.11% ││││ ││││ 0x00007fd204a341fa: mov %r10d,0xc(%rbx) ;*synchronization entry ││││ ││││ ; - org.openjdk.gcbench.wip.BarriersSATBSlowpath::test@-1 (line 86) 0.40% ││││ ││││ 0x00007fd204a341fe: add $0x10,%rsp 0.02% ││││ ││││ 0x00007fd204a34202: pop %rbp 2.05% ││││ ││││ 0x00007fd204a34203: test %eax,0x136b5df7(%rip) # 0x00007fd2180ea000 ││││ ││││ ; {poll_return} 0.29% ││││ ││││ 0x00007fd204a34209: retq ↘│││ ││││ 0x00007fd204a3420a: testb $0x14,0x20(%r15) │││╭ ││││ 0x00007fd204a3420f: jne 0x00007fd204a34256 ││││╭││││ 0x00007fd204a34211: mov -0x8(%rdx),%r8 ; implicit exception: dispatches to 0x00007fd204a3424a │││││││││ 0x00007fd204a34215: mov %r8,%rdx │││││╰│││ 0x00007fd204a34218: jmp 0x00007fd204a341db │↘│││ │││ 0x00007fd204a3421a: xor %ebp,%ebp │ │││ ╰││ 0x00007fd204a3421c: jmp 0x00007fd204a341ec 0.34% │ ↘││ ││ 0x00007fd204a3421e: mov 0xc(%rdx),%r11d 0.29% │ ││ ││ 0x00007fd204a34222: test %r11d,%r11d │ ││ ╰│ 0x00007fd204a34225: je 0x00007fd204a341f3 1.65% │ ││ │ 0x00007fd204a34227: mov 0x40(%r15),%r10 0.45% │ ││ │ 0x00007fd204a3422b: mov %r11,%rdi 0.32% │ ││ │ 0x00007fd204a3422e: shl $0x3,%rdi 0.27% │ ││ │ 0x00007fd204a34232: test %r10,%r10 │ ││ │ 0x00007fd204a34235: je 0x00007fd204a34287 1.67% │ ││ │ 0x00007fd204a34237: mov 0x50(%r15),%r11 0.12% │ ││ │ 0x00007fd204a3423b: mov %rdi,-0x8(%r11,%r10,1) 7.16% │ ││ │ 0x00007fd204a34240: add $0xfffffffffffffff8,%r10 0.10% │ ││ │ 0x00007fd204a34244: mov %r10,0x40(%r15) ;*putfield dst {reexecute=0 rethrow=0 return_oop=0} │ ││ │ ; - org.openjdk.gcbench.wip.BarriersSATBSlowpath::test@5 (line 86) 0.22% │ ││ ╰ 0x00007fd204a34248: jmp 0x00007fd204a341f3 ↘ │↘ ↗ 0x00007fd204a3424a: mov $0xfffffff6,%esi │ │ 0x00007fd204a3424f: callq 0x00007fd2049a3d80 ; ImmutableOopMap{rbp=NarrowOop } │ │ ;*putfield dst {reexecute=1 rethrow=0 return_oop=0} │ │ ; - org.openjdk.gcbench.wip.BarriersSATBSlowpath::test@5 (line 86) │ │ ; {runtime_call UncommonTrapBlob} │ │ 0x00007fd204a34254: ud2 ↘ ╰ 0x00007fd204a34256: mov -0x8(%rdx),%r8 ; implicit exception: dispatches to 0x00007fd204a3424a 0x00007fd204a3425a: mov %r8,%r10 0x00007fd204a3425d: shr $0x16,%r10 .................................................................................................... 30.66% <total for region 2> ....[Hottest Region 3].............................................................................. c2, org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub, version 137 (47 bytes) 0x00007fd204a2e898: mov -0x8(%rbp),%r8 ; implicit exception: dispatches to 0x00007fd204a2e9aa 0x00007fd204a2e89c: movzbl 0x94(%r8),%r10d ;*getfield isDone {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@27 (line 239) 0x00007fd204a2e8a4: test %r10d,%r10d 0x00007fd204a2e8a7: jne 0x00007fd204a2e93c ;*ifeq {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@30 (line 239) 0x00007fd204a2e8ad: mov $0x1,%ebp 0x00007fd204a2e8b2: data16 nopw 0x0(%rax,%rax,1) 0x00007fd204a2e8bc: data16 data16 xchg %ax,%ax ;*aload_1 {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@33 (line 240) 1.66% ↗ 0x00007fd204a2e8c0: mov %r8,0x28(%rsp) 0.49% │ 0x00007fd204a2e8c5: mov 0x38(%rsp),%rsi 0.01% │ 0x00007fd204a2e8ca: mov 0x30(%rsp),%rdx 0.14% │ 0x00007fd204a2e8cf: callq 0x00007fd2049c4500 ; ImmutableOopMap{[32]=Oop [40]=Oop [48]=Oop [56]=Oop } │ ;*invokevirtual test {reexecute=0 rethrow=0 return_oop=0} │ ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@17 (line 237) │ ; {optimized virtual_call} 11.14% │ 0x00007fd204a2e8d4: mov 0x28(%rsp),%r8 0.44% │ 0x00007fd204a2e8d9: mov -0x8(%r8),%r10 0.89% │ 0x00007fd204a2e8dd: movzbl 0x94(%r10),%r11d ;*getfield isDone {reexecute=0 rethrow=0 return_oop=0} │ ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@27 (line 239) 6.31% │ 0x00007fd204a2e8e5: add $0x1,%rbp ; ImmutableOopMap{r10=Oop r8=Oop [32]=Oop [48]=Oop [56]=Oop } │ ;*ifeq {reexecute=1 rethrow=0 return_oop=0} │ ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@30 (line 239) 0.00% │ 0x00007fd204a2e8e9: test %eax,0x136bb711(%rip) # 0x00007fd2180ea000 │ ; {poll} 0.14% │ 0x00007fd204a2e8ef: test %r11d,%r11d ╰ 0x00007fd204a2e8f2: je 0x00007fd204a2e8c0 ;*aload_1 {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@33 (line 240) 0x00007fd204a2e8f4: movabs $0x7fd21679bad0,%r10 0x00007fd204a2e8fe: callq *%r10 ;*invokestatic nanoTime {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@34 (line 240) 0x00007fd204a2e901: mov 0x20(%rsp),%r10 0x00007fd204a2e906: mov %rbp,0x18(%r10) ;*putfield measuredOps {reexecute=0 rethrow=0 return_oop=0} ; - org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub@49 (line 242) 0x00007fd204a2e90a: mov %rax,0x30(%r10) ;*putfield stopTime {reexecute=0 rethrow=0 return_oop=0} .................................................................................................... 21.23% <total for region 3> ....[Hottest Regions]............................................................................... 41.26% libjvm.so _ZN13SATBMarkQueue6filterEv (199 bytes) 30.66% c2 org.openjdk.gcbench.wip.BarriersSATBSlowpath::test, version 130 (136 bytes) 21.23% c2 org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub, version 137 (47 bytes) 3.23% libjvm.so _ZN21OopOopIterateDispatchI25ShenandoahMarkRefsClosureE5Table15oop_oop_iterateI13InstanceKlassjEEvPS0_P7oopDescP5Klass (379 bytes) 0.92% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (259 bytes) 0.68% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (144 bytes) 0.66% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (81 bytes) 0.35% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (23 bytes) 0.17% libjvm.so _ZN13SATBMarkQueue21should_enqueue_bufferEv (159 bytes) 0.14% libjvm.so _ZNK5Klass15start_of_vtableEv (12 bytes) 0.08% libjvm.so _ZN21OopOopIterateDispatchI25ShenandoahMarkRefsClosureE5Table15oop_oop_iterateI13InstanceKlassjEEvPS0_P7oopDescP5Klass (30 bytes) 0.08% libjvm.so _ZN13SATBMarkQueue6filterEv (22 bytes) 0.06% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (6 bytes) 0.05% libjvm.so _ZN17ShenandoahRuntime25write_ref_field_pre_entryEP7oopDescP10JavaThread (35 bytes) 0.04% libjvm.so _ZN8PtrQueue20enqueue_known_activeEPv (63 bytes) 0.04% [kernel.kallsyms] [unknown] (6 bytes) 0.04% c2 org.openjdk.gcbench.wip.BarriersSATBSlowpath::test, version 130 (16 bytes) 0.03% c2 org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_AverageTime, version 135 (44 bytes) 0.03% c2 org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_AverageTime, version 135 (44 bytes) 0.03% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator (3 bytes) 0.21% <...other 254 warm regions...> .................................................................................................... 100.00% <totals> ....[Hottest Methods (after inlining)].............................................................. 41.37% libjvm.so _ZN13SATBMarkQueue6filterEv 30.70% c2 org.openjdk.gcbench.wip.BarriersSATBSlowpath::test, version 130 21.23% c2 org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_avgt_jmhStub, version 137 3.31% libjvm.so _ZN21OopOopIterateDispatchI25ShenandoahMarkRefsClosureE5Table15oop_oop_iterateI13InstanceKlassjEEvPS0_P7oopDescP5Klass 2.71% libjvm.so _ZN24ShenandoahConcurrentMark14mark_loop_workI25ShenandoahMarkRefsClosureLb1EEEvPT_PtjP22ParallelTaskTerminator 0.17% libjvm.so _ZN13SATBMarkQueue21should_enqueue_bufferEv 0.17% [kernel.kallsyms] [unknown] 0.14% libjvm.so _ZNK5Klass15start_of_vtableEv 0.06% c2 org.openjdk.gcbench.wip.generated.BarriersSATBSlowpath_test_jmhTest::test_AverageTime, version 135 0.06% libjvm.so _ZN8PtrQueue20enqueue_known_activeEPv 0.05% libjvm.so _ZN17ShenandoahRuntime25write_ref_field_pre_entryEP7oopDescP10JavaThread 0.02% ld-2.27.so __tls_get_addr 0.00% libc-2.27.so __memset_avx2_erms 0.00% libjvm.so _ZN15objArrayOopDesc17oop_iterate_rangeI25ShenandoahMarkRefsClosureEEvPT_ii 0.00% libjvm.so __tls_get_addr@plt 0.00% interpreter method entry point (kind = zerolocals) 0.00% libjvm.so _ZN27ShenandoahSATBBufferClosure9do_bufferEPPvm 0.00% libjvm.so _ZN6BitMap17clear_large_rangeEmm 0.00% libjvm.so _ZN17ThreadsListHandleC2EP6Thread 0.00% interpreter invokevirtual 182 invokevirtual 0.01% <...other 27 warm methods...> .................................................................................................... 100.00% <totals> ....[Distribution by Source]........................................................................ 51.99% c2 47.82% libjvm.so 0.17% [kernel.kallsyms] 0.02% ld-2.27.so 0.00% libc-2.27.so 0.00% interpreter 0.00% [vdso] .................................................................................................... 100.00% <totals> # Run complete. Total time: 00:01:02 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark (size) Mode Cnt Score Error Units BarriersSATBSlowpath.test 40000000 avgt 5 12.278 ± 0.139 ns/op BarriersSATBSlowpath.test:·asm 40000000 avgt NaN ---