rev 56634 : imported patch 8230876.patch
rev 56635 : v2.00 -> v2.05 (CR5/v2.05/8-for-jdk13) patches combined into one; merge with 8229212.patch; merge with jdk-14+11; merge with 8230184.patch; merge with 8230876.patch; merge with jdk-14+15; merge with jdk-14+18.
rev 56637 : Add OM_CACHE_LINE_SIZE so that ObjectMonitor cache line sizes can be experimented with independently of DEFAULT_CACHE_LINE_SIZE; for SPARC and X64 configs that use 128 for DEFAULT_CACHE_LINE_SIZE, we are experimenting with 64; move _previous_owner_tid and _allocation_state fields to share the cache line with ObjectMonitor::_header; put ObjectMonitor::_ref_count on its own cache line after _owner; add 'int* count_p' parameter to deflate_monitor_list() and deflate_monitor_list_using_JT() and push counter updates down to where the ObjectMonitors are actually removed from the in-use lists; monitors_iterate() async deflation check should use negative ref_count; add 'JavaThread* target' param to deflate_per_thread_idle_monitors_using_JT() add deflate_common_idle_monitors_using_JT() to make it clear which JavaThread* is the target of the work and which is the calling JavaThread* (self); g_free_list, g_om_in_use_list and g_om_in_use_count are now static to synchronizer.cpp (reduce scope); add more diagnostic info to some assert()'s; minor code cleanups and code motion; save_om_ptr() should detect a race with a deflating thread that is bailing out and cause a retry when the ref_count field is not positive; merge with jdk-14+11; add special GC support for TestHumongousClassLoader.java; merge with 8230184.patch; merge with jdk-14+14; merge with jdk-14+18.
rev 56638 : Merge the remainder of the lock-free monitor list changes from v2.06 with v2.06a and v2.06b after running the changes through the edit scripts; merge pieces from dcubed.monitor_deflate_conc.v2.06d in dcubed.monitor_deflate_conc.v2.06[ac]; merge pieces from dcubed.monitor_deflate_conc.v2.06e into dcubed.monitor_deflate_conc.v2.06c; merge with jdk-14+11; test work around for test/jdk/tools/jlink/multireleasejar/JLinkMultiReleaseJarTest.java should not been needed anymore; merge with jdk-14+18.
rev 56639 : loosen a couple more counter checks due to races observed in testing; simplify om_release() extraction of mid since list head or cur_mid_in_use is marked; simplify deflate_monitor_list() extraction of mid since there are no parallel deleters due to the safepoint; simplify deflate_monitor_list_using_JT() extraction of mid since list head or cur_mid_in_use is marked; prepend_block_to_lists() - simplify based on David H's comments; does not need load_acquire() or release_store() because of the cmpxchg(); prepend_to_common() - simplify to use mark_next_loop() for m and use mark_list_head() and release_store() for the non-empty list case; add more debugging for "Non-balanced monitor enter/exit" failure mode; fix race in inflate() in the "CASE: neutral" code path; install_displaced_markword_in_object() does not need to clear the header field since that is handled when the ObjectMonitor is moved from the global free list; LSuccess should clear boxReg to set ICC.ZF=1 to avoid depending on existing boxReg contents; update fast_unlock() to detect when object no longer refers to the same ObjectMonitor and take fast path exit instead; clarify fast_lock() code where we detect when object no longer refers to the same ObjectMonitor; add/update comments for movptr() calls where we move a literal into an Address; remove set_owner(); refactor setting of owner field into set_owner_from(2 versions), set_owner_from_BasicLock(), and try_set_owner_from(); the new functions include monitorinflation+owner logging; extract debug code from v2.06 and v2.07 and move to v2.07.debug; change 'jccb' -> 'jcc' and 'jmpb' -> 'jmp' as needed; checkpoint initial version of MacroAssembler::inc_om_ref_count(); update LP64 MacroAssembler::fast_lock() and fast_unlock() to use inc_om_ref_count(); fast_lock() return flag setting logic can use 'testptr(tmpReg, tmpReg)' instead of 'cmpptr(tmpReg, 0)' since that's more efficient; fast_unlock() LSuccess return flag setting logic can use 'testl (boxReg, 0)' instead of 'xorptr(boxReg, boxReg)' since that's more efficient; cleanup "fast-path" vs "fast path" and "slow-path" vs "slow path"; update MacroAssembler::rtm_inflated_locking() to use inc_om_ref_count(); update MacroAssembler::fast_lock() to preserve the flags before decrementing ref_count and restore the flags afterwards; this is more clean than depending on the contents of rax/tmpReg; coleenp CR - refactor async monitor deflation work from ServiceThread::service_thread_entry() to ObjectSynchronizer::deflate_idle_monitors_using_JT(); rehn,eosterlund CR - add support for HandshakeAfterDeflateIdleMonitors for platforms that don't have ObjectMonitor ref_count support implemented in C2 fast_lock() and fast_unlock().

   1 /*
   2  * Copyright (c) 1998, 2019, Oracle and/or its affiliates. All rights reserved.
   3  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
   4  *
   5  * This code is free software; you can redistribute it and/or modify it
   6  * under the terms of the GNU General Public License version 2 only, as
   7  * published by the Free Software Foundation.
   8  *
   9  * This code is distributed in the hope that it will be useful, but WITHOUT
  10  * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
  11  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
  12  * version 2 for more details (a copy is included in the LICENSE file that
  13  * accompanied this code).
  14  *
  15  * You should have received a copy of the GNU General Public License version
  16  * 2 along with this work; if not, write to the Free Software Foundation,
  17  * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
  18  *
  19  * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
  20  * or visit www.oracle.com if you need additional information or have any
  21  * questions.
  22  *
  23  */
  24 
  25 #ifndef SHARE_RUNTIME_OBJECTMONITOR_HPP
  26 #define SHARE_RUNTIME_OBJECTMONITOR_HPP
  27 
  28 #include "memory/allocation.hpp"
  29 #include "memory/padded.hpp"
  30 #include "oops/markWord.hpp"
  31 #include "runtime/os.hpp"
  32 #include "runtime/park.hpp"
  33 #include "runtime/perfData.hpp"
  34 
  35 class ObjectMonitor;
  36 
  37 // ObjectWaiter serves as a "proxy" or surrogate thread.
  38 // TODO-FIXME: Eliminate ObjectWaiter and use the thread-specific
  39 // ParkEvent instead.  Beware, however, that the JVMTI code
  40 // knows about ObjectWaiters, so we'll have to reconcile that code.
  41 // See next_waiter(), first_waiter(), etc.
  42 
  43 class ObjectWaiter : public StackObj {
  44  public:
  45   enum TStates { TS_UNDEF, TS_READY, TS_RUN, TS_WAIT, TS_ENTER, TS_CXQ };
  46   ObjectWaiter* volatile _next;
  47   ObjectWaiter* volatile _prev;
  48   Thread*       _thread;
  49   jlong         _notifier_tid;
  50   ParkEvent *   _event;
  51   volatile int  _notified;
  52   volatile TStates TState;
  53   bool          _active;           // Contention monitoring is enabled
  54  public:
  55   ObjectWaiter(Thread* thread);
  56 
  57   void wait_reenter_begin(ObjectMonitor *mon);
  58   void wait_reenter_end(ObjectMonitor *mon);
  59 };
  60 
  61 // The ObjectMonitor class implements the heavyweight version of a
  62 // JavaMonitor. The lightweight BasicLock/stack lock version has been
  63 // inflated into an ObjectMonitor. This inflation is typically due to
  64 // contention or use of Object.wait().
  65 //
  66 // WARNING: This is a very sensitive and fragile class. DO NOT make any
  67 // changes unless you are fully aware of the underlying semantics.
  68 //
  69 // ObjectMonitor Layout Overview/Highlights/Restrictions:
  70 //
  71 // - The _header field must be at offset 0 because the displaced header
  72 //   from markWord is stored there. We do not want markWord.hpp to include
  73 //   ObjectMonitor.hpp to avoid exposing ObjectMonitor everywhere. This
  74 //   means that ObjectMonitor cannot inherit from any other class nor can
  75 //   it use any virtual member functions. This restriction is critical to
  76 //   the proper functioning of the VM.
  77 // - The _header and _owner fields should be separated by enough space
  78 //   to avoid false sharing due to parallel access by different threads.
  79 //   This is an advisory recommendation.
  80 // - The general layout of the fields in ObjectMonitor is:
  81 //     _header
  82 //     <lightly_used_fields>
  83 //     <optional padding>
  84 //     _owner
  85 //     <remaining_fields>
  86 // - The VM assumes write ordering and machine word alignment with
  87 //   respect to the _owner field and the <remaining_fields> that can
  88 //   be read in parallel by other threads.
  89 // - Generally fields that are accessed closely together in time should
  90 //   be placed proximally in space to promote data cache locality. That
  91 //   is, temporal locality should condition spatial locality.
  92 // - We have to balance avoiding false sharing with excessive invalidation
  93 //   from coherence traffic. As such, we try to cluster fields that tend
  94 //   to be _written_ at approximately the same time onto the same data
  95 //   cache line.
  96 // - We also have to balance the natural tension between minimizing
  97 //   single threaded capacity misses with excessive multi-threaded
  98 //   coherency misses. There is no single optimal layout for both
  99 //   single-threaded and multi-threaded environments.
 100 //
 101 // - See TEST_VM(ObjectMonitor, sanity) gtest for how critical restrictions are
 102 //   enforced.
 103 // - Adjacent ObjectMonitors should be separated by enough space to avoid
 104 //   false sharing. This is handled by the ObjectMonitor allocation code
 105 //   in synchronizer.cpp. Also see TEST_VM(SynchronizerTest, sanity) gtest.
 106 //
 107 // Futures notes:
 108 //   - Separating _owner from the <remaining_fields> by enough space to
 109 //     avoid false sharing might be profitable. Given
 110 //     http://blogs.oracle.com/dave/entry/cas_and_cache_trivia_invalidate
 111 //     we know that the CAS in monitorenter will invalidate the line
 112 //     underlying _owner. We want to avoid an L1 data cache miss on that
 113 //     same line for monitorexit. Putting these <remaining_fields>:
 114 //     _recursions, _EntryList, _cxq, and _succ, all of which may be
 115 //     fetched in the inflated unlock path, on a different cache line
 116 //     would make them immune to CAS-based invalidation from the _owner
 117 //     field.
 118 //
 119 //   - The _recursions field should be of type int, or int32_t but not
 120 //     intptr_t. There's no reason to use a 64-bit type for this field
 121 //     in a 64-bit JVM.
 122 






 123 class ObjectMonitor {

 124   friend class ObjectSynchronizer;
 125   friend class ObjectWaiter;
 126   friend class VMStructs;
 127   JVMCI_ONLY(friend class JVMCIVMStructs;)
 128 
 129   // The sync code expects the header field to be at offset zero (0).
 130   // Enforced by the assert() in header_addr().
 131   volatile markWord _header;        // displaced object header word - mark
 132   void* volatile _object;           // backward object pointer - strong root
 133  public:
 134   ObjectMonitor* _next_om;          // Next ObjectMonitor* linkage
 135  private:



 136   // Separate _header and _owner on different cache lines since both can
 137   // have busy multi-threaded access. _header and _object are set at
 138   // initial inflation and _object doesn't change until deflation so
 139   // _object is a good choice to share the cache line with _header.
 140   // _next_om shares _header's cache line for pre-monitor list historical
 141   // reasons. _next_om only changes if the next ObjectMonitor is deflated.
 142   DEFINE_PAD_MINUS_SIZE(0, DEFAULT_CACHE_LINE_SIZE,
 143                         sizeof(volatile markWord) + sizeof(void* volatile) +
 144                         sizeof(ObjectMonitor *));
 145   void* volatile _owner;            // pointer to owning thread OR BasicLock
 146   volatile jlong _previous_owner_tid;  // thread id of the previous owner of the monitor














 147   volatile intptr_t _recursions;    // recursion count, 0 for first entry
 148   ObjectWaiter* volatile _EntryList;  // Threads blocked on entry or reentry.
 149                                       // The list is actually composed of WaitNodes,
 150                                       // acting as proxies for Threads.
 151 
 152   ObjectWaiter* volatile _cxq;      // LL of recently-arrived threads blocked on entry.
 153   Thread* volatile _succ;           // Heir presumptive thread - used for futile wakeup throttling
 154   Thread* volatile _Responsible;
 155 
 156   volatile int _Spinner;            // for exit->spinner handoff optimization
 157   volatile int _SpinDuration;
 158 
 159   volatile jint  _contentions;      // Number of active contentions in enter(). It is used by is_busy()
 160                                     // along with other fields to determine if an ObjectMonitor can be
 161                                     // deflated. See ObjectSynchronizer::deflate_monitor().

 162  protected:
 163   ObjectWaiter* volatile _WaitSet;  // LL of threads wait()ing on the monitor
 164   volatile jint  _waiters;          // number of waiting threads
 165  private:
 166   volatile int _WaitSetLock;        // protects Wait Queue - simple spinlock
 167 
 168  public:
 169   static void Initialize();
 170 
 171   // Only perform a PerfData operation if the PerfData object has been
 172   // allocated and if the PerfDataManager has not freed the PerfData
 173   // objects which can happen at normal VM shutdown.
 174   //
 175   #define OM_PERFDATA_OP(f, op_str)              \
 176     do {                                         \
 177       if (ObjectMonitor::_sync_ ## f != NULL &&  \
 178           PerfDataManager::has_PerfData()) {     \
 179         ObjectMonitor::_sync_ ## f->op_str;      \
 180       }                                          \
 181     } while (0)
 182 
 183   static PerfCounter * _sync_ContendedLockAttempts;
 184   static PerfCounter * _sync_FutileWakeups;
 185   static PerfCounter * _sync_Parks;
 186   static PerfCounter * _sync_Notifications;
 187   static PerfCounter * _sync_Inflations;
 188   static PerfCounter * _sync_Deflations;
 189   static PerfLongVariable * _sync_MonExtant;
 190 
 191   static int Knob_SpinLimit;
 192 
 193   void* operator new (size_t size) throw();
 194   void* operator new[] (size_t size) throw();
 195   void operator delete(void* p);
 196   void operator delete[] (void* p);
 197 
 198   // TODO-FIXME: the "offset" routines should return a type of off_t instead of int ...
 199   // ByteSize would also be an appropriate type.
 200   static int header_offset_in_bytes()      { return offset_of(ObjectMonitor, _header); }
 201   static int object_offset_in_bytes()      { return offset_of(ObjectMonitor, _object); }
 202   static int owner_offset_in_bytes()       { return offset_of(ObjectMonitor, _owner); }

 203   static int recursions_offset_in_bytes()  { return offset_of(ObjectMonitor, _recursions); }
 204   static int cxq_offset_in_bytes()         { return offset_of(ObjectMonitor, _cxq); }
 205   static int succ_offset_in_bytes()        { return offset_of(ObjectMonitor, _succ); }
 206   static int EntryList_offset_in_bytes()   { return offset_of(ObjectMonitor, _EntryList); }
 207 
 208   // ObjectMonitor references can be ORed with markWord::monitor_value
 209   // as part of the ObjectMonitor tagging mechanism. When we combine an
 210   // ObjectMonitor reference with an offset, we need to remove the tag
 211   // value in order to generate the proper address.
 212   //
 213   // We can either adjust the ObjectMonitor reference and then add the
 214   // offset or we can adjust the offset that is added to the ObjectMonitor
 215   // reference. The latter avoids an AGI (Address Generation Interlock)
 216   // stall so the helper macro adjusts the offset value that is returned
 217   // to the ObjectMonitor reference manipulation code:
 218   //
 219   #define OM_OFFSET_NO_MONITOR_VALUE_TAG(f) \
 220     ((ObjectMonitor::f ## _offset_in_bytes()) - markWord::monitor_value)
 221 
 222   markWord           header() const;
 223   volatile markWord* header_addr();
 224   void               set_header(markWord hdr);
 225 
 226   intptr_t is_busy() const {
 227     // TODO-FIXME: assert _owner == null implies _recursions = 0
 228     return _contentions|_waiters|intptr_t(_owner)|intptr_t(_cxq)|intptr_t(_EntryList);












 229   }
 230   const char* is_busy_to_string(stringStream* ss);
 231 
 232   intptr_t  is_entered(Thread* current) const;
 233 
 234   void*     owner() const;
 235   void      set_owner(void* owner);










 236 
 237   jint      waiters() const;
 238 
 239   jint      contentions() const;
 240   intptr_t  recursions() const                                         { return _recursions; }
 241 
 242   // JVM/TI GetObjectMonitorUsage() needs this:
 243   ObjectWaiter* first_waiter()                                         { return _WaitSet; }
 244   ObjectWaiter* next_waiter(ObjectWaiter* o)                           { return o->_next; }
 245   Thread* thread_of_waiter(ObjectWaiter* o)                            { return o->_thread; }
 246 
 247  protected:
 248   // We don't typically expect or want the ctors or dtors to run.
 249   // normal ObjectMonitors are type-stable and immortal.
 250   ObjectMonitor() { ::memset((void*)this, 0, sizeof(*this)); }
 251 
 252   ~ObjectMonitor() {
 253     // TODO: Add asserts ...
 254     // _cxq == 0 _succ == NULL _owner == NULL _waiters == 0
 255     // _contentions == 0 _EntryList  == NULL etc
 256   }
 257 
 258  private:
 259   void Recycle() {
 260     // TODO: add stronger asserts ...
 261     // _cxq == 0 _succ == NULL _owner == NULL _waiters == 0
 262     // _contentions == 0 EntryList  == NULL
 263     // _recursions == 0 _WaitSet == NULL
 264     DEBUG_ONLY(stringStream ss;)
 265     assert((is_busy() | _recursions) == 0, "freeing in-use monitor: %s, "
 266            "recursions=" INTPTR_FORMAT, is_busy_to_string(&ss), _recursions);
 267     _succ          = NULL;
 268     _EntryList     = NULL;
 269     _cxq           = NULL;
 270     _WaitSet       = NULL;
 271     _recursions    = 0;
 272   }
 273 
 274  public:
 275 
 276   void*     object() const;
 277   void*     object_addr();
 278   void      set_object(void* obj);









 279 
 280   // Returns true if the specified thread owns the ObjectMonitor. Otherwise
 281   // returns false and throws IllegalMonitorStateException (IMSE).
 282   bool      check_owner(Thread* THREAD);
 283   void      clear();

 284 
 285   void      enter(TRAPS);
 286   void      exit(bool not_suspended, TRAPS);
 287   void      wait(jlong millis, bool interruptable, TRAPS);
 288   void      notify(TRAPS);
 289   void      notifyAll(TRAPS);
 290 
 291   void      print() const;

 292   void      print_on(outputStream* st) const;
 293 
 294 // Use the following at your own risk
 295   intptr_t  complete_exit(TRAPS);
 296   void      reenter(intptr_t recursions, TRAPS);
 297 
 298  private:
 299   void      AddWaiter(ObjectWaiter* waiter);
 300   void      INotify(Thread* self);
 301   ObjectWaiter* DequeueWaiter();
 302   void      DequeueSpecificWaiter(ObjectWaiter* waiter);
 303   void      EnterI(TRAPS);
 304   void      ReenterI(Thread* self, ObjectWaiter* self_node);
 305   void      UnlinkAfterAcquire(Thread* self, ObjectWaiter* self_node);
 306   int       TryLock(Thread* self);
 307   int       NotRunnable(Thread* self, Thread * Owner);
 308   int       TrySpin(Thread* self);
 309   void      ExitEpilog(Thread* self, ObjectWaiter* Wakee);
 310   bool      ExitSuspendEquivalent(JavaThread* self);

 311 };
 312 































 313 #endif // SHARE_RUNTIME_OBJECTMONITOR_HPP
--- EOF ---