src/share/classes/java/util/concurrent/ForkJoinPool.java

Print this page




 172      * that may be stolen by other workers.  Preference rules give
 173      * first priority to processing tasks from their own queues (LIFO
 174      * or FIFO, depending on mode), then to randomized FIFO steals of
 175      * tasks in other queues.
 176      *
 177      * WorkQueues
 178      * ==========
 179      *
 180      * Most operations occur within work-stealing queues (in nested
 181      * class WorkQueue).  These are special forms of Deques that
 182      * support only three of the four possible end-operations -- push,
 183      * pop, and poll (aka steal), under the further constraints that
 184      * push and pop are called only from the owning thread (or, as
 185      * extended here, under a lock), while poll may be called from
 186      * other threads.  (If you are unfamiliar with them, you probably
 187      * want to read Herlihy and Shavit's book "The Art of
 188      * Multiprocessor programming", chapter 16 describing these in
 189      * more detail before proceeding.)  The main work-stealing queue
 190      * design is roughly similar to those in the papers "Dynamic
 191      * Circular Work-Stealing Deque" by Chase and Lev, SPAA 2005
 192      * (http://research.sun.com/scalable/pubs/index.html) and
 193      * "Idempotent work stealing" by Michael, Saraswat, and Vechev,
 194      * PPoPP 2009 (http://portal.acm.org/citation.cfm?id=1504186).
 195      * See also "Correct and Efficient Work-Stealing for Weak Memory
 196      * Models" by Le, Pop, Cohen, and Nardelli, PPoPP 2013
 197      * (http://www.di.ens.fr/~zappa/readings/ppopp13.pdf) for an
 198      * analysis of memory ordering (atomic, volatile etc) issues.  The
 199      * main differences ultimately stem from GC requirements that we
 200      * null out taken slots as soon as we can, to maintain as small a
 201      * footprint as possible even in programs generating huge numbers
 202      * of tasks. To accomplish this, we shift the CAS arbitrating pop
 203      * vs poll (steal) from being on the indices ("base" and "top") to
 204      * the slots themselves.  So, both a successful pop and poll
 205      * mainly entail a CAS of a slot from non-null to null.  Because
 206      * we rely on CASes of references, we do not need tag bits on base
 207      * or top.  They are simple ints as used in any circular
 208      * array-based queue (see for example ArrayDeque).  Updates to the
 209      * indices must still be ordered in a way that guarantees that top
 210      * == base means the queue is empty, but otherwise may err on the
 211      * side of possibly making the queue appear nonempty when a push,
 212      * pop, or poll have not fully committed. Note that this means




 172      * that may be stolen by other workers.  Preference rules give
 173      * first priority to processing tasks from their own queues (LIFO
 174      * or FIFO, depending on mode), then to randomized FIFO steals of
 175      * tasks in other queues.
 176      *
 177      * WorkQueues
 178      * ==========
 179      *
 180      * Most operations occur within work-stealing queues (in nested
 181      * class WorkQueue).  These are special forms of Deques that
 182      * support only three of the four possible end-operations -- push,
 183      * pop, and poll (aka steal), under the further constraints that
 184      * push and pop are called only from the owning thread (or, as
 185      * extended here, under a lock), while poll may be called from
 186      * other threads.  (If you are unfamiliar with them, you probably
 187      * want to read Herlihy and Shavit's book "The Art of
 188      * Multiprocessor programming", chapter 16 describing these in
 189      * more detail before proceeding.)  The main work-stealing queue
 190      * design is roughly similar to those in the papers "Dynamic
 191      * Circular Work-Stealing Deque" by Chase and Lev, SPAA 2005
 192      * (http://labs.oracle.com/pls/apex/f?p=labs:10:0) and
 193      * "Idempotent work stealing" by Michael, Saraswat, and Vechev,
 194      * PPoPP 2009 (http://portal.acm.org/citation.cfm?id=1504186).
 195      * See also "Correct and Efficient Work-Stealing for Weak Memory
 196      * Models" by Le, Pop, Cohen, and Nardelli, PPoPP 2013
 197      * (http://www.di.ens.fr/~zappa/readings/ppopp13.pdf) for an
 198      * analysis of memory ordering (atomic, volatile etc) issues.  The
 199      * main differences ultimately stem from GC requirements that we
 200      * null out taken slots as soon as we can, to maintain as small a
 201      * footprint as possible even in programs generating huge numbers
 202      * of tasks. To accomplish this, we shift the CAS arbitrating pop
 203      * vs poll (steal) from being on the indices ("base" and "top") to
 204      * the slots themselves.  So, both a successful pop and poll
 205      * mainly entail a CAS of a slot from non-null to null.  Because
 206      * we rely on CASes of references, we do not need tag bits on base
 207      * or top.  They are simple ints as used in any circular
 208      * array-based queue (see for example ArrayDeque).  Updates to the
 209      * indices must still be ordered in a way that guarantees that top
 210      * == base means the queue is empty, but otherwise may err on the
 211      * side of possibly making the queue appear nonempty when a push,
 212      * pop, or poll have not fully committed. Note that this means