


Up:  One-Sided Communications
Next:  Atomicity
Previous:  Error Classes
  
 
The semantics of  RMA operations is best understood by assuming that the  
system  
maintains a separate  public copy of each window, in addition to  
the original location in process memory (the  private window   
copy).  
There is only one instance of each variable in process memory,  
but a distinct  public  
copy of the variable for each window that contains it.  A load  
accesses the instance in process memory (this includes  MPI   
sends).  
A store accesses  
and updates the instance in process memory  
(this includes  MPI receives),  
but the update may affect other public copies of the same  
locations.  
A get on a window accesses the public copy of that window.  
A put or accumulate on a window accesses and  
updates the public copy of that  
window,  
but the update may affect the private copy of the same  
locations in process memory, and public copies of other overlapping windows.  
This is illustrated in  
Figure 5 
.  

  
 
Figure 5: Schematic description of window 
  
  
The following rules specify  
the latest time at which an operation must complete at the origin or  
the target.  
The update performed by a  
get call in the origin process memory is visible when the get  
operation is complete at the origin (or earlier); the update performed by a  
put or  
accumulate call in the public copy of the target window is visible  
when the put or accumulate has completed at the target (or earlier).  The  
rules  
also specifies  
the latest  
time at which an update of one window copy becomes visible in another overlapping copy.  
 
 
 
1. An  RMA  operation is completed at the origin  
by the ensuing call to  
 MPI_WIN_COMPLETE, MPI_WIN_FENCE or  
 MPI_WIN_UNLOCK  
that synchronizes this access at the origin.  
 
 
2. If an  RMA operation is completed at the origin by a call to  
 MPI_WIN_FENCE  
then the operation is completed at the target  by the  
matching call to  MPI_WIN_FENCE by the target process.  
 
 
3. If an  RMA operation is completed at the origin  
by a call to  MPI_WIN_COMPLETE  
then the operation is completed at the target  by the  
matching call to   MPI_WIN_WAIT by the target process.  
 
 
4. If an  RMA operation is completed at the origin by a call to  
 MPI_WIN_UNLOCK  
then the operation is completed at the target by  
that same call to  MPI_WIN_UNLOCK.  
 
 
5. An  update of a location in a private window copy in process memory becomes  
visible in the public window copy  
at latest when an ensuing call to  MPI_WIN_POST,  
MPI_WIN_FENCE, or  MPI_WIN_UNLOCK is executed on that  
window  
by the window owner.  
 
 
6. An update by a put or accumulate call to a public window copy  
becomes visible in the private copy in process memory at latest  
when an ensuing  
call to  MPI_WIN_WAIT, MPI_WIN_FENCE, or  
 MPI_WIN_LOCK is executed on that window by the window  
owner.  
The  MPI_WIN_FENCE or  MPI_WIN_WAIT  
call that completes the transfer from  
public copy to private copy  
(6) is the same call that  
completes the put or accumulate operation in the window copy (2, 3).  
If a put or accumulate access was synchronized with a lock, then  
the update of the public window copy is complete as soon as  
the updating process executed  MPI_WIN_UNLOCK.  
On the other hand, the update  
of private copy in the process memory may be delayed until the target  
process  
executes a synchronization call on that window (6).   Thus, updates to  
process memory can always be delayed until the process executes a  
suitable synchronization call.  Updates to a public window copy can also be  
delayed until the window owner executes a synchronization call, if  
fences or post-start-complete-wait synchronization is used.  Only  
when lock synchronization is used does it becomes necessary to update the  
public window copy, even if the window owner does not execute any related   
synchronization call. 
The rules above also define, by implication, when an update to a  
public window copy becomes visible in another overlapping public  
window copy.  
Consider, for example,  two overlapping windows, win1 and win2.  A call to  
 MPI_WIN_FENCE(0, win1) by the window owner  
makes visible in the process memory  
previous updates to window win1 by remote processes.  A subsequent call  
to  MPI_WIN_FENCE(0, win2) makes these updates visible in  
the public copy of win2.  
 
A correct program must obey the following rules.  
 
 
 
1. A location in a window must not be accessed locally once an update to  
that location has started, until the update becomes visible in the  
private window copy in process  
memory.  
 
 
2. A location in a window must not  be accessed as a target of an  RMA  
operation once an update to that location has started, until the  
update becomes visible in the public window copy.  There is one  
exception to this rule, in the case where the same variable is updated  
by two concurrent accumulates that use the same operation,  
with the same predefined datatype, on the same   
window.  
 
 
3. A put or accumulate must not access a target window once a local update  
or a put or accumulate update to  another (overlapping) target window  
have started on a location in the target window, until the  update  
becomes visible in the public copy of the  window.  
Conversely, a local update in process memory  
to a location in a window must not start once a put or  
accumulate update to that target window has started, until the put or accumulate  
update becomes visible in process memory.  In both cases, the  
restriction applies to operations even if they access disjoint  
locations in the window.  
A program is erroneous if it violates these rules. 
 
 
 Rationale.  
 
The last constraint on correct  RMA accesses may seem unduly  
restrictive, as it forbids concurrent accesses to nonoverlapping  
locations in a window.  The reason for this constraint is that, on  
some architectures, explicit coherence restoring operations may be  
needed at synchronization points.  
A different operation may be needed for locations that were  
locally updated by stores and for locations that were remotely  
updated by put or accumulate operations.  Without this constraint,  
the  MPI library will have to track  
precisely which locations in a window were updated by a put or  
accumulate call.  The additional overhead of maintaining such  
information is considered prohibitive.  
 ( End of rationale.) 
 
 
 
 Advice to users.  
 
A user can write correct programs by following the following rules:  
 
 
-  
fence:
-  
During each period between fence calls, each window is  
either updated by put or accumulate calls, or updated by local stores,  
but not both.  Locations updated by put or accumulate calls   
should not be  
accessed during the same period (with the exception of  
concurrent updates to the same location by accumulate calls).  
Locations accessed by  
get calls should not be updated during the same period.  
 
-  
post-start-complete-wait:
-  
A window should not be updated  
locally while being posted, if it is being updated by put or  
accumulate calls.  Locations updated by put or accumulate  
calls should not be accessed while the window is posted (with the  
exception of concurrent updates to the same location by  
accumulate calls).  
Locations accessed by get calls should not be updated while  
the window is posted.  
 
With the post-start synchronization, the target process can tell  
the origin process that its window is now ready for  RMA access; with  
the complete-wait synchronization, the origin process can tell the  
target process that it has finished its  RMA accesses to the   
window.  
  
 
 
-  
lock:
-  
Updates to the window are protected by exclusive locks if  
they may conflict.  Nonconflicting accesses (such as read-only accesses  
or accumulate accesses) are protected by shared locks,   
both for local accesses and for  RMA accesses.  
  
 
-  
changing window or synchronization mode:
-  
  
One can change synchronization mode, or change the window used to  
access a location that belongs to two overlapping windows, when the  
process memory and the window copy are guaranteed to have the same  
values.  This is true after a local call to  MPI_WIN_FENCE, if  
 RMA accesses to the window are synchronized with fences; after a  
local call to  MPI_WIN_WAIT, if the accesses are synchronized  
with post-start-complete-wait;  
after the call at the origin (local or remote) to  
 MPI_WIN_UNLOCK  
if the accesses are synchronized with locks.  
 
In addition, a process should not access the local buffer of a  
get operation until the operation is complete, and should not update  
the local buffer of a put or accumulate operation until that operation  
is complete.   
 ( End of advice to users.) 
 
 



Up:  One-Sided Communications
Next:  Atomicity
Previous:  Error Classes
Return to MPI-2 Standard Index
Return to MPI Standard Index
Return to MPI Forum Home Page
Return to MPI Home Page
MPI-2.0 of July 18, 1997
HTML Generated on November 1, 2000