Skip to content

Avoid Using Model State when in Parallel Execution Context

Dan Maljovec edited this page Sep 26, 2017 · 1 revision

Be aware when creating a new Model subclass that separate evaluations can be executed in parallel, and the model should be treated as a read-only object when inside the evaluateSample function. The reasoning is explained below.

Note, that when <InternalParallel>True</InternalParallel> is set, then the state of the Model instance will be copied to each separate process and thus will be frozen upon the call to submit. Any updates to the Model will be transient and not communicated back to the copy of the Model existing on the main RAVEN execution process (the one in charge of collecting all of the results). Furthermore, any state of the model to a subsequent collectOutput call need not match the temporary state when a model run was queued for execution. Therefore, any information needed should be passed in as an argument to evaluateSample and returned somehow, so the collectOutput can adequately process special cases.

If instead <InternalParallel>False</InternalParallel> is used, then the state of the Model instance will be shared, however there are no thread-safety locks preventing multiple threads from writing to the model object. Again, in this case we should avoid writing any state of the Model when inside the evaluateSample function call to prevent data corruption.

It is safest to not depend on the state of the Model anywhere in the submit and collectOutput pipeline. The Model's state offers no guarantees that it matches what it was when the sample being collected was generated. The way to do this in practice is to not have anything write to a variable starting with self. in the evaluateSample or any method called by it. In addition, the only variables that should be read from self are those that are constant throughout the current Step in either the evaluateSample, collectOutput, or any functions called within these functions.