Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a sometimes failing concurrency test #1712

Merged
merged 3 commits into from
Oct 6, 2022
Merged

a sometimes failing concurrency test #1712

merged 3 commits into from
Oct 6, 2022

Conversation

rwmcintosh
Copy link
Member

@Yuri05 @msevestre @PavelBal

When I run this test it will fail similarly often to the R code that inspired it.

The SequenceEqual test is checking that all the sequences are identical besides any from failing runs.

@msevestre
Copy link
Member

This is good. We removed R altogether from this issue.
@Yuri05 finally has an issue worth looking at;-)

@Yuri05
Copy link
Member

Yuri05 commented Sep 29, 2022

Nice 🙂

@Yuri05
Copy link
Member

Yuri05 commented Oct 5, 2022

fun to debug :)
I temporarily extended the code in SimModelBatch.CreateAndFinalizeSimulation with this:

grafik

This writes the xmls which are passed to SimModel for calculation.
In the simulation, exactly 1 output variable is set as persistable:
grafik

Now when some simulations do not retrieve any results: this output variable is set as non-persistable!
grafik

Here the xmls of a failing run: simulation 27 is the one with persistable = 0
xmls.zip

So what happens: SimModel sometimes becomes an input with 0 required outputs, calculates and returns empty set of outputs, because none of the outputs is marked as to be returned.
SimModel should surely check for this and either produce a warning or stop with the error (not sure what's better) - but the initial issue is apparently the SimModel-XML creation in core.
Something is not thread safe there...

@msevestre
Copy link
Member

SO GOOD @Yuri05
debug master did it again.
Ok so we need to make SimModel generation thread safe!. WTF. THe service needs to be instantiated on each Manager instead of recup once and cached. This is probably an easy fix

@rwmcintosh
Copy link
Member Author

rwmcintosh commented Oct 5, 2022

once again, bug reports go in to SimModel and then bounce straight back out. Thanks @Yuri05

@msevestre
Copy link
Member

So we are already creating a new instance of the simulation exporter

   private SimulationExport createSimulationExport(IModelCoreSimulation simulation, SimModelExportMode simModelExportMode, IReadOnlyList<string> variableMoleculePaths)
      {
         var simulationExportCreator = _simulationExportCreatorFactory.Create();
         var simulationExport = simulationExportCreator.CreateExportFor(simulation.Model, simModelExportMode, variableMoleculePaths);
         simulationExport.AddSimulationConfiguration(simulation.SimulationSettings);
         return simulationExport;
      }

So the quick fix that I was hoping is not happening

@msevestre
Copy link
Member

Ok I think I have it....

@msevestre msevestre marked this pull request as ready for review October 6, 2022 02:35
@msevestre msevestre requested a review from Yuri05 October 6, 2022 02:35
/// Return a clone of the simulation that can be used during batch run.
/// This simulation will contain the minimal information required to run. Configuration settings such as some building blocks will not be copied over
/// </summary>
/// <param name="simulation"></param>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be removed. You really just wanted the summary here I guess.

Comment on lines +34 to +44
// Initialize a BuildConfiguration with only SimulationSettings because some of the properties to complete the initialization are required
BuildConfiguration = new BuildConfiguration
{
SimulationSettings = _cloneManagerForModel.Clone(simulationToClone.SimulationSettings)
}
};

// Initialize a BuildConfiguration with only SimulationSettings because some of the properties to complete the initialization are required
// None of the other properties are required to complete the simulation
var simulationBuildConfiguration = new BuildConfiguration
{
SimulationSettings = _cloneManagerForModel.Clone(simulationToClone.SimulationSettings)
};
simulation.BuildConfiguration = simulationBuildConfiguration;

//Once we prepare a simulation for a batch, we are assuming that it won't be changed by the caller.
//We update the persistable at this stage to avoid any possible thread problems
_simulationPersistableUpdater.UpdateSimulationPersistable(simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the problem was that cloning did not properly copy the simulation settings and thus the curve was not set as persistable?
But why did it happen only sometimes and not every time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that's not it. I put it there because it's an implementation details that needs to happen as we are about to start the simulation run.

Befote, it was done in the initialize of the batch, which was using the clone already. However it was the same instance shared amongst all run of the batch. as this was happening in //, there was potentially a moment where one batch would cleaer the output to set them again, but the other one was done and was generating XML..

Anyways this line was added to fix another bug and I remembered not being too happy about doing it that way but I didn't find anything better. Now that we have a place to clone the simulation, this felt to me like the right location

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see. cool, cool :)

@@ -25,7 +25,7 @@ public class ConcurrentRunSimulationBatch : IDisposable, IWithId

public ConcurrentRunSimulationBatch(IModelCoreSimulation simulation, SimulationBatchOptions simulationBatchOptions)
{
Simulation = Api.GetSimulationTask().Clone(simulation);
Simulation = Api.GetSimulationTask().CloneForBatchRun(simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwmcintosh @Yuri05 I have renamed this guy because we are not creating a real clone
Just one that is enough to calculate

}

public void Initialize(IModelCoreSimulation simulation, SimulationBatchOptions simulationBatchOptions)
{
_simulationPersistableUpdater.UpdateSimulationPersistable(simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the culprit. It is resetting the simulations output and then setting them again. This call is NOT thread safe and was called in mult thread environment


//Once we prepare a simulation for a batch, we are assuming that it won't be changed by the caller.
//We update the persistable at this stage to avoid any possible thread problems
_simulationPersistableUpdater.UpdateSimulationPersistable(simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am doing it here. As soon as we cloen the simulation (for batch), I am making it also up to date as far as output selection is concerned. It is similar to what we do in normal run where we update before we reun

@@ -165,6 +228,7 @@ public override void GlobalContext()
{
base.GlobalContext();
_simulation = _simulationPersister.LoadSimulation(HelperForSpecs.DataFile("S1.pkml"));
_simulationPersistableUpdater.UpdateSimulationPersistable(_simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know why this was required . The test is bypassing the runner and calling directly the underlying stuff so that's probably why

Comment on lines +34 to +44
// Initialize a BuildConfiguration with only SimulationSettings because some of the properties to complete the initialization are required
BuildConfiguration = new BuildConfiguration
{
SimulationSettings = _cloneManagerForModel.Clone(simulationToClone.SimulationSettings)
}
};

// Initialize a BuildConfiguration with only SimulationSettings because some of the properties to complete the initialization are required
// None of the other properties are required to complete the simulation
var simulationBuildConfiguration = new BuildConfiguration
{
SimulationSettings = _cloneManagerForModel.Clone(simulationToClone.SimulationSettings)
};
simulation.BuildConfiguration = simulationBuildConfiguration;

//Once we prepare a simulation for a batch, we are assuming that it won't be changed by the caller.
//We update the persistable at this stage to avoid any possible thread problems
_simulationPersistableUpdater.UpdateSimulationPersistable(simulation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that's not it. I put it there because it's an implementation details that needs to happen as we are about to start the simulation run.

Befote, it was done in the initialize of the batch, which was using the clone already. However it was the same instance shared amongst all run of the batch. as this was happening in //, there was potentially a moment where one batch would cleaer the output to set them again, but the other one was done and was generating XML..

Anyways this line was added to fix another bug and I remembered not being too happy about doing it that way but I didn't find anything better. Now that we have a place to clone the simulation, this felt to me like the right location

@msevestre
Copy link
Member

Damm I had written all those comments yesterday in the PR and forgot to submit review. I HATE THIS FEATURE IN GITHUB

@msevestre
Copy link
Member

going to merge now. I think this will solve all our issues
FYI @PavelBal

@msevestre msevestre merged commit ccb2b9f into develop Oct 6, 2022
@msevestre msevestre deleted the concurrency-test branch October 6, 2022 18:41
@PavelBal
Copy link
Member

PavelBal commented Oct 7, 2022

Awesome. Once PK-Sim is updated, we should also update Core in R and should be able to use simulation batches again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants