Changes to MOOSE checkpoints prevent MRAD model restarts #369

harterj · 2024-03-15T20:05:30Z

Bug Description

The checkpoint = true in HPMR_thermal_ss.i used to write a directory called HPMR_dfem_griffin_ss_out_bison0_cp with checkpoint files. With the recent changes to MOOSE checkpoints, it appears only the master input app can write checkpoint files.

>  mpiexec -n 48 ~/sawtooth-projects/dire_wolf/dire_wolf-opt -i HPMR_dfem_griffin_tr.i
*** ERROR ***
No checkpoint file found!

Steps to Reproduce

Obtain the latest version of Dire Wolf and MOOSE. Run the steady state case, HPMR_dfem_griffin_ss.i as normal. When the simulation finishes, look in HPMR_dfem_griffin_ss_out_bison0_cp, notice there are 0 files. You will be unable to run the null transient, or any other simulation relying on restarts.

[hartjack][~/projects/virtual_test_bed/microreactors/mrad/steady/HPMR_dfem_griffin_ss_out_bison0_cp] (devel)> l
total 160K
drwx------ 2 hartjack hartjack   0 Mar 15 13:12 .
drwxrwxr-x 7 hartjack hartjack 14K Mar 15 13:51 ..
[hartjack][~/projects/virtual_test_bed/microreactors/mrad/steady/HPMR_dfem_griffin_ss_out_bison0_cp] (devel)>

As a comparison to show new functionality, add checkpoint = true to [Outputs] in HPMR_dfem_griffin_ss.i. Run the steady state case. Look at the two output directories HPMR_dfem_griffin_ss_out_cp and HPMR_dfem_griffin_ss_out_bison0_cp. You will see the checkpoints in the former directory, and not the latter.

It is not as trivial as just using the master app checkpoint files. The neutronics and conduction meshs are different, with different BCs. I did not create the mesh, but I'm guessing some changes might need to be made, hence tagging the model creators.

Impact

Trying to use a copy of MRAD for another application. I can't progress much further right now.

Tagging: @miaoyinb @nstauff @GiudGiud @markdehart

The text was updated successfully, but these errors were encountered:

miaoyinb · 2024-03-17T20:57:10Z

@GiudGiud Is this the expected behavior of the new app version? I recently used the INL HPC blue_crab-opt compiled on 3/15/2024 for a different problem and the checkpoint files of the child app are generated as usual.

GiudGiud · 2024-03-18T01:52:44Z

We should still be able to generate sub-app checkpoints with checkpoint = true in the subapp.
I need to look into this

harterj · 2024-03-21T16:09:47Z

Tag @YaqiWang

lindsayad · 2024-03-25T16:23:06Z

I believe we've been moving towards having the main app handling all the restart, even if the main input file is "changing" from before-restart to after-restart.

@loganharbour

harterj · 2024-03-28T22:02:26Z

BTW, @GiudGiud helped me get this running, but obviously this is not the ideal fix. This is [Outputs] in HPMR_thermo_ss.i

[Outputs]
  perf_graph = true
  exodus = true
  color = true
  csv = true
  [check]
    type = Checkpoint
    execute_on = FINAL
    num_files = 1e5
  []
[]

This will generate checkpoint files for the SubApp.

harterj added the bug Something isn't working label Mar 15, 2024

lindsayad mentioned this issue Mar 29, 2024

Subapp of type FullSolveMultiapp does not output checkpoint when enabled in subapp input file idaholab/moose#24777

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to MOOSE checkpoints prevent MRAD model restarts #369

Changes to MOOSE checkpoints prevent MRAD model restarts #369

harterj commented Mar 15, 2024

miaoyinb commented Mar 17, 2024

GiudGiud commented Mar 18, 2024

harterj commented Mar 21, 2024

lindsayad commented Mar 25, 2024

harterj commented Mar 28, 2024

Changes to MOOSE checkpoints prevent MRAD model restarts #369

Changes to MOOSE checkpoints prevent MRAD model restarts #369

Comments

harterj commented Mar 15, 2024

Bug Description

Steps to Reproduce

Impact

miaoyinb commented Mar 17, 2024

GiudGiud commented Mar 18, 2024

harterj commented Mar 21, 2024

lindsayad commented Mar 25, 2024

harterj commented Mar 28, 2024