PIO_INTERNAL_ERROR with ERS_P256.hcru_hcru.I20TRGSWCNPRDCTCBC.pm-cpu_intel.elm-erosion
#6486
Labels
ERS_P256.hcru_hcru.I20TRGSWCNPRDCTCBC.pm-cpu_intel.elm-erosion
#6486
With
ERS.hcru_hcru.I20TRGSWCNPRDCTCBC.pm-cpu_intel.elm-erosion
, the test uses 128 tasks on 1 node and was a little slow. I wanted to try using 2 nodes (256 tasks), but hit an error described here.Note that to still improve speed of these tests, I went ahead with a PR to increase tasks to 192 (still using 2 nodes):
After #6484, we are now using 192 tasks for all components.
To reproduce the errro:
ERS_P256.hcru_hcru.I20TRGSWCNPRDCTCBC.pm-cpu_intel.elm-erosion
Note I don't see the error with a SMS test -- so seems to be related to writing restarts.
I was seeing:
Jayesh asked "Does increasing the number of I/O processes (./xmlchange PIO_NUMTASKS=16) fix the issue with ERS.hcru_hcru.I20TRGSWCNPRDCTCBC ? Looks like an error from PnetCDF on the total size of the pending writes from a single process being > INT_MAX"
Which I've not tried, but I don't think we would want to use that going forward.
The text was updated successfully, but these errors were encountered: