View Issue Details

IDProjectCategoryView StatusLast Update
0001542OpenFOAMBugpublic2015-04-09 12:54
Reporterfeymark Assigned Tohenry  
PriorityhighSeveritymajorReproducibilityalways
Status closedResolutionno change required 
PlatformLinuxOSArch Linux 
Summary0001542: Restart introduce perturbations when increasing number of processors
DescriptionWhen running the channel395 on 8 processors restart from a previous time works perfectly fine and exactly the same residuals are found for all following time steps.
However when the number of processors are increased to 16 the restart introduces perturbations and different residuals, and thus p and U, are found.
Steps To ReproduceRun the attached cases using Allrun.
TagsNo tags attached.

Activities

feymark

2015-02-17 21:31

reporter  

channel395.tar.gz (3,028 bytes)

henry

2015-02-17 21:48

manager   ~0003812

Yes, this is to be expected. In order to reduce this perturbation you would need to run with tighter tolerances, and even then machine round-off may limit the degree to which you can mitigate this perturbation. The best approach is not to change the decomposition mid-run.

feymark

2015-02-17 21:50

reporter   ~0003813

Last edited: 2015-02-17 21:50

I notice from you answer that you haven't looked at my files. I do not change the decomposition mid-run. I'm not a newbie.

henry

2015-02-17 22:36

manager   ~0003814

Are the statistics when avereged long enough to converge different between the two runs?

feymark

2015-02-17 22:51

reporter   ~0003815

Perhaps I was a bit unspecific. If you look at the final pressure iterations for the 8 processor case, starting from 0. At time = 3 it reads,

GAMG: Solving for p, Initial residual = 0.75319219919945, Final residual = 3.5915425979914e-07, No Iterations 12

Then restart from time = 2. At time = 3 it reads,

GAMG: Solving for p, Initial residual = 0.75319219919945, Final residual = 3.5915425979914e-07, No Iterations 12

Exactly the same, which is very good. But if you do the same thing with 16 processors you get,

GAMG: Solving for p, Initial residual = 0.77177480795539, Final residual = 8.4216285201411e-07, No Iterations 11

and after restart from time = 2

GAMG: Solving for p, Initial residual = 0.77177480795462, Final residual = 8.4216285156141e-07, No Iterations 11

They are close but they are not the same. I do not mean that 8 processors should give exactly the same answer as 16 processors. But, the restart should be exact regardless of the number of processors.

feymark

2015-04-09 10:06

reporter   ~0004583

I found out what's causing this. The problem is that the uniform/time isn't written in binary although the format clearly states binary.

/*--------------------------------*- C++ -*----------------------------------*\
| ========= | |
| \\ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \\ / O peration | Version: 2.3.x |
| \\ / A nd | Web: www.OpenFOAM.org |
| \\/ M anipulation | |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version 2.0;
    format binary;
    class dictionary;
    location "2/uniform";
    object time;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

value 1.9999999999999998;

name "2";

index 10;

deltaT 0.20000000000000001;

deltaT0 0.20000000000000001;


// ************************************************************************* //

henry

2015-04-09 10:17

manager   ~0004584

The uniform/time file is written in the format you specified in your controlDict and if you write field data into this file it would be binary, however individual scalars are written ASCII as they are in field files.

I could change the uniform/time file to be explicitly written ASCII but then all data including field data would be written ASCII.

wyldckat

2015-04-09 10:45

updater   ~0004586

Greetings to all!

I believe this issue was addressed or at least talked about in issue #815: http://www.openfoam.org/mantisbt/view.php?id=815 - the changes are present in OpenFOAM-dev, but I think they aren't in 2.3.x, because the changes were substantial.

@Henry: And if I remember correctly, the "uniform/time" file is always written in ASCII.

@Feymark: Why do you need the time precision to be set to 14? Because writing floating point numbers in ASCII is always prone to error.

Best regards,
Bruno

feymark

2015-04-09 10:46

reporter   ~0004587

The value, deltaT and deltaT0 needs to be written in binary (or in most cases ASCII with write precision 17) for the restart to work perfectly.

henry

2015-04-09 12:53

manager   ~0004588

Writing these entries in binary will not improve the accuracy of the internal representation. If 17 decimal places is not sufficiently accurate for your simulations you will need to recompile OpenFOAM quad-precision or use an exact kernel.

Given that you are not changing the time-step in your simulations you do not need the uniform/time file anyway and you could delete it.

Issue History

Date Modified Username Field Change
2015-02-17 21:31 feymark New Issue
2015-02-17 21:31 feymark File Added: channel395.tar.gz
2015-02-17 21:48 henry Note Added: 0003812
2015-02-17 21:48 henry Status new => closed
2015-02-17 21:48 henry Assigned To => henry
2015-02-17 21:48 henry Resolution open => fixed
2015-02-17 21:50 feymark Note Added: 0003813
2015-02-17 21:50 feymark Status closed => feedback
2015-02-17 21:50 feymark Resolution fixed => reopened
2015-02-17 21:50 feymark Note Edited: 0003813
2015-02-17 22:36 henry Note Added: 0003814
2015-02-17 22:51 feymark Note Added: 0003815
2015-02-17 22:51 feymark Status feedback => assigned
2015-04-09 10:06 feymark Note Added: 0004583
2015-04-09 10:17 henry Note Added: 0004584
2015-04-09 10:45 wyldckat Note Added: 0004586
2015-04-09 10:46 feymark Note Added: 0004587
2015-04-09 12:53 henry Note Added: 0004588
2015-04-09 12:53 henry Status assigned => closed
2015-04-09 12:54 henry Resolution reopened => no change required