View Issue Details

IDProjectCategoryView StatusLast Update
0001792OpenFOAMBugpublic2015-11-21 18:38
ReporterGRAUPS Assigned Tohenry  
PrioritynormalSeveritymajorReproducibilitysometimes
Status resolvedResolutionfixed 
PlatformIntel64OSRHELOS Version6.6
Summary0001792: dgraphFold2 IOstream meshing error non-axis aligned pipe
DescriptionI've finally been able to reproduce reliably on a some-what small and shareable model a meshing error I've seen many times with non-axis aligned geometry but until now have not been able to reproduce reliably.

The error manifests itself as a "dgraphFold2: out of memory" error in OpenFOAM 2.3.x (build 2.3.x-6586d545d605, OpenMPI 1.6.5, scotch 6.0.0), and an "error in IOstream" in OpenFOAM 2.4.x (build 2.4.x-664271ec09a7, OpenMPI 1.8.5, scotch 6.0.3). The error only occurs when using certain processor counts and appears to disappear when the decomposition is set to simple instead of scotch.

Please find attached a few screenshots of the errors as well as the model I'm using (a non-axis aligned pipe). Please see the steps to reproduce for all the scenarios I've run.
Steps To ReproduceScenario 1 (case is uploaded with this setup)
- 8 procs, scotch decomp, OpenFOAM 2.3.x, OpenMPI 1.6.5, scotch 6.0.0
- run allrun, the result is a degraphfold2 error

Scenario 2
- 4 procs, scotch decomp, OpenFOAM 2.3.x, OpenMPI 1.6.5, scotch 6.0.0
- run allrun, the mesh is created successfully

Scenario 3
- 8 procs, simple decomp, OpenFOAM 2.3.x, OpenMPI 1.6.5, scotch 6.0.0
- run allrun, the mesh is created successfully

Scenario 4
- 8 procs, scotch decomp, OpenFOAM 2.4.x, OpenMPI 1.8.5, scotch 6.0.3
- run allrun, the result is an IOstream error

Scenario 5
- 4 procs, scotch decomp, OpenFOAM 2.4.x, OpenMPI 1.8.5, scotch 6.0.3
- run allrun, the result is a MPI_Recv error

Scenario 6
- 4 procs, simple decomp, OpenFOAM 2.4.x, OpenMPI 1.8.5, scotch 6.0.3
- run allrun, the mesh is created successfully

Scenario 7
- 8 procs, simple decomp, OpenFOAM 2.4.x, OpenMPI 1.8.5, scotch 6.0.3
- run allrun, the mesh is created successfully
Additional InformationI have a feeling the bug is actually in one of the thirdparty libraries (scotch or OpenMPI). Nevertheless, it's been affecting a number of my jobs I've been running and has been discussed in a CFD Online thread by other users (including myself).

http://www.cfd-online.com/Forums/openfoam-meshing-snappyhexmesh/113692-snappyhexmesh-running-out-memory-without-reason.html

In addition, I think this also might be related to the interior faceZone I am creating during the meshing process. If I remove the interior, meshing seems to proceed normally. It's also related to orientation, if I create an axis aligned pipe with an interior, everything seems to mesh normally.

It would be great if this could finally be tracked down. Thanks again for your support and your work on OpenFOAM. Let me know if you need anything else from me.
Tagsdecomposition, OpenMPI, parallel, Scotch, simple, snappyHexMesh

Relationships

related to 0001914 closedwyldckat meshSearch::findCellWalk sometimes fails in parallel 

Activities

GRAUPS

2015-07-21 17:31

reporter  

meshing_bug.tar.gz (87,148 bytes)

GRAUPS

2015-07-21 17:32

reporter  

2.3.x_scotch_8procs_error.png (3,426 bytes)   
2.3.x_scotch_8procs_error.png (3,426 bytes)   

GRAUPS

2015-07-21 17:32

reporter  

2.4.x_scotch_4procs_error.png (15,581 bytes)   
2.4.x_scotch_4procs_error.png (15,581 bytes)   

GRAUPS

2015-07-21 17:32

reporter  

2.4.x_scotch_8procs_error.png (11,839 bytes)   
2.4.x_scotch_8procs_error.png (11,839 bytes)   

GRAUPS

2015-07-21 20:16

reporter   ~0005117

Last edited: 2015-07-21 20:56

Not sure if it's related to the errors I'm seeing or not, but it looks like mattijs submitted a bug report over a year ago to the scotch bug tracker that has yet to be resolved. I'll document the link here just in case it is relevant...

http://gforge.inria.fr/tracker/index.php?func=detail&aid=17142&group_id=248&atid=1079

EDIT: I didn't find any non-manifold edges at the processor boundaries per mattijs' bug report, so my errors may not be related to mattijs' report.

user4

2015-07-22 09:04

  ~0005120

For another case we've heard that scotch 6.0.4 actually solved a dgraphFold2 error. This might just be related to a slightly different decomposition but it might be good to try this version anyway.

GRAUPS

2015-08-03 21:53

reporter   ~0005180

mattijs,

I've been unsuccessful in getting scotch 6.0.4 to compile properly. And judging by the comment in OpenFOAM-dev on commit 38415f8, it looks like OpenFOAM-dev hasn't been upgraded to 6.0.4 for the same reason.

Have you been able to successfully compile OpenFOAM with scotch 6.0.4?

user4

2015-08-04 10:00

  ~0005185

You can add a –DCOMMON_PTHREAD to the CFLAGS in $WM_THIRD_PARTY_DIR/etc/wmakeFiles/scotch/Makefile.inc.i686_pc_linux2.shlib-OpenFOAM-64Int32

However this will enable thread support. scotch should compile without but there currently is a bug in 604:

https://gforge.inria.fr/tracker/index.php?group_id=248&atid=1079

GRAUPS

2015-08-04 15:48

reporter   ~0005192

Mattijs, thanks for the additional compile flag that allowed be to build scotch 6.0.4. Unfortunately I received the same errors with scotch 6.0.4 that I did with scotch 6.0.3 for this model.

GRAUPS

2015-08-04 21:22

reporter   ~0005196

Mattjs, I started researching strategy strings today in scotch to see if I could find a workaround to this error. It looks like OpenFOAM tries to support custom strategy strings through the 'strategy' keyword under 'scotchCoeffs' in the decomposeParDict file. However, none of the examples in $FOAM_SRC/parallel/decompose/scotchDecomp/scotchDecomp.C work when placed in the quotes after the 'strategy' keyword. Even the long default string referred to in that file produces an error. Is this functionality broken at the moment?

user4

2015-08-05 09:18

  ~0005198

The strategy string varies from version to version of scotch so it might be that the one in our header file is out of date. You can get the default strategy string by keeping the binaries of scotch (remove the 'make realclean' from the $WM_THIRD_PARTY_DIR/Allwmake) and run gmap with option -vs. I've never tweaked these settings myself but interested in feedback. Can you post your findings here?

GRAUPS

2015-08-06 21:18

reporter   ~0005202

NEW DEVELOPMENT!

I was able to reproduce the IOstream error completely separate from the scotch decomposition. I had a feeling that we might have been hitting two separate errors, so I started running meshing tests with simple decomp (varying proc counts and xyz decomps). I was able to reproduce the IOstream error using only simple decomp. These are the settings...

- 24 procs, simple decomp, OpenFOAM 2.4.x, OpenMPI 1.8.7
simpleCoeffs
{
    n (3 4 2);
    delta 0.001;
}

I've uploaded an additional model (meshing_bug_simple_24procs.tar.gz) with these preset.

Can you please take a look at this new model Mattijs? This looks like it might be pointing to a bug elsewhere (possibly within OpenFOAM's mesher snappyHexMesh).

GRAUPS

2015-08-06 21:19

reporter  

user4

2015-08-26 09:20

 

0001-BUG-polyMesh-construct-tetBasePtIs-in-parallel-befor.patch (1,005 bytes)   
From 34c1a8f33e097bcddf7f2c829be66960553a8a7a Mon Sep 17 00:00:00 2001
From: mattijs <mattijs>
Date: Wed, 26 Aug 2015 09:15:04 +0100
Subject: [PATCH] BUG: polyMesh: construct tetBasePtIs in parallel before any
 usage

---
 src/OpenFOAM/meshes/polyMesh/polyMesh.C |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/OpenFOAM/meshes/polyMesh/polyMesh.C b/src/OpenFOAM/meshes/polyMesh/polyMesh.C
index a8156d2..abc38d4 100644
--- a/src/OpenFOAM/meshes/polyMesh/polyMesh.C
+++ b/src/OpenFOAM/meshes/polyMesh/polyMesh.C
@@ -1478,7 +1478,11 @@ Foam::label Foam::polyMesh::findCell
     const cellRepresentation decompMode
 ) const
 {
-    if (Pstream::parRun() && decompMode == FACEDIAGTETS)
+    if
+    (
+        Pstream::parRun()
+     && (decompMode == FACEDIAGTETS || decompMode == CELL_TETS)
+    )
     {
         // Force construction of face-diagonal decomposition before testing
         // for zero cells. If parallel running a local domain might have zero
-- 
1.7.10.4

user4

2015-08-26 09:23

  ~0005300

The problem is in the findCell routine in the polyMesh where some modes can trigger the building of the cell tet-decomposition and this needs to be done in parallel.

Please apply the uploaded patch.

(this patch is also in commit 34e062da in https://github.com/OpenFOAM/OpenFOAM-dev.git)

Kind regards,

Mattijs

GRAUPS

2015-08-26 19:02

reporter   ~0005301

Mattijs, thanks for looking at this. The patch however doesn't appear to fix the problem in 2.4.x or dev.

I'm currently on OpenFOAM-dev commit 32b7a26 from Aug 10th (after the June commit you referenced) and after re-running the case I still receive the same error. Can you reproduce this on your end still in OpenFOAM-dev? Which commit did you test with?

I also applied your patch to build bb146c867bf2 of 2.4.x and retested. I still receive the same error in the patched 2.4.x as well.

Which build did you test with where the patch worked?

GRAUPS

2015-08-26 19:23

reporter   ~0005303

Last edited: 2015-08-26 19:24

Mattijs, my apologies. The patch does appear to have worked for this model for 2.4.x build bb146c8. I got my OpenFOAM versions mixed up when I did the initial test of 2.4.x.

OpenFOAM-dev still appears to have regressed though. Can you take a look there and make sure there isn't anything additional that needs to be applied to 2.4.x?

Thanks

GRAUPS

user4

2015-08-27 09:22

  ~0005306

The patch is for 2.4.x. I will test dev.

GRAUPS

2015-09-03 18:55

reporter   ~0005321

Mattijs, were you able to reproduce the errors I'm seeing in dev?

henry

2015-10-19 17:01

manager   ~0005432

@GRAUPS: I am looking into this problem in OpenFOAM-dev and pushed a tentative fix:

commit 2c46bf1554a1d5e581d5ebcaa4516359bdc6bdf0

could you test it and provide feedback?

henry

2015-10-19 17:22

manager   ~0005433

The issues in this thread relate to the confusion in thread:

http://www.openfoam.org/mantisbt/view.php?id=1544

in which there is no clear resolution of the issue and the purpose of the proposed "fixes" also unclear. I will now work on this directly to resolve the issue.

henry

2015-10-19 18:06

manager   ~0005434

For the meshing_bug_simple_24procs.tar.gz case you provided I get a different error:

Introducing baffles to block off problem cells
----------------------------------------------

[12]
[12]
[12] --> FOAM FATAL ERROR:
[12] Maximum number of iterations reached. Increase maxIter.
    maxIter:1
    nChangedCells:0
    nChangedFaces:0

Does this case mesh for you in either OpenFOAM-dev or OpenFOAM-2.4.x?

GRAUPS

2015-10-19 18:18

reporter   ~0005435

Henry, thanks for picking this up. With Mattijs' patch for 2.4.x provided in this thread I was able to mesh it previously. I will run it again in 2.4.x to confirm and get back to you ASAP.

henry

2015-10-19 18:24

manager   ~0005436

I get "Maximum number of iterations reached. Increase maxIter." for both of your cases for both OpenFOAM-dev and OpenFOAM-2.4.x built with gcc or clang with or without the changes to the handling of CELL_TETS.

GRAUPS

2015-10-19 18:38

reporter   ~0005437

Henry, I can confirm that the attached model meshes successfully on my build. Here are the details of my current build...

2.4.x-8685344a0b1f with Mattijs' patch applied
Built with Gcc 5.2.0
OpenMPI-1.8.7
scotch_6.0.4 (compiled with –DCOMMON_PTHREAD to avoid errors)

Let me know if you need more info.

henry

2015-10-19 18:40

manager   ~0005438

I always get "Maximum number of iterations reached. Increase maxIter." for your cases.

Could you test the latest OpenFOAM-dev?

henry

2015-10-19 18:49

manager   ~0005439

P.S. If it does not run with the latest OpenFOAM-dev try reverting

commit 2c46bf1554a1d5e581d5ebcaa4516359bdc6bdf0

GRAUPS

2015-10-19 18:52

reporter   ~0005440

My current dev build is pretty dated. I will update, recompile, and retest and get back to you when I am finished.

GRAUPS

2015-10-20 15:43

reporter   ~0005449

I've compiled and tested the latest version of dev (2c46bf1) and also compiled and tested the previous commit (95c4f93). In dev I receive the same "Maximum number of iterations reached. Increase maxIter." that you got for both the latest and previous commits. This is a new error that I haven't seen or encountered before.

So now to determine why it meshes flawlessly in my 2.4.x build (with Mattijs patch) but doesn't in dev. Time to review the commits to 2.4.x since my 2.4.x-8685344a0b1f build, since you said you were getting it in 2.4.x as well.

GRAUPS

2015-10-20 15:46

reporter   ~0005450

To be clear, I was using the meshing_bug_simple_24procs.tar.gz test case to do my testing.

GRAUPS

2015-10-20 16:13

reporter   ~0005451

In dev I reverted the change made to globalMeshData.C in commit 511489a and then tested again. I now receive the "Iostream" error that I previously documented. I then re-applied the changes made in the latest commit (2c46bf1) and then tested, but I still receive the "Iostream" error.

So the "Maximum number of iterations reached. Increase maxIter." error must have been introduced in commit 511489a in dev and efc3810 in 2.4.x.

henry

2015-10-21 12:53

manager   ~0005452

Resolved by commit 1da3d4aa6f9572bf43d027b54830317fd9189d1c in OpenFOAM-2.4.x
Resolved by commit e64a8929a6483af1311c0b6853d912adff72532b in OpenFOAM-dev

Issue History

Date Modified Username Field Change
2015-07-21 17:31 GRAUPS New Issue
2015-07-21 17:31 GRAUPS File Added: meshing_bug.tar.gz
2015-07-21 17:32 GRAUPS File Added: 2.3.x_scotch_8procs_error.png
2015-07-21 17:32 GRAUPS File Added: 2.4.x_scotch_4procs_error.png
2015-07-21 17:32 GRAUPS File Added: 2.4.x_scotch_8procs_error.png
2015-07-21 17:34 GRAUPS Tag Attached: snappyHexMesh
2015-07-21 17:34 GRAUPS Tag Attached: parallel
2015-07-21 17:35 GRAUPS Tag Attached: OpenMPI
2015-07-21 17:35 GRAUPS Tag Attached: Scotch
2015-07-21 20:16 GRAUPS Note Added: 0005117
2015-07-21 20:56 GRAUPS Note Edited: 0005117
2015-07-22 09:04 user4 Note Added: 0005120
2015-08-03 21:53 GRAUPS Note Added: 0005180
2015-08-04 10:00 user4 Note Added: 0005185
2015-08-04 15:48 GRAUPS Note Added: 0005192
2015-08-04 21:22 GRAUPS Note Added: 0005196
2015-08-05 09:18 user4 Note Added: 0005198
2015-08-06 21:18 GRAUPS Note Added: 0005202
2015-08-06 21:19 GRAUPS File Added: meshing_bug_simple_24procs.tar.gz
2015-08-06 21:24 GRAUPS Tag Attached: simple
2015-08-06 21:24 GRAUPS Tag Attached: decomposition
2015-08-26 09:20 user4 File Added: 0001-BUG-polyMesh-construct-tetBasePtIs-in-parallel-befor.patch
2015-08-26 09:23 user4 Note Added: 0005300
2015-08-26 19:02 GRAUPS Note Added: 0005301
2015-08-26 19:23 GRAUPS Note Added: 0005303
2015-08-26 19:24 GRAUPS Note Edited: 0005303
2015-08-27 09:22 user4 Note Added: 0005306
2015-09-03 18:55 GRAUPS Note Added: 0005321
2015-10-19 17:01 henry Note Added: 0005432
2015-10-19 17:22 henry Note Added: 0005433
2015-10-19 18:06 henry Note Added: 0005434
2015-10-19 18:18 GRAUPS Note Added: 0005435
2015-10-19 18:24 henry Note Added: 0005436
2015-10-19 18:38 GRAUPS Note Added: 0005437
2015-10-19 18:40 henry Note Added: 0005438
2015-10-19 18:49 henry Note Added: 0005439
2015-10-19 18:52 GRAUPS Note Added: 0005440
2015-10-20 15:43 GRAUPS Note Added: 0005449
2015-10-20 15:46 GRAUPS Note Added: 0005450
2015-10-20 16:13 GRAUPS Note Added: 0005451
2015-10-21 12:53 henry Note Added: 0005452
2015-10-21 12:53 henry Status new => resolved
2015-10-21 12:53 henry Resolution open => fixed
2015-10-21 12:53 henry Assigned To => henry
2015-11-21 18:38 wyldckat Relationship added related to 0001914