implement regions #175

nschloe · 2017-12-21T11:13:27Z

Some data formats have the concept of regions/submeshes/domains, e.g., gmsh and Exodus. Think about how to best integrate this into meshio.

gdmcbain · 2017-12-22T19:56:10Z

Reading about Exodus boundary conditions, I see that besides what it calls 'element blocks', corresponding to the above, it also has 'nodesets' and 'sidesets', the last two being useful for boundary and initial conditions, and 'sides' being sides of elements.
I can immediately envisage 'nodesets' being useful for Dirichlet conditions.

gdmcbain · 2017-12-22T20:17:51Z

Exodus Element Blocks are stricter than Gmsh Physical Regions; whereas the latter encode only the continuous concept of subdomain, the former mixes in the discrete notion of type of element.

Element Blocks (also referred to as simply, Blocks) are a logical grouping of elements all having the same basic geometry and number of nodes. All elements within an Element Block are required to have the same element type. (ibid)

In a 'quad-dominant' mesh of a surface, this would necessitate two blocks, for the triangles and quadrilaterals. I can see computational advantages to this: the connectivity would be described by a rectangular matrix rather than a list. But it also seems like an admixture of concerns if one just wants to specify that a subdomain is of one material.

cbcoutinho · 2017-12-22T20:47:47Z

What would the computational advantage be of having the elemental group data as matrix instead of a list? For one, a list (or list of lists) would be much sparser, plus this information is already provided by meshio in the form of dictionaries with list/array values, which I think is now provided by the 'cells' and 'cell_data' objects.

…

On Fri, Dec 22, 2017, 21:17 Geordie McBain ***@***.***> wrote: Exodus Element Blocks are stricter than Gmsh Physical Regions; whereas the latter encode only the continuous concept of subdomain, the former mixes in the discrete notion of type of element. Element Blocks (also referred to as simply, Blocks) are a logical grouping of elements all having the same basic geometry and number of nodes. All elements within an Element Block are required to have the same element type. (ibid) In a 'quad-dominant' mesh of a surface, this would necessitate two blocks, for the triangles and quadrilaterals. I can see computational advantages to this: the connectivity would be described by a rectangular matrix rather than a list. But it also seems like an admixture of concerns if one just wants to specify that a subdomain is of one material. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#175 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMTd7KePPmdbBvud0QNXM3SQ00aY8aZ8ks5tDA5wgaJpZM4RJrl-> .

gdmcbain · 2017-12-23T01:45:13Z

Hmm. I was trying to second-guess why Exodus insisted on one type of element per block.
In which case each element has the same number of nodes and the array with entry [e, n] the global index of the n-th node in the e-th element would be 100% dense and perfectly rectangular, which I thought (at the risk of prematurely optimizing) might be more efficient to allocate and look up.
I didn't find an explicit statement of the motivation for the Exodus restriction yet, but thought it might be something like that.

gdmcbain · 2017-12-23T02:08:01Z

As to the relation between this and what's currently in the meshio cell_data and field_data, I think they're like inverses, as discussed in the final comments yesterday on #169.

nschloe · 2018-01-30T16:15:05Z

Hmm. I was trying to second-guess why Exodus insisted on one type of element per block.
In which case each element has the same number of nodes and the array with entry [e, n] the global index of the n-th node in the e-th element would be 100% dense and perfectly rectangular,

I also think this will be the reason. This is by the way also what comes out of a meshio read: The cells a segregated by their element type so the connectivity data is just one big n-by-k array, where n is the number of cells and k the number of nodes per cell. This makes many things really convenient.

nschloe · 2018-01-30T16:21:54Z

Anyhow, on to regions. Any opinion yet on how to implement them? The possibilities I see right now are

An dict of arrays that assigns an integer value to each cell, à la

regions = {'triangle': [0 ,0, 1, 0, ...], 'quad': [1, 0, 1, ...]}

A dictionary of indices into the cells dict

regions = {'boundary': [0, 2, 3, 4, 5, ...], 'copper': [0, 1, 2, 3, 10, ...]}

or

regions = {'boundary': {'triangle': [0, ...]}, 'copper': {'triangle': [0, ...]}}

I generally like the second approach better because it allows to naturally associate a string with the regions and permits cells to be a member of an arbitrary number of regions (also none at all). Perhaps Exodus-style point sets could also be incorporated in the this approach with the 'point' keyword.

dalcinl · 2018-01-30T18:50:56Z

Your last one is promising, however there is something missing. In some formats (e.g. Gmsh), the region names are mapped to an integer tag. Maybe

regions = {
    'boundary': (int_tag_1, {'triangle': [0, ...]}), 
    'copper': (int_tag_2, {'triangle': [0, ...]}),
}

Also, maybe the region dict should be allowed to have integer keys ? (again thinking on Gmsh, where physical regions are always identified with an integer, and an optional name). Or maybe keys could be tuples ("name", integer_tag) or "name" or integer_tag, the three variants allowed ?

PS: At this point I'm wondering whether meshio should use a container class to hold its data, rather than having read() return many things.

PS: What we are trying to achieve made me remember this joke https://xkcd.com/927/.

nschloe · 2018-01-30T18:55:56Z

Yeah, we should definitely allow integer tags. Also, parallel to cell_data, names and cell types should be ordered like

regions = {'triangle': {'boundary': [0, ...], 8: [0, ...]}}

@dalcinl Do you know if it's guaranteed in gmsh that physical tags and geometrical tags are always different?

nschloe · 2018-01-30T18:57:58Z

Also, good idea about the container type. This would allow us to add more data without breaking the API (as we have to do now).

dalcinl · 2018-01-30T18:58:23Z

In gmsh, physical tags and geometrical tags are orthogonal concepts, they can certainly be the same. You can have let say a boundary face with physical tag 1 that comes from a geometrical entity with tag 1.

PS: So is the tuple[str,int] or str or int not good enough for your?

dalcinl · 2018-01-30T19:02:36Z

Also, eventually you need something like a metadata entry where you can stash stuff that are too much specific to a mesh format, something like metadata = {'format': 'gmsh', stuff, stuff, stuff}, so that you can implement a roundtrip like write('mesh2.msh', *read('mesh1.msh')) and get a mesh2.msh file that matches as much as possible what is inside mesh1.msh without loss of information.

nschloe · 2018-01-30T19:06:35Z

Also, eventually you need something like a metadata entry

All good suggestions. This should probably be made a separate issue.

About the different gmsh tags: Tuples as keys is certainly possible, but seems a bit wasteful. The fact that there are multiple different tag types appears to be very specific to gmsh. A more natural approach seems to be to use

'physical{}'.format(k)
'geometrical{}'.format(k)

as keys. Optionally, the physical list could be replaced by their respective entris in $PhysicalNames.

dalcinl · 2018-01-30T19:12:05Z

Well, to add complications, gmsh physical regions can have names, I submitted a PR some time ago adding support for them. IMHO, Gmsh's geometrical tags are rather useless in a mesh, they are sort of references to the tags of geometrical entities in the *.geo file, abd they would be a good candidate for stashing in a metadata entry.

nschloe · 2018-01-30T19:22:37Z

Well, to add complications, gmsh physical regions can have names

No problem, I got this covered. I've now added gmsh region reading in this branch; will see about writing now. Feedback welcome.

gdmcbain · 2018-01-30T23:16:46Z

PS: At this point I'm wondering whether meshio should use a container class to hold its data, rather than having read() return many things.

Yes, so at the moment I have something like this in my unpublished finite-element module based on meshio and pygmsh. (At least, I hadn't heard of "container class" before but I think I get the idea.)

class Mesh(object):

    dimension = NotImplemented

    def __init__(self, points, cells,
                 point_data=None, cell_data=None, field_data=None):

        self.points, self.cells = points, cells

        self.point_data = {} if point_data is None else point_data
        self.cell_data = {} if cell_data is None else cell_data
        self.field_data = {} if field_data is None else field_data
    
    @classmethod
    def from_geo(cls, geometry: Geometry, *args, **kwargs):
     
        return cls(*generate_mesh(geometry, dim=cls.dimension, *args, **kwargs))

Is that what you meant? If meshio had such a class, I would use it or subclass it, I think, rather than catching the multiple values returned by read (via generate_mesh) in my classmethod constructor.

nschloe · 2018-01-30T23:49:13Z

Let's not mix discussions. I've just opened an issue for the container discussion at #198.

nschloe · 2018-01-31T10:20:02Z

Alright, the regions now kind of work in the branch, but I'm not sure if I'm a hundred percent happy with it. The design is now such that the reader will return something like

regions = {
    'triangle': {
        'physical-1-copper': [...],
        'physical-5-iron': [...],
        'geometrical-8': [...],
        }}

from a gmsh mesh. The lists are indices in the cell array. When writing, you supply such an object to the writer, and tags and physical names are extracted from the keys.

While I think the above format is most useful for applications, I cannot see a mesh format that actually implements tags like that. (Please correct me if I'm wrong.) Even gmsh demands one and only one tag to be present for each cell in both physical and geometrical groups. I don't see much difference to a cell data array with int dtype. This is also how you're supposed to handle subdomains in VTK.

I'm wondering now if we should implement pseudo-regions simply as cell data arrays with a specific name, e.g., 'region-<something>'.

gdmcbain · 2018-01-31T23:53:33Z

I'm not so keen on packing metadata into strings. @dalcinl proposed more structured tags, which generally seems more Pythonic.

Another idea is to store the metadata hierarchically. What's in master at the moment is hierarchical, where in cell_data, each type of cell ('vertex', 'line', …) has a dict with keys 'physical', 'geometric'. Here in #175 we're talking about inverting this, similar to the discussion ‘tags in VTK’ with Andy Bauer cited above, but a hierarchy could be used that way too; e.g. (without having thought this through), in field_data, instead of a flat dict keyed by user-level names of regions (as in master), put a 'physical' key there (and 'geometrical' if desired, though I agree with the assessment of @dalcinl that they're ‘rather useless’).

gdmcbain · 2018-01-31T23:57:34Z

More generally, doesn't the current field_data fulfil what @dalcinl proposed for 'metadata'? What's the difference? Except that at the moment, the keys are user-level names of regions of various dimension, but couldn't those be pushed down one level?

nschloe · 2018-02-01T14:38:02Z

I'm not so keen on packing metadata into strings. dalcinl proposed more structured tags, which generally seems more Pythonic.

Well, Gmsh want's to store a string and an integer with each region. I guess that's what you get when you settle for a format like that, but it isn't pretty. I'm not a fan of making a tuple a key in dictionary, but, as suggested above, what we could do is simply store two separate things:

The regions as int-valued cell data arrays.
The string data as a dictionary (as part of field data)

I think that's more or less what we had done earlier anyways. Advantages of this:

The data structure maps 1:1 what gmsh does in physical and geometrical data.
It seamlessly integrates in our cell data array; no need to change the API.
Cell data can be written to almost all formats, so information isn't lost in translation.

Disadvantages:

Each cell needs to have and only one value for each region array. This is fine for materials (after all, each cell is made of one), but you cannot simply mark a particular boundary set or so.
The Gmsh standard actually allows for marking more than physical and geometrical regions, but that's not covered here. I haven't seen more markings than those two anywhere in the wild though.

What do you think?

dalcinl · 2018-02-01T15:06:38Z

@gdmcbain By metadata I mean a special dict of something that is very specific to the format that you are reading from or writing to. For example, if you read gmsh and next write vtk, the vtk writer can just ignore this metadata, but if you write back gmsh, then you honor it. The use case is doing some pre-processing on the input mesh data, and writing it back in the same format. Of course, it is up to the user to doing the read-transform-write to update the metadata if needed.

dalcinl · 2018-02-01T15:09:59Z

@gdmcbain @nschloe Can you define what field_data means for you? Maybe yes, field_data is what I called metadata, but I'm not sure whether that's the actual meaning.

nschloe · 2018-02-01T15:49:20Z

@dalcinl Remember we're discussing regions here. If it's about the a container class, please move you're contributions to the appropriate issue.

dalcinl · 2018-02-01T15:56:33Z

Yes, I know. Should be regions (or part of the info about them) be considered metadata that is too much format-specific?

nschloe · 2018-02-01T16:04:01Z

First we have to decide how to do regions. If we decide to just wrap them as part of cell data (see suggestion above), we hardly need any meta data.

dalcinl · 2018-02-01T16:18:56Z

Perhaps cell_data is too much generic for enconding regions? For me, point_data and cell_data means scalar or vector fields on either points or cells, in the VTK sense. The gmsh format also support point and cell data (real values, not integers!), but confusing/mixing such data with regions was the root cause of my initial complaints.

If you still want to encode regions with cell data, that's fine, but we need to devise a mechanism to distinguish integral values (for gmsh regions, aka physical/elementary tags) from scalar/vector data (for gmsh NodeData or ElementData).

nschloe · 2018-02-01T18:51:16Z

Perhaps cell_data is too much generic for enconding regions?

Generally you're right here. The thing is that Gmsh doesn't make use of the freedom: Both physical and geometrical data are simply integer arrays over cells, just like cell_data. The gmsh standard does actually allow for more flexibility by tagging, but I've never seen this used anywhere.

nut we need to devise a mechanism to distinguish integral values (for gmsh regions, aka physical/elementary tags)

That should be easy: The two region values of gmsh are called physical and geometrical.

dalcinl · 2018-02-01T22:20:32Z

What's in a name? :-) What would happen if you read from a VTK file with a cell data entry named physical with let say some scalar and vector field, and then you want to write out a gmsh file?

Anyway, you seem to suggest going back to previous approach where physical and geometrical were dumped in gmsh files as integer tags, and not as now in the $ElementData section. I'm certainly not opposed to that, I already said that IMHO the current behavior is a regression. However, maybe I would suggest to somehow namespace these special cell data names, let say something like gmsh:physical and gmsh:geometrical (or maybe region:physical to make it more format-independent), so now the gmsh writer can have some confidence that the data is actually meant to be dumped as region tags. With a bit of namespacing on the keys for cell data, I think you can go back to the original approach of using cell_data, and forget about a special treatment for regions.

gdmcbain · 2018-02-02T01:32:19Z

Thanks for clarifications. This is all sounding O. K. to me as far as it goes, but…

I've just realized why I hadn't been bitten by the change as @dalcinl was this time, whereas I had been back in #169. I don't ask meshio to write .msh files, only read them, and that implicitly by using pygmsh.generate_mesh; the hidden temporary .msh files being written by Gmsh in a subprocess. Thus far, I'm working directly with the five objects returned by generate_mesh (i.e. by meshio.read) and haven't needed to serialize the mesh; if I did, and chose meshio.write and Gmsh .msh for the purpose, I would have hit this problem, yes. I would certainly have expected the physical tagging written by pygmsh and Gmsh into the $Element block to have persisted there.

So this isn't really about what I thought it was when this issue was launched in December; i.e. whether the data structure for the regions should be inverted. This sounds more like #197, but I see that the discussion from there was moved back here.

To return it for the moment to the original question of which way the looking-up of regions should go, I see that Gmsh in the MSH format in the $Element block stores it as element -> region. And if it were in $ElementData it would still be nearly the same: elm-number -> region, 'elm-number' being the index into the list or array of elements. This means that if I only want to integrate over a region, I need to filter either of those look-ups to get the elements or elm-numbers that lie in the region of interest. The intention of the inverse was to have dicts going the other way: region -> elm-number, so that such a traversal would be more direct. I believe, as mentioned above, that this is what Exodus's 'element blocks' do, but studying the spec for Gmsh MSH again, I can't see any way to encode a mapping in that direction without inverting it. It also sounds like VTK has no support for region -> elm-number either. Is Exodus on its own? Anyway, without support in Gmsh or VTK, it's maybe not worth having inverting the look-up read from the $Element block on the way in as one is just going to have to undo that on the way out in most cases.

nschloe · 2018-02-02T11:22:19Z

@gdmcbain Thanks for your comment! If I would sign all of this I could.

For my own applications, I would rather need region -> elm-index, but as you point out correctly, this is unfortunately not what mesh formats implement. Granted, Exodus node sets and side sets is approximately that, but only for, well nodes and "sides". I don't know why they make this restriction. I'm almost inclined

Okay, I think this is how it's gonna go:

Implement regions as integer cell data. As for gmsh's groups, I like @dalcinl's suggestion to prefix the name; 'gmsh:physical' and 'gmsh:geometrical' are probably good enough.
If there ever comes a file format that does region -> element-index (or the need arises for Exodus node, side sets), we can always implement that as proper regions.

nschloe · 2018-02-02T12:40:29Z

Alright, thanks everyone for hanging in there. I've just published 1.11.7 with the proposed fix.

gdmcbain · 2018-02-05T05:31:00Z

What I'm thinking more generally, trying to not to be Gmsh-centric, is that some input or output formats will store regions as element -> region and some as region -> element. (I'm not imagining any other possibilities than these two, though I did try to.)

It's true that we don't know of any mesh formats that support region -> element-index with elements being allowed to belong to more than one region, Exodus II with its Element Blocks does more work in the direction of region -> element than element -> region.

Another reference for the format (especially ‘Anatomy of an Exodus II file’ in §3):

Kostka, T. D. (2013, January). Exomerge user's manual: A lightweight Python interface for manipulating Exodus files. Technical Report SAND2013-0725, Sandia National Laboratories, Livermore, California.

But anyway, not tying meshio too closely to any existing format, I wondered about whether the format-neutral data-structure that meshio builds and populates when called to read and uses when called to write, what if it had both possibilities?

element -> region, say implemented as at present in cell_data[type]['gmsh:physical'] from Regions naive #199
region -> element, not yet implemented but maybe with a new dict cell_data['regions'].

Then in meshio.*_io.read, if reading:

a type 1 format such as Gmsh MSH, populate cell_data[type]['gmsh:physical']
a type 2 format, populate cell_data['regions']

Then in meshio.*_io.write, if we have

cell_data[type]['gmsh:physical'] and we have to write
1. a type 1 format, that's easy, already implemented
2. a type 2 format, do the inversion. Go through the cell_data[type]['gmsh:physical'] and construct the maps region -> element
cell_data['regions'] and we have to write
1. a type 1 format, invert
2. a type 2 format, it should be straightforward.

I imagine not usually having both cell_data[type]['gmsh:physical'] and cell_data['regions'] in the same cell_data dict.

gdmcbain · 2018-02-05T05:52:35Z

I was disappointed to learn by way of the above that

Even gmsh demands one and only one tag to be present for each cell in both physical and geometrical groups.

but indeed, from Gmsh's MSH ASCII file format:

By default, the first tag is the number of the physical entity to which the element belongs; the second is the number of the elementary geometrical entity to which the element belongs; the third is the number of mesh partitions to which the element belongs, followed by the partition ids (negative partition ids indicate ghost cells). A zero tag is equivalent to no tag. Gmsh and most codes using the MSH 2 format require at least the first two tags (physical and elementary tags).

That's quite a shortcoming. I wondered what happens if two physical entities are defined that overlap. I tried this with:

Point (1) = {0, 0, 0, 1};
Point (2) = {1, 0, 0, 1};
Point (3) = {2, 0, 0, 1};

Line (1) = {1, 2};
Line (2) = {2, 3};

Physical Line("left", 1) = {1};
Physical Line("all", 2) = {1, 2};

and found that Gmsh makes two lots of elements over 0 < x < 1.

$MeshFormat
2.2 0 8
$EndMeshFormat
$PhysicalNames
2
1 1 "left"
1 2 "all"
$EndPhysicalNames
$Nodes
3
1 0 0 0
2 1 0 0
3 2 0 0
$EndNodes
$Elements
3
1 1 2 1 1 1 2
2 1 2 2 1 1 2
3 1 2 2 2 2 3
$EndElements

The first two elements here have the same endpoints.

I suppose in practice (and I'm very surprised that I haven't hit this yet in my work) that what one would have to do is define three separate physical entities for the intersection and two relative complements and then use the union of the intersection and one of the two relative complements. Not nice.

nschloe · 2018-02-05T08:43:23Z

Thanks Geordie for the concise write-up!

what if it had both possibilities?

Yeah that'd be nice to have. I believe it'd make sense to wait for an actual use case of this and then design meshio around that. Having cell_data['regions'] will certainly a possibility, another is regions[name], but really we don't have to decide that before a user comes along with a mesh format that supports that.

That's quite a shortcoming. I wondered what happens if two physical entities are defined that overlap. I tried this with: [...]

Very interesting discovery! I'm surprised that not even Gmsh itself makes proper use of its tags. That makes me wonder why they didn't use $ElementData, or why that mesh format was created in the first place when VTK & friends were already around. Well, well...

gdmcbain · 2018-09-07T00:42:28Z

I wondered what happens if two physical entities are defined that overlap.

This was touched on by C. Geuzaine (2018-09-06) as one of the motivations for the introduction of MSH4 #280.

The handling of physical groups in version 2 had numerous issues (see https://gitlab.onelab.info/gmsh/gmsh/issues/94). Two of the most serious were that

if you defined more than 1 group for a given geometrical entity, the elements would be duplicated

storing the physical group for each element was wasteful (as many integers as elements, instead of a single one)

These points might guide #289.

nschloe mentioned this issue Dec 21, 2017

Regression: Meshio doesn't recognize physical groups #169

Closed

nschloe mentioned this issue Jan 30, 2018

GMSH: physical and elementary tags in the $Elements section #197

Closed

gdmcbain mentioned this issue Jan 31, 2018

discussion: container class? #198

Closed

nschloe closed this as completed Feb 2, 2018

gdmcbain mentioned this issue Sep 25, 2018

medit_io: Missing functionality for MEDIT .mesh files. #239

Open

3 tasks

gdmcbain mentioned this issue Nov 9, 2018

piecewise-definition of coefficients on subdomains kinnala/scikit-fem#89

Closed

gdmcbain mentioned this issue Dec 4, 2018

Backward compatibility: make Mesh class iterable and yield points, cells, point_data, cell_data, field_data #319

Closed

tianyikillua mentioned this issue Mar 18, 2019

MED 4.0 to .msh or .xml with entities #347

Closed

gdmcbain mentioned this issue Nov 29, 2019

Mesh.save subdomains and boundaries kinnala/scikit-fem#261

Closed

gdmcbain mentioned this issue Jan 6, 2020

Unification of point / cell sets among formats #619

Open

anufagbemi mentioned this issue Apr 27, 2020

Multi-regions created from Abaqus .inp file to vtk #792

Closed

LTMeyer mentioned this issue Jun 11, 2021

Support for MED multiple meshes #1133

Open

implement regions #175

implement regions #175

Comments

nschloe commented Dec 21, 2017

gdmcbain commented Dec 22, 2017

gdmcbain commented Dec 22, 2017

cbcoutinho commented Dec 22, 2017 via email

gdmcbain commented Dec 23, 2017

gdmcbain commented Dec 23, 2017

nschloe commented Jan 30, 2018

nschloe commented Jan 30, 2018

dalcinl commented Jan 30, 2018

nschloe commented Jan 30, 2018

nschloe commented Jan 30, 2018

dalcinl commented Jan 30, 2018

dalcinl commented Jan 30, 2018

nschloe commented Jan 30, 2018 • edited Loading

dalcinl commented Jan 30, 2018

nschloe commented Jan 30, 2018

gdmcbain commented Jan 30, 2018

nschloe commented Jan 30, 2018

nschloe commented Jan 31, 2018

gdmcbain commented Jan 31, 2018

gdmcbain commented Jan 31, 2018

nschloe commented Feb 1, 2018 • edited Loading

dalcinl commented Feb 1, 2018

dalcinl commented Feb 1, 2018

nschloe commented Feb 1, 2018

dalcinl commented Feb 1, 2018

nschloe commented Feb 1, 2018

dalcinl commented Feb 1, 2018

nschloe commented Feb 1, 2018

dalcinl commented Feb 1, 2018

gdmcbain commented Feb 2, 2018 • edited Loading

nschloe commented Feb 2, 2018 • edited Loading

nschloe commented Feb 2, 2018

gdmcbain commented Feb 5, 2018 • edited Loading

gdmcbain commented Feb 5, 2018

nschloe commented Feb 5, 2018

gdmcbain commented Sep 7, 2018

nschloe commented Jan 30, 2018 •

edited

Loading

nschloe commented Feb 1, 2018 •

edited

Loading

gdmcbain commented Feb 2, 2018 •

edited

Loading

nschloe commented Feb 2, 2018 •

edited

Loading

gdmcbain commented Feb 5, 2018 •

edited

Loading