Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Species] Add molecular weight attribute #1266

Merged
merged 10 commits into from
Jun 24, 2022

Conversation

bryanwweber
Copy link
Member

@bryanwweber bryanwweber commented Apr 30, 2022

The molecular weight is computed for each species when it is created.
The Phase private attribute m_mwt is not changed.

This will be useful in general, but specifically for #1185. This does not need to go in for 2.6.

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@bryanwweber
Copy link
Member Author

Posted a question about the test failures on the UG: https://groups.google.com/g/cantera-users/c/mdNdCulK4gc/m/3G53KMDzAAAJ

@ischoegl
Copy link
Member

ischoegl commented May 1, 2022

I think you need to implement a Species::setComposition method in C++ that also calculates the weight, similar to what is done in newSpecies (which won't be called as it is only used for the YAML interface).

I also replied on the UG

@speth
Copy link
Member

speth commented May 2, 2022

Can you explain more about why this is a necessary / useful feature? I don't quite understand what role it plays in implementing yaml2ck. One reason that the molecular weight isn't already a property of the Species object is because in the most general case (which includes custom element definitions) it can't be determined without information that's only held by the ThermoPhase object.

@bryanwweber
Copy link
Member Author

Thanks @speth! Good question. It will be useful to simplify sorting species in the CK output. I also think it's a logical place to keep the molecular weight... ideally, a species knows its own data. It's where I'd expect to find it if I had only a Species object.

That said, one of my test failures is because I'm not handling custom elements right now. I think it would make sense to raise an error if a custom element is present and the species is not part of a phase. Then the phase can set the molecular weight when the species is added. What do you think?

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

@speth and @bryanwweber ... I actually agree that having Species know their molecular weights adds a feature of interest. I can think of at least one instance where not having this available for species forced me to use some kludgy work-arounds.

Per implementation, I see that custom elements add a wrinkle to what I said on the UG. My preferred solution here would still be to strengthen the C++ constructor for Species, i.e.

[...] addressing things at the root, by adding a Species(const std::string&, const Composition&, double, double) constructor in C++, exposing it to Cython, and calling it in __cinit__ instead of the default constructor. This constructor would then call the existing Species(const std::string&, const compositionMap&, double, double), which has a more restrictive interface. [...] the pre-existing constructor (and not newSpecies) should handle the calculation of molecular weights

I haven't looked into details about custom elements. As these are just computational entities/placeholders, they presumably don't need a weight, so there isn't an issue? Regarding species that contain (custom or regular) elements that aren't part of the phase, this is absolutely an error, and can be handled separately from the instantiation.

@speth
Copy link
Member

speth commented May 2, 2022

I'm unclear on what the extra constructor is supposed to do differently -- Composition and compositionMap are both just typedefs for std::map<std::string, double>. In any case, I agree that this calculation does not belong in the newSpecies function. The Species constructor is one option, but ThermoPhase::addSpecies is the only place where you're guaranteed to have enough information to do so.

It will be useful to simplify sorting species in the CK output.

Don't you have access to a ThermoPhase object containing the Species when generating this output?

I also think it's a logical place to keep the molecular weight... ideally, a species knows its own data.

I haven't looked into details about custom elements. As these are just computational entities/placeholders, they presumably don't need a weight, so there isn't an issue?

That's one use case. Another is handling isotopes, where of course you do want to include the different value of the atomic weight.

Regarding species that contain (custom or regular) elements that aren't part of the phase, this is absolutely an error, and can be handled separately from the instantiation.

This is already handled, of course. The default behavior for "regular" elements is to just add them to the phase, but this can be changed by calling one of Phase::ignoreUndefinedElements(), Phase::addUndefinedElements(), or Phase::throwUndefinedElements().

I think it would make sense to raise an error if a custom element is present and the species is not part of a phase.

So how do you create such a Species and add it to a phase?

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

I'm unclear on what the extra constructor is supposed to do differently -- Composition and compositionMap are both just typedefs for std::map<std::string, double>.

Ugh. That is correct - didn't track down the definitions

//! Map connecting a string name with a double.
/*!
* This is used mostly to assign concentrations and mole fractions to species.
*/
typedef std::map<std::string, double> compositionMap;
//! Map from string names to doubles. Used for defining species mole/mass
//! fractions, elemental compositions, and reaction stoichiometries.
typedef std::map<std::string, double> Composition;

In this case, I honestly think that there is one typedef too many.

So ... no new constructor needed. Good point about isotopes, and it frankly would have been surprising if there hadn't already been exception handling for undefined elements.

Either way, my personal preference here would be to strengthen the constructor for Species ...

@bryanwweber
Copy link
Member Author

bryanwweber commented May 2, 2022

Don't you have access to a ThermoPhase object containing the Species when generating this output?

Yes, but it feels strange to write:

solution = ct.Solution(...)
species = {s.name: s for s in solution.species()}
dict(sorted(species.items(), key=lambda s: solution[s[0]].molecular_weights))
# or
dict(sorted(species, key=lambda s: solution.molecular_weight(solution.species_index(s[0]))))

or something like that, when it could be:

sorted(solution.species(), key=operator.attrgetter("molecular_weight"))

since, at least in this the yaml2ck script, we'll always have the complete phase.

So how do you create such a Species and add it to a phase?

I meant throw if someone tries to read/get the molecular weight when the Species is not attached to a Phase (possibly also only if there are unknown elements involved). Presumably, once the species is attached to a Phase, the phase can tell it "This is your molecular weight".

@speth
Copy link
Member

speth commented May 2, 2022

I meant throw if someone tries to read/get the molecular weight when the Species is not attached to a Phase (possibly also only if there are unknown elements involved). Presumably, once the species is attached to a Phase, the phase can tell it "This is your molecular weight".

If you want to just let the Phase object set the value stored by the Species when Phase::addSpecies is called, I think I'd withdraw my objection to adding the molecular weight to the Species object. That would avoid the possibility of introducing any inconsistency between the two values.

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

@bryanwweber ... I had the exact same issue (minus resorting to lambda), i.e.

solution = ct.Solution(...)
species = {s.name: s for s in solution.species()}
dict(sorted(species.items(), key=lambda s: solution[s[0]].molecular_weights))

while wanting to implement a dictionary comprehension.

@speth ... I still don't fully understand what the hangup is about phases needing to set molecular weights (instead of species themselves). Can you direct me to a YAML input file where this shows up? (I mention YAML and not C++ here intentionally)

@bryanwweber
Copy link
Member Author

bryanwweber commented May 2, 2022

That would avoid the possibility of introducing any inconsistency between the two values.

@speth I think it would be useful to be able to get the molecular weight of an unattached species, provided it used only the elements/isotopes defined in Elements.h. However, I recognize that this introduces some further complexity to the situation that may not be worth the effort (that is, there's the risk of values becoming inconsistent).

I think this can be resolved by introducing a new/modified constructor as @ischoegl suggested, or perhaps a setter/getter method pair... The question to me is what the signature should look like to be most clear. Or, put another way, how can addSpecies tell the Species constructor "don't worry about molecular weight" and Python (via ct.Species, although this would extend to classes created directly in C++ as well) can say "please calculate and set the molecular weight for me if you know about all these elements and throw an error if I try to get the molecular weight and you found an element you don't know about".

Maybe there isn't a way to resolve this satisfactorily, but I hope there is.

@speth
Copy link
Member

speth commented May 2, 2022

Here is an existing example from the test suite that would lead to errors/inconsistencies with some of the implementations proposed here. Phase definition:

- name: element-remote
thermo: ideal-gas
elements:
- default: [N, O]
- test_subdir/species-elements.yaml/isotopes: [Ar]
species: [O2, NO, N2, AR]
note: replace Argon with custom element from a different file
state: {T: 900.0, P: 5 atm, Y: {O2: 0.4, N2: 0.4, AR: 0.2}}

where the referenced element definitions are:

isotopes:
- symbol: Ar
atomic-weight: 38
- symbol: O18
atomic-weight: 18

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

@speth ... thank you, this sheds some light! This is an interesting case. But the larger question here is: what do you do if you have more than one isotope? ... One alternative would be to specify the isotope further down in the definition of the species, i.e.

- name: AR
composition: {Ar: 1}
thermo:
model: NASA7
temperature-ranges: [300.00, 1000.00, 5000.00]
data:
- [2.500000000E+00, 0.000000000E+00, 0.000000000E+00, 0.000000000E+00,
0.000000000E+00, -7.453750000E+02, 4.366000000E+00]
- [2.500000000E+00, 0.000000000E+00, 0.000000000E+00, 0.000000000E+00,
0.000000000E+00, -7.453750000E+02, 4.366000000E+00]
note: "120186"

for example by adding a key for molecular weight.

(which imho would be a cleaner solution to this conundrum). I believe that would allow for the co-existence of multiple isotopes of the same element.

@speth
Copy link
Member

speth commented May 2, 2022

I think this can be resolved by introducing a new/modified constructor as @ischoegl suggested, or perhaps a setter/getter method pair... The question to me is what the signature should look like to be most clear. Or, put another way, how can addSpecies tell the Species constructor "don't worry about molecular weight" and Python (via ct.Species, although this would extend to classes created directly in C++ as well) can say "please calculate and set the molecular weight for me if you know about all these elements and throw an error if I try to get the molecular weight and you found an element you don't know about".

If you take my suggestion of letting Phase::addSpecies be the place where molecular weight is calculated, I would suggest the following:

  • Add m_molecular_weight (or whatever you want to call it) as a protected member of Species
  • Initialize it to nan in the Species constructor
  • Add a setter that will be called from Phase::addSpecies
  • Add a getter that checks if the value is nan and throws if it is
  • Do not implement the setter in Python

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

While I do not like the Phase::addSpecies avenue at all (it just feels wrong), it avoids code refactoring. Isotopes have come up elsewhere recently, so this is probably a good candidate for a feature request, which could address this more thoroughly (perhaps adding a // @todo: revisit this feature comment when initializing the value may also be good).

That said, what @speth suggests is exactly how things are usually implemented based on our coding style recommendations (no surprises!). I only have one friendly amendment: initialize it to a negative value, as it is an easier sentinel value to handle.

@speth
Copy link
Member

speth commented May 2, 2022

@ischoegl - that test case admittedly isn't super exciting. It uses just Ar as the element name precisely because the goal is to test that the default atomic weight for that element symbol can be overridden.

The way to have multiple isotopes of an element would be to give them distinct names, like O16 and O18. The set of species you might end up defining would start to multiply, e.g. you would have up to three species for what we normally just call O2, with compositions set to {O16: 2}, {O18: 2} and {O16: 1, O18: 1}, and at least in the general case would have slightly different thermo data.

I don't think you want to just store the molecular weight in the species definition (this seems like it's a long way from the original topic of this PR...), because we still want reactions and equilibrium calculations to preserve isotopic abundances.

@ischoegl
Copy link
Member

ischoegl commented May 2, 2022

@speth ... I agree.

The way to have multiple isotopes of an element would be to give them distinct names, like O16 and O18. The set of species you might end up defining would start to multiply, e.g. you would have up to three species for what we normally just call O2, with compositions set to {O16: 2}, {O18: 2} and {O16: 1, O18: 1}, and at least in the general case would have slightly different thermo data.

That makes sense (of course), which leads to a compromise: calculate the molecular weight if it's listed in Elements.h or defer if custom elements are found. To be honest, I find the test example somewhat contrived ("the goal is to test that the default atomic weight for that element symbol can be overridden"), and it's about the only case where what I just said would fail.

@speth
Copy link
Member

speth commented May 3, 2022

To be honest, I find the test example somewhat contrived ("the goal is to test that the default atomic weight for that element symbol can be overridden"), and it's about the only case where what I just said would fail.

The test may be a bit contrived, but being able to do this has been a feature in the CTI format since Cantera 1.6, so it's better to test it than not. I'd agree that overriding the default element weights is kind of a niche feature, but I'm not ready to assert that there's no use case for it just so a Species object not attached to a Phase can still have a molecular_weight property.

@ischoegl
Copy link
Member

ischoegl commented May 3, 2022

Ok. As mentioned above:

While I do not like the Phase::addSpecies avenue at all (it just feels wrong), it avoids code refactoring. […]

That said, what @speth suggests is exactly how things are usually implemented based on our coding style recommendations (no surprises!). I only have one friendly amendment: initialize it to a negative value, as it is an easier sentinel value to handle.

At the same time, with CTI gone we could revisit this historical feature (which I don’t think is good practice). I just don’t like it. If anything, switching out an element at the phase level should not be viewed as an ambiguity, but a deliberate choice by the user, where changing values are to be expected.

@speth
Copy link
Member

speth commented May 3, 2022

At the same time, with CTI gone we could revisit this historical feature (which I don’t think is good practice). I just don’t like it. If anything, switching out an element at the phase level should not be viewed as an ambiguity, but a deliberate choice by the user, where changing values are to be expected.

What ambiguity is there in the existing implementation? You have to be quite explicit in the phase definition to use alternative element definitions, as shown in the example I linked to before:

 - name: element-remote 
   thermo: ideal-gas 
   elements: 
   - default: [N, O] 
   - test_subdir/species-elements.yaml/isotopes: [Ar] 
   species: [O2, NO, N2, AR] 
   note: replace Argon with custom element from a different file 
   state: {T: 900.0, P: 5 atm, Y: {O2: 0.4, N2: 0.4, AR: 0.2}} 

@ischoegl
Copy link
Member

ischoegl commented May 3, 2022

I agree that there is no ambiguity. What I mean here is that on principle, there shouldn't be an issue with a default value (as created by a Species constructor) being overwritten due to a differing specification in the phase YAML input.

PS: I am stopping here, as I'm ok with any path taken at this point - it's ultimately not a big enough issue to me

@bryanwweber
Copy link
Member Author

bryanwweber commented May 4, 2022

@speth / @ischoegl I updated the implementation to remove code from newSpecies and move it to the Species constructor and Phase::addSpecies. I think I hit a compromise of having a useful molecular_weight attribute when a species is created without a phase, while hopefully avoiding inconsistent data. This does need a few more tests to address code coverage and I need to fix anything that breaks running on CI, but I wanted to post some actual code to move the discussion forward, I think we got a little bit off track in the thread here.

FWIW, I think having Phase::addSpecies be able to set attributes on Species is fine. There are places where the abstraction of a "species" is leaky, with one case being isotopes or "custom" elements that the Species can't know about without information that's only available from outside Species (in this case, the elements specification).

@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch from d9a5e55 to 8067877 Compare May 7, 2022 13:51
@codecov
Copy link

codecov bot commented May 7, 2022

Codecov Report

Merging #1266 (27d1af3) into main (8f4c212) will increase coverage by 0.02%.
The diff coverage is 95.60%.

@@            Coverage Diff             @@
##             main    #1266      +/-   ##
==========================================
+ Coverage   68.02%   68.04%   +0.02%     
==========================================
  Files         314      314              
  Lines       41966    42007      +41     
  Branches    16863    16880      +17     
==========================================
+ Hits        28547    28584      +37     
  Misses      11166    11166              
- Partials     2253     2257       +4     
Impacted Files Coverage Δ
src/thermo/Elements.cpp 88.00% <94.73%> (-1.16%) ⬇️
src/thermo/Species.cpp 91.46% <95.45%> (+1.46%) ⬆️
src/thermo/Phase.cpp 82.56% <100.00%> (+0.06%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch from 8067877 to 918d14b Compare May 16, 2022 20:38
@bryanwweber
Copy link
Member Author

@speth / @ischoegl I think I need to add some more tests here to address the coverage, but can you guys take a look at the C++ changes? I think I got things correct, but I'd be happy to have some advice on improvements, especially with respect to improving the memory usage of the new functions I've added.

@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch from 918d14b to 9d8a815 Compare May 17, 2022 15:09
@bryanwweber
Copy link
Member Author

bryanwweber commented May 17, 2022

I think I fixed the pre-Python 3.9 failures with Element. Python 3.9 adds the ability to chain the classmethod decorator with the property decorator, see https://docs.python.org/3/library/functions.html#classmethod. Once we drop Python 3.8 support, we can remove the _element_names and _element_symbols helper methods. I'll add a note to that effect.

Edit: The helper functions are necessary because using a list/tuple comprehension with the elementSymbols() C++ function directly results in a compiler error.

@bryanwweber bryanwweber requested a review from ischoegl May 17, 2022 23:55
@speth speth self-requested a review May 25, 2022 03:08
@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch from 3d868ac to 78c7591 Compare May 25, 2022 08:42
Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @bryanwweber. I had a few more suggested updates.

interfaces/cython/cantera/thermo.pyx Outdated Show resolved Hide resolved
include/cantera/thermo/Elements.h Outdated Show resolved Hide resolved
include/cantera/thermo/Species.h Outdated Show resolved Hide resolved
include/cantera/thermo/Species.h Outdated Show resolved Hide resolved
include/cantera/thermo/Species.h Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/thermo.pyx Show resolved Hide resolved
src/thermo/Phase.cpp Outdated Show resolved Hide resolved
src/thermo/Species.cpp Outdated Show resolved Hide resolved
@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch 2 times, most recently from bb3f91d to 9d4b0f1 Compare May 25, 2022 17:05
@bryanwweber bryanwweber added enhancement Species Issues related to species definitions labels May 26, 2022
@bryanwweber bryanwweber force-pushed the add-species-molecular-weight branch from 9d4b0f1 to e6d3fad Compare May 26, 2022 15:58
Copy link
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @bryanwweber ... thanks for working on this, as I believe that this is a nice addition. I have a few suggestions that should simplify a couple of things.

One thing I believe could be improved in this context (although I do not suggest to add to this PR) is the handling of modified element or isotope weights - while this can be done (see the element-override example), it appears to exist completely outside of the element handler.

include/cantera/thermo/Elements.h Outdated Show resolved Hide resolved
include/cantera/thermo/Elements.h Outdated Show resolved Hide resolved
include/cantera/thermo/Elements.h Outdated Show resolved Hide resolved
include/cantera/thermo/Species.h Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Show resolved Hide resolved
src/thermo/Elements.cpp Outdated Show resolved Hide resolved
include/cantera/thermo/Elements.h Show resolved Hide resolved
src/thermo/Elements.cpp Show resolved Hide resolved
src/thermo/Elements.cpp Show resolved Hide resolved
src/thermo/Species.cpp Outdated Show resolved Hide resolved
bryanwweber and others added 8 commits June 13, 2022 08:38
The type is more properly the size_t type since it's a size not just an
integer.
The methods added here get the element symbols and names as vectors of
strings and a map of the element symbols and names to the atomic
weights. This is simplified by converting the weight tables to
std::vectors instead of C-style vectors.
The molecular weight is computed for each species when the molecular
weight is accessed on the species or added to a phase. The Phase
private attribute m_mwt is not changed.
Since setting the species molecular weight can throw, all modifications
to Phase member variables should be done after calling
Species::setMolecularWeight() so that the Phase instance is not left in
an inconsistent state.
If the species molecular weight is undefined when retrieved, the weight
is computed in the getter. This simplifies the logic in the setter,
which now only has to worry about checking the value, not computing the
value.
ischoegl
ischoegl previously approved these changes Jun 15, 2022
Copy link
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I did notice some really minor formatting issues that are trivially easy.

src/thermo/Elements.cpp Outdated Show resolved Hide resolved
src/thermo/Elements.cpp Outdated Show resolved Hide resolved
src/thermo/Elements.cpp Outdated Show resolved Hide resolved
Co-authored-by: Ingmar Schoegl <[email protected]>
Copy link
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@speth speth merged commit abe3137 into Cantera:main Jun 24, 2022
@bryanwweber bryanwweber deleted the add-species-molecular-weight branch August 2, 2024 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Species Issues related to species definitions
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

3 participants