Use "dataclasses" #60

ziotom78 · 2020-09-23T10:49:03Z

This PR adds the support for
dataclasses, a
feature of Python 3.7+ that is available on Python 3.6 too thanks to a
backward-compatible package.

This PR uses @dataclass to define the following classes:

DetectorInfo (previously Detector)
FreqChannelInfo
A way to quickly read from a TOML file/dictionary a list of detectors, possibly specifying entire channels at once
InstrumentInfo
Documentation
Agree on a few breaking changes for Instrument/InstrumentInfo (Proposal: a few breaking changes #65)

Using `@dataclass` vastly simplifies the implementation of `Detector`, and the new name `DetectorInfo` makes the connection with the IMO quantity clearer.

Apparently, Poetry 1.0 does not handle dependencies on specific versions of Python correctly: python-poetry/poetry#1413 While waiting for Poetry 1.1, this commit tries to fix the problem.

ziotom78 · 2020-09-24T10:09:40Z

The classes DetectorInfo and FreqChannelInfo are ready. This PR however should wait the resolution of the following minor issues in litebird_imo before it can be merged:

The reason why these are needed is because the fields in DetectorInfo and FreqChannelInfo refer to the new field names proposed in these issues.

This is a **breaking** change: the `Instrument` class is renamed to `InstrumentInfo`, to match the names `DetectorInfo` and `FreqChannelInfo`. For consistency, the class has been moved from `instruments.py` to `detectors.py`, which already contained the definition for classes `DetectorInfo` and `FreqChannelInfo`. Moreover, the fields used by `InstrumentInfo` to hold angles can no longer be initialized with values in *degrees*: you must explicitly pass them as radians. This makes the interface to the class clearer: the parameters you pass to the constructor are the same you can access once the object has been constructed. The mock IMO used in tests has been updated to use the name `instrument_info` instead of `focal_plane`, to be consistent with the true IMO. A few new data files have been added to this IMO, to test the ability to produce data files. All the examples in the documentation have been updated accordingly.

This was missing from the previous commit. It implements a general-purpose way to produce a list of `DetectorInfo` objects out of a set of definitions in a dictionary, and it is meant to be used with TOML files. A new page about detectors, frequency channels, and instruments has been added to the documentation.

This might resolve a few problems when installing the "pillow" dependency under Mac OS X: python-pillow/Pillow#4242 (the title of the issue is misleading, as the problem does not affect only Windows user, as it's evident from the thread.)

ziotom78 · 2020-10-16T09:55:09Z

The PR seems in good shape now. Please have a look at the documentation here:

https://litebird-sim.readthedocs.io/en/dataclasses/detectors.html

It will be merged mid next week, if no concerns arise.

dpole · 2020-10-16T10:22:20Z

Hi, I've got two comments.

Starting with the quick one, it seems to me that we should change name to detector.py, even though I don't have a specific suggestion.

Second, we touched the topic elsewhere, but I promised during the last call that I'll drop a line also here. I'm am a bit afraid of the proliferation of classes storing the information of some part of the instrument. As I said in #53 , I prefer bare data: it is less safe, less OO (which is actually a pro to me) but more transparent and minimal (in the sense that it requires the user to get familiar with fewer concepts). I am fine with creating specialized classes for central pieces of information (as in this PR), but, looking more into the future, I am wondering whether we have clear ideas about where to draw the line and how to handle information beyond that line. Otherwise we'll end up with one dataclass per IMO entity.

ziotom78 · 2020-10-16T10:58:29Z

Hi @dpole , a few comments:

detectors.py was named similarly to observations.py, simulations.py, quaternions.py, instruments.py, i.e., plurals. If we decide to change it, we must change everything to stay consistent. Do you think it's worth doing this? As far as I can imagine, the user will just write import litebird_sim, and the name of the sub-modules will be rarely used.
I share your concerns about OO, but in my experience its biggest pain points are:
- Heavy use of complex inheritance hierarchies, which we avoid almost completely in our codebase
- Difficulty to access "private" members from the outside during debugging, which we avoid entirely because Python has no private members
- Having so many "getter" and "setter" methods in classes hurts eyes. However, so far we have avoided these entirely
- Boresome definition of constructors that have to initialize all the members one by one, which this PR solves

While I agree that OO can make the codebase more difficult to understand (the old joke about the NotificationStrategyFactory!), I have never seen any data that backs up the claim that just using plain structs makes things more difficult for users. (Yes, I used the word "struct", because the way we are using classes in this codebase is more similar to C/Julia struct types than to C++ classes.) The ease of use depends on the API, not on the fact that we're avoiding the class keyword and the object.member notation (just two concepts to learn).

Finally, I too would like to avoid having too many classes, but my reason is against it is because of the "too many" part of the sentence, not because any OO construct must be avoided at all costs and in any case. I suggest we employ the following rule (abridged from here):

If the fields are always the same during runtime, implement a dataclass;
If the fields can change at runtime depending on the user's input, use a dictionary.

The three classes DetectorInfo, FreqChannelInfo, and InstrumentInfo are meant to be a one-to-one replica of IMO objects, and thus they fall under point 1. For custom data like the parameters field of the Simulation class, we use the dictionary according to point 2.

dpole · 2020-10-16T11:20:40Z

I see, thanks. I think we're talking about the same (or close enough) shade of gray from different angles. In particular, I agree that if we use dataclasses as C-like structures no complexity is added. This intention wasn't clear to me. A last question to clarify my practical concern about where to draw the line. Suppose that we add information about the HWP in the IMo with all sort of instrumental properties defined in it. How would you handle it? Is the idea to define a dedicated class?

About my first comment, it wasn't about the plural but rather the fact that there is more than detectors in the module (namely channel and instrument).

Finally...I am a bit disappointed because I didn't get the NotificationStrategyFactory joke. It sounded like one of those nerdy things that make my day ;P

ziotom78 · 2020-10-16T12:39:15Z

Suppose that we add information about the HWP in the IMo with all sort of instrumental properties defined in it. How would you handle it? Is the idea to define a dedicated class?

Yes, indeed: if the object is to be read from the IMO, it means that the set of metadata is "frozen" by a Specification Document, and thus it falls in the 1st category above.

In this way, we can get the information about an HWP using the construct

hwp_info = HwpInfo.from_imo(imo, "/ref/to/hwp")

similarly to what we already do with DetectorInfo.from_imo and similar.

Now I get the point about detectors.py. What about something that underlines their close connection with the IMO? E.g., imo_datatypes.

dpole · 2020-10-16T15:07:43Z

Yes, indeed: if the object is to be read from the IMO, it means that the set of metadata is "frozen" by a Specification Document, and thus it falls in the 1st category above.

Maybe I'm overestimating the problem, but I imagine that the IMo will have a considerable number of different entities in the future. We are committing to writing a dataclass replicating the content of the imo specification for every entity. It seems to me a significant burden (maybe unnecessary) on developers.

Beyond the effort issue, maybe it is somehow forbidden by some IMo specification, but I can imagine an entity with many child entities. Walking this hierarchy of entities and calling at each step the proper constructor can lead to quite a bit of complicated code. Having an "anonymous" data container into which any entity may be loaded using a generic function seems a more straightforward and "scalable" solution to me.

ziotom78 · 2020-10-16T15:13:33Z

Maybe I'm overestimating the problem, but I imagine that the IMo will have a considerable number of different entities in the future. We are committing to writing a dataclass replicating the content of the imo specification for the entity. It seems to me an significant burden (maybe unnecessary) on developers.

I do not expect that many details in the IMO will need to be accessed by our framework. (E.g., mechanical structure of the many hardware parts, thermal models, optical models, RF models, circuit diagrams, component datasheets, etc.) We're going to implement dataclasses only for those modules we actually need for the simulation.

Beyond the effort issue, maybe it is somehow forbidden by some IMo specification, but I can imagine an entity with many child entities. Walking this hierarchy of entities and calling at each step the proper constructor can lead to quite a bit of complicated code. Having an "anonymous" data container into which any entity may be loaded using a generic function seems a more straightforward and "scalable" solution to me.

This framework is supposed only to load data files (not entities), and data files, like quantities, have no hierarchy in the current specification of the IMO.

ziotom78 added 7 commits September 23, 2020 11:10

Add dependency on dataclasses for py36

2b45e19

Rename Detector->DetectorInfo and make it a dataclass

5760d37

Using `@dataclass` vastly simplifies the implementation of `Detector`, and the new name `DetectorInfo` makes the connection with the IMO quantity clearer.

Ensure that detector quaternions are normalized

ac2de73

Trying to overcome Poetry's bug

1b8914d

Apparently, Poetry 1.0 does not handle dependencies on specific versions of Python correctly: python-poetry/poetry#1413 While waiting for Poetry 1.1, this commit tries to fix the problem.

Add class FreqChannelInfo

505db6a

Add FreqChannelInfo class

9c102c7

Merge branch 'master' into dataclasses

34f3466

Merge branch 'master' into dataclasses

8e19f5c

ziotom78 mentioned this pull request Oct 9, 2020

from_imo method in the Instrument class #63

Closed

ziotom78 added 8 commits October 9, 2020 12:59

Install a mock IMO in the Travis CI machine

39e83a2

Add bandcenter and bandwidth to DetectorInfo

40cfea5

Fix configuration of mock IMO in Travis

edc5d86

Fix wrong path in Travis script

22092a2

Add debug messages for Travis

778a79b

Fix path in Travis

a3e7e59

ziotom78 mentioned this pull request Oct 9, 2020

Proposal: a few breaking changes #65

Closed

ziotom78 added 3 commits October 14, 2020 12:56

Improve the docs

835c3f8

Add missing parenthesis in example

d9cd1ac

ziotom78 merged commit e5d3ec9 into master Oct 23, 2020

ziotom78 deleted the dataclasses branch October 23, 2020 04:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use "dataclasses" #60

Use "dataclasses" #60

ziotom78 commented Sep 23, 2020 •

edited

Loading

ziotom78 commented Sep 24, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020 •

edited

Loading

ziotom78 commented Oct 16, 2020

Use "dataclasses" #60

Use "dataclasses" #60

Conversation

ziotom78 commented Sep 23, 2020 • edited Loading

ziotom78 commented Sep 24, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020

ziotom78 commented Oct 16, 2020

dpole commented Oct 16, 2020 • edited Loading

ziotom78 commented Oct 16, 2020

ziotom78 commented Sep 23, 2020 •

edited

Loading

dpole commented Oct 16, 2020 •

edited

Loading