NXexperiment and NXexperimentdata #580

jacobfilik · 2017-06-22T14:04:10Z

Contributed definitions to allow NXdata groups to be related to each other and an NXdetector.

With the aim of making writing software for visualisation and processing of data collections with multiple detectors in the same scan easier by explicitly describing how the NXdata groups are released to each other and the corresponding NXDetector.

prjemian · 2017-06-22T15:01:59Z

contributed_definitions/NXexperiment.nxdl.xml

+	    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	    xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 ../nxdl.xsd">
+	<doc>
+      Describe the contents of an NXentry resulting from a data collection using multiple detectors,


per #469 & #270, make the first line a one-line summary since only the first line will be used to generate the summary documentation about each NXDL shown on http://download.nexusformat.org/doc/html/classes/contributed_definitions/index.html

prjemian · 2017-06-22T15:02:27Z

contributed_definitions/NXexperiment.nxdl.xml

+      Describe the contents of an NXentry resulting from a data collection using multiple detectors,
+      generating raw and derived data, potentially being scanned across multiple dimensions.
+
+      It is not uncommon for a beamline at a synchrotron to collect data from multiple detectors in during a single scan. NeXus files


remove in

prjemian · 2017-06-22T15:04:07Z

contributed_definitions/NXexperimentdata.nxdl.xml

+	    xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 ../nxdl.xsd"
+	    >
+	<doc>
+		Describe all the plottable data from a single detector in a single entity. To be used


per #469 & #270, make the first line a one-line summary since only the first line will be used to generate the summary documentation about each NXDL shown on http://download.nexusformat.org/doc/html/classes/contributed_definitions/index.html

prjemian · 2017-06-22T15:05:42Z

contributed_definitions/NXexperimentdata.nxdl.xml

+
+		(**required**) :ref:`NXdata` at least one NXdata, related to the NXdetector
+  </doc>
+  <attribute name="DATANAME_origin">


add nameType="any" (per #562) to indicate that DATANAME is not required text in the name

Just to check, the issue suggests using this in fields, whereas this is an attribute. No other attributes (such as AXISNAME_indices) follow this convention.

prjemian · 2017-06-22T15:09:56Z

contributed_definitions/NXexperimentdata.nxdl.xml

+      name of an :ref:`NXdata` that must be in the same :ref:`NXexperimentdata`.
+    </doc>
+  </attribute>
+  <attribute name="primary">


change name of this attribute to default (per #380)

documentation should mention that an absolute HDF5 path address (starts with /, such as /entry/instrument/detector1/ccd3) must be used.

I would prefer not to have to use the absolute HDF5 path for this, just the name of (one of) the NXdata in this class.

This would be expected to be used more like the signal attribute in NXData, that is, from my collection of objects, which is the one i should actually care about. If it is a path to a location outside the group, it loses all the connections in the group.
For example, NXexperimentdata contains

Xspress3 (/entry/instrument/detector)[NXDetector] AllElementSum(/entry/AllElementSum)[NXData] FeKAlpha(/entry/FeKAlpha)[NXData]

with default wanting to refer to FeKalpha (which has an origin of AllElementSum, which has an origin of Xspress3). If default was /entry/FeKalpha I have to strip the HDF5 path to make any use of the origin attributes.
There is the requirement that all the NXdata groups in this class must have unique names (which could be explicitly stated in the doc) but if there are mulitple NXdata, derived from the same NXDetector, with the same name, that suggests to me bigger issues with the file.

Since this behaviour breaks with the use of default in the issue (by not using the full path), but would work with the external default definition (I assume there is no reason why the default in an NXentry couldn't contain the full path to the NXdata in an NXexperimentaldata), maybe a different name would be better.

rayosborn · 2017-06-23T14:04:28Z

Thanks for making this suggestion, which looks interesting, but I'm struggling to understand exactly how these classes would be used. Could you provide an example file tree, such as your SAXS example, with its various detectors?

jacobfilik · 2017-06-23T14:39:08Z

Happy to, but it will take a few days to find suitable example data and construct the file by hand.

The main issue the classes aim to solve is relating NXdata to each other and NXdetector groups. For simple files with one NXdata and one NXdetector the problem is trivial.

The minute there are multiple NXdata and NXdetector groups there is no way to relate them, except potentially by comparing names/strings which can't be good. (If there is an official NeXus way to achieve this that I have overlooked I am happy to be educated). Relating an NXdata to the NXdetector is crucial, if you want to, say, convert from pixel to q space.

Also, for a grid scan, if one NXdata contains the detector sum (which would be a 2D image), and another contains the detector data (a 3 or 4D block) there seems to be no way to encode that each pixel of the sum is derived from a frame of the detector. Again for one detector you could guess, but for N detectors you are back to comparing names.

I have no emotional attachment to these classes. @markbasham and I have discussed around this issue for a long time, and this seems to be one method of encapsulating the information we need, without just adding more fields to the existing base classes. I'm sure there are many elegant solutions, and I am happy to discuss the most NeXus way to get the information we need from the file.

rayosborn · 2017-06-23T17:55:25Z

It is possible that the functionality that you are looking for is already provided by NeXus links. Here is part of the tree structure of one of the example files distributed with NeXpy:

>>> print(chopper.tree)
chopper:NXroot
  entry:NXentry
    analysis = 'TOFNDGS'
    data:NXdata
      @axes = ['polar_angle' 'time_of_flight']
      @signal = 'data'
      data = int32(148x750)
      polar_angle -> /entry/instrument/detector/polar_angle
        @target = '/entry/instrument/detector/polar_angle'
      time_of_flight -> /entry/instrument/detector/time_of_flight
        @target = '/entry/instrument/detector/time_of_flight'
    instrument:NXinstrument
      detector:NXdetector
        polar_angle = float32(148)
          @target = '/entry/instrument/detector/polar_angle'
        time_of_flight = float32(751)
          @target = '/entry/instrument/detector/time_of_flight'

So polar_angle and time_of_flight are added to the /entry/data NXdata group as links to fields that are stored in the /entry/instrument/detector group. The target attribute can be used to identify the NXdetector group from which the NXdata axes are derived. One of the rationales for treating these as soft links within NeXus (even though they are stored as hard links in the HDF5 file) is so that you can identify what detector group it comes from.

It also allows the mixing of axes from multiple groups. For example, if you are scanning both incident energy and detector angle, the former could be stored in a NXmonochromator group and the latter in a NXdetector group.

Your method may be more transparent, although high-level APIs can make the association clearer. For example, in the attached image from NeXpy, the chain-link icon shows that the item is a link. If you right-click on the link, one of the options is Show Link, which will automatically select the target field in the detector group.

Let us know if this meets your needs, or if you think new classes will still be useful. We will certainly consider them if you do.

jacobfilik · 2017-06-24T20:27:20Z

Thank you, I had seen the @target attribute in Nexus files before, but I could never find anything in the documentation, so assumed it was something internal. This would work perfectly for my first use case (better in-fact including the axes example you show), but I would want to add some doc to the NXData class to describe this use-case. Assuming NXdata is the main place @target is used.

I think the second use case (multiple NXData, some derived from processing others) could be met by clever use of @target and NXprocess. I will have a think about this and possibly submit an alternative pull request, so the relative merits of each can be discussed.

Thanks again for taking time to review and discuss this request, it is greatly appreciated.

prjemian · 2017-06-25T14:50:38Z

Good point. The @target attribute is more general than just the NXdata base class, it is common to all links in a NeXus data file. And you have identified another TODO item in the NeXus manual. The only documentation is an h5dump example of a NeXus link.

That documentation will be addressed with (new) issue #581.

prjemian · 2017-06-25T16:05:36Z

This is where the @target attribute is defined in NeXus: http://download.nexusformat.org/doc/html/nxdl_desc.html#linktype

prjemian · 2017-06-25T17:04:37Z

The Design chapter of the manual describes NeXus links and gives an example use of the @target attribute.

create a NeXus link

jacobfilik · 2017-06-25T18:30:19Z

Brilliant! Thank you, I will start making sure target attributes are added appropriately. I assume it is also correct to add them when the data is externally linked (e.g. unless the data is only written directly in the NXdata i should expect/write @target)?

Finally, is there an approved manner of representing that a certain NXdata contains the result of processing another NXdata? I am considering including an NXprocess in the processed 'NXdata`, showing the provenance. If this is acceptable (and ideally recommended) I will happily close this pull request, as all the functionality I hoped to add is already there.

prjemian · 2017-06-25T21:12:44Z

There is not a standard set for NXdata contains NXprocess, it makes some sense.

Another way is to place both NXprocess and NXprocess as siblings within NXentry. There may be one or more such NXprocess groups in a numbered sequence, describing a series of processing events. Each one of those might result in a NXdata group.

prjemian · 2017-06-25T21:19:48Z

The NAPI code makes no @target attribute assignment for externally linked data.

You'd best avoid using the @target attribute with external file links. Should also be added to the documentation? Seems reasonable at this point.

jacobfilik · 2017-06-25T22:13:01Z

Thank you so much for this again, I just want to check with one final example and then I will close the request (although at this point it sounds like I should try the NAPI if that is the reference implementation).

If I have some data in myfile.h5, and I make an eternal link to it in my NeXus file in the NXdetector at /entry/instrument/detector/data, this should have no @target (it would make absolutely no sense to have one).
If then in the same NeXus file I make an NXdata /entry/mydata/data which is just a link to the data in the NXdetector this should have a @target, because it is important to show that this is the data from the detector (and it is a link to another dataset in the same file, even though the data is in an external file)?

prjemian · 2017-06-25T22:46:38Z

The procedure sounds right. BUT, HDF5 will not let you be do this ("Interfile hard links are not allowed"). Just tried it with Python and h5py, following the external file example in the manual. Continuing that example:

In [62]: f = h5py.File(FILE_HDF5_MASTER, "a")

In [63]: ds = f["/entry/data/counts"]

In [64]: ds.name
Out[64]: u'/entry/instrument/detector/counts'

In [65]: ds.attrs["target"] = ds.name

In [66]: f["/entry/nexus_link"] = ds
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-66-659843c8241d> in <module>()
----> 1 f["/entry/nexus_link"] = ds

h5py\_objects.pyx in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482483473\work\h5py\_objects.c:2705)
()

h5py\_objects.pyx in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482483473\work\h5py\_objects.c:2663)
()

D:\Apps\Anaconda\lib\site-packages\h5py\_hl\group.pyc in __setitem__(self, name, obj)
    275
    276         if isinstance(obj, HLObject):
--> 277             h5o.link(obj.id, self.id, name, lcpl=lcpl, lapl=self._lapl)
    278
    279         elif isinstance(obj, SoftLink):

h5py\_objects.pyx in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482483473\work\h5py\_objects.c:2705)
()

h5py\_objects.pyx in h5py._objects.with_phil.wrapper (C:\Minonda\conda-bld\h5py_1474482483473\work\h5py\_objects.c:2663)
()

h5py\h5o.pyx in h5py.h5o.link (C:\Minonda\conda-bld\h5py_1474482483473\work\h5py\h5o.c:3619)()

RuntimeError: Unable to create link (Interfile hard links are not allowed)

Conclusion is that when you want to make a link to an object in a NeXus HDF5 data file and that object is itself externally linked, then instead of making a hard link at the new location, you make an external file link at the new location back to the data source. Clear enough?

rayosborn · 2017-06-26T02:17:22Z

Your external link example does reveal a limitation in using the @target attribute to logically associate two groups. The only way to add such an attribute to the externally linked HDF5 dataset is for it to be added to the dataset itself in the remote file, where it might not make sense, e.g., it might be an invalid path. In practice, or at least in my own experience, external links are most often used for very large multidimensional arrays, i.e., signal arrays rather than one-dimensional axis arrays. If you stored the axis arrays in the local file, you could still use them to associate the NXdata and NXdetector groups, as in the chopper example, even if the signal array is externally linked in both groups. But I agree that external links expose a flaw in this approach.

jacobfilik · 2017-06-26T08:30:45Z

That's a shame, the @target attribute looks like exactly what I want, but a significant amount of our data is externally linked.

I don't suppose there is a plan to move the @target attribute off the data set into the group (in the same way @signal was moved)? Defining @DATASETNAME_target with all the axes and axes_indicies etc, feels logical to me.

(As an aside, should I close this pull request - i'm now not convinced this is the most NeXus approach - and open an issue pointing at it? Are more people likely to read pull request comments or issues?)

prjemian · 2017-06-26T16:27:27Z

The @target attribute can be used with both fields (a.k.a. HDF5 datasets) and groups to indicate a link to another item in the same file. No plans to change this as this is one of the older internal structures in NeXus, many software tools expect this to remain as they know it now.

Perhaps we re-examine your original intention, which was to associate the NXdata and NXdetector groups. I agree that closing this PR without a merge is the right course, and open a new issue to discuss how best to implement your intention. The scope may expand to include associations between NXprocess, NXlog, and NXnote as well.

mkoennecke · 2017-06-27T14:33:29Z

Hi,

to add my 2c's worth: In a normal in-file link look at the target attribute. For an external link, NAPI
writes the napimount attribute. Which contains a URL like string pointing into the external file.

So, yes, we have the use case for NXexperiment_data well covered.

rayosborn · 2017-06-27T14:46:24Z

Hi Mark,
This discussion has now been transferred to #583. I'm not sure the @napimount attribute does what Jacob needs. It points to the object in the remote file, but Jacob is asking how to link an NXdata and NXdetector group in the same file that happen to have the same external link. Also, does the @napimount attribute still exist once the HDF5 file has been written? I thought it was just used by the API in creating the external link. In HDF5, the external link in the parent file cannot have different attributes to the dataset in the remote file, so once the link has been made and saved to the HDF5 file, these extra attributes must disappear. That is what happens in the Python nexusformat API.

P.S. I see that Jacob was already making a similar point.

jacobfilik · 2017-06-27T14:46:32Z

Hi,

Thanks, but if this points to the external file, the only way I can find the corresponding NXDetector is to parse all the NXDetectors looking for the same url which I don't really want to do. Also, is the napimount atttribute well documented NeXus standard (I haven't seen it before, but I am certainly not a NeXus expert)?

Finally structuring code like:

if (contains target attribute) {
//Can go directly to NXdetector
} else if (contains napimount attribute) {
//walk detectors finding same url
} else {
//walk detectors finding same HDF5 dataset id
}

doesn't seem ideal. I admit that the NXexperiment/NXexperimentdata is not the way we want to do things, but the above can't be the recommended approach?

jacobfilik · 2017-06-28T16:45:37Z

Not the right solution for this functionality, discussion moved to issue #583

Definitions of NXexperiment and NXexperimentdata

68e6946

prjemian reviewed Jun 22, 2017

View reviewed changes

prjemian added a commit that referenced this pull request Jun 25, 2017

fixes #581 highlight target attribute in index (refs #580), steps to

33b59e2

create a NeXus link

prjemian added a commit that referenced this pull request Jun 25, 2017

#580

715b11a

prjemian added a commit that referenced this pull request Jun 25, 2017

#580 add URL to NX5nativeexternallink

ffb1654

prjemian added a commit that referenced this pull request Jun 26, 2017

refs #580

6b8a348

prjemian mentioned this pull request Jun 26, 2017

Cannot mix hard links with external links -- need to check for this nexusformat/code#451

Open

jacobfilik mentioned this pull request Jun 27, 2017

How to relate NXData to NXDetector and each other #583

Open

jacobfilik closed this Jun 28, 2017

prjemian added a commit that referenced this pull request Jun 30, 2017

#577, #580 start a history file

115f939

prjemian added a commit that referenced this pull request Jun 30, 2017

#577, #580 link to main README

ebdf2c9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NXexperiment and NXexperimentdata #580

NXexperiment and NXexperimentdata #580

jacobfilik commented Jun 22, 2017

prjemian Jun 22, 2017

prjemian Jun 22, 2017

prjemian Jun 22, 2017

prjemian Jun 22, 2017

jacobfilik Jun 23, 2017

prjemian Jun 22, 2017

jacobfilik Jun 23, 2017

rayosborn commented Jun 23, 2017

jacobfilik commented Jun 23, 2017 •

edited

Loading

rayosborn commented Jun 23, 2017

jacobfilik commented Jun 24, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

jacobfilik commented Jun 25, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

jacobfilik commented Jun 25, 2017

prjemian commented Jun 25, 2017

rayosborn commented Jun 26, 2017

jacobfilik commented Jun 26, 2017

prjemian commented Jun 26, 2017

mkoennecke commented Jun 27, 2017

rayosborn commented Jun 27, 2017 •

edited

Loading

jacobfilik commented Jun 27, 2017 •

edited

Loading

jacobfilik commented Jun 28, 2017 •

edited

Loading

NXexperiment and NXexperimentdata #580

NXexperiment and NXexperimentdata #580

Conversation

jacobfilik commented Jun 22, 2017

prjemian Jun 22, 2017

Choose a reason for hiding this comment

prjemian Jun 22, 2017

Choose a reason for hiding this comment

prjemian Jun 22, 2017

Choose a reason for hiding this comment

prjemian Jun 22, 2017

Choose a reason for hiding this comment

jacobfilik Jun 23, 2017

Choose a reason for hiding this comment

prjemian Jun 22, 2017

Choose a reason for hiding this comment

jacobfilik Jun 23, 2017

Choose a reason for hiding this comment

rayosborn commented Jun 23, 2017

jacobfilik commented Jun 23, 2017 • edited Loading

rayosborn commented Jun 23, 2017

jacobfilik commented Jun 24, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

jacobfilik commented Jun 25, 2017

prjemian commented Jun 25, 2017

prjemian commented Jun 25, 2017

jacobfilik commented Jun 25, 2017

prjemian commented Jun 25, 2017

rayosborn commented Jun 26, 2017

jacobfilik commented Jun 26, 2017

prjemian commented Jun 26, 2017

mkoennecke commented Jun 27, 2017

rayosborn commented Jun 27, 2017 • edited Loading

jacobfilik commented Jun 27, 2017 • edited Loading

jacobfilik commented Jun 28, 2017 • edited Loading

jacobfilik commented Jun 23, 2017 •

edited

Loading

rayosborn commented Jun 27, 2017 •

edited

Loading

jacobfilik commented Jun 27, 2017 •

edited

Loading

jacobfilik commented Jun 28, 2017 •

edited

Loading