WIP: Implement Pydra code-generators #2665

tclose · 2023-06-29T03:55:41Z

Implementation of #2605

Tasks:

__print_pydra_usage__ for C++ commands.
__print_pydra_usage__ for Python commands
Script to loop through all cmds and print their "pydra usage" to *.py files
Fix up package TOML (maybe switch to hatchling?)
GitHub action to be run on new release to upload pydra-mrtrix3 package to PyPI
Tests to check generated code can print cmdlines properly (@Lestropie, do you have any ideas for this? I was thinking if there was a hidden "dry-run" option then

Blocked by:

Complete run.command(): Support for command-line piping #1392 so that Python commands type file-types

Lestropie · 2023-06-29T04:44:37Z

As mentioned in #2605, Python will require #1392 via ankitasanil#1 to have typed command-line parameters.
@ankitasanil: Still interest in completing the PR, or will @MRtrix3/mrtrix3-devs need to finish it off?

tclose · 2023-06-30T00:45:22Z

As mentioned in #2605, Python will require #1392 via ankitasanil#1 to have typed command-line parameters. @ankitasanil: Still interest in completing the PR, or will @MRtrix3/mrtrix3-devs need to finish it off?

I have written the Python version as best as I can with the current level of typing, but you are right that it will need more type information to get it to work in a guaranteed way.

On typing of files, how many different formats does MRtrix interact with (e.g. bvec, bval, response.txt, etc...) and would it be too much to ask to be able to give each format its own ArgType (like Image* and Track* do currently)? Because I have a PR that refactors Pydra's type-checking to be aware of file-formats (nipype/pydra#662), and it would be great to be able to specify them exactly instead of defaulting to a generic FileSet class.

Lestropie · 2023-06-30T01:54:12Z

how many different (file) formats does MRtrix interact with

Assuming upstream exclusion of images & tractograms from that set, here's what comes to mind:

1D & 2D numerical data input & output (currently ASCII only but also NPY after Npy file format support #2437)
TSFs
JSONs (in multiple contexts)
.dcm files
Other text files:
- Lists of paths to other files (eg. stats command inputs)
- Generic tabulated data (eg. parcellation lookup tables but others also)

As far as further custom types in user interfacing, I would argue that:

That could potentially be the case for TSFs / JSONs / .dcms, since those represent singular unambiguous cases of file extension / content of data / requisite parsing code / interpretation of data.
The specific examples you provide (bvec / bval / response functions) I'm more sceptical of, as they are just numerical data, albeit sometimes with expectations of dimensionality / size; they're just subsets of "Nd numerical data" to which certain interpretations are applied after the data are parsed.

Curious to hear others' thoughts.

tclose · 2023-06-30T02:05:11Z

how many different (file) formats does MRtrix interact with

Assuming upstream exclusion of images & tractograms from that set, here's what comes to mind:

1D & 2D numerical data input & output (currently ASCII only but also NPY after Npy file format support #2437)

TSFs

JSONs (in multiple contexts)

.dcm files

Other text files:

Lists of paths to other files (eg. stats command inputs)

Generic tabulated data (eg. parcellation lookup tables but others also)

As far as further custom types in user interfacing, I would argue that:

That could potentially be the case for TSFs / JSONs / .dcms, since those represent singular unambiguous cases of file extension / content of data / requisite parsing code / interpretation of data.

The specific examples you provide (bvec / bval / response functions) I'm more sceptical of, as they are just numerical data, albeit sometimes with expectations of dimensionality / size; they're just subsets of "Nd numerical data" to which certain interpretations are applied after the data are parsed.

Curious to hear others' thoughts.

Yes, maybe file formats is a bit of a misnomer, data types would be more accurate/actually what is captured in Pydra. In any case, it would be very helpful on the Pydra code-gen front if these file-data-types could be specified at bit finer granularity

tclose · 2023-07-12T06:23:55Z

Ok, so this just needs some unit tests to test out the generated Pydra interfaces and to wait for #1392.

@Lestropie, on the unit tests, do you think it would be an issue if I added a --dry-run option to the general app options, which basically tests to see whether the provided options parse and then exits?

tclose · 2023-07-12T06:28:28Z

Ok, so this just needs some unit tests to test out the generated Pydra interfaces and to wait for #1392.

@Lestropie, on the unit tests, do you think it would be an issue if I added a --dry-run option to the general app options, which basically tests to see whether the provided options parse and then exits?

If you don't want to clutter the help menu further, maybe there could be special configure arg to insert it in?

Lestropie · 2023-08-18T01:17:36Z

One note I might throw in here just because I encountered it while working on the code and it's of some limited relevance.

For a command-line option, in C++, you specify the option, and then you can specify some number of compulsory command-line arguments that must follow that option, and get consumed from the command-line by the parser immediately following the appearance of the option. Each of those arguments may have its own type.

In Python, this can't currently be done. In argparse, the option itself is assigned a type, as well as a number of arguments. Therefore you can't have >1 arguments following a command-line option where those arguments have different types to one another.

daljit46 · 2023-08-24T09:19:40Z

Please note that this PR will need to conform to the formatting requirements introduced in #2652, thus all C++ files need to be formatted using clang-format (version 16). See here.

tclose · 2023-08-25T05:22:26Z

In Python, this can't currently be done. In argparse, the option itself is assigned a type, as well as a number of arguments. Therefore you can't have >1 arguments following a command-line option where those arguments have different types to one another.

I think the typical way to represent this is to use the typing.Tuple[MyType1, MyType2] form

Lestropie · 2024-02-05T02:08:58Z

I think the typical way to represent this is to use the typing.Tuple[MyType1, MyType2] form

Having had a small taste of Pydra now, this might be better than doing the most permissible thing of just accepting any string in such a scenario, but doesn't fully address the issue to the extent that the MRtrix3 C++ interface does, at least to the best of my understanding.

Say for instance a command-line option takes as input a string and then an integer. Yes you could make that command-line option a typing.Tuple[str, int], but that would mean that the user could erroneously specify an integer and then a string, and that wouldn't be caught by the command-line parser.

PS. Relevance to #2580: making the argument types in C++ a bitwise flag indicating what set of types is permissible would be faithful to the typing.Tuple[MyType1, MyType2] usage I've seen.

…ory of repo

…make it more pydronic

…ent pipeline

Lestropie · 2024-02-28T09:40:02Z

@tclose: After discussion with @jdtournier, there is trepidation about embedding the Pydra interfaces within the repository.

On the plus side, it would enforce consistency, PR changeset would show interface changes that might otherwise be invisible to the CI checks of the online documentation, and be convenient from the perspective of workflow managers grabbing both the software and the interface to it.

On the minus side, it's implicitly taking on management responsibility for interfaces to an environment for which none of the core development team have experience; even if the interfaces were to be committed in a functional state, and the MRtrix3 commands were to not change, it would nevertheless be possible for there to be changes to Pydra that would necessitate revisions here.

What would it look like from the Pydra perspective if the MRtrix3 Pydra interface were to be provided as a stand-alone repository? Ie. What would be the magnitude of inconvenience? There's obviously the issue that on a tagged release of MRtrix3 you would need to update that external repository and generate a corresponding tag.

tclose · 2024-02-28T11:47:31Z

@Lestropie @jdtournier just to clarify that the interfaces are not being stored, instead generated from the MRtrix C++ & Python command declarations (see __print_usage_pydra__). Therefore, the interfaces should evolve with changes to the MRtrix commands as long as the command definition syntax stays stable.

It would be great if the print_usage_pydra special command (or whatever it is called) can live in the MRtrix source, as I'm not sure it will be easy to implement/maintain that in a separate code-base (if possible at all). However, the generation code and Python package can be stored in a separate repository if required. This is what has been typically done for other toolboxes.

We are hoping to move to a model where the Pydra interfaces could be integrated into the tool source repos to make it easier for them them stay in step with each other. Given it is my code, I'm happy to put my hand up to deal with any issues that might arise, as I would need to do this if it were stored in a separate repository. But in any case, totally up to you guys, I can integrate the changes into the existing nipype/pydra-mrtrix3 repo instead.

jdtournier · 2024-02-28T12:19:30Z

Hey @tclose!
Just a few points/questions:

No problem at all from our point of view to include the code responsible for producing the interface definitions. That would indeed be really tricky to maintain outside of the core repo, but it's not something that I expect to change much once implemented, and it makes sense for it to be there. I would opt for a name such as __print_usage_json__ (assuming Pydra does expect JSON...) just to avoid giving the impression that this can only be used in Pydra - if it's JSON, there's no reason it can't be used in other contexts as well .

The bigger question from my point of view is whether to store the generated definitions in the core repo. It's trivial to generate them at build-time, and maybe make it part of the CMake install or something. But that doesn't require the generated interface definitions themselves to also be stored on the main repo. What I was trying to figure out was how you envisage these files to be used in practice. If they're stored alongside a full MRtrix install, it would be possible for an application like Pydra to query them at run-time and offer a version of the interface that is always guaranteed to be in lock-step with the version of MRtrix installed. If they're stored elsewhere, then there is always the risk of a mismatch between the version installed and the interface files, which could lead to odd errors. I expect with that kind of approach, you would need to host interface definitions for each version of MRtrix, along with the ability to determine the current version of MRtrix, and would only guarantee correct operation for official release versions. But the main reason I'm discussing this is because in the former case, I don't see a need to host these interface definitions on the main repo, while I might see a call for that with the latter approach (though I still think it would be cleaner from our point of view to have them hosted elsewhere, presumably somewhere more obviously related to the Pydra project).

What do you reckon?

Lestropie · 2024-02-28T23:09:07Z

It might be the case that we need to consider separately:

The implementation of __print_usage_pydra__
(which without a doubt should be implemented within the main repository, even if it may one day need to be changed in response to Pydra changes)
The storage of the outcomes of __print_usage_pydra__, which would extend Generate interfaces #2808, and potentially not require any more developer effort (eg. have a single script that generates both the interfaces and the online documentation), and CI / code review would potentially highlight CLI changes not exposed by help page changes.
The additional content currently present in this PR.
Current discussion with @jdtournier may have overlooked the fact that there's a lot of additional content here over and above the direct outputs from __print_usage_pydra__. Which if your concern is developer maintenance burden, this would exacerbate that quite a lot.

I wondered about an intermediate solution where we do 1 and 2 but not 3. However I don't think that that solves @jdtournier's concerns, and it might not make @tclose's handling any cleaner.

cmd/amp2response.cpp

…trix3

tclose · 2024-02-29T07:36:24Z

Ooops, I just realised that in between switching between the old build script and the cmake version I inadvertently committed my old pre-cmake build directory to the branch, which might have been confusing things. After speaking to Rob today we decided to strip everything out of this PR except to the print_usage_pydra special methods in favour of storing the package stub in https://github.com/nipype/pydra-mrtrix3. However, if you don't mind, I have kept in the GH action workflow which generates the interfaces and runs the tests with the PARSE_ONLY env var set.

The Pydra export is actually Python code rather than JSON so it will need to be specific to Pydra if that is ok. Also, the interface syntax is likely/hopefully going to change a fair bit in the near future if my proposal is accepted, so we can probably just leave this PR as a WIP until then (I can build and release the package on PyPI with this dev branch in the mean time) and the first cmake release.

tclose · 2024-02-29T07:39:57Z

cmd/amp2response.cpp

@@ -57,7 +57,7 @@ void usage() {
  ARGUMENTS
    + Argument ("amps", "the amplitudes image").type_image_in()
    + Argument ("mask", "the mask containing the voxels from which to estimate the response function").type_image_in()
-    + Argument ("directions", "a 4D image containing the estimated fibre directions").type_image_in()
+    + Argument ("directions_image", "a 4D image containing the estimated fibre directions").type_image_in()


I changed this to avoid a clash with an option of the same name

(Hid / resolved prior post about the same issue)

Issue here is that the argument "directions" and the command-line option "directions" encode two very different pieces of information; so disambiguating by calling one an "image" may not be the optimal change. I'm hesitant to change the command-line option, as there are many commands relating to "amp" images that have an option "-directions" (even if it's not currently centralised with a const Option in core/ / src/). Conversely, the functionality of the positional argument here also appears in other commands; so renaming it to something different only here would also be slightly unusual.

A few possibilities come to mind:

Across various commands (thinking dwi2response in particular), where an image encoding a unit 3-vector per voxel is used, adopt "orientations" rather than "directions".

While it's technically applicable to the data type in terms of command naming convention, "peaks" would likely not be a good choice here, since can't have multiple vectors per voxel, and the data may have been derived using some approach other than ODF peak-finding.

Rename here (and preferably in other comparable instances also) to "fibre_directions".

Create a function that returns an Option, creating the current "-directions" command-line option with a custom help string per use (I'm pretty sure from experience it's the considerable difference in these help strings that has caused there to not be centralisation of this command-line option). Once this works, change the command-line option name to something like "-amp_directions".
(We might need to think carefully about our use of the header field "Directions" also...)

Lestropie · 2024-03-11T11:13:51Z

Add test data and tests for new interface generation capability (both C++ and Python)

Lestropie marked this pull request as draft June 29, 2023 04:44

tclose force-pushed the pydra-usage branch from 0a07600 to 8cb732d Compare July 12, 2023 06:37

This was referenced Jul 12, 2023

Python CLI changes #2678

Merged

Runtime test of command-line parsing #2679

Open

Lestropie mentioned this pull request Aug 11, 2023

Docs: Fix missing boolean types in full usage #2693

Open

tclose force-pushed the pydra-usage branch 3 times, most recently from cb2e711 to 0dff23a Compare August 22, 2023 07:44

This was referenced Jan 31, 2024

mrcat: Change positional argument structure #2792

Merged

Save software interfaces in multiple formats #2798

Open

Lestropie mentioned this pull request Feb 19, 2024

Generate interfaces #2808

Closed

tclose added 9 commits February 22, 2024 16:17

implemented print_pydra_usage for C++ commands

057980c

touch up imports in pydra_usage_print output

c8568b0

implementing type-aware output_file_template

ea474a5

fixed up output_file_template

cae2e23

added return statement to make gcc happy

e5c56f8

implemented python __print_usage_pydra__ option

1451c06

cleaned up pylint issues in app.py

3152260

debugged handling of type printing in python __print_usage_pydra__

35fbece

started to add in pydra-package to be auto-generated as nested direct…

192e30c

…ory of repo

tclose added 3 commits February 22, 2024 20:59

moved pydra src inside 'interfaces' directory

a6dfbee

added _version.py to gitignore

3350178

removed stray .DS_Store

86271ff

tclose force-pushed the pydra-usage branch from 025b718 to 86271ff Compare February 22, 2024 10:02

tclose added 12 commits February 23, 2024 10:00

replace 'input' and 'output' fields with "in_file" and "out_file" to …

ea3d590

…make it more pydronic

added pytest_cache to gitignore

8adb7ff

removed generated pydra files from git repo

9f41b6a

touch up

3fecc1f

updating pydra package github action

147993b

touched up pydra GH action workflow

76d6bd8

debugging pydra GH action

d1c2ddd

fix up

617d9b2

log generation errors instead of raising them to test rest of deploym…

6bfef76

…ent pipeline

touched up logging in generation script

8f90927

fix up GH action

520bcb1

fix up

834da49

tclose added 2 commits February 29, 2024 12:52

delete stray generated files

b5d2802

removed more stray files

2072799

tclose commented Feb 29, 2024

View reviewed changes

cmd/amp2response.cpp Outdated Show resolved Hide resolved

tclose added 3 commits February 29, 2024 18:20

stripped back pydra GH action to be testing only

5c30f14

deleted pydra package stub in favour of storing it in nipype/pydra-mr…

098361b

…trix3

removed flake8 config

0ff5bd1

Merge branch 'dev' into pydra-usage

b24273f

tclose commented Feb 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Implement Pydra code-generators #2665

WIP: Implement Pydra code-generators #2665

tclose commented Jun 29, 2023 •

edited

Loading

Lestropie commented Jun 29, 2023

tclose commented Jun 30, 2023 •

edited

Loading

Lestropie commented Jun 30, 2023

tclose commented Jun 30, 2023

tclose commented Jul 12, 2023

tclose commented Jul 12, 2023

Lestropie commented Aug 18, 2023

daljit46 commented Aug 24, 2023 •

edited

Loading

tclose commented Aug 25, 2023

Lestropie commented Feb 5, 2024

Lestropie commented Feb 28, 2024

tclose commented Feb 28, 2024

jdtournier commented Feb 28, 2024

Lestropie commented Feb 28, 2024

tclose commented Feb 29, 2024

tclose Feb 29, 2024

Lestropie Mar 1, 2024

Lestropie commented Mar 11, 2024

WIP: Implement Pydra code-generators #2665

Are you sure you want to change the base?

WIP: Implement Pydra code-generators #2665

Conversation

tclose commented Jun 29, 2023 • edited Loading

Tasks:

Blocked by:

Lestropie commented Jun 29, 2023

tclose commented Jun 30, 2023 • edited Loading

Lestropie commented Jun 30, 2023

tclose commented Jun 30, 2023

tclose commented Jul 12, 2023

tclose commented Jul 12, 2023

Lestropie commented Aug 18, 2023

daljit46 commented Aug 24, 2023 • edited Loading

tclose commented Aug 25, 2023

Lestropie commented Feb 5, 2024

Lestropie commented Feb 28, 2024

tclose commented Feb 28, 2024

jdtournier commented Feb 28, 2024

Lestropie commented Feb 28, 2024

tclose commented Feb 29, 2024

tclose Feb 29, 2024

Choose a reason for hiding this comment

Lestropie Mar 1, 2024

Choose a reason for hiding this comment

Lestropie commented Mar 11, 2024

tclose commented Jun 29, 2023 •

edited

Loading

tclose commented Jun 30, 2023 •

edited

Loading

daljit46 commented Aug 24, 2023 •

edited

Loading