Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement v1 file format [WIP] #299

Draft
wants to merge 269 commits into
base: master
Choose a base branch
from
Draft

Implement v1 file format [WIP] #299

wants to merge 269 commits into from

Conversation

t7phy
Copy link
Member

@t7phy t7phy commented Jul 12, 2024

Feature TODOs:

  • Generalize Channel to support arbitrarily many PIDs. This is implemented with commits f32902a, 2417c92 and 03f2dc6.
  • The structure Order should support a fragmentation scale. Started with commits 5e281ae and 02e48e9.
  • The CLI should support varying this new fragmentation scale. We need the following point variations: 1 (no scale variation), 3 (vary all scales with the same factor), 7 and 9 (assuming that there's no fragmentation scale), and 17 and 27 (assuming there's a fragmentation scale)
  • Extend the evolution code to take care of (at least) one, two and three convolutions with possibly one, two and three different EKOs
  • Finally, we need new subgrid types that support possibly arbitrarily many scales and momentum fractions and which support filling or are optimized for space. We should use the PackedArray struct that we wrote some time ago.
  • To fix the tests, we need import v0 PineAPPL grids and convert the subgrids into the new subgrid types.
  • Add machine-readable MC results and uncertainties to calculate interpolation error #270

Code TODOs:

  • Fill in all the TODO comments.
  • Change types of members of Order to u8
  • Generalize channel! to arbitrarily many PIDs
  • Support arbitrarily many scales and fix import of flexible-scale fastNLO tables
  • Test evolution for flexible-scale fastNLO tables
  • Simplify bin treatment: remove BinInfo, and have instead fill limits (1d limits that only concern Grid::fill) and bin limits (n-dimensional limits)
  • Add fragmentation scales to Grid::evolve_info
  • Make it possible to have different initial factorization and fragmentation scales in an FK-table
  • Rename lagrange_subgrid
  • Add support for static scale detection in previous subgrid type
  • Rename PackedQ1X2SubgridV1
  • Remove NodeValues
  • Remove Mu2
  • Implement D=3 for general_tensor_mul in evolution.rs
  • In Grid::merge check if grids are compatible with each other before merging
  • Check whether there are similar functions

CAPI TODO:

  • increase code coverage
  • unbreak GridOptFlags (see commit 54a59f3)
  • rename pineappl_channel_* to pineappl_channels_*?

Fortran module TODO:

  • add new functions

Python API:

  • update to pyo3-0.22.5, this possible now since numpy-0.22.0 was released. Figure out what new features we can leverage from 0.22.x (see its CHANGELOG):
    • numpy-2 support
    • declarative modules and submodules
    • Python 3.13
    • recheck whether we still need #![allow(unsafe_op_in_unsafe_fn)]
  • Update tutorials
  • raise minimum required Python version to 3.7 in pyproject.toml

@t7phy t7phy linked an issue Jul 12, 2024 that may be closed by this pull request
15 tasks
@cschwan cschwan linked an issue Jul 31, 2024 that may be closed by this pull request
5 tasks
@cschwan cschwan removed a link to an issue Jul 31, 2024
15 tasks
@cschwan
Copy link
Contributor

cschwan commented Jul 31, 2024

I believe the list shown in #118 is probably a bit too ambitious for this pull request. Realistically, we should implement all the items listed in #172.

@cschwan cschwan changed the title makes the necessary changes for v1 format [WIP] Implement v1 file format [WIP] Aug 6, 2024
/// slices : Iterable
/// list of (PyOperatorSliceInfo, 5D array) describing each convolution
/// slices : list(list(tuple(PyOperatorSliceInfo, PyReadOnlyArray4)))
/// list of EKOs where each element is a list of (PyOperatorSliceInfo, 5D array)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// list of EKOs where each element is a list of (PyOperatorSliceInfo, 5D array)
/// list of EKOs where each element is a list of (PyOperatorSliceInfo, 4D array)

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is indeed a 4D,

Btw, this is what I briefly asked in yesterday's CM re pineko (to which I now know the answer). For the time being, each operator is passed as a vector but it'll be changed to a generator.

@Radonirinaunimi
Copy link
Member

There are two roughly things:

1. the submodule `register` functions add some manual `import sys; sys.modules['...'] = m`. I wonder whether this has been fixed/added to the new PyO3 version

2. If that's the case, we could try to completely get rid of the `register` function by using the declarative syntax.

I've pondered on this (cc comment) for while now and I don't think there is a way (with all so far PyO3 versions) to use declarative modules and have all of the following works:

import pineappl
from pineappl import grid
from pineappl.grid import Grid

So if we still want to have all of the nested modules to get expanded, we are stuck with the current approach.

@Radonirinaunimi
Copy link
Member

The python and C interfaces are more or less at a good stage now, so I'll them at these for the time being. If there is anything I can help with the Fortran part or the main PineAPPL crate just let me know.

- in the column 'svmaxreldiff' only show differences from scale-varied
  results, not from the central result
- preserve sign which was previously stripped away
@janw20
Copy link
Collaborator

janw20 commented Nov 15, 2024

Hi, I completed the Fortran interface. I made some (minor) design decisions that I would like your opinion on:

  • Since Kinematics is an enum with tuple-like variants, it is converted to a C struct holding a tag variable and a C union. Fortran doesn't support unions, so I use a Fortran derived type (pineappl_kinematics) that also has a tag variable, but only one integer variable for the index instead of the C union. I found this to be the cleanest solution, or do you have a better idea?
  • For some reason, Fortran cannot do arrays of procedure pointers, so I created a new derived type for the PDF and alpha_s functions, pineappl_xfx and pineappl_alphas, that each holds the procedure pointer as a member named proc. This is needed so that an array of PDF functions can be passed to pineappl_grid_convolve. For consistency, I also changed this in pineappl_grid_convolve_with_one and pineappl_grid_convolve_with_two. This goes against keeping backward compatibility with the 0.x of Pineappl versions, but until there is a solution to Investigate if we can compile/install the Fortran module #257, I figured that the user would make their own copy of pineappl.f90 anyway and use that, so backward compatibility is not that important in the Fortran case.

I also noticed two small bugs in the v1 C++ examples which I fixed in my first commit. Related to that: Is there a reason why pineappl_grid_new creates the kinematics in the order (Q², x₁, x₂), and not (x₁, x₂, Q²), as was always the order in, e.g., pineappl_grid_fill? I think this might have been the reason for the bug in examples/cpp/fill-grid-v1.cpp line 113

@Radonirinaunimi
Copy link
Member

Hi @janw20, thanks a lot for completing the Fortran interface!

Hi, I completed the Fortran interface. I made some (minor) design decisions that I would like your opinion on:

* Since `Kinematics` is an enum with tuple-like variants, it is converted to a C struct holding a tag variable and a C union. Fortran doesn't support unions, so I use a Fortran derived type (`pineappl_kinematics`) that also has a tag variable, but only one integer variable for the index instead of the C union. I found this to be the cleanest solution, or do you have a better idea?

* For some reason, Fortran cannot do arrays of procedure pointers, so I created a new derived type for the PDF and alpha_s functions, `pineappl_xfx` and `pineappl_alphas`, that each holds the procedure pointer as a member named `proc`. This is needed so that an array of PDF functions can be passed to `pineappl_grid_convolve`. For consistency, I also changed this in `pineappl_grid_convolve_with_one` and `pineappl_grid_convolve_with_two`. This goes against keeping backward compatibility with the 0.x of Pineappl versions, but until there is a solution to [Investigate if we can compile/install the Fortran module #257](https://github.com/NNPDF/pineappl/issues/257), I figured that the user would make their own copy of `pineappl.f90` anyway and use that, so backward compatibility is not that important in the Fortran case.

I had a look at it and it looks good. I don't think there is a cleanest a more robust to approach this apart from what is being done now (actually, this was how I also envisioned how it'd look like).

I also noticed two small bugs in the v1 C++ examples which I fixed in my first commit.

Thanks for spotting and fixing this bug! I had this under my radar before but somehow forgot to fix it (maybe I remembered to only do so on the Python API).

Related to that: Is there a reason why pineappl_grid_new creates the kinematics in the order (Q², x₁, x₂), and not (x₁, x₂, Q²), as was always the order in, e.g., pineappl_grid_fill? I think this might have been the reason for the bug in examples/cpp/fill-grid-v1.cpp line 113

I don't think there is a real reason why the kinematics have to be ordered in such a way. The restriction is coming from these lines:

for entry in channel.entry() {
// TODO: we assume `idx` to be ordered as scale, x1, x2
let fx_prod = cache.as_fx_prod(&entry.0, order.alphas, &idx);
lumi += fx_prod * entry.1;
}

which has a TODO, so likely this will change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants