Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doxygen-check pre-commit hook #11076

Merged
merged 13 commits into from
Jun 20, 2022

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Jun 8, 2022

  • Adds doxygen-check pre-commit hook (verbose)
    • doxygen-check passes with warning if doxygen is not present
    • doxygen-check passes with warning if supported doxygen version 1.8.20 to 1.9.1 is not present
    • doxygen-check fails if any warning or error is thrown by doxygen
    • since pre-commit runs as part of style check, it's removed from style.sh
  • updated Doxyfile configuration to work with versions 1.8.20 to 1.9.1
  • added doxygen=1.8.20 to conda dev environment
  • mentioned doxygen.sh in CONTRIBUTING.md guide
    Related to [FEA] Add Doxygen CI check  #9373 (Add Doxygen CI check)

@karthikeyann karthikeyann added feature request New feature or request 3 - Ready for Review Ready for review by team gpuCI libcudf Affects libcudf (C++/CUDA) code. 4 - Needs Review Waiting for reviewer to review or respond non-breaking Non-breaking change labels Jun 8, 2022
@karthikeyann karthikeyann requested a review from a team as a code owner June 8, 2022 06:55
@karthikeyann karthikeyann self-assigned this Jun 8, 2022
@github-actions github-actions bot removed the libcudf Affects libcudf (C++/CUDA) code. label Jun 8, 2022
@github-actions github-actions bot added the conda label Jun 8, 2022
@karthikeyann karthikeyann requested a review from bdice June 8, 2022 18:58
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @karthikeyann. This is going to work great with a few alterations. Comments below.

@@ -0,0 +1,18 @@
#!/bin/bash
Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my knowledge, there is currently no way to let pre-commit manage the doxygen executable it uses, which means we have to use language: system. (This is solved for clang-format by this project: https://github.com/ssciwr/clang-format-wheel)

I don't think we have any hard dependencies on packages outside of what pre-commit manages right now. I would adapt this script to fail cleanly (exit 0) if doxygen is not installed. It will still fail in CI. I believe that is how the cmake-format script works already (with some additional details that you can see by reading it).

@@ -55,13 +55,12 @@ HEADER_META_RETVAL=$?
echo -e "$HEADER_META"

# Run doxygen and check for missing documentation
DOXYGEN=`cd cpp/doxygen && { cat Doxyfile ; echo QUIET = YES; echo GENERATE_HTML = NO; } | doxygen - 2>&1 | tail -n +3`
DOXYGEN=`ci/checks/doxygen.sh`
Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section can be removed. With this PR, the invocation of pre-commit above will call doxygen for us.

The clang-format check is also redundant with pre-commit now, but that hasn't been removed yet either. We should remove that in a separate PR -- we'll need to survey the team to find if anyone relies on the Python script to manually call clang-format. It is no longer needed by our CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit checks could be skipped pretty easily. so, we will need a CI check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that our CI style check already calls pre-commit explicitly. The CI does not manually invoke black, isort, flake8, etc. It just runs this pre-commit configuration.

cudf/ci/checks/style.sh

Lines 23 to 24 in a00cca6

pre-commit run --hook-stage manual --all-files
PRE_COMMIT_RETVAL=$?

If we add this as a pre-commit hook, we don't need to call it once via pre-commit and again manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit will not run doxygen if there are no file changes in cpp/include in that PR. It may not run always.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI command above is invoked with --all-files for this reason. It will always run, as we intend.

Try pre-commit run --hook-stage manual --all-files locally and you can see for yourself.

Copy link
Contributor Author

@karthikeyann karthikeyann Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed doxygen check in style.sh, updated pre-commit to run in manual mode too (it did not run earlier since stages 'commit' was specified).

@@ -43,6 +43,7 @@ dependencies:
- black=22.3.0
- isort=5.6.4
- mypy=0.782
- doxygen=1.8.20
Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest on conda-forge is 1.9.2. Is that supported? If so, I would update this pinning. If not, we should try to make our code compatible with the latest release. https://anaconda.org/conda-forge/doxygen

Suggested change
- doxygen=1.8.20
- doxygen=1.9.2

Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed 1.9.2 and got these warnings.

warning: Tag 'OUTPUT_TEXT_DIRECTION' at line 102 of file '-' has become obsolete.             
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'COLS_IN_ALPHA_INDEX' at line 1089 of file '-' has become obsolete.              
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'LATEX_SOURCE_CODE' at line 1836 of file '-' has become obsolete.                
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'RTF_SOURCE_CODE' at line 1926 of file '-' has become obsolete.                  
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'DOCBOOK_PROGRAMLISTING' at line 2031 of file '-' has become obsolete.           
         To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"

Since this is a language: system package that is not managed by pre-commit, we will need to make sure it works for slightly-older and current versions. At the time of writing:

I'd say that supporting 1.8.13 and newer is an ideal goal. We may need to add filters for warnings that arise in 1.9.x, or simply call doxygen -u to update the configuration if it is still correct for 1.8.x.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using doxygen 1.8.20 because that's the version use to generate https://docs.rapids.ai/api/libcudf/nightly/

Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit is not managing this dependency and uses a system (or conda) installation of doxygen, whatever is on the user's path. Users of Ubuntu 22.04 who do not use a conda environment to build would likely be fetching the system-installed apt package for doxygen 1.9.1. Those users would get warnings and the pre-commit check would fail. Regardless of whether users are in a conda environment, they can make use of all other pre-commit hooks because they are self-managed by pre-commit itself. Because this hook cannot be self-managed by pre-commit, we need to take care of two small things:

  • Users without doxygen installed must be able to make pre-commit pass locally (that's my suggestion above for exit 0 if doxygen isn't found). In this case, CI (which has doxygen installed) will catch any issues.
  • Users with newer versions of doxygen must be able to make pre-commit pass locally (because we cannot pin a dependency that is not managed by pre-commit). I think we can easily adapt our configuration to be compatible with both 1.8.x and 1.9.x, and that should solve 95% of the potential problems here.

If you'd like to see a concrete solution, I can push a small commit to help resolve both issues.

Copy link
Contributor

@bdice bdice Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to add -- if we solve the two problems above (no doxygen / new doxygen), it's totally fine to pin this the same way as our official documentation builds (1.8.20).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the requirement about system python. That's the reason for using wheel version of clang-format too!

Copy link
Contributor

@bdice bdice Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! First, we prefer pre-commit hooks that are Python packages (or pip-installable binaries like clang-format, which doesn't have any Python code but is still pip-installable). Second, we prefer language: system as a fallback option that may require additional work. From the list of Supported Languages, we really only want python or system (or possibly script?). All the others require the system to have other tools installed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from unsupported tag warnings, there are many bogus warning in older versions <1.8.20.
Adding regex to ignore all those warnings is a hacky approach, and not recommended. I will enforce the doxygen version check in ci/check/doxygen.sh.

Regarding updating the Doxyfile configuration for different version, unsupported tags alone can be removed to not throw unsupported tag warnings on version from 1.8.20 to 1.9.1.
For doxygen versions >1.9.1 there are bogus warnings again. So, we cannot support them yet. (for example, default constructor without argument is throwing warning that their parameters are not all documented)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will enforce the doxygen version check in ci/check/doxygen.sh.

That sounds good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR updated. Please re-review

.pre-commit-config.yaml Show resolved Hide resolved
@karthikeyann karthikeyann requested a review from bdice June 8, 2022 20:33
@karthikeyann karthikeyann requested a review from a team as a code owner June 14, 2022 20:25
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jun 14, 2022
@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.08@a00cca6). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.08   #11076   +/-   ##
===============================================
  Coverage                ?   86.34%           
===============================================
  Files                   ?      144           
  Lines                   ?    22741           
  Branches                ?        0           
===============================================
  Hits                    ?    19635           
  Misses                  ?     3106           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a00cca6...e0a44c7. Read the comment docs.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to exit gracefully if doxygen versions don't match (or doxygen isn't found).

if [ $(version "$DOXYGEN_VERSION") -lt $(version "1.8.20") ] || [ $(version $DOXYGEN_VERSION) -gt $(version "1.9.1") ]; then
echo -e "Unsupported doxygen version $DOXYGEN_VERSION"
echo -e "Expecting doxygen version from 1.8.20 to 1.9.1"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if doxygen is not installed?
What happens if an incompatible version is installed?

I would prefer for those cases to pass (in other words, silently skip this check if it cannot be completed correctly with the user's installed software). Any issues will be caught by CI. Otherwise users have to manually override pre-commit's failure in order to commit.

This is the same strategy we adopt for cmake-format: https://github.com/rapidsai/cudf/blob/branch-22.08/cpp/scripts/run-cmake-format.sh

We need a way to invoke CMake linting commands without causing pre-commit failures (which could block local commits or CI)

... exits gracefully if the file is not found.

echo "The rapids-cmake cmake-format configuration file was not found at any of the default search locations: "
echo ""
( IFS=$'\n'; echo "${DEFAULT_FORMAT_FILE_LOCATIONS[*]}" )
echo ""
echo "Try setting the environment variable RAPIDS_CMAKE_FORMAT_FILE to the path to the config file."
exit 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if doxygen is not installed?
Fails
What happens if an incompatible version is installed?
Fails

If doxygen-check passes when doxygen is not installed or with unsupported version, I am concerned that users might be confused that local pre-commit doxygen-check passes, but pre-commit in CI failed.
Since pre-commit swallows STDOUT and STDERR when exit code is zero, users may not know the reason for local doxygen-check pass. IMO, it's showing different behavior in CI vs local machine silently, which ideally should not happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already lean towards letting things pass locally and fail in CI rather than putting requirements on users to install all system dependencies just to run basic code formatters on their local system. Python users in particular shouldn’t be required to install doxygen if they only make a small C++ tweak. I’d like to hear what @vyasr or @PointKernel think since they’ve also worked with pre-commit hooks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it is not possible to satisfy this version requirement (1.8.20 <= version <= 1.9.1) using apt on Ubuntu 18.04 or 20.04. Only Ubuntu 22.04 has a compatible version (doxygen 1.9.1) and that could be upgraded to an incompatible version in the future at any time. The lack of graceful exit means that users effectively must use conda to satisfy the doxygen requirement (or build doxygen from source), which we specifically sought to avoid in the past. Users should be able to make pre-commit pass with software from only a system package manager, with no conda required. Otherwise users will simply not utilize pre-commit and benefit from the majority of features that do work on all systems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already lean towards letting things pass locally and fail in CI rather than putting requirements on users to install all system dependencies

Can you please provide more details on how this decision was made? I personally find it's impractical that sometimes my local pre-commit checks passed but then the CI failed. e.g. some of the cmake format failures, if not found by the local pre-commit hook, were obvious but others took me a while to "guess" where went wrong. If users choose to contribute to the code, I think it's fair to require them to install all the proper dependencies.

Python users in particular shouldn’t be required to install doxygen if they only make a small C++ tweak.

If they commit the "tweak", enforcing doxygen check is a reasonable requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the errors come at once. No early exit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d support exploring the build-from-source option within pre-commit.

Do you mean, build-doxygen-from-source?
It brings doxygen's compilation problems into pre-commit installation. I would support doxygen binaries through pip-installable wheels. https://sourceforge.net/projects/doxygen/files/ is actively maintained. so it's possible to implement similar to clang-format wheels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused here.

If we're going to require a specific version of a thing, we should automate getting it. We did this with clang-format.

By "we did it with clang-format", do you mean in the form of our conda environment? If so, that's still the case. Someone creating a custom environment for building cudf (or using compose) will get the correct version. If someone tries to use their own environment, though, they're on their own. That is also identical with how clang-format worked before.

I wonder if pre-commit hooks can execute code on installation (or build on first run?). That would be equally as good as pip-installable wheels.

I don't know how much control you have over the installation step, but you can execute basically arbitrary code on first run to do the installation if it doesn't already exist (in some prespecified directory). I wouldn't say that it's equally as good, though. Compiling code within a pre-commit hook immediately opens you up to any sort of build failure, which is also essentially out of your control to debug.

However, there is already a model for making wheels for non-Python packages in the clang-format wheels so I might lean towards trying that way first.

This is the better approach if we feel strongly about doxygen needing to always be available in this fashion. If we're already going to do the work to build it in a pre-commit hook, we're most of the way there to building the wheels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We did this with clang-format"

clang-format binary is packaged as pip installable repo. ssciwr/clang-format-wheel). So it avoided using creating conda environment.
My first idea is to use conda for automate installing specific version of doxygen. but it creates additional dependency to have conda installed in the system, which we don't like to put these additional conditions (conda) on the developer.

@jrhemstad is suggesting to create a pip installable doxygen repo which can install doxygen binary, similar to clang-format.
@jrhemstad are you suggesting to build doxygen-from-source in pre-commit installation OR in pip installable doxygen repo releases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d echo all of what @vyasr said here: #11076 (comment)

Building from source on the first run of the pre-commit hook is not ideal — I only proposed exploring it because it was a different approach than previous ideas. Packaging doxygen binary wheels is the way I’d like to go if we decide to move forward here.

@karthikeyann karthikeyann requested a review from bdice June 16, 2022 11:57
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once we have consensus on the exit code for local runs of pre-commit when doxygen is not installed / wrong versions. edit: resolved by verbose: true and exit 0. I really appreciate your hard work on this, it’s tricky to consider all the cases but it will be very nice for maintaining the quality of the docs.

@PointKernel
Copy link
Member

Do we need to add a section in contributing guide mentioning how Doxygen check works in cudf?

@karthikeyann
Copy link
Contributor Author

Do we need to add a section in contributing guide mentioning how Doxygen check works in cudf?

Added a line about ci/checks/doxygen.sh as linter.

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Thanks @karthikeyann !

@jrhemstad
Copy link
Contributor

Can we update the PR description with a little more detail on what this will actually do?

@davidwendt
Copy link
Contributor

Is the required/recommended doxygen version documented anywhere?

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with one minor style suggestion.

function version { echo "$@" | awk -F. '{ printf("%d%03d%03d%03d\n", $1,$2,$3,$4); }'; }

# doxygen supported version 1.8.20 to 1.9.1
DOXYGEN_VERSION=`doxygen --version`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DOXYGEN_VERSION=`doxygen --version`
DOXYGEN_VERSION=$(doxygen --version)

For consistency with the rest of the script, we should switch this to use $() instead of ``.

@vyasr
Copy link
Contributor

vyasr commented Jun 17, 2022

Is the required/recommended doxygen version documented anywhere?

For that matter, is the required clang-format version documented anywhere (I do not consider environment or pre-commit files to be documentation)? We probably want both.

Also as discussed above, we should change the cmakelang hooks to also use verbose: true. That's not really in scope for this PR, so I'd be happy to see that done after this one is done, just don't want to see it fall off the radar.

@karthikeyann
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit ff1f9d3 into rapidsai:branch-22.08 Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants