Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup suggested by Linux kernel's coccicheck #14372

Closed
wants to merge 9 commits into from
Closed

Conversation

ryao
Copy link
Contributor

@ryao ryao commented Jan 10, 2023

Motivation and Context

Upon reading an article discussing the PVS Studio's developers most recent analysis of the Linux kernel, I noticed that it spoke well of Linux's "Coccinelle analyzer":

https://pvs-studio.com/en/blog/posts/cpp/0858/

I had not considered Coccinelle to be a static analyzer since its primarily purpose is semantic patching, but it appears that the Linux kernel has adopted it as a QA tool for detecting patterns in the kernel source code that probably merit cleanup:

https://www.kernel.org/doc/html/latest/dev-tools/coccinelle.html

Inspired by that, I tried running it on the OpenZFS kernel modules. Its devm_free.cocci, pci_free_consistent.cocci and of_table.cocci checks caused the spatch tool to crash. However, the rest of the checks worked. It only succeeded in identifying various cleanup opportunities. It tried to find some actual bugs, since it claimed to have caught 3 NULL pointer dereferences, but it was unsuccessful since the reports were false positives. In specific, it reported three NULL pointer dereferences on this:

                        IMPLY(lwb != NULL, 
                            lwb->lwb_state == LWB_STATE_ISSUED ||
                            lwb->lwb_state == LWB_STATE_WRITE_DONE ||
                            lwb->lwb_state == LWB_STATE_FLUSH_DONE);

However, it reported none on this:

IMPLY(lwb != NULL, lwb->lwb_state != LWB_STATE_CLOSED);

Interestingly, it scanned the FreeBSD code in the tree, and reported a few cleanup possibilities in it.

I have addressed most of the reports that it made. The ones that I did not address were:

  • 9 suggestions to use swap() - that is not cross platform compatible
  • 7 suggestions to use ARRAY_SIZE() - this appears to be cross platform compatible, but I wanted to double check the safety of our ARRAY_SIZE macro in the contexts it identified and was busy with other things.
  • 4 suggestions that a NULL check before some freeing functions is not needed. - I am not sure if this is cross platform compatible
  • 1 suggestion to use kvfree() in the SPL - this is not compatible with Linux 3.10; we might as well continue doing the equivalent of kvfree() to avoid unnecessary code bloat.
  • 5 nonsensical suggestions from ./scripts/coccinelle/api/stream_open.cocci that are likely a bug in either the .cocci file or in spatch.
  • 1 suggestion to use MIN() - 2 of these suggestions were taken elsewhere, but not in the zstd code since the headers currently used do not provide MIN(). More details are in the relevant commit message.
  • 1 suggestion to use kmem_cache_zalloc() instead of kmem_cache_alloc()+memset() - kmem_cache_zalloc() is non-portable and while we could use a constructor, we might as well just continue using memset() to zero memory since that avoids a pointer indirection.
  • 1 suggestion to convert ->i_count in struct inode from atomic_t to refcount_t - this is up to mainline Linux. I would not be surprised if they change this at some point.
  • The 3 false positives where it claimed IMPLY had 3 NULL pointer dereferences.

Description

The changes are documented in the patch commit messages.

How Has This Been Tested?

A local build test has been done.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

The Linux 5.16.14 kernel's coccicheck caught these. The semantic patch
that caught them was:

./scripts/coccinelle/api/alloc/alloc_cast.cocci

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic patch
that caught it was:

./scripts/coccinelle/api/alloc/zalloc-simple.cocci

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

The Linux kernel's documentation makes a good case for why we should not
use these:

https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Richard Yao <[email protected]>
fdc2d30 accidentally broke the
indentation.

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

However, unlike the cases where the GNU zero length array extension had
been used, coccicheck would not suggest patches for the older style
single member arrays. That was good because blindly changing them would
break size calculations in most cases.

Therefore, this required care to make sure that we did not break size
calculations. In the case of `indirect_split_t`, we use
`offsetof(indirect_split_t, is_child[is->is_children])` to calculate
size. This might be subtly wrong according to an old mailing list
thread:

https://inbox.sourceware.org/gcc-prs/[email protected]/T/

That is because the C99 specification should consider the flexible array
members to start at the end of a structure, but compilers prefer to put
padding at the end. A suggestion was made to allow compilers to allocate
padding after the VLA like compilers already did:

http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n983.htm

However, upon thinking about it, whether or not we allocate end of
structure padding does not matter, so using offsetof() to calculate the
size of the structure is fine, so long as we do not mix it with sizeof()
on structures with no array members.

In the case that we mix them and padding causes offsetof(struct_t,
vla_member[0]) to differ from sizeof(struct_t), we would be doing unsafe
operations if we underallocate via `offsetof()` and then overcopy via
sizeof().

Signed-off-by: Richard Yao <[email protected]>
In zfs_zaccess_dataset_check(), we have the following subexpression:

(!IS_DEVVP(ZTOV(zp)) ||
    (IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS)))

When !IS_DEVVP(ZTOV(zp)) is false, IS_DEVVP(ZTOV(zp)) is true under the
law of the excluded middle since we are not doing pseudoboolean alegbra.
Therefore doing:

(IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS))

Is unnecessary and we can just do:

(v4_mode & WRITE_MASK_ATTRS)

The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/excluded_middle.cocci

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/minmax.cocci

There was a third opportunity to use `MIN()`, but that was in
`FSE_minTableLog()` in `module/zstd/lib/compress/fse_compress.c`.
Upstream zstd has yet to make this change and I did not want to change
header includes just for MIN, or do a one off, so I left it alone.

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/semicolon.cocci

Signed-off-by: Richard Yao <[email protected]>
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/null/badzero.cocci

Signed-off-by: Richard Yao <[email protected]>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup all looks reasonable to me.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Jan 11, 2023
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic patch
that caught it was:

./scripts/coccinelle/api/alloc/zalloc-simple.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

The Linux kernel's documentation makes a good case for why we should not
use these:

https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
fdc2d30 accidentally broke the
indentation.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

However, unlike the cases where the GNU zero length array extension had
been used, coccicheck would not suggest patches for the older style
single member arrays. That was good because blindly changing them would
break size calculations in most cases.

Therefore, this required care to make sure that we did not break size
calculations. In the case of `indirect_split_t`, we use
`offsetof(indirect_split_t, is_child[is->is_children])` to calculate
size. This might be subtly wrong according to an old mailing list
thread:

https://inbox.sourceware.org/gcc-prs/[email protected]/T/

That is because the C99 specification should consider the flexible array
members to start at the end of a structure, but compilers prefer to put
padding at the end. A suggestion was made to allow compilers to allocate
padding after the VLA like compilers already did:

http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n983.htm

However, upon thinking about it, whether or not we allocate end of
structure padding does not matter, so using offsetof() to calculate the
size of the structure is fine, so long as we do not mix it with sizeof()
on structures with no array members.

In the case that we mix them and padding causes offsetof(struct_t,
vla_member[0]) to differ from sizeof(struct_t), we would be doing unsafe
operations if we underallocate via `offsetof()` and then overcopy via
sizeof().

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
In zfs_zaccess_dataset_check(), we have the following subexpression:

(!IS_DEVVP(ZTOV(zp)) ||
    (IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS)))

When !IS_DEVVP(ZTOV(zp)) is false, IS_DEVVP(ZTOV(zp)) is true under the
law of the excluded middle since we are not doing pseudoboolean alegbra.
Therefore doing:

(IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS))

Is unnecessary and we can just do:

(v4_mode & WRITE_MASK_ATTRS)

The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/excluded_middle.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/minmax.cocci

There was a third opportunity to use `MIN()`, but that was in
`FSE_minTableLog()` in `module/zstd/lib/compress/fse_compress.c`.
Upstream zstd has yet to make this change and I did not want to change
header includes just for MIN, or do a one off, so I left it alone.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/semicolon.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
behlendorf pushed a commit that referenced this pull request Jan 13, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/null/badzero.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught these. The semantic patch
that caught them was:

./scripts/coccinelle/api/alloc/alloc_cast.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic patch
that caught it was:

./scripts/coccinelle/api/alloc/zalloc-simple.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

The Linux kernel's documentation makes a good case for why we should not
use these:

https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
fdc2d30 accidentally broke the
indentation.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

However, unlike the cases where the GNU zero length array extension had
been used, coccicheck would not suggest patches for the older style
single member arrays. That was good because blindly changing them would
break size calculations in most cases.

Therefore, this required care to make sure that we did not break size
calculations. In the case of `indirect_split_t`, we use
`offsetof(indirect_split_t, is_child[is->is_children])` to calculate
size. This might be subtly wrong according to an old mailing list
thread:

https://inbox.sourceware.org/gcc-prs/[email protected]/T/

That is because the C99 specification should consider the flexible array
members to start at the end of a structure, but compilers prefer to put
padding at the end. A suggestion was made to allow compilers to allocate
padding after the VLA like compilers already did:

http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n983.htm

However, upon thinking about it, whether or not we allocate end of
structure padding does not matter, so using offsetof() to calculate the
size of the structure is fine, so long as we do not mix it with sizeof()
on structures with no array members.

In the case that we mix them and padding causes offsetof(struct_t,
vla_member[0]) to differ from sizeof(struct_t), we would be doing unsafe
operations if we underallocate via `offsetof()` and then overcopy via
sizeof().

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
In zfs_zaccess_dataset_check(), we have the following subexpression:

(!IS_DEVVP(ZTOV(zp)) ||
    (IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS)))

When !IS_DEVVP(ZTOV(zp)) is false, IS_DEVVP(ZTOV(zp)) is true under the
law of the excluded middle since we are not doing pseudoboolean alegbra.
Therefore doing:

(IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS))

Is unnecessary and we can just do:

(v4_mode & WRITE_MASK_ATTRS)

The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/excluded_middle.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/minmax.cocci

There was a third opportunity to use `MIN()`, but that was in
`FSE_minTableLog()` in `module/zstd/lib/compress/fse_compress.c`.
Upstream zstd has yet to make this change and I did not want to change
header includes just for MIN, or do a one off, so I left it alone.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/semicolon.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Mar 3, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/null/badzero.cocci

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
datacore-rm pushed a commit to DataCoreSoftware/openzfs that referenced this pull request Aug 14, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/minmax.cocci

There was a third opportunity to use `MIN()`, but that was in
`FSE_minTableLog()` in `module/zstd/lib/compress/fse_compress.c`.
Upstream zstd has yet to make this change and I did not want to change
header includes just for MIN, or do a one off, so I left it alone.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#14372
tonyhutter pushed a commit that referenced this pull request Sep 27, 2023
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:

./scripts/coccinelle/misc/flexible_array.cocci

However, unlike the cases where the GNU zero length array extension had
been used, coccicheck would not suggest patches for the older style
single member arrays. That was good because blindly changing them would
break size calculations in most cases.

Therefore, this required care to make sure that we did not break size
calculations. In the case of `indirect_split_t`, we use
`offsetof(indirect_split_t, is_child[is->is_children])` to calculate
size. This might be subtly wrong according to an old mailing list
thread:

https://inbox.sourceware.org/gcc-prs/[email protected]/T/

That is because the C99 specification should consider the flexible array
members to start at the end of a structure, but compilers prefer to put
padding at the end. A suggestion was made to allow compilers to allocate
padding after the VLA like compilers already did:

http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n983.htm

However, upon thinking about it, whether or not we allocate end of
structure padding does not matter, so using offsetof() to calculate the
size of the structure is fine, so long as we do not mix it with sizeof()
on structures with no array members.

In the case that we mix them and padding causes offsetof(struct_t,
vla_member[0]) to differ from sizeof(struct_t), we would be doing unsafe
operations if we underallocate via `offsetof()` and then overcopy via
sizeof().

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants