Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong drive format on Unix #107365

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

MojtabaTajik
Copy link
Contributor

@MojtabaTajik MojtabaTajik commented Sep 4, 2024

Add reading filesystem type from /proc/self/mountinfo on Linux and use sys call as the fallback approach.

  • Prioritize reading the filesystem type from /proc/self/mountinfo on Linux.
  • Retain the existing P/Invoke fallback for non-Linux platforms and as a fallback on Linux.
  • Added try/catch block to handle potential but unlikely access issues or unrecognized data.
    Although /proc/self/mountinfo is process-specific and should generally be accessible,
    this ensures robustness in case of unexpected errors (e.g., temporary unavailability, corrupted data).
  • Updated documentation to reflect the reasoning and kernel behaviour regarding procfs on different platforms.

No breaking changes are expected. The refactor should be functionally equivalent but more reliable and efficient.

FIX #95099

Change GetFormatInfoForMountPoint method to "use /proc/self/mountinfo" file

- Replaced system calls (`statfs`/`statvfs`) with a simplified parsing of `/proc/self/mountinfo`.
- Removed native SystemNative_GetFormatInfoForMountPoint as we don't need anymore
- Parsing the mount info file allows for efficient retrieval of filesystem format and type.
- No functional changes; the method continues to return the correct filesystem format and drive type for the given mount point.

FIX dotnet#95099
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 4, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 4, 2024
@huoyaoyuan huoyaoyuan added area-System.IO and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Sep 5, 2024
@am11
Copy link
Member

am11 commented Sep 5, 2024

with a more simplified approach that parses /proc/self/mountinfo to retrieve filesystem information

  1. procfs can be unavailable on linux. It can also be misbehaving and we have made runtime implementation resilient for such cases in number of places using fallbacks.

  2. procfs is also optional in FreeBSD. It is not enabled by default.

  3. On SunOS et al. procfs is different; provides binary interface instead of text-based one.

  4. there is no procfs on macOS.

Given all that, we should not remove this common P/Invoke call, but keep it as primary source for non-linux and as a fallback on linux.

@MojtabaTajik
Copy link
Contributor Author

MojtabaTajik commented Sep 5, 2024

@am11 Should we go by the "mountinfo" first and go for that statfs syscall if that failed?
In that case, I think checking against "/proc/self/mountinfo" file exists is enough for a fallback mechanism, yes?

In the case of something like SunOS, where the symlink is represented as a binary format, how should we handle it? We may need to check the operating system or implement a mechanism for the parser method to fail gracefully, triggering a fallback syscall as a recovery option.

@am11
Copy link
Member

am11 commented Sep 5, 2024

@am11 Should we go by the "mountinfo" first and go for that statfs syscall if that failed? In that case, I think checking against "/proc/self/mountinfo" file exists is enough for a fallback mechanism, yes?

Problem with File.Exists is that we still may not be able to read the file due to access issues. The reliable way is to actually read the file and TryParse whatever comes back. If it doesn't exist or the data is unavailable or unrecognized, fallback. We could use one of the existing patterns: #62513 (simplest would be: try/catch { /* ignore */ }).

In the case of something like SunOS, where the symlink is represented as a binary format, how should we handle it? We may need to check the operating system or implement a mechanism for the parser method to fail gracefully, triggering a fallback syscall as a recovery option.

We can keep P/Invoking for non-linux:
e.g.

#if TARGET_LINUX
try
{
  // read from procfs
  // return;
}
catch { /* ignored */ }
#endif

// existing GetFormatInfoForMountPoint logic

@MojtabaTajik
Copy link
Contributor Author

MojtabaTajik commented Sep 5, 2024

Problem with File.Exists is that we still may not be able to read the file due to access issues.

Based on the Kernel documentation, There is one such file per process because not all processes see the same mount points.

Since /proc/self/mountinfo refers to the process that is reading the file, you should have no issues accessing it within your process. You can simply open and read it like any other file in the /proc directory.

Btw, I agree that using try-catch is a good approach to handle any unexpected behaviour, as there could be other unknown issues.

…Linux

Add support for reading filesystem type from /proc/self/mountinfo on Linux

- Prioritize reading the filesystem type from `/proc/self/mountinfo` on Linux.
- Retain the existing P/Invoke fallback for non-Linux platforms and as a fallback on Linux.
- Added `try/catch` block to handle potential but unlikely access issues or unrecognized data.
  Although `/proc/self/mountinfo` is process-specific and should generally be accessible,
  this ensures robustness in case of unexpected errors (e.g., temporary unavailability, corrupted data).
- Updated documentation to reflect the reasoning and kernel behavior regarding procfs on different platforms.
@am11
Copy link
Member

am11 commented Sep 5, 2024

Btw, I agree that using try-catch is a good approach to handle any unexpected behaviour, as there could be other unknown issues.

Yup, we have actually seen those issues in chroot type environments (qemu, unikernels like nanos (which is the implementation of Linux interface and not the stock kernel), scratch containers).

fields.MoveNext(); // Skip Parent ID
fields.MoveNext(); // Skip Major:Minor
fields.MoveNext(); // Skip Root
fields.MoveNext(); // Skip to MountPoint field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should include additional changes so we use the MountPoint field as the Name/RootDirectory for DriveInfo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is: for the instances returned by DriveInfo.GetDrives.

format = line[fields.Current].ToString();

fields.MoveNext();
type = GetDriveType(line[fields.Current].ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be passing format to GetDriveType instead of reading the next field?

man proc has this example line:

36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue


return 0;
}
catch { /* ignored */ }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the format is well documented, we shouldn't need a catch all.

We can place this under if (File.Exists("/proc/self/mountinfo")) to handle the case where proc is unavailable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this was discussed in #107365 (comment).

Personally, I think the File.Exists should be fine, but I defer to the maintainers' preferences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why all/most procfs usage in libs are using try-catch pattern is because anything could go wrong (#62513). Also see #74316 (comment) and #52753 to read on unreliability issues attached with FileSystemInfo.Exists out in the wild.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some reasons why we don't expect these to occur for /proc/self/mountinfo

Anyway, my main concern here is that a generic catch all masks failures in the parsing.

Perhaps we can filter for IOException/UnauthorizedAccessException. These are the exceptions we might expect when not being able to read the procfs file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this approach is similar to what has been used in networking info https://github.com/dotnet/runtime/pull/63696/files#diff-deb3e580b54bccfac2148f60c5d90c6e88b55b508948052e9cf2d1e5421f6ff5R21. Maybe adding a Debug.Fail in catch body for unexpected parsing issues during the development/testing is enough so implementation bugs don't go undetected?

/// <param name="name">The mount point name to query.</param>
/// <param name="format">Output parameter for the filesystem format.</param>
/// <param name="type">Output parameter for the drive type.</param>
/// <returns>0 if successful, otherwise an error code.</returns>
private static unsafe int GetFormatInfoForMountPoint(string name, out string format, out DriveType type)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some parsing tests for this, similar to

public void ParseValidStatFiles_Success(
.

MemoryExtensions.SpanSplitEnumerator<char> fields = line.Split(' ');

// Skip fields we don't care about (Fields 1-4)
fields.MoveNext(); // Skip Mount ID
Copy link
Member

@tmds tmds Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I find the Skip in this comment misleading. The call to MoveNext puts us at the Mount ID, not past it, right?

Maybe you can drop the Skip.

fields.MoveNext(); // Mount ID
...


fields.MoveNext();
type = GetDriveType(line[fields.Current].ToString());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a matching closing } for the while loop. If CI doesn't complain, then TARGET_LINUX may not be set.

type = GetDriveType(line[fields.Current].ToString());

return 0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we went over the lines of mountinfo and didn't find a match, there's no need to do the PInvoke. We can just return immediately.

Comment on lines +78 to +110
#if TARGET_LINUX
try
{
using StreamReader reader = new("/proc/self/mountinfo");

string rawLine;
while ((rawLine = reader.ReadLine()) is not null)
{
ReadOnlySpan<char> line = rawLine.AsSpan();
MemoryExtensions.SpanSplitEnumerator<char> fields = line.Split(' ');

// Skip fields we don't care about (Fields 1-4)
fields.MoveNext(); // Skip Mount ID
fields.MoveNext(); // Skip Parent ID
fields.MoveNext(); // Skip Major:Minor
fields.MoveNext(); // Skip Root
fields.MoveNext(); // Skip to MountPoint field

if (!MemoryExtensions.Equals(line[fields.Current], name, StringComparison.Ordinal)) continue;

// Skip to the separator which is end of optional fields (Field 8)
while (fields.MoveNext() && !MemoryExtensions.Equals(line[fields.Current], "-", StringComparison.Ordinal));

fields.MoveNext();
format = line[fields.Current].ToString();

fields.MoveNext();
type = GetDriveType(line[fields.Current].ToString());

return 0;
}
catch { /* ignored */ }
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if TARGET_LINUX
try
{
using StreamReader reader = new("/proc/self/mountinfo");
string rawLine;
while ((rawLine = reader.ReadLine()) is not null)
{
ReadOnlySpan<char> line = rawLine.AsSpan();
MemoryExtensions.SpanSplitEnumerator<char> fields = line.Split(' ');
// Skip fields we don't care about (Fields 1-4)
fields.MoveNext(); // Skip Mount ID
fields.MoveNext(); // Skip Parent ID
fields.MoveNext(); // Skip Major:Minor
fields.MoveNext(); // Skip Root
fields.MoveNext(); // Skip to MountPoint field
if (!MemoryExtensions.Equals(line[fields.Current], name, StringComparison.Ordinal)) continue;
// Skip to the separator which is end of optional fields (Field 8)
while (fields.MoveNext() && !MemoryExtensions.Equals(line[fields.Current], "-", StringComparison.Ordinal));
fields.MoveNext();
format = line[fields.Current].ToString();
fields.MoveNext();
type = GetDriveType(line[fields.Current].ToString());
return 0;
}
catch { /* ignored */ }
#endif
if (OperatingSystem.IsLinux())
{
try
{
using StreamReader reader = new("/proc/self/mountinfo");
string? rawLine;
while ((rawLine = reader.ReadLine()) is not null)
{
ReadOnlySpan<char> line = rawLine.AsSpan();
MemoryExtensions.SpanSplitEnumerator<char> fields = line.Split(' ');
// Skip fields we don't care about (Fields 1-4)
fields.MoveNext(); // Skip Mount ID
fields.MoveNext(); // Skip Parent ID
fields.MoveNext(); // Skip Major:Minor
fields.MoveNext(); // Skip Root
fields.MoveNext(); // Skip to MountPoint field
if (!MemoryExtensions.Equals(line[fields.Current], name, StringComparison.Ordinal)) continue;
// Skip to the separator which is end of optional fields (Field 8)
while (fields.MoveNext() && !MemoryExtensions.Equals(line[fields.Current], "-", StringComparison.Ordinal));
fields.MoveNext();
format = line[fields.Current].ToString();
fields.MoveNext();
type = GetDriveType(line[fields.Current].ToString());
return 0;
}
}
catch { /* ignored */ }
}

I just noticed that this wasn't building on Linux because only CoreLib has TARGET_LINUX type of constants defined. Also, there is a syntax error try and while blocks have only one closing }.

OperatingSystem.IsLinux() block will be optimized out on non-Linux by the JIT (since it returns a constant bool).

Copy link
Member

@tmds tmds Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had noticed the syntax error and based on that assumed TARGET_LINUX must not be set (#107365 (comment)).

if (OperatingSystem.IsLinux())

Yes, this is the way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DriveInfo.DriveFormat property reports ext3 format on ext4 formatted filesystems (Unix)
4 participants