Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Tar Global Extended Attributes API changes #70869

Merged
merged 8 commits into from
Jun 22, 2022

Conversation

carlossanlop
Copy link
Member

@carlossanlop carlossanlop commented Jun 17, 2022

Addresses the rest of the APIs approved in: #69935

The other half of the APIs got merged here: #70325

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Jun 17, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Addresses the rest of the APIs approved in: #69935

Depends on the conversion constructors PR: #70325

While I merge that other PR, this one can be reviewed by looking at the 3 individual commits (look at that pretty separator commit 🙂):

Author: carlossanlop
Assignees: carlossanlop
Labels:

area-System.IO

Milestone: 7.0.0

@carlossanlop carlossanlop force-pushed the MultipleGEA branch 2 times, most recently from 7fcc8bb to 27c1734 Compare June 20, 2022 06:23
@carlossanlop
Copy link
Member Author

Force pushing a rebase merge to fix merge conflicts with main.

@carlossanlop
Copy link
Member Author

/azp run runtime-extra-platforms

1 similar comment
@carlossanlop
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines

This comment was marked as duplicate.

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

{
Debug.Assert(globalExtendedAttributesEntryNumber >= 1);

Copy link
Member

@danmoseley danmoseley Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside - the code below reading TMPDIR then falling back to /tmp- why doesn't it just call Path.GetTempPath() that does the same thing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I'll use Path.GetTempPath.

Here's why I wasn't using that:

The GNU tar manual specifies how the global extended attribute header name field obtains its value:

8.3.7.1 Controlling Extended Header Keywords

$TMPDIR/GlobalHead.%p.%n

[...] $TMPDIR stands for the value of the TMPDIR environment variable. If TMPDIR is not set, tar uses /tmp.

This description looks very similar to the remarks section of Path.GetTempPath:

Remarks

This method checks for the existence of environment variables in the following order and uses the first path found:

Linux

  1. The path specified by the TMPDIR environment variable. If the path is not specified in the TMPDIR environment variable, the default path /tmp/ is used.

Windows

  1. The path specified by the TMP environment variable.
  2. The path specified by the TEMP environment variable.
  3. The path specified by the USERPROFILE environment variable.
  4. The Windows directory.

So a couple of things need to be considered if we are to use Path.GetTempPath:

  • I don't see why any tool would be reading the GEA path, or having reading/writing behavior depend on it. The field isn't very interesting, unless the user cares about the process ID or the sequence number.
  • Keep in mind that none of the tar specs (at least the ones I read) specify what the expected behavior for Windows should be. Everything is Unix focused. If we format that string embedding Windows paths, I hope other tools don't break due to finding unexpected drive names or \ separators in that field. But go back to the previous point: I don't think any tool should make its behavior depend on this field.
  • If the Windows path ends up being larger than than 100 bytes, the sequence number and process ID would get truncated. Which is fine, I guess, because a couple of lines below, I re-generate the string by forcing the usage of /tmp.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, the only thing I would like to know before hitting the accept button is why the code is using the TMPDIR env var rather that calling the Path API? (issue initally raised by Dan)

Comment on lines +228 to +232
"00000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999999/");

TarEntry file = reader.GetNextEntry();
VerifyRegularFileEntry(file, format,
$"00000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999999/00000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999.txt",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's hard to verify the long string lengths, perhaps you could use the string ctor that accepts char count and just repeat a single char? Example:

- "00000000001111111111222222222233333333334444444444555555555566666666667777777777888888888899999999/"
+ new string('0', 98) + "/"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two reasons for these long strings:

  1. The tar assets from dotnet/runtime-assets already use these strings for long directory names and filenames, so changing these would mean changing the assets first, then bump the package version number in this PR.
  2. When I was starting to test long names, I discovered it was very annoying to not be able to determine where exactly the string was getting unexpectedly truncated, or causing a failure. Having groups of 10 equal characters made it much easier to track the exact location.

@carlossanlop
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@carlossanlop
Copy link
Member Author

carlossanlop commented Jun 22, 2022

The runtime-extra-platforms job may hit either timestamp related failures or long path related failures. Both are being fixed in other PRs:

#71038 (timestamps)

#71115 (long paths)

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @carlossanlop !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants