Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get sdist requires from egg-info #88

Merged

Conversation

Avasam
Copy link
Contributor

@Avasam Avasam commented Jan 15, 2023

CC @AlexWaygood This was the idea I mentioned.

When the requirements can't be found in requires_dist, fallback to searching in *.egg-info. All without executing any arbitrary code.

I move the f"Expected dependency {req} to be present in the allowlist" error higher so it can fail earlier and leave the network requests last.

It's my first time playing around with downloading and manipulating archives in python. Let me know what I can do better here.

I tested the idea with pycocotools (best case scenario) click-default-group (on my windows machine with 7zip there's an extra folder between the gz and tar, still worked), numpy (no *.egg-info folder), mypy (has empty lines and [extras]) and pywin32 (no Source Distribution)

@Avasam
Copy link
Contributor Author

Avasam commented Jan 15, 2023

Tests found at least one case where this errors out. I'll look into it.
Edit: empty lines and [extras] definitions. Now accounted for and fixed.

stub_uploader/metadata.py Outdated Show resolved Hide resolved
@Avasam
Copy link
Contributor Author

Avasam commented Jan 15, 2023

Should I delete the files afterward or does it not matter as code meant for the CI? (which I assume just cleans itself up)

stub_uploader/metadata.py Outdated Show resolved Hide resolved
Copy link
Contributor

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, remarks below.

Comment on lines 199 to 202
for info in file_in:
folder_name = info.name
# Assume a single folder
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a check here, whether folder_name was actually set, or maybe even check that there really is exactly one folder to account for broken eggs. But for the former case, the following should be simpler and have that required error check:

Suggested change
for info in file_in:
folder_name = info.name
# Assume a single folder
break
# Assume a single folder
folder_name = next(iter(file_in)).name

You can then also drop the initialization of folder_name.

Or just use * instead of folder_name in the glob below.

Copy link
Contributor Author

@Avasam Avasam May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just use * instead of folder_name in the glob below.

New code looks much cleaner like this, and makes no assumption. Especially since it's extracting to a temp folder now.

)


def get_sdist_requires(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "get" makes this function sound simpler than what it actually does. Maybe "extract"?

Also I think it would be safer to extract the sdist into an empty temporary directory than to where ever we currently are.

use glob to not have to assume match count
@Avasam Avasam requested a review from srittau May 3, 2023 16:22
@AlexWaygood
Copy link
Contributor

At least some of the test failures will be fixed by #95

@Avasam
Copy link
Contributor Author

Avasam commented May 3, 2023

I was thinking integration tests might be too fragile, but I can just pin the version I'm testing against. I'm sure you'd like me to add tests for this?

Quick testing script I've been using:

from __future__ import annotations

import requests
from packaging.requirements import Requirement

from stub_uploader.metadata import extract_sdist_requires, validate_response

for req_name in ("pycocotools", "click-default-group", "numpy", "mypy", "pywin32"):  # Pin these
    req = Requirement(req_name)

    resp = requests.get(f"https://pypi.org/pypi/{req.name}/json")
    validate_response(resp, req)
    data = resp.json()

    sdist_data = next(
        (
            url_data
            for url_data in reversed(data["urls"])
            if url_data["packagetype"] == "sdist"
        ),
        None,
    )
    if sdist_data is None:
        print(f"{req_name}: sdist_data is None!")
        continue

    requirements = extract_sdist_requires(sdist_data, req)
    print(f"{req_name}: {list(requirements)}")

Results:

pycocotools: [<Requirement('matplotlib>=2.1.0')>, <Requirement('numpy')>]
click-default-group: [<Requirement('click')>]
numpy: []
mypy: [<Requirement('typing_extensions>=3.10')>, <Requirement('mypy_extensions>=1.0.0')>, <Requirement('tomli>=1.1.0')>, <Requirement('typed_ast<2,>=1.4.0')>, <Requirement('psutil>=4.0')>, <Requirement('pip')>, <Requirement('typed_ast<2,>=1.4.0')>, <Requirement('lxml')>]
pywin32: sdist_data is None!

@srittau srittau merged commit 518ddfe into typeshed-internal:main May 4, 2023
@Avasam Avasam deleted the get-sdist-requires-from-egg-info branch May 4, 2023 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants