Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add urls support to collections. #62

Merged
merged 11 commits into from
Jul 19, 2023
Merged

Add urls support to collections. #62

merged 11 commits into from
Jul 19, 2023

Conversation

FledgeXu
Copy link
Collaborator

Fix #59

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!
Can you please:

  • Add a check for clashing filenames? If we request (directly or indirectly) several files to be stored at the same path, the first one will get the spot and the rest will silently point to it. We should raise on such conflict.
  • Update the README to mention the new format(s)
  • Update the CHANGELOG to add the feature.

nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Show resolved Hide resolved
nautiluszim/scraper.py Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
@FledgeXu FledgeXu requested a review from rgaudin July 19, 2023 13:59
CHANGELOG Outdated Show resolved Hide resolved
CHANGELOG Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
nautiluszim/scraper.py Outdated Show resolved Hide resolved
@FledgeXu FledgeXu requested a review from rgaudin July 19, 2023 14:50
Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great but duplicate name is not working

[
    {
        "title": "Fledge PDF",
        "description": "Someting in his ZIP",
        "authors": "n/a",
        "files": ["test.pdf"]
    },
    {
        "title": "Tarifs I SAGO",
        "description": "Tarifs électricité Mali",
        "authors": "EDM",
        "files": [
            {
                "archive_member": "test.pdf",
                "url": "https://isago.rskg.org/static/tarifs-isago.pdf",
                "filename": "test.pdf"
            }
        ]
    }
]

This doesn't raise while it should.

The problem is you are looking for duplicates in path instead of filename. path is used only to retrieve member_name (maybe it should be renamed) and we don't care about using it multiple times.

@FledgeXu
Copy link
Collaborator Author

The problem is you are looking for duplicates in path instead of filename. path is used only to retrieve member_name (maybe it should be renamed) and we don't care about using it multiple times.

I see. you are right, filename is the name of the last file that will be put into zim.

@rgaudin
Copy link
Member

rgaudin commented Jul 19, 2023

You may also want to only display duplicate ones (using a set) ; otherwise

ValueError: Files in collection are duplicate:
 - test.pdf
 - test.pdf

You should also look into refactoring a bit test_collection to please Codefactor

@FledgeXu FledgeXu requested a review from rgaudin July 19, 2023 16:00
@FledgeXu
Copy link
Collaborator Author

maybe it should be renamed

I just changed the path to the uri. I think it's more appropriate.

@rgaudin rgaudin merged commit 15ac800 into main Jul 19, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add URLs support to collection
2 participants