Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 11yr old post title includes "\n", which is treated as a linefeed/illegal filesystem character when attempting to save the post #616

Closed
3 tasks done
mbarr564 opened this issue Mar 19, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@mbarr564
Copy link

mbarr564 commented Mar 19, 2022

  • I am reporting a bug.
  • I am running the latest version of BDfR
  • I have read the Opening an issue

Description

Old post title includes "\n", which causes a terminating error when attempting to save to disk.
This MAY be long resolved by reddit (do they even still allow "\n" in new post titles anymore?), but I'm leaving the offending post up so the issue can be reproduced.

Command

Paste here the command(s) that causes the bug:
My wrapper isn't involved with the bug AFAIK, but it will allow easy repro in a Windows environment:

  1. PS> Install-Script -Name New-SubredditHTMLArchive -RequiredVersion 2.0.9 -Force
  2. PS> New-SubredditHTMLArchive.ps1 -Subreddit 'AtheistHavens'
  3. BDFR reaches this post, and then throws the below log output when attempting to save to disk:
    https://www.reddit.com/r/AtheistHavens/comments/jiecu/reston_va_some_info_regarding_shelters_in_the_area/

Environment (please complete the following information):

Windows 11
PowerShell: 5.1.22000.282
Python: 3.10.2
BDFR: 2.5.2

Logs

[2022-03-19 09:18:07,794 - bdfr.downloader - DEBUG] - Attempting to download submission jiecu
[2022-03-19 09:18:07,795 - bdfr.downloader - DEBUG] - Using SelfPost with url https://www.reddit.com/r/AtheistHavens/comments/jiecu/reston_va_some_info_regarding_shelters_in_the_area/
[2022-03-19 09:18:09,106 - bdfr.downloader - ERROR] - [Errno 22] Invalid argument: 'C:\Users\username\Documents\New-SubredditHTMLArchive\JSON\AtheistHavens\Rendaril_Reston, VA\nSome info regarding shelters in the area._jiecu.txt'
Traceback (most recent call last):
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr\downloader.py", line 110, in _download_submission
with open(destination, 'wb') as file:
OSError: [Errno 22] Invalid argument: 'C:\Users\username\Documents\New-SubredditHTMLArchive\JSON\AtheistHavens\Rendaril_Reston, VA\nSome info regarding shelters in the area.jiecu.txt'
[2022-03-19 09:18:09,111 - bdfr.downloader - ERROR] - Failed to write file in submission jiecu to C:\Users\username\Documents\New-SubredditHTMLArchive\JSON\AtheistHavens\Rendaril_Reston, VA
Some info regarding shelters in the area._jiecu.txt: [Errno 22] Invalid argument: 'C:\Users\username\Documents\New-SubredditHTMLArchive\JSON\AtheistHavens\Rendaril_Reston, VA\nSome info regarding shelters in the area._jiecu.txt'

[2022-03-19 09:18:09,112 - bdfr.archive_entry.submission_archive_entry - DEBUG] - Retrieving full comment tree for submission jiecu
[2022-03-19 09:18:09,753 - root - ERROR] - Scraper exited unexpectedly
Traceback (most recent call last):
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr_main
.py", line 120, in cli_clone
reddit_scraper.download()
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr\cloner.py", line 21, in download
self.write_entry(submission)
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr\archiver.py", line 75, in write_entry
self._write_entry_json(archive_entry)
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr\archiver.py", line 87, in _write_entry_json
self._write_content_to_disk(resource, content)
File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\bdfr\archiver.py", line 102, in _write_content_to_disk
with open(file_path, 'w', encoding="utf-8") as file:
OSError: [Errno 22] Invalid argument: 'C:\Users\username\Documents\New-SubredditHTMLArchive\JSON\AtheistHavens\Rendaril_Reston, VA\nSome info regarding shelters in the area._jiecu.json'

@mbarr564 mbarr564 added the bug Something isn't working label Mar 19, 2022
@mbarr564
Copy link
Author

@mbarr564 mbarr564 changed the title [BUG] 11yr old post title includes "\n", which is treated as a linefeed when attempting to save the post [BUG] 11yr old post title includes "\n", which is treated as a linefeed/directory when attempting to save the post Mar 19, 2022
@mbarr564 mbarr564 changed the title [BUG] 11yr old post title includes "\n", which is treated as a linefeed/directory when attempting to save the post [BUG] 11yr old post title includes "\n", which is treated as a linefeed/illegal filesystem character when attempting to save the post Mar 21, 2022
@Serene-Arc Serene-Arc self-assigned this Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants