Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fantia] Post-processor events running at unintuitive times? #4627

Open
shrublet opened this issue Oct 5, 2023 · 2 comments
Open

[Fantia] Post-processor events running at unintuitive times? #4627

shrublet opened this issue Oct 5, 2023 · 2 comments

Comments

@shrublet
Copy link

shrublet commented Oct 5, 2023

I believe since the Fantia extractor was refactored (f845298), the post and post-after events now trigger for each content section/entry instead of the entire post as a whole. While I don't inherently think it's a bad change, I'm unsure now how I'd trigger a post-processor when post-after would typically trigger. If count were added as a keyword, comparing and using filter with num and count would do the trick, and for post, just num suffices, but I still think it's somewhat unintuitive, especially relative to how other extractors work. I did end up just adding this in the for loop in items if there's no better alternative:

post["count"] = sum(
    len(self._process_content(post, content))
    for content in post["_data"]["post_contents"]
) + 1

Hope you have some insight or thoughts on this.

Edit: Whoops can't get count from _get_post_contents because it inserts the thumbnail into the list when it's called. Changed to just add +1 manually to post_contents to account for that.

Edit 2: Also ended up adding a count and num value for content after the loop yields the directory as I was having some edge case issues where comparing num to count didn't sufficiently achieve what I needed it to.

@mikf
Copy link
Owner

mikf commented Oct 13, 2023

I tried something that would potentially solve your problem with content_num == content_count as filter, but it doesn't work due to how post-after is implemented and how objects get reused. It currently triggers for the last content section as well as the second-to-last one … (833dce1)

(man, the code has become such a mess over time)

mikf added a commit that referenced this issue Oct 14, 2023
at least this makes "filter": "content_num == content_count+1"
with "event": "post-after" work
@shrublet
Copy link
Author

I tried something that would potentially solve your problem with content_num == content_count as filter, but it doesn't work due to how post-after is implemented and how objects get reused

It's good to know it wasn't just a me problem! I mentioned in my second edit I had to put post["content_num"] += 1 after yield Message.Directory, post as if I put it where it logically makes sense, I would end up with the result you're talking about where it overcounts. It's not a solution, but it's a workaround that's good enough (for me) for now as I don't necessarily need the actual number. At the very least, it's a pretty minor concern all things considered, and I at least have something that works on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants