Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对于许多楼层为同一人所发的帖子生成html的排版有些混乱 #8

Closed
brookchou opened this issue Oct 8, 2019 · 1 comment

Comments

@brookchou
Copy link

问题出现在一个楼主连续占用了很多楼层、且图片极少的帖子之中。此前没有抓取过有这种特征的帖子,所以没有发现这个问题。

生成的html文件将许多楼层的正文被合并到了一起,原本该楼层的日期、层主ID、楼层号等基本格式被省略掉了。而且有的楼层的开头(比如图1里有张插图后面的“预计”2字,还有以序号“1、”“2、”开头的文字)被排到了上一层正文的末尾。这些使得页面看起来就比较凌乱。

详细可见部分截图以及原始网页的截图。原帖地址为这个

捕获-1
捕获-3
捕获-2

@hjhee
Copy link
Owner

hjhee commented Oct 24, 2019

我在程序内主动将部分楼层合并了,方便阅读小说类帖子,是特意这样设计的。
可合并的楼层定义为:连续楼层,楼主相同,楼层内无楼中楼评论
合并方法方法为直接把楼层内容前向append,在楼层结尾没有图片的情况下,邻接楼层的文字是会断行隔开的。要是合并楼层导致排版混乱的话,可以放弃楼层合并或者在合并内容的同时人工追加另起新行的HTML代码。

@hjhee hjhee closed this as completed in f099948 Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants