Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

抓取电子书时,无法捕获一段内的多种字体属性 #157

Closed
blakehan opened this issue Dec 4, 2022 · 5 comments
Closed

抓取电子书时,无法捕获一段内的多种字体属性 #157

blakehan opened this issue Dec 4, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@blakehan
Copy link

blakehan commented Dec 4, 2022

图书地址: https://www.dedao.cn/ebook/reader?id=kQX7yD4MVoN52PDAnlRdzK6qvg8XEwbM1R3ZJjBb7rO4ypxGa9LeQm1kYng9YzK5

名著中一段话内经常有多种字体。比如这里用仿宋表示法语,今楷表示俄语。
image
SVG代码如下:
image
抓取以后仿宋和今楷都变成了苹方。
image

@blakehan
Copy link
Author

blakehan commented Dec 4, 2022

Temporary fix:

image

@yann0917
Copy link
Owner

yann0917 commented Dec 5, 2022

bug #141 ,之前没遇到过这种情况,就没处理。

@yann0917 yann0917 added the bug Something isn't working label Dec 5, 2022
@yann0917
Copy link
Owner

yann0917 commented Dec 6, 2022

你这种处理方式不具有普遍性,我再想想其他办法。

@blakehan
Copy link
Author

blakehan commented Dec 6, 2022

@yann0917 我改了下代码,现在应该可以适用于所有情况了

comment.diff.txt

@yann0917
Copy link
Owner

yann0917 commented Dec 6, 2022

image
可以了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants