Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip over noscript tags #55

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion html2text.go
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ func (ctx *textifyTraverseContext) handleElement(node *html.Node) error {
ctx.isPre = false
return err

case atom.Style, atom.Script, atom.Head:
case atom.Style, atom.Script, atom.Head, atom.Noscript :
// Ignore the subtree.
return nil

Expand Down
4 changes: 4 additions & 0 deletions html2text_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -775,6 +775,10 @@ func TestIgnoreStylesScriptsHead(t *testing.T) {
`<html><head><title>Title</title></head><body></body></html>`,
"",
},
{
"<noscript><a href=\"/action/clickThrough?id=some_identifier\" target=\"_blank\"><img src=\"https://example.com/image/some_identifier\"></a></noscript>",
"",
},
}

for _, testCase := range testCases {
Expand Down
6 changes: 6 additions & 0 deletions testdata/utf8.html
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@
<p class="calibre_7">这本书的创作耗费了相当多的时间和精力。在成长的过程中,我在我的小房间里从未想过等待我的会是这样的战斗。在创作中,我的思想逐渐成熟;爱恋从分崩离析,到失而复得,世界冠军头衔从失之交臂,到囊中取物。如果说在我人生的第一个二十九年中,我学到了什么,那就是,我们永远无法预测结局,无论是重要的比赛、冒险,还是轰轰烈烈的爱情。我们唯一可以肯定的只有,出乎意料。不管我们做了多么万全的准备,在生活的真实场景中,我们总是会处于陌生的境地。我们也许会无法冷静,失去理智,感觉似乎整个世界都在针对我们。在这个时候,我们所要做的是要付出加倍的努力,要表现得比预想得更好。我认为,关键在于准备好随机应变,准备好在所能想象的高压下发挥出创造力。</p>
<p class="calibre_7">读者朋友们,我非常希望你们在读过这本书后,可以得到启发,甚至会得到触动,从而能够根据各自的天赋与特长,去实现自己的梦想。这就是我写作此书的目的。我在字里行间所传达的理念曾经使我受益匪浅,我很希望它们可以为大家提供一个基本的框架和方向。如果我的方法言之有理,那么就请接受它,琢磨它,并加之自己的见解。忘记我的那些数字。真正的掌握需要通过自己发现一些最能够引起共鸣的信息,并将其彻底地融合进来,直至成为一体,这样我们才能随心所欲地驾驭它。</p>
<div class="mbp_pagebreak" id="calibre_pb_4"></div>
<!-- No JavaScript support is a call to an image -->
<noscript>
<a href="/action/clickThrough?id=some_identifier" target="_blank">
<img src="https://example.com/image/some_identifier">
</a>
</noscript>
</body>

</html>