Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NBSP support. Closes: #834. #851

Merged
merged 5 commits into from
Oct 11, 2023
Merged

Conversation

rysson
Copy link

@rysson rysson commented Jul 11, 2023

Fix NBSP, it's not-break but varible-width space. Fixes #834

Checklist:

  • The GitHub pipeline is OK (green),
    meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.

  • A unit test is covering the code added / modified by this PR

  • This PR is ready to be merged

  • In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder

  • A mention of the change is present in CHANGELOG.md

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

@rysson rysson requested a review from Lucas-C as a code owner July 11, 2023 06:33
@rysson
Copy link
Author

rysson commented Jul 11, 2023

I need to add tests and changelog.
I'm not sure if docs need be updated.

@Lucas-C
Copy link
Member

Lucas-C commented Jul 11, 2023

I need to add tests and changelog. I'm not sure if docs need be updated.

You can just add a mention of your change in CHANGELOG.md, that would be enough 😊

@rysson
Copy link
Author

rysson commented Jul 11, 2023

I don't speak English well, is it enough?

Fixed:

  • variable-width non-breaking space (NBSP) support issue #834

@gmischler
Copy link
Collaborator

I'm not sure if docs need be updated.

It might be helpful to mention this eg. in the paragraph about Character or Word Based Line Wrapping. After the first sentence, you could add something like this:

Non-breaking spaces (\U00a0) do not trigger a word wrap, but are otherwise treated exactly as a normal space character.

With a test to actually verify that, this is a very nice little fix!
👍

@Lucas-C
Copy link
Member

Lucas-C commented Jul 12, 2023

Based on the GitHub Actions execution logs, it seems that your PR is breaking a few existing unit tests:

FAILED test/fonts/test_charmap.py...
FAILED test/html/test_html.py::test_html_whitespace_handling

You can try running them on your computer with this command:

pytest -vv test/fonts/test_charmap.py test/html/test_html.py

Related doc: https://pyfpdf.github.io/fpdf2/Development.html

docs/Text.md Outdated
@@ -32,8 +32,9 @@ fixed_text = get_display(reshape(some_text))
```

### Character or Word Based Line Wrapping
By default, `multi_line()` and `write()` will wrap lines based on words, using space characters and soft hyphens as seperators.
For languages like Chinese and Japanese, that don't usually seperate their words, character based wrapping is more appropriate.
By default, `multi_line()` and `write()` will wrap lines based on words, using space characters and soft hyphens as separators.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that occasion, you could also fix an old mistake, and change the incorrect multi_line() in that sentence to multi_cell(). 😉

@Lucas-C
Copy link
Member

Lucas-C commented Aug 1, 2023

Hi @rysson!

Do you still want to finish this PR? 😊

Otherwise, maybe another contributor could take over

@rysson
Copy link
Author

rysson commented Aug 2, 2023

@Lucas-C it depends... I'm on holidays now. Then another contributor could finish it faster.
I'll back about Aug, 16.

@Lucas-C
Copy link
Member

Lucas-C commented Aug 3, 2023

@Lucas-C it depends... I'm on holidays now. Then another contributor could finish it faster. I'll back about Aug, 16.

No worries, this can definitively wait until the end of August if you wish to finish this! 😊

I'm just going to perform a new release in the meantime.

@rysson
Copy link
Author

rysson commented Sep 2, 2023

I'm back and I trying to finish this PR.
Everything going wrong, eg.

I'm kidding, nothing to spot me :-P

OK. I've made some fixes.

@Lucas-C – about unitttest, test_html_whitespace_handling is OK IMHO, test_charmap.py looks the same, but files differs a bit. Could you look?

@Lucas-C
Copy link
Member

Lucas-C commented Sep 4, 2023

Hi @rysson

Thank you for getting back to this 😊👍

I haven't reviewed all your changes, but only based on the ones in CHANGELOG.md, I think you have rebase issues:
your branch / Pull Request contains many changes in this file that make no sense in this context.

Could you please fix this?
If you cannot manage to properly rebase your branch,
it may be worth creating a new "clean" branch with your patch,
and open a new Pull Request.

@Lucas-C
Copy link
Member

Lucas-C commented Oct 9, 2023

Hi @rysson

Have you been able to get back to this?

Do you need with the rebase or with anything else?

@rysson
Copy link
Author

rysson commented Oct 10, 2023

Hi @rysson
Have you been able to get back to this?

Oh, I've forgot, I'm very sorry.
Live it too short a little, too many tasks ;-)

Ok, I've used cherry-pick for patch and I've generate same test PDF.
BUT, I have a few more test failed:

test/html/test_html.py::test_html_features FAILED
test/html/test_html.py::test_html_customize_ul FAILED
test/html/test_html.py::test_html_description FAILED
test/html/test_html.py::test_html_HTMLMixin_deprecation_warning FAILED

It's seems, that space (20) is replaced with nbsp (a0), for example:

E             +  b'BT 31.18 779.94 Td (          description details ) Tj ET',
E             -  b'BT 31.18 779.94 Td <a0a0a0a0a0a0a0a0a0a06465736372697074696f6e2064657461696c'

Could you look into those test and tell me if I should generate those PDF too?
Path replace a0 to 20 then I suggest to generate PDF again.

@rysson
Copy link
Author

rysson commented Oct 10, 2023

OK, I generated PDF, they look good IMHO.

I have another test failed:

        with time_execution() as duration:
            build_pdf_with_big_images()
>       assert duration.seconds > 0.3
E       assert 0.2659818779939087 > 0.3
E        +  where 0.2659818779939087 = namespace(seconds=0.2659818779939087).seconds

Maybe build server be a little bit slower and test passes.

@Lucas-C
Copy link
Member

Lucas-C commented Oct 10, 2023

OK, I generated PDF, they look good IMHO.

I have another test failed:

        with time_execution() as duration:
            build_pdf_with_big_images()
>       assert duration.seconds > 0.3
E       assert 0.2659818779939087 > 0.3
E        +  where 0.2659818779939087 = namespace(seconds=0.2659818779939087).seconds

Maybe build server be a little bit slower and test passes.

Yes, this is a recurring issue tracked in #923
I'm going to bump this threshold to 22s, but simply retrying your pipeline execution should be enough to have it pass for now.

@codecov-commenter
Copy link

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (a99de0a) 93.56% compared to head (058b9e5) 93.57%.
Report is 1 commits behind head on master.

❗ Current head 058b9e5 differs from pull request most recent head 2f0cd05. Consider uploading reports for the commit 2f0cd05 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #851   +/-   ##
=======================================
  Coverage   93.56%   93.57%           
=======================================
  Files          28       28           
  Lines        8209     8213    +4     
  Branches     1500     1501    +1     
=======================================
+ Hits         7681     7685    +4     
  Misses        326      326           
  Partials      202      202           
Files Coverage Δ
fpdf/line_break.py 99.14% <100.00%> (+<0.01%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Lucas-C
Copy link
Member

Lucas-C commented Oct 10, 2023

This PR looks good to me 🙂

@gmischler would you like to get a look at it before we merge?

@gmischler
Copy link
Collaborator

@gmischler would you like to get a look at it before we merge?

Looks good to me. Let's send it.

@gmischler gmischler merged commit 1546f70 into py-pdf:master Oct 11, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-breaking space support in multi_cell() and write_html()
4 participants