Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding mark_utf8 #35

Open
MartinKies opened this issue Sep 24, 2020 · 1 comment
Open

Question regarding mark_utf8 #35

MartinKies opened this issue Sep 24, 2020 · 1 comment

Comments

@MartinKies
Copy link
Contributor

In write.output.solution (create_ps.r) you have

out.txt = mark_utf8(out.txt)

I am unsure about its purpose. This line sometimes leads to errors when used before my function "fix.parser.inconsistencies" due to incompatibilities with the stringr package, e.g. regarding stringr::str_length().

Uncommenting the line fixes the error and the resulting solution looks fine to me (in particular regarding Umlauts). Perhaps the following code makes my point more clear:

fix.parser.inconsistencies("Test ü")
[1] "Test ü"
mark_utf8("Test ü")
[1] "Test \xfc"
str_length("Test ü")
[1] 6
str_length("Test \xfc")
[1] 6
str_length(mark_utf8("Test ü"))
[1] NA
Warnmeldung:
In stri_length(string) :
invalid UTF-8 byte sequence detected; try calling stri_enc_toutf8()
fix.parser.inconsistencies(mark_utf8("Test ü"))
[1] "Test �"

I am a bit wary whether uncommenting the line is the way to go, because I do not fully understand what its purpose is. Maybe I found an error in mark_utf8 itself, als str_length("Test \xfc") does work?

@skranz
Copy link
Owner

skranz commented Sep 24, 2020

Hmm, honestly all this UTF-8 code was mainly try-and-error. Perhaps there was a problem that caused me to enter the line, I don't remember. If all problem sets (with special UTF-8 chars like ü) convert well you can comment it out.

Note that I usually expect all _sol.Rmd files to be saved with UTF-8 encoding if the problem arises because you saved the _sol.Rmd files in different encoding, like Windows standard encoding probably first try to change the encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants