Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reason behind pipx run forcing the subprocess' encoding to utf-8? #1423

Open
J3ronimo opened this issue May 21, 2024 · 3 comments
Open

Reason behind pipx run forcing the subprocess' encoding to utf-8? #1423

J3ronimo opened this issue May 21, 2024 · 3 comments

Comments

@J3ronimo
Copy link

J3ronimo commented May 21, 2024

Hi. Today I stumbled upon a problem with pipx run, where the python tool to run prints German Umlauts like "ü". Those didn't show up correctly in the terminal, although I knew that they did when I ran the raw python scripts without pipx wrapped around.

Turns out that the reason for this lies in pipx.util, where _fix_subprocess_env sets

env["PYTHONIOENCODING"] = "utf-8"
env["PYTHONLEGACYWINDOWSSTDIO"] = "utf-8"

into the environment inherited to the subprocess, and then exec_app sets

subprocess.run( ..., encoding="utf-8")

alongside to match that.

The problem is that my German Windows terminal (cmd.exe) is not UTF-8 but CP850, therefore anything coming as utf8 from Python looks like gibberish in my terminal.

I'd like to know if there was a specific reason behind forcing the encoding here,
or if anything speaks against just leaving these settings away so that Python can detect and use the encoding of the terminal, which works nicely in my case.

Thanks and cheers.

@chrysle
Copy link
Contributor

chrysle commented May 21, 2024

Hmm yes, there is #335 (comment) and context.

@J3ronimo J3ronimo changed the title Reason behind pipx run forcing the subprocess' encoding utf-8? Reason behind pipx run forcing the subprocess' encoding to utf-8? May 21, 2024
@J3ronimo
Copy link
Author

J3ronimo commented May 22, 2024

Thanks for the link @chrysle . Unfortunately to me the commit message doesn't make clear why this was added, and it doesn't seem related to the issue that it fixes. To make the whole thing a little more graspable:

pipx run cowsay -t "hello äöü"

prints

  _________
| hello ├ñ├Â├╝ |
  =========
         \
          \
            ^__^
            (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

on my machine (Win10, cmd.exe in Windows Terminal, chcp says 850, pipx 1.5.0, Python 3.11.5), whereas just

cowsay -t "hello äöü"

in the same terminal prints everything correctly. And removing the above lines related to the subprocess encoding fixes this.

@chrysle
Copy link
Contributor

chrysle commented May 22, 2024

Unfortunately to me the commit message doesn't make clear why this was added, and it doesn't seem related to the issue that it fixes.

As stated in #335 (comment), this was added to prevent any edge cases that might occur otherwise – normally, you're on the safe side with UTF-8 encoding, because it's that widespread. But I agree the behaviour you experience is unpleasant. Probably, we should make pipx's output encoding configurable, with an environment variable prefixed PIPX_ to avoid any unintended behaviour originating from user-specified PYTHONIOENCODING.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants