Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: bandit linter showing high severity_score for tarfile library #3841

Closed
joydeep049 opened this issue Feb 19, 2024 · 8 comments
Closed

Comments

@joydeep049
Copy link
Contributor

joydeep049 commented Feb 19, 2024

In my recent commit to my PR #3543 bandit linter shows that the used library tarfile has high severity_score. However, I went through all the documentations of the repo and also of python.
I was not able to find any suitable method to extract tarfiles without using the tarfile library.
Even the utility functions that we have in async_utils.py use tarfile library as they call unpack_archive function from shutil.py which calls tarfile to unpack a tarfile.
Does anyone have any suggestions on how to tackle this problem?
@terriko @anthonyharrison @b31ngd3v @Rexbeast2

@ffontaine
Copy link
Contributor

Hello, unblob could be used to extract a wide variety of file formats including tar. unblob implemented a SafeTarFile class as tarfile is indeed vulnerable to path traversal (onekey-sec/unblob#459).

@joydeep049
Copy link
Contributor Author

Hello, unblob could be used to extract a wide variety of file formats including tar. unblob implemented a SafeTarFile class as tarfile is indeed vulnerable to path traversal (onekey-sec/unblob#459).

Thank you @ffontaine . Will try it in code and let you know how it works

@terriko
Copy link
Contributor

terriko commented Feb 20, 2024

A few things:

  • bandit is currently reporting even safe usage of the library (there's an open bug about that, or at least it was open when I checked last week. If we need to just update bandit someone could submit a pr for that.)
  • but there isn't an approved safe usage supported by all versions of python that we support
  • I've got a tentative setup in fix: use tarfile extract filters to open tarfiles more safely #3769 but it still doesn't cover everything and is missing a test at the moment.
  • Somewhat ironically, it would actually be safer to call tar as a subprocess, the way we used to do 🤦
  • If unblob works on windows we may need to integrate it into our extractor.py functions. I haven't looked at it yet so let me know how it works for you if you play with it.

That said, for PR #3543 I would suggest you just call our existing extractor functions rather than calling tarfile directly. That way, when extractor.py is fixed your code will also be fixed.

@joydeep049
Copy link
Contributor Author

joydeep049 commented Feb 20, 2024

A few things:

  • bandit is currently reporting even safe usage of the library (there's an open bug about that, or at least it was open when I checked last week. If we need to just update bandit someone could submit a pr for that.)
  • but there isn't an approved safe usage supported by all versions of python that we support
  • I've got a tentative setup in fix: use tarfile extract filters to open tarfiles more safely #3769 but it still doesn't cover everything and is missing a test at the moment.
  • Somewhat ironically, it would actually be safer to call tar as a subprocess, the way we used to do 🤦
  • If unblob works on windows we may need to integrate it into our extractor.py functions. I haven't looked at it yet so let me know how it works for you if you play with it.

That said, for PR #3543 I would suggest you just call our existing extractor functions rather than calling tarfile directly. That way, when extractor.py is fixed your code will also be fixed.

@terriko I took your advice and changed the code and committed .
Next, Can I work on adding a Fuzz Testing code to the DebParser?
Also, I'll look into unblob and if it works on windows or not.
(Although I just tried to install it in windows and showed some wheel-build related errors :) )

@joydeep049
Copy link
Contributor Author

joydeep049 commented Feb 20, 2024

Actually , since we are talking about extractors, I did come across one little thing.
The tar files inside the test.deb that we have in test/assets (now also in test/language_data) contains a control.tar.xz file, which our current tar extractor cannot read probably because it has the .xz extension as well.
What can we do about it ?
@terriko @ffontaine

@mastersans
Copy link
Contributor

Actually , since we are talking about extractors, I did come across one little thing. The tar files inside the test.deb that we have in test/assets (now also in test/language_data) contains a control.tar.xz file, which our current tar extractor cannot read probably because it has the .xz extension as well. What can we do about it ? @terriko @ffontaine

Hey @crazytrain328 i was recently working with extractors, we do have .tar.xz extractor it is handled with extract_file_tar() in extractor.py , there is whole list of file extension that are supported you can check out the onces that are useful to your issue.

@joydeep049
Copy link
Contributor Author

joydeep049 commented Feb 21, 2024

Hello @mastersans , Thanx for letting me know.
The function I had used to extract was the aio_unpack_archive(), which is basically an asynchronous wrapper around the shutil function unpack_archive().
We would want the above function to be able to extract tar.xz files as well right? @terriko
Also, @terriko Can you review this whenever you have time.
Thanx

@joydeep049
Copy link
Contributor Author

I'm gonna close this since it has already been resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants