Does this pyelftools support for multi thread handling? #465

ytxloveyou · 2023-04-06T01:52:39Z

When the input elf files are very large, the Processing time becomes very long.... So i am wondering if it could support for multi thread or process handling....

sevaa · 2023-04-11T13:52:05Z

Python is notoriously single threaded :)

Pyelftools tries to lazy-load where possible. What's the usage scenario exactly? ELF proper parsing, or DWARF? For one thing, the DWARF handling portions of the library tend to load sections as a whole, with no support for progressive loading.

ytxloveyou · 2023-04-11T15:36:24Z

Python is notoriously single threaded :)

Pyelftools tries to lazy-load where possible. What's the usage scenario exactly? ELF proper parsing, or DWARF? For one thing, the DWARF handling portions of the library tend to load sections as a whole, with no support for progressive loading.

ok....thanks for feedback... i try to use it to re-construct and analye the variable type (like structure ,union or so on) from the elf dwarf info... when the elf file is less than 5mb, then it is fine. but if it is bigger, it might cost 5-15 minutes to go through all compute units and the content in it. So i am wondering whether the multi-process or multi-thread could help in this case....
If no, maybe i need to find out how to do it in C or some other ways to do multiple process job in the meantime.....

The reason why i raise this issues is that in our company(automotive product) we use vector toolchain( maybe you know it or not) it is called ASAP2 tool which could analyze ELF files to generate symbol with types. it runs very fast... I really want to know how could they achieve it..

sevaa · 2023-04-11T15:47:59Z

So it's DWARF. Have you timed the execution - which call exactly is taking 15 minutes?

ytxloveyou · 2023-04-11T15:59:17Z

Just assume that the there are more than 10000 compute units , so by calling cu_iter it might cost 5 minutes, then for each comput units , if i need to export all variable with its types, then i need to loop through all the DT members to find out the relevant type definition, this process might need 10 minutes in some cases... something like it ... maybe i did something wrong...

sevaa · 2023-04-11T16:12:05Z

So off the top of my head, one thing you can try is iterating through DIEs smarter. Are you after all variables, or all static lifetime variables? Globals, static class members, or both? When it comes to modern DWARF, navigating to a sibling is a fast operation. You could try that instead of scrolling through all DIEs.

Iterating between compile units is not a long operation per se - there is a linked list-like data structure there, going from a CU to the next CU is fast. That said, the section needs to be loaded first, and that's a time consuming piece of I/O.

One more thing I thought of, you might be able to slice some time from I/O by somewhat reimplementing get_DWARF_info() - don't load the miscellaneous sections. There is no built-in support for lazy loading of those (that I know of), you'd have to roll your own. It's a good idea for improvement, though. Depending on what kind of information you want to dump, though, loclists might be necessary,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does this pyelftools support for multi thread handling? #465

Does this pyelftools support for multi thread handling? #465

ytxloveyou commented Apr 6, 2023

sevaa commented Apr 11, 2023

ytxloveyou commented Apr 11, 2023 •

edited

Loading

sevaa commented Apr 11, 2023

ytxloveyou commented Apr 11, 2023

sevaa commented Apr 11, 2023

Does this pyelftools support for multi thread handling? #465

Does this pyelftools support for multi thread handling? #465

Comments

ytxloveyou commented Apr 6, 2023

sevaa commented Apr 11, 2023

ytxloveyou commented Apr 11, 2023 • edited Loading

sevaa commented Apr 11, 2023

ytxloveyou commented Apr 11, 2023

sevaa commented Apr 11, 2023

ytxloveyou commented Apr 11, 2023 •

edited

Loading