Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Use serde to aid in loading docs #91

Merged
merged 2 commits into from
Jul 8, 2024

Conversation

WillLillis
Copy link
Collaborator

@WillLillis WillLillis commented Jul 8, 2024

This PR addresses #75 by changing how data is loaded at server start. Before, each xml file was parsed at server start into a vector of Instructions, Registers, or Directives. Now, the following takes place:

  • A separate binary (asm_docs_parsing) parses said xml files into vectors and then serializes them once beforehand. The serialized content is then written to a separate file
  • At server start, the serialized file is deserialized, and the rest of the server's operation proceeds as normal

Before marking this PR as ready, I'd like to do a thorough performance measurement, as well as to write some tests to ensure the serialized docs stay in sync with their xml counterparts.

Averaged over 10 trials each, the server now starts about 12% faster, which is an absolute win. The performance gains added here will continue to help as we add additional architectures as well.

However, some initial measurements are promising:

  • The x86 register set showed a 1.88x loading time improvement
  • The x86 instruction set showed a 1.69x loading time improvement
  • Raw and serialized data file sizes are all in the same ballpark, with an aggregate reduction of 736KB. See below for details:
x86 Instructions x86_64 Instructions z80 Instructions x86 Registers x86_64 Registers z80 Registers Gas Directives
Raw Size (bytes) 4882695 5241289 117347 55149 101735 5047 57010
Serialized Size (bytes) 4557857 4754451 177181 58589 111965 6125 57107
Change -324838 -486838 +59834 +3440 +10230 +1078 +97

Closes #75

@WillLillis WillLillis mentioned this pull request Jul 8, 2024
13 tasks
@WillLillis WillLillis force-pushed the docs_parsing branch 4 times, most recently from 055eb52 to 74bb4b1 Compare July 8, 2024 21:56
@WillLillis WillLillis marked this pull request as ready for review July 8, 2024 22:03
@WillLillis WillLillis merged commit 4d54f6b into bergercookie:master Jul 8, 2024
15 checks passed
@WillLillis WillLillis deleted the docs_parsing branch July 8, 2024 22:08
@lu-zero
Copy link

lu-zero commented Jul 8, 2024

If you store it as a bincode, it could lead to even bigger gains.

@WillLillis
Copy link
Collaborator Author

WillLillis commented Jul 9, 2024

If you store it as a bincode, it could lead to even bigger gains.

Thanks, wasn't aware of that crate!

#93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use serde for deserializing the data sources
2 participants