Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Feedback: Improving Startup Latency in X11 #16033

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Sewer56
Copy link
Contributor

@Sewer56 Sewer56 commented Jun 15, 2024

What does the pull request do?

This pull request reduces the latency required to display the first Window when running Avalonia under the X11 backend.

What is the updated/expected behavior with this PR?

On my machine, running Hyprland and a 5900X, a blank Avalonia project compiled with NativeAOT previously took 80-105ms from running AppBuilder.Configure to running Initialize in App.axaml.cs.

This PR improves it to 66-81ms or 40-51ms depending on commit used.

How was the solution implemented (if it's not obvious)?

Startup is largely bottlenecked on X11 by the following methods:

  • X11CursorFactory__ctor
  • InitializeGraphics(options, Info)

During the backend initialization.
These two methods alone take around half the startup time.

Because these methods don't depend on each other's inputs/outputs,
and have approximately the same runtime, they can be ran in parallel
to speed up booting.

Note: We call XInitThreads, which enables concurrent access to X11.

Two Implementations

There are two implementations in this PR, pick the preferred commit.

Baseline: 80-105ms

  • cf30995 creates the cursors in parallel to graphics, resulting in 66-81ms (0.825x the time).
  • bfb2ef2 instead lazily creates the X11 cursors as needed, resulting in 40-51ms startup times. (0.5x the time).

My lower end laptop with 4 threads yielded similar results.

Request for Feedback

Note

This concerns the original parallel implementation
PR was updated to instead make the lazy implementation the current one, since I believe it may be preferable.

I would like to know if this could potentially be further improved.
Oddly, I was expecting slightly greater savings, something approximately equal to not calling X11CursorFactory at all.

Although I am seeing an improvement, the improvement is not as large as I expected it to be. I'm not sure if this is due to some sort of resource locking in the XWayland implementation powering my desktop behind the scenes, or whether it's due to another reason.

I did investigate the latency with which the threadpool tasks were immediately started; starting a dedicated thread, etc. And it seems to be somewhere in well, not Avalonia code.

Why?

I thought Startup could be faster.
So I made it a bit faster.

I think this can even be further improved by starting the processing right at the call to UseX11(this AppBuilder builder), but I can't see that passing review, as that should be used for configuration, not execution.

@avaloniaui-bot
Copy link

You can test this PR using the following package version. 11.2.999-cibuild0049056-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

@cla-avalonia
Copy link
Collaborator

cla-avalonia commented Jun 15, 2024

  • All contributors have signed the CLA.

@Sewer56
Copy link
Contributor Author

Sewer56 commented Jun 15, 2024

@cla-avalonia agree

@maxkatz6 maxkatz6 requested a review from kekekeks June 15, 2024 17:33
var cursorShapes = GetAllCursorShapes();
_cursors = new Dictionary<CursorFontShape, IntPtr>(cursorShapes.Length);
foreach (var shape in cursorShapes)
_cursors[shape] = XLib.XCreateFontCursor(_display, shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is XCreateFontCursor a bottleneck? If so, I think this call could be done lazily on GetCursor.

Copy link
Contributor Author

@Sewer56 Sewer56 Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a viable strategy aswell.

Without initializing the cursors at all, startup is around 50ms.
Hence why I noted the savings were less than what I expected, and there might have been some contention.

It might actually be better than the current solution; as the user is unlikely to immediately jump in and use all cursor styles. Let me make a quick commit to test this.


As for the force pushes, that was just me fixing up spacing as to not make unnecessary changes in the diff, aligning myself with the contributors guide.

Copy link
Contributor Author

@Sewer56 Sewer56 Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some testing, I've it seems that lazy initialization for cursors is probably the better approach.
It gets a startup time of 40-51ms on my machine, so around twice as fast than the baseline.
Loading a previously uncreated cursor seems to be less than 1ms, so no horrific frame spikes to be found.

I originally thought XCreateFontCursor was doing a file read under the hood, since I was seeing fgets in the stack trace, which is where I got the idea of sending that to background came from. But even if it was, that data would be cached one way or another, so the read would be fast.

I've committed bfb2ef2 which achieves that as the current commit and updated the opening post. The previous PR is still available in the branch history.

@Sewer56 Sewer56 force-pushed the improve-x11-startup-aot branch 3 times, most recently from d9a2ff6 to 2c635b8 Compare June 15, 2024 19:52
@avaloniaui-bot
Copy link

You can test this PR using the following package version. 11.2.999-cibuild0049070-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants