Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus segfault when there are offline CPUs #31

Open
yilongli opened this issue Jun 27, 2019 · 4 comments · May be fixed by #36
Open

Nexus segfault when there are offline CPUs #31

yilongli opened this issue Jun 27, 2019 · 4 comments · May be fixed by #36

Comments

@yilongli
Copy link

I got a segfault when running the create_session_test at the following line:

get_lcores_for_numa_node(numa_node).at(sm_thread_lcore_index));

The problem is that sm_thread_lcore_index is assigned to be the last lcore at line 61 without considering its status while get_lcores_for_numa_node returns only online lcores.

@yilongli
Copy link
Author

BTW, the code in numautils.h seems to assume that there is an equal number of (online) lcores in each numa node, which is quite fragile.

@anujkaliaiitd
Copy link
Collaborator

Hi, Yilong. The approach suggested in this issue would be nice to have in eRPC. Machines with offline CPUs are uncommon IMO, so this is a low-priority task for us. We would welcome a patch.

As a temporary workaround, you might hard-code the core for the session management thread. Or, you might delete the core pinning for this thread altogether. The session management thread has near-zero CPU use when sessions aren't being actively created or destroyed, so my hope is that disabling core pinning won't affect performance.

@yilongli
Copy link
Author

I had hyperthreading turned off so half of the CPUs were offline. I agree that machines with offline CPUs are rare in production but it's quite convenient for doing experiment. Anyway, I might submit a patch if this becomes more problematic for me. Thanks.

@anujkaliaiitd
Copy link
Collaborator

Ah - I didn't think of the HT-disabled case. That's a scenario that we would like to support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants