Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generator: Don't reload udev in the generator #304

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

daniloegea
Copy link
Collaborator

@daniloegea daniloegea commented Dec 16, 2022

The systemd documentation states that any kind of process communication shouldn't happen inside generators. It can cause problems like the one reported in LP#1999178:

Generators are run very early at boot and cannot rely on any external services. They may not talk to any other process. That includes simple things such as logging to syslog(3), or systemd itself (this means: no systemctl(1))!

Ubuntu IS also affected by the same issue! We are observing long pauses during upgrades from Lunar to Mantic due to a package (usbmuxd) installing a udev rules file referring to a user that will be created late in the upgrade process. Every daemon-reload will stuck for 2 minutes, and it's happening many times. While the workaround for this specific case is easy (create the user earlier), users might run in to situations that's out of our control.

This change moves the call to the udevadm command from the generator to the Python code. To keep the behavior as close to the previous one as possible, the call is made as part of the generate command.

We could simply check if the generator is being called as a systemd generator and don't call udevadm, but in the end the binary still is a generator so we shouldn't even have that logic there.

The same is true for the calls we make to systemd using systemctl. Even though they are not executed when the binary is called as a generator, they probably shouldn't be there.

UPDATE:

I'm also proposing that we stop using generators to generate configuration. There are 2 main reasons for that:

  1. According to systemd.generator(7), it's just wrong: Generators should only be used to generate unit files, .d/*.conf drop-ins for them and symlinks to them, not any other kind of non-unit related configuration.
  2. Every daemon-reload will run the generator and regenerate all the configuration. It's a lightweight operation if one doesn't have a lot of YAMLs, but that's totally unnecessary.

This PR also includes an experimental systemd service that will call the generate binary before running systemd-networkd, Network Manager and cloud-init. While I'm not completely sure it's correct, it appears to work fine.

I have a PPA for Mantic containing these changes here https://launchpad.net/~danilogondolfo/+archive/ubuntu/netplan.io/+packages

Description

Checklist

  • Runs make check successfully.
  • Retains 100% code coverage (make check-coverage).
  • New/changed keys in YAML format are documented.
  • (Optional) Adds example YAML for new feature.
  • (Optional) Closes an open bug in Launchpad.

@daniloegea daniloegea added the RFC Request for comment (don't merge yet) label Dec 16, 2022
@daniloegea daniloegea marked this pull request as draft December 16, 2022 11:29
@jpsutton
Copy link

jpsutton commented Jan 3, 2023

I rebuilt the netplan pkg for ArchLinux with this PR applied and it resolved the issue for me.

@slyon
Copy link
Collaborator

slyon commented Jan 5, 2023

Just a note (not a full review, yet): We need to double check if this still works for LP#1669564 (f4b0c54)

@daniloegea
Copy link
Collaborator Author

These changes break the new dbus integration tests:

test_dbus_config_apply (__main__.TestNetworkd.test_dbus_config_apply) ... FAIL
test_dbus_config_get (__main__.TestNetworkd.test_dbus_config_get) ... ok
test_dbus_config_set (__main__.TestNetworkd.test_dbus_config_set) ... ok

======================================================================
FAIL: test_dbus_config_apply (__main__.TestNetworkd.test_dbus_config_apply)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/autopkgtest.iwnz9O/build.JXT/real-tree/tests/integration/dbus.py", line 202, in test_dbus_config_apply
    self.assertEqual(out.returncode, 0, msg=f"Busctl Apply() failed with error: {out.stderr}")
AssertionError: 1 != 0 : Busctl Apply() failed with error: Call failed: netplan apply failed: Child process exited with code 1
stdout: ''
stderr: 'Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 56, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 244, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 63, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 244, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 123, in command_apply
    if run_generate and subprocess.call(generator_call) != 0:
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 389, in call
    with Popen(*popenargs, **kwargs) as p:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'netplan'
'


----------------------------------------------------------------------
Ran 3 tests in 2.343s

FAILED (failures=1)
autopkgtest [15:05:08]: test dbus: -----------------------]
autopkgtest [15:05:09]: test dbus:  - - - - - - - - - - results - - - - - - - - - -
dbus                 FAIL non-zero exit status 1

@slyon
Copy link
Collaborator

slyon commented Apr 24, 2023

I wonder if we could do something similar to #162

Instead of calling udevadm syncronoulsy, we could consider deploying a new "netplan-udev-reload.service" unit, which executes something like udevadm control --reload && udevadm trigger --attr-match=subsystem=net as a OneShot unit.

From the generator we could then eque this "oneshot" command by calling systemctl start --no-block --no-ask-password [--fail] netplan-udev-reload.service (or maybe use systemd-run ...?), which should make it be executed after the current "daemon-reload" transaction is done, while still executing our commands.

@daniloegea daniloegea force-pushed the udev_reload branch 2 times, most recently from 9d631f1 to 1104634 Compare August 29, 2023 14:56
@daniloegea daniloegea force-pushed the udev_reload branch 2 times, most recently from 676bb2d to 0572baf Compare September 5, 2023 17:07
daniloegea and others added 4 commits September 6, 2023 10:40
The systemd documentation states that any kind of process communication
shouldn't happen inside generators. It can cause problems like the one
reported in LP#1999178.

Ubuntu seems to not be affected by the same issue yet, but as the
problem happens due to a race condition we might run into it in the
future as well.

This change moves the call to the udevadm command from the generator to
the Python code. To keep the behaviour as close to the previous one as
possible, the call is made as part of the generate command.

We could simply check if the generator is being called as a systemd
generator and don't call udevadm, but in the end the binary still is a
generator so we shouldn't even have that logic there.

The same is true for the calls we make to systemd using systemctl. Even
though they are not executed when the binary is called as a generator,
they probably shouldn't be there.
slyon added a commit to slyon/netplan that referenced this pull request Jul 4, 2024
The udev rules directories are monitored and re-loaded automatically with
modern systemd-udevd. No need to manually reload them in the generator,
causing side-effects.

We can still force-reload them when issuing 'netplan generate' or
'netplan apply' through the CLI, just to be on the safe side.

Replaces: canonical#304
slyon added a commit to slyon/netplan that referenced this pull request Jul 4, 2024
The udev rules directories are monitored and re-loaded automatically with
modern systemd-udevd. No need to manually reload them in the generator,
causing side-effects.

We can still force-reload them when issuing 'netplan generate' or
'netplan apply' through the CLI, just to be on the safe side.

Replaces: canonical#304
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for comment (don't merge yet)
Projects
None yet
3 participants