Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller seems to get stuck with queued actions when deleting namespaces ungracefully (Removing finalizers) #3755

Open
WyWyFe opened this issue Jun 26, 2024 · 0 comments

Comments

@WyWyFe
Copy link

WyWyFe commented Jun 26, 2024

Describe the bug
When ungracefully deleting a K8s namespace that contains ingress objects managed by the ALBC, the ALBC will try to reconcile ingress objects from the namespace that no longer exists for up to 16m40s as per targetgroupbinding-max-exponential-backoff-delay.

This leads to delays in provisoning ingresses as it appears the ALBC may be stuck trying to reconcile ingress objects in the namespace that no longer exists.

NOTE:
Deleting the namespace and allowing it to gracefully delete all objects before terminating the namespace itself will not lead to this issue. It only occurs when removing the finalizers of the namespace so that the namespace is immediately removed.

Steps to reproduce

  1. Delete a namespace with ingress objects related to target group bindings for an ALB provisioned by the ALBC.
  2. Remove the finalizers from the namespace so that it is immediately removed
  3. Observe logs from the ALBC stating that it is trying to reconcile an ingress object from the namespace which no longer exists
  4. Create a new ingress object in an existing namespace and observe the ingress object will not be created until the 16m40s timeout period for targetgroupbinding-max-exponential-backoff-delay elapses.

This leads to greatly slowed ALB reconciliation as the ALBC seems to be stuck processing items in its queue, such as reconciling the ingress objects which belonged to the namespace which was ungracefully terminated.

Expected outcome
If ingress objects belong to a namespace that no longer exists, the reconciler should skip these in the future and not use the exponential backoff delay logic from targetgroupbindings before moving to the next action in the queue (such as creating a new ALB for a new ingress object).

Environment

  • AWS Load Balancer controller version v2.5.4
  • Kubernetes version 1.27
  • Using EKS (yes/no), if so version? k8s 1.27

Additional Context:
Creating this issue as discussed offline with colleague @oliviassss
Thanks team! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant