Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with StatefulSet Rolling Update Strategy #180

Open
amacciola opened this issue Jul 21, 2022 · 10 comments
Open

Issue with StatefulSet Rolling Update Strategy #180

amacciola opened this issue Jul 21, 2022 · 10 comments

Comments

@amacciola
Copy link

amacciola commented Jul 21, 2022

Precursor:

Currently all of our applications are deployed with StatefulSets vs being deployed with Deployments . The current UpdateStrategy of our StatefulSets is Rolling Updates. Here is an explanation of what it does and the other option we have:
Screen Shot 2022-07-21 at 10 29 21 AM

Issue:

The combination of Rolling Updates && Libcluster is making it so that we can never add new services to the libcluster/horde registry. Because we will have

  1. 3 pods running all lets say with version 1
  2. Version 1 has libcluster/horde running but only Genserver_A is registered to it
  3. Then we trigger an update to this env with version 2
  4. In version 2 we have added a Genserver_B to be added to the libcluster/horde registry
  5. The Rolling Update will start with pod 2 out of 0,1,2. And it will not update the other pods with the new version until pod 2 is up and running
  6. However when pod 2 starts the libcluster detects pods 0 and 1 using the k8s labels and IPs and tries to register Genserver_B
  7. But pods 0 and 1 do not have the code for Genserver_B yet. So pod 2 crashes because it cannot start Genserver_B on the pod its trying to.
  8. And the Rolling update never proceeds to the other pods because pod 2 never passes

Or at least that is what i think is happening here. For the most part i think i have the issue correct and the error message on the pod that is crashing is

        ** (EXIT) an exception was raised:
            ** (UndefinedFunctionError) function Cogynt.Servers.Workers.CustomFields.start_link/1 is undefined or private
                (cogynt 0.1.0) Cogynt.Servers.Workers.CustomFields.start_link([name: {:via, Horde.Registry, {Cogynt.Horde.HordeRegistry, Cogynt.Servers.Workers.CustomFields}}])
                (horde 0.8.7) lib/horde/processes_supervisor.ex:766: Horde.ProcessesSupervisor.start_child/3
                (horde 0.8.7) lib/horde/processes_supervisor.ex:752: Horde.ProcessesSupervisor.handle_start_child/2
                (stdlib 3.17) gen_server.erl:721: :gen_server.try_handle_call/4
                (stdlib 3.17) gen_server.erl:750: :gen_server.handle_msg/6
                (stdlib 3.17) proc_lib.erl:226: :proc_lib.init_p_do_apply/3

Even though i know the version on that pod has the code for Cogynt.Servers.Workers.CustomFields.start_link so it must be referring to one of the other 2 pods that had not got the new version yet.

Has anyone else every ran into this problem ?

@AndrewDryga
Copy link

I think you would have the same issue even if you use deployment, new pods would crash-loop anyways.

The way how we solved this is: we have our own implementation of the Kubernetes strategy that polls k8s API and only joins nodes of the same version (from the version label) into a cluster. This makes it impossible to handoff state between application versions but makes sure that code that was never tested to co-live in a cluster would end up crashing in production.

@amacciola
Copy link
Author

@AndrewDryga do you think this custom k8s strategy is worth a PR or can be shared ? Because i feel like how are more people not running into this same issue ? Is everyone else using this library only every deploying libcluster 1 time and then never adding new features into its registry from that point forward ?

@AndrewDryga
Copy link

@amacciola the problem with our strategy is that it is very opinionated (uses specific labels named for our environment, uses node names from k8s labels, etc). I will think about open-sourcing it but it's a relatively easy change: just leverage labelSelector in get_nodes/4 callback implementation and query only the pods of a specific version.

@amacciola
Copy link
Author

@AndrewDryga okay i will try this. So if i am trying to extend the k8s dns strategy here:
https://github.com/bitwalker/libcluster/blob/main/lib/strategy/kubernetes_dns.ex

you are suggestion that i need to tweak the get_nodes method here:

defp get_nodes(%State{topology: topology, config: config}) do

to also additionally query for a specific version or at least only matching version numbers ?

@AndrewDryga
Copy link

@amacciola you can't extract that information from DNS server, instead you should modify that function in lib/strategy/kubernetes.ex. K8s API returns a lot of information about the pod including labels that you need to use to store the version.

@amacciola
Copy link
Author

@AndrewDryga i see. So its changing

path =
case ip_lookup_mode do
:endpoints -> "api/v1/namespaces/#{namespace}/endpoints?labelSelector=#{selector}"
:pods -> "api/v1/namespaces/#{namespace}/pods?labelSelector=#{selector}"
end
headers = [{'authorization', 'Bearer #{token}'}]
http_options = [ssl: [verify: :verify_none], timeout: 15000]
case :httpc.request(:get, {'https://#{master}/#{path}', headers}, http_options, []) do
{:ok, {{_version, 200, _status}, _headers, body}} ->
parse_response(ip_lookup_mode, Jason.decode!(body))
|> Enum.map(fn node_info ->
format_node(
Keyword.get(config, :mode, :ip),
node_info,
app_name,
cluster_name,
service_name
)
end)

so include additional params to only return info for certain version info

@AndrewDryga
Copy link

@amacciola yes, you want to query for pods and return only the ones that match your current version

@amacciola
Copy link
Author

@AndrewDryga i am working on testing this new strategy out now so thanks for the insight. But i just wanted to make sure i understood how some of the Libcluster combined with Horde registry code is working under the hood.

If we have 3 pods running for the same application. Each of these pods have lets say server_1 registered in the HordeRegistry.

  • pod_1 with version_1
  • pod_2 with version_1 -> running server_1
  • pod_3 with version_1

If we then trigger an update for version_2, which contains a new service that gets registered in the Horde Registry, server_2. Lets say the update starts with pod_3. When pod_3 comes online with the new version and finds pod_1 and pod_2 .

Does it just pick a process_id from one of the 3 pods to to try and start the new service on ? So you will have a 2 in 3 chance it tries to start the new service on a pod with version_1 and not version_2. ???

@AndrewDryga
Copy link

I'm not using Horde but the pods with version 1 would not see pods with version 2 in the Erlang cluster, so basically, for each of the islands (one per version), everything would behave like it's a cluster with the same codebase. If you have globally unique jobs it also means that you will have two of the workers started (one per island).

@amacciola
Copy link
Author

For now i have just created a separate HordeRegistry for each Genserver we would want to be leveraging the Libcluster strategies. As long as we dont have too many its a minor annoyance to fix this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants