Skip to content

Releases: nebuly-ai/nos

0.1.2

06 Apr 10:32
Compare
Choose a tag to compare

Release notes nos 0.1.2

This is a minor release fixing a bug affecting Dynamic MPS Partitioning.

Changelog

🔨 Fix bug in MPS nodes initialization

0.1.1

03 Apr 13:57
Compare
Choose a tag to compare

Release notes nos 0.1.1

This is a minor release fixing some bugs and adding minor improvements.

Changelog

  • 🔨 Fix typo in Helm chart preventing the mount of the scheduler config into the GPU Partitioner (thanks @nickpetrovic!)
  • 🔨 Fix bug preventing Dynamic MIG Partitioning from working correctly on multi-GPU nodes (thanks to @likku123 and @WindowsXp-Beta for their help detecting and troubleshooting the issue!)
  • ✨ Initialize the GPUs of the MIG-enabled nodes with the largest available MIG profile
  • ✨ Include NVIDIA-A100-SXM4-80GB and NVIDIA-A100-SXM4-40GB models in known MIG geometries
  • ✨ Update documentation for including k8s and CUDA version constraints

Contributors

@Telemaco019
@nickpetrovic
@5cat

0.1.0

30 Jan 10:46
d6c98c6
Compare
Choose a tag to compare

Release Notes nos

This is the first major release of nos, the Nebuly Operating System. It implements two main features:

  • ✂️ Dynamic GPU Partitioning: you can think of this as a cluster autoscaler for GPUs: instead of scaling up the number of nodes and GPUs, it dynamically partitions them to ensure that each workload only uses the GPU resources it actually needs, resulting in spare GPU capacity that could be used for other workloads. To partition GPUs, nos leverages NVIDIA’s MPS and MIG, finally making them dynamic.
  • 🤝 Elastic Resource Quota management: it allows to increase the number of Pods running on the cluster by allowing teams (namespaces) to borrow quotas of reserved resources from other teams as long as they are not using them.