Skip to content

Releases: huggingface/tgi-gaudi

v2.0.1: SynapseAI v1.16.0

24 Jun 09:58
Compare
Choose a tag to compare

SynapseAI v1.16.0

The codebase is validated with SynapseAI 1.16.0 and optimum-habana 1.12.0.

Tested configurations

  • LLama2 7B BF16 / FP8 on 1xGaudi2
  • LLama2 70B BF16 / FP8 on 8xGaudi2
  • Falcon 180B BF16 / FP8 on 8xGaudi2
  • Mistral 7B BF16 / FP8 on 1xGaudi2
  • Mixtral 8x7B BF16 / FP8 on 1xGaudi2

Highlights

  • Add support for grammar feature
  • Add support for Habana Flash Attention

Full Changelog: v2.0.0...v2.0.1

v2.0.0: SynapseAI v1.15.0

13 May 12:33
1a8c7d0
Compare
Choose a tag to compare

SynapseAI v1.15.0

The codebase is validated with SynapseAI 1.15.0 and optimum-habana 1.11.1.

Tested configurations

  • LLama2 70B BF16 / FP8 on 8xGaudi2

Highlights

  • Add support for FP8 precision

Full Changelog: v1.2.1...v2.0.0

v1.2.1: SynapseAI v1.14.0

19 Mar 07:43
d752317
Compare
Choose a tag to compare

SynapseAI v1.14

The codebase is validated with SynapseAI 1.14.0 and optimum-habana 1.10.4.

Tested configuration

  • LLama2 70B BF16 on 8xGaudi2

Highlights

  • Add support for continuous batching on Intel Gaudi
  • Add batch size bucketing
  • Add sequence bucketing for prefill operation
  • Optimize concatenate operation
  • Add speculative scheduling

Full Changelog: v1.2.0...v1.2.1