Skip to content

Latest commit

 

History

History
66 lines (41 loc) · 2.07 KB

CHANGELOG.md

File metadata and controls

66 lines (41 loc) · 2.07 KB

Changelog

We are following semantic versioning with strict backward-compatibility policy.

You can find out backwards-compatibility policy here.

Changes for the upcoming release can be found in the 'changelog.d' directory in our repository.

Changes

  • --quantize now takes int8, int4 instead of 8bit, 4bit to be consistent with bitsandbytes concept.

    openllm CLI now cached all available model command, allow faster startup time.

    Fixes openllm start model-id --debug to filtered out debug message log from bentoml.Server.

    --model-id from openllm start now support choice for easier selection.

    Updated ModelConfig implementation with getitem and auto generation value.

    Cleanup CLI and improve loading time, openllm start should be 'blazingly fast'. #28

Features

  • Added support for quantization during serving time.

    openllm start now support --quantize int8 and --quantize int4 GPTQ quantization support is on the roadmap and currently being worked on.

    openllm start now also support --bettertransformer to use BetterTransformer for serving.

    Refactored openllm.LLMConfig to be able to use with __getitem__: openllm.DollyV2Config()['requirements'].

    The access order being: __openllm_*__ > self.<key> > __openllm_generation_class__ > __openllm_extras__.

    Added towncrier workflow to easily generate changelog entries

    Added use_pipeline, bettertransformer flag into ModelSettings

    LLMConfig now supported __dataclass_transform__ protocol to help with type-checking

    openllm download-models now becomes openllm download #27