Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from caikit:main #38

Closed
wants to merge 16 commits into from
Closed

[pull] main from caikit:main #38

wants to merge 16 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Sep 13, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

dependabot bot and others added 16 commits August 26, 2024 19:12
Updates the requirements on [caikit[runtime-grpc,runtime-http]](https://github.com/caikit/caikit) to permit the latest version.
- [Release notes](https://github.com/caikit/caikit/releases)
- [Changelog](https://github.com/caikit/caikit/blob/main/releases.md)
- [Commits](caikit/caikit@v0.26.34...v0.27.0)

---
updated-dependencies:
- dependency-name: caikit[runtime-grpc,runtime-http]
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
When CONFIG_FILES setting is not used, the caikit-nlp config values
are not read from environment variables unless caikit has already
imported caikit-nlp.  The code was loading the embedding config too
soon in this case so the env vars were ignored.

This commit reads the config during bootstrap and load for settings that are needed
at bootstrap/load time and also reads config during module init for runtime settings.

In addition:

* bootstrap() kwargs can be used (e.g. for trust_remote_code param)
* load() was made to work with model_path as ModuleConfig (because str is deprecated and spammy)
* utils/env_val_to_int was removed because the caikit config handles this well
* utils/env_val_to_bool was NOT YET removed because the caikit config does not handle this quite as well

Signed-off-by: Mark Sturdevant <[email protected]>
ModuleConfig should always have a model_path, but this check will
make the error better if this gets called with an empty ModuleConfig.

Signed-off-by: Mark Sturdevant <[email protected]>
Signed-off-by: Mark Sturdevant <[email protected]>
Read embeddings config during bootstrap/load and instance creation
…cruntime-http--gte-0.26.34-and-lt-0.28.0

Update caikit[runtime-grpc,runtime-http] requirement from <0.27.0,>=0.26.34 to >=0.26.34,<0.28.0
This module is closely related to EmbeddingModule.

Cross-encoder models use Q and A pairs and are trained return a relevance score for rank().
The existing rerank APIs in EmbeddingModule had to encode Q and A
separately and use cosine similarity as a score. So the API is the same, but the results
are supposed to be better (and slower).

Cross-encoder models do not support returning embedding vectors or sentence-similarity.

Support for the existing tokenization and model_info endpoints was also added.

Signed-off-by: Mark Sturdevant <[email protected]>
* mostly removing unnecessary code
* some better clarity

Signed-off-by: Mark Sturdevant <[email protected]>
* The already borrowed errors are fixed with tokenizers per thread,
  so there were some misleading comments about not changing params
  for truncation (which we do for cross-encoder truncation).

Signed-off-by: Mark Sturdevant <[email protected]>
Default is 32.
Can override with embedding batch_size in config or EMBEDDING_BATCH_SIZE env var.

Signed-off-by: Mark Sturdevant <[email protected]>
* Moved the truncation check to a place that can determine
  the proper index for the error message (with batching).

* Added test to validate some results after truncation.
  This is with a tiny model, but works for sanity.

Signed-off-by: Mark Sturdevant <[email protected]>
The part that really tests that a token is truncated was wrong.

* It was backwards and passing because the scores are sorted by rank
* Using the index to get scores in the order of the inputs
* Now correctly xx != xy but xy == xyz (truncated z)

Signed-off-by: Mark Sturdevant <[email protected]>
CrossEncoderModule with rerank API
Copy link

openshift-ci bot commented Sep 13, 2024

Hi @pull[bot]. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@vaibhavjainwiz
Copy link
Member

/ok-to-test

@vaibhavjainwiz
Copy link
Member

/lgtm

Copy link

openshift-ci bot commented Sep 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pull[bot], vaibhavjainwiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants