-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from caikit:main #38
Conversation
Updates the requirements on [caikit[runtime-grpc,runtime-http]](https://github.com/caikit/caikit) to permit the latest version. - [Release notes](https://github.com/caikit/caikit/releases) - [Changelog](https://github.com/caikit/caikit/blob/main/releases.md) - [Commits](caikit/caikit@v0.26.34...v0.27.0) --- updated-dependencies: - dependency-name: caikit[runtime-grpc,runtime-http] dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
When CONFIG_FILES setting is not used, the caikit-nlp config values are not read from environment variables unless caikit has already imported caikit-nlp. The code was loading the embedding config too soon in this case so the env vars were ignored. This commit reads the config during bootstrap and load for settings that are needed at bootstrap/load time and also reads config during module init for runtime settings. In addition: * bootstrap() kwargs can be used (e.g. for trust_remote_code param) * load() was made to work with model_path as ModuleConfig (because str is deprecated and spammy) * utils/env_val_to_int was removed because the caikit config handles this well * utils/env_val_to_bool was NOT YET removed because the caikit config does not handle this quite as well Signed-off-by: Mark Sturdevant <[email protected]>
Signed-off-by: Christina Xu <[email protected]>
Convert tensor dtype to float64
ModuleConfig should always have a model_path, but this check will make the error better if this gets called with an empty ModuleConfig. Signed-off-by: Mark Sturdevant <[email protected]>
Signed-off-by: Mark Sturdevant <[email protected]>
Read embeddings config during bootstrap/load and instance creation
…cruntime-http--gte-0.26.34-and-lt-0.28.0 Update caikit[runtime-grpc,runtime-http] requirement from <0.27.0,>=0.26.34 to >=0.26.34,<0.28.0
This module is closely related to EmbeddingModule. Cross-encoder models use Q and A pairs and are trained return a relevance score for rank(). The existing rerank APIs in EmbeddingModule had to encode Q and A separately and use cosine similarity as a score. So the API is the same, but the results are supposed to be better (and slower). Cross-encoder models do not support returning embedding vectors or sentence-similarity. Support for the existing tokenization and model_info endpoints was also added. Signed-off-by: Mark Sturdevant <[email protected]>
* mostly removing unnecessary code * some better clarity Signed-off-by: Mark Sturdevant <[email protected]>
* The already borrowed errors are fixed with tokenizers per thread, so there were some misleading comments about not changing params for truncation (which we do for cross-encoder truncation). Signed-off-by: Mark Sturdevant <[email protected]>
Default is 32. Can override with embedding batch_size in config or EMBEDDING_BATCH_SIZE env var. Signed-off-by: Mark Sturdevant <[email protected]>
* Moved the truncation check to a place that can determine the proper index for the error message (with batching). * Added test to validate some results after truncation. This is with a tiny model, but works for sanity. Signed-off-by: Mark Sturdevant <[email protected]>
The part that really tests that a token is truncated was wrong. * It was backwards and passing because the scores are sorted by rank * Using the index to get scores in the order of the inputs * Now correctly xx != xy but xy == xyz (truncated z) Signed-off-by: Mark Sturdevant <[email protected]>
Signed-off-by: Mark Sturdevant <[email protected]>
CrossEncoderModule with rerank API
Hi @pull[bot]. Thanks for your PR. I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pull[bot], vaibhavjainwiz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )