HF-torchserve-pipeline

NOTE: This repository is under integration with the official Torchserver repo and is subject to heavy changes in the future.

HF-torchserve-pipeline

This repository contains an example to deploy models with third-party dependencies (like 🤗 Transformers, sparseml etc) on Torchserve servers as ready-for-usage Docker containers on cloud services like AWS.

For the context of this repository, we would deploy the models on an AWS t2.micro instance which can be used for free (for 750 hours) on a new AWS account. We work with a 🤗 MobileViT Transformer model for the task of image classification by using its pipeline feature, the handler code in scripts can also be used as a simplistic template to deploy an 🤗 pipeline.

This work can also be extended to deploy any 🤗 pipeline for any supported task with Torchserve.

This work may also be extended to deploy the Torchserve Docker containers with HF models at scale with AWS Cloudformation & AWS EKS as explained in the official Torchserve repo & AWS Sagemaker, incorporating utilities like AWS ELB & Cloudwatch.

We would also benchmark the REST API calls in time units and compare the model performances for the following approaches:

Deploying the MobileViT XX Small Huggingface model with a custom torchserve handler. (refer HF-only directory)
Deploying the MobileViT XX Small Huggingface model in scripted mode with a custom torchserve handler. (refer HF-scripted directory)

Todo

Verify HF pipeline functionality with AWS EC2 t2.micro
Add benchmarking scripts for throughput with Torchserve utilities.
Add dynamic batching explanation.
Integrate inference optimizations from 🤗 optimum library:
- ONNX Runtime inference
- ONNX Quantized Runtime inference
- Optional: Try ONNX-TensorRT integration (reference)
Try LLM.int8 integration

References

Support

There are many ways to support an open-source work, ⭐ing it is one of them.

Issues

In case of bugs or queries, raise an Issue, or even better, raise a PR with fixes.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
HF-ORT		HF-ORT
HF-only		HF-only
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HF-torchserve-pipeline

Todo

References

Support

Issues

About

Releases 1

Languages

License

tripathiarpan20/HF-torchserve-pipeline

Folders and files

Latest commit

History

Repository files navigation

HF-torchserve-pipeline

Todo

References

Support

Issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages