-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add web endpoint to the inference server #38
Conversation
3658c66
to
3aa7a9e
Compare
I think the run_name change makes sense even just on its own but can we update the readme too? |
src/inference.py
Outdated
config = yaml.safe_load(f.read()) | ||
model_path = (Path(run_folder) / config["output_dir"] / "merged").resolve() | ||
|
||
def __init__(self, run_name: str = "", model_dir: str = "/runs") -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model_dir is a little confusing since the two things we store in volumes are (1) pretrained models and (2) finetuned models. They’re all models :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but everything under /runs
is finetuned models right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably not since there’s config, training logs, metrics, etc. but it’s a pretty nitpicky point either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i'll rename it to run_dir
src/inference.py
Outdated
run_name = VOLUME_CONFIG[self.model_dir].listdir("/")[-1].path | ||
|
||
# Grab the output dir (usually "lora-out") | ||
with open(f"/runs/{run_name}/config.yml") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we have /runs hardcoded here even though it’s a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep let me fix
3aa7a9e
to
29ca3b7
Compare
This lets you do
modal serve src.inference
in one window, and
curl 'https://modal-labs--example-axolotl-inference-web-dev.modal.run?input=what+time+is+it'
in another windowit streams!
Note that most of the complexity here is that we want to support an unparametrized class constructor, so we need to pick the run id automatically. In order to do that it was slightly easier to change the argument to be
run_name
rather than the full path. Ifrun_name
isn't provided (which it never is in the case of web endpoints, since we don't support parametrized functions with web endpoints), then it just picks the last run by doing a directory listing.