Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Remote push refactor #297

Merged
merged 159 commits into from
Jun 14, 2024
Merged

Remote push refactor #297

merged 159 commits into from
Jun 14, 2024

Conversation

robertgshaw2-neuralmagic
Copy link
Collaborator

SUMMARY:

  • updated model test structure to focus on core models
  • refactored tests to use environment variables (currently at "test group" level - so each folder has an env variable). All tests are off by default and they are explicitly enabled
  • refactored workflows build-test workflow to use a list of env variables rather than skip test list

WHY:

  • this enables us to be more sane about what is and is not on - as opposed to a long list of files
  • this enables us to actually track what is run and what is not run (via testmo, which tracks skipped tests)
  • this enables us to have more fine-grained control over what is run vs not run (we can add more env vars at the sub-group level to turn off more tests)

alexm-neuralmagic and others added 30 commits June 8, 2024 16:39
Allow dummy load format for fp8,
torch.uniform_ doesn't support FP8 at the moment

Co-authored-by: Mor Zusman <[email protected]>
Pass the CUDA stream into the CUTLASS GEMMs, to avoid future issues with CUDA graphs
…ct#4893)

The 2nd PR for vllm-project#4532.

This PR supports loading FP8 kv-cache scaling factors from a FP8 checkpoint (with .kv_scale parameter).
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
@@ -147,7 +147,7 @@ def __init__(
self,
model_name: str,
dtype: str = "half",
access_token: Optional[str] = None,
**kwargs,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • access token not needed (we set HF_TOKEN in automation)
  • kwargs enables us to pass whatever we want for hf runner

@@ -472,21 +472,21 @@ def _decode_token_by_position_index(

def generate_greedy_logprobs_nm_use_tokens(
self,
prompts: List[str],
input_ids_lst: List[torch.Tensor],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • previously, we were passing a prompt, formatted with a chat template (which appends bos token)
  • then, we tokenize here which appends another bos token

This change prevents that double permanently, by forcing the test to tokenize everything fully

@@ -37,7 +37,7 @@ jobs:
test_label_solo: gcp-k8s-l4-solo
test_label_multi: ignore
test_timeout: 480
test_skip_list: neuralmagic/tests/skip-for-remote-push-tmp.txt
test_skip_env_vars: neuralmagic/tests/test_skip_env_vars/smoke.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For remote-push, maybe we don't need to run all 4 python versions, and only 38/311 is good enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's get to that after this is merged

tests/nm_utils/utils_skip.py Outdated Show resolved Hide resolved
Copy link
Member

@andy-neuma andy-neuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer not having a file as input. instead, it'd be more flexible to just have a string listing the ENV with setting. something like,

TEST_ACCURACY=DISABLE,TEST_CORE=ENABLE

@@ -37,7 +37,7 @@ jobs:
test_label_solo: gcp-k8s-l4-solo
test_label_multi: ignore
test_timeout: 480
test_skip_list: neuralmagic/tests/skip-for-remote-push-tmp.txt
test_skip_env_vars: neuralmagic/tests/test_skip_env_vars/smoke.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's get to that after this is merged

Copy link
Member

@andy-neuma andy-neuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! we can adjust calling parameters and style later.

@dbarbuzzi dbarbuzzi merged commit a3cc7a8 into main Jun 14, 2024
30 of 34 checks passed
@dbarbuzzi dbarbuzzi deleted the remote-push-refactor branch June 14, 2024 14:18
derekk-nm pushed a commit that referenced this pull request Jun 24, 2024
SUMMARY:

* updated model test structure to focus on core models
* refactored tests to use environment variables (currently at "test
group" level - so each folder has an env variable). All tests are off by
default and they are explicitly enabled
* refactored workflows build-test workflow to use a list of env
variables rather than skip test list

WHY:
* this enables us to be more sane about what is and is not on - as
opposed to a long list of files
* this enables us to actually track what is run and what is not run (via
testmo, which tracks skipped tests)
* this enables us to have more fine-grained control over what is run vs
not run (we can add more env vars at the sub-group level to turn off
more tests)

---------

Signed-off-by: kerthcet <[email protected]>
Signed-off-by: Muralidhar Andoorveedu <[email protected]>
Signed-off-by: pandyamarut <[email protected]>
Co-authored-by: Alexander Matveev <[email protected]>
Co-authored-by: Woosuk Kwon <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Wenwei Zhang <[email protected]>
Co-authored-by: Alexei-V-Ivanov-AMD <[email protected]>
Co-authored-by: Alexey Kondratiev <[email protected]>
Co-authored-by: Mor Zusman <[email protected]>
Co-authored-by: Mor Zusman <[email protected]>
Co-authored-by: Aurick Qiao <[email protected]>
Co-authored-by: Kuntai Du <[email protected]>
Co-authored-by: Antoni Baum <[email protected]>
Co-authored-by: HUANG Fei <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Simon Mo <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Co-authored-by: Kante Yin <[email protected]>
Co-authored-by: sasha0552 <[email protected]>
Co-authored-by: SangBin Cho <[email protected]>
Co-authored-by: Tyler Michael Smith <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
Co-authored-by: raywanb <[email protected]>
Co-authored-by: Nick Hill <[email protected]>
Co-authored-by: Philipp Moritz <[email protected]>
Co-authored-by: Letian Li <[email protected]>
Co-authored-by: Murali Andoorveedu <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Varun Sundar Rabindranath <[email protected]>
Co-authored-by: Elisei Smirnov <[email protected]>
Co-authored-by: Elisei Smirnov <[email protected]>
Co-authored-by: youkaichao <[email protected]>
Co-authored-by: leiwen83 <[email protected]>
Co-authored-by: Lei Wen <[email protected]>
Co-authored-by: Eric Xihui Lin <[email protected]>
Co-authored-by: beagleski <[email protected]>
Co-authored-by: bapatra <[email protected]>
Co-authored-by: Barun Patra <[email protected]>
Co-authored-by: Lily Liu <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Zhuohan Li <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Michał Moskal <[email protected]>
Co-authored-by: Ruth Evans <[email protected]>
Co-authored-by: Divakar Verma <[email protected]>
Co-authored-by: Roger Wang <[email protected]>
Co-authored-by: Junichi Sato <[email protected]>
Co-authored-by: Marut Pandya <[email protected]>
Co-authored-by: afeldman-nm <[email protected]>
Co-authored-by: Ronen Schaffer <[email protected]>
Co-authored-by: Itay Etelis <[email protected]>
Co-authored-by: omkar kakarparthi <[email protected]>
Co-authored-by: Alexei V. Ivanov <[email protected]>
Co-authored-by: Breno Faria <[email protected]>
Co-authored-by: Breno Faria <[email protected]>
Co-authored-by: Hyunsung Lee <[email protected]>
Co-authored-by: Chansung Park <[email protected]>
Co-authored-by: SnowDist <[email protected]>
Co-authored-by: functionxu123 <[email protected]>
Co-authored-by: xuhao <[email protected]>
Co-authored-by: Domenic Barbuzzi <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.