-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MiGraphx CPU/GPU Status Tracking #325
Comments
@zjgarvey added llvm/torch-mlir#3647 to some of the models as we need that along with iree-org/iree#18229 |
cc @lialan as well. Can you co-ordinate with Zach to track CPU codegen issues. |
Also adding llvm/torch-mlir#3651 that needs to be done for supporting broad range of models. |
Updated benchmarks for static-dim bert tests on mi300:Passing SummaryTOTAL TESTS = 18
Fail SummaryTOTAL TESTS = 18
Test Run DetailTest was run with the following arguments:
|
This issue will be used to track compilation failures for migraphx models on CPU and GPU. Compile failures for each model should have a link to an issue with a smaller reproducer in the notes column.
Notes:
migraphx_ORT__bert_base_cased_1
fails on CPU but passes on GPU. Other adjacent models fail for similar reasons on both. Very odd.migraphx_sdxl__unet__model
,migraphx_ORT__bert_large_uncased_1
because they cause a crash (likely OOM)CPU Status Table
The Following report was generated with IREE compiler version iree-org/iree@caacf6c
Torch-mlir version llvm/torch-mlir@2665ed3
Passing Summary
TOTAL TESTS = 30
Fail Summary
TOTAL TESTS = 30
Test Run Detail
Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=True, stages=None, skip_stages=None, benchmark=False, load_inputs=False, groups='all', test_filter='migraphx', testsfile=None, tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='mi_10_10.md')
OLD STATUS (Will update and migrate issues to current table)
GPU Status Table
last generated with pip installed iree tools at version
Summary
Test Run Detail
Test was run with the following arguments:
Namespace(device='hip://1', backend='rocm', iree_compile_args=['iree-hip-target=gfx942'], mode='onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, load_inputs=False, groups='all', test_filter='migraphx', tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, report=True, report_file='9_3_migraphx.md')
Note: GPU missing sd model (runs out of memory and kills the test). Probably happening during native inference, so it might need some looking into.
Performance data with iree-benchmark-module on GPU
Summary
Test Run Detail
Test was run with the following arguments:
Namespace(device='local-task', backend='llvm-cpu', iree_compile_args=None, mode='cl-onnx-iree', torchtolinalg=False, stages=None, skip_stages=None, benchmark=True, load_inputs=False, groups='all', test_filter='migraphx', testsfile=None, tolerance=None, verbose=True, rundirectory='test-run', no_artifacts=False, cleanup='0', report=True, report_file='report.md')
The text was updated successfully, but these errors were encountered: