Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Upstream sync 2024 07 01 #350

Merged
merged 113 commits into from
Jul 3, 2024
Merged

Commits on Jul 1, 2024

  1. Configuration menu
    Copy the full SHA
    54348ea View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5ae4430 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1601d82 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    20faeb6 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    62ecd68 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ca8bc83 View commit details
    Browse the repository at this point in the history
  7. [ci] Remove aws template (vllm-project#5757)

    Signed-off-by: kevin <[email protected]>
    khluu authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    91b2d1d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    21450bc View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    1d55e23 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    980c10b View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    3b261da View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    8d6c12f View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    c3bc8c6 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    a9e34b9 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    21f69d1 View commit details
    Browse the repository at this point in the history
  16. resolved

    mawong-amd authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    ece7c7f View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    e6935bd View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    81a21d2 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    f9775e9 View commit details
    Browse the repository at this point in the history
  20. [Core] Refactor Worker and ModelRunner to consolidate control plane c…

    …ommunication (vllm-project#5408)
    
    Signed-off-by: Stephanie Wang <[email protected]>
    Signed-off-by: Stephanie <[email protected]>
    Co-authored-by: Stephanie <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    fb41934 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    ce9da79 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    cb364ef View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    9744700 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    2f7eba7 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    74952fd View commit details
    Browse the repository at this point in the history
  26. [Kernel] Adding bias epilogue support for cutlass_scaled_mm (vllm-p…

    …roject#5560)
    
    Co-authored-by: Chih-Chieh-Yang <[email protected]>
    Co-authored-by: Lucas Wilkinson <[email protected]>
    3 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    1d1929b View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    5095252 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    1653293 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    e423b2c View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    698f968 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    182cdaa View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    750539c View commit details
    Browse the repository at this point in the history
  33. [BugFix] Fix cuda graph for MLPSpeculator (vllm-project#5875)

    Co-authored-by: Abhinav Goyal <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    7823612 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    0844ba8 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    5855a8e View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    2102a46 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    f483510 View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    684c441 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    dcb8246 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    db62aa3 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    0c7ef70 View commit details
    Browse the repository at this point in the history
  42. Configuration menu
    Copy the full SHA
    6e594ee View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    81ddde3 View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    c1d4964 View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    209a147 View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    4d5e0b9 View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    5f1316e View commit details
    Browse the repository at this point in the history
  48. [VLM][BugFix] Make sure that multi_modal_kwargs can broadcast prope…

    …rly with ring buffer. (vllm-project#5905)
    
    Signed-off-by: Xiaowei Jiang <[email protected]>
    Co-authored-by: Roger Wang <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    74bf88f View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    f177c04 View commit details
    Browse the repository at this point in the history
  50. [Core] Registry for processing model inputs (vllm-project#5214)

    Co-authored-by: ywang96 <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    70af85d View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    fd59ff4 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    2e67191 View commit details
    Browse the repository at this point in the history
  53. [Bugfix] Better error message for MLPSpeculator when `num_speculative…

    …_tokens` is set too high (vllm-project#5894)
    
    Signed-off-by: Thomas Parnell <[email protected]>
    tdoublep authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    0d4c0c6 View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    1ce7d18 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    4b9894c View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    6664f2a View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    42cdb40 View commit details
    Browse the repository at this point in the history
  58. [ Misc ] Remove fp8_shard_indexer from Col/Row Parallel Linear (Sim…

    …plify Weight Loading) (vllm-project#5928)
    
    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    robertgshaw2-neuralmagic and Robert Shaw committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    7c1515e View commit details
    Browse the repository at this point in the history
  59. [ Bugfix ] Enabling Loading Models With Fused QKV/MLP on Disk with FP8 (

    vllm-project#5921)
    
    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    robertgshaw2-neuralmagic and Robert Shaw committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    9598197 View commit details
    Browse the repository at this point in the history
  60. Support Deepseek-V2 (vllm-project#4650)

    Co-authored-by: Philipp Moritz <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    3441c30 View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    a5ef790 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    ccd94db View commit details
    Browse the repository at this point in the history
  63. [Bugfix] Fix Engine Failing After Invalid Request - AsyncEngineDeadEr…

    …ror (vllm-project#5963)
    
    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    robertgshaw2-neuralmagic and Robert Shaw committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    f49047a View commit details
    Browse the repository at this point in the history
  64. [Kernel] Flashinfer for prefill & decode, with Cudagraph support for …

    …decode (vllm-project#4628)
    
    Co-authored-by: LiuXiaoxuanPKU <[email protected]>, bong-furiosa <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    eeb9d99 View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    026b28e View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    f281c2e View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    b89416e View commit details
    Browse the repository at this point in the history
  68. [Misc] Extend vLLM Metrics logging API (vllm-project#5925)

    Co-authored-by: Antoni Baum <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    acf1f76 View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    b9acdae View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    aa72bdc View commit details
    Browse the repository at this point in the history
  71. [Misc] Update Phi-3-Vision Example (vllm-project#5981)

    Co-authored-by: Cyrus Leung <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    33fecd4 View commit details
    Browse the repository at this point in the history
  72. Configuration menu
    Copy the full SHA
    00f60d2 View commit details
    Browse the repository at this point in the history
  73. Configuration menu
    Copy the full SHA
    270105d View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    b22f1be View commit details
    Browse the repository at this point in the history
  75. [ CI/Build ] Added E2E Test For Compressed Tensors (vllm-project#5839)

    Co-authored-by: Michael Goin <[email protected]>
    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    3 people committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    aa49ffe View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    b481fe3 View commit details
    Browse the repository at this point in the history
  77. [ CI/Build ] LM Eval Harness Based CI Testing (vllm-project#5838)

    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    robertgshaw2-neuralmagic and Robert Shaw committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    47407b7 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    3d215cc View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    d0b7111 View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    445b0d3 View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    cea9f6b View commit details
    Browse the repository at this point in the history
  82. [ci][distributed] fix device count call

    [ci][distributed] fix some cuda init that makes it necessary to use spawn (vllm-project#5991)
    youkaichao authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    4f7381a View commit details
    Browse the repository at this point in the history
  83. [Frontend]: Support base64 embedding (vllm-project#5935)

    Co-authored-by: Cyrus Leung <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    3ceed36 View commit details
    Browse the repository at this point in the history
  84. [Lora] Use safetensor keys instead of adapter_config.json to find une…

    …xpected modules. (vllm-project#5909)
    
    Co-authored-by: sang <[email protected]>
    2 people authored and robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    51f3e3f View commit details
    Browse the repository at this point in the history
  85. Configuration menu
    Copy the full SHA
    9c74b00 View commit details
    Browse the repository at this point in the history
  86. Configuration menu
    Copy the full SHA
    4153e58 View commit details
    Browse the repository at this point in the history
  87. [ Misc ] Refactor w8a8 to use process_weights_after_load (Simplify …

    …Weight Loading) (vllm-project#5940)
    
    Co-authored-by: Robert Shaw <rshaw@neuralmagic>
    robertgshaw2-neuralmagic and Robert Shaw committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    27a711a View commit details
    Browse the repository at this point in the history
  88. format

    robertgshaw2-neuralmagic committed Jul 1, 2024
    2 Configuration menu
    Copy the full SHA
    53655b2 View commit details
    Browse the repository at this point in the history
  89. isort

    robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    07abe05 View commit details
    Browse the repository at this point in the history
  90. format

    robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    0f0fec4 View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    1cc7c46 View commit details
    Browse the repository at this point in the history
  92. Configuration menu
    Copy the full SHA
    a699814 View commit details
    Browse the repository at this point in the history
  93. updated

    robertgshaw2-neuralmagic committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    9a4be7f View commit details
    Browse the repository at this point in the history
  94. format

    robertgshaw2-neuralmagic committed Jul 1, 2024
    1 Configuration menu
    Copy the full SHA
    b4eec34 View commit details
    Browse the repository at this point in the history
  95. Configuration menu
    Copy the full SHA
    08dedd5 View commit details
    Browse the repository at this point in the history
  96. Configuration menu
    Copy the full SHA
    dac4bb3 View commit details
    Browse the repository at this point in the history
  97. Configuration menu
    Copy the full SHA
    87a4288 View commit details
    Browse the repository at this point in the history
  98. Configuration menu
    Copy the full SHA
    81e1c3e View commit details
    Browse the repository at this point in the history
  99. Configuration menu
    Copy the full SHA
    2c3c43b View commit details
    Browse the repository at this point in the history
  100. Configuration menu
    Copy the full SHA
    cf4e758 View commit details
    Browse the repository at this point in the history
  101. Configuration menu
    Copy the full SHA
    9c7608c View commit details
    Browse the repository at this point in the history
  102. Configuration menu
    Copy the full SHA
    1b7245f View commit details
    Browse the repository at this point in the history
  103. Configuration menu
    Copy the full SHA
    fa05042 View commit details
    Browse the repository at this point in the history
  104. 1 Configuration menu
    Copy the full SHA
    484a2e3 View commit details
    Browse the repository at this point in the history

Commits on Jul 2, 2024

  1. 1 Configuration menu
    Copy the full SHA
    99f1474 View commit details
    Browse the repository at this point in the history
  2. format

    robertgshaw2-neuralmagic committed Jul 2, 2024
    2 Configuration menu
    Copy the full SHA
    afb93b9 View commit details
    Browse the repository at this point in the history
  3. 1 Configuration menu
    Copy the full SHA
    fcb4dd3 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ceaf019 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    655389d View commit details
    Browse the repository at this point in the history
  6. formatted

    robertgshaw2-neuralmagic committed Jul 2, 2024
    2 Configuration menu
    Copy the full SHA
    206af82 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2024

  1. Configuration menu
    Copy the full SHA
    cd2aa72 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f43cb06 View commit details
    Browse the repository at this point in the history
  3. 1 Configuration menu
    Copy the full SHA
    7a45bfa View commit details
    Browse the repository at this point in the history