inference

Here are 1,340 public repositories matching this topic...

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

ai deep-learning hpc distributed-computing inference big-model large-scale data-parallelism model-parallelism pipeline-parallelism foundation-models heterogeneous-training

Updated Dec 23, 2024
Python

ggerganov / whisper.cpp

Sponsor

Star

Port of OpenAI's Whisper model in C/C++

inference transformer speech-recognition openai speech-to-text whisper

Updated Dec 24, 2024
C++

microsoft / DeepSpeed

Star

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Dec 20, 2024
Python

vllm-project / vllm

Sponsor

Star

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Dec 24, 2024
Python

google-ai-edge / mediapipe

Star

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated Dec 21, 2024
C++

Tencent / ncnn

Star

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Updated Dec 24, 2024
C++

SYSTRAN / faster-whisper

Star

Faster Whisper transcription with CTranslate2

deep-learning inference transformer speech-recognition openai speech-to-text quantization whisper

Updated Dec 23, 2024
Python

gvergnaud / ts-pattern

Star

🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.

javascript typescript matching pattern pattern-matching branching inference ts conditions type-inference exhaustive

Updated Dec 18, 2024
TypeScript

NVIDIA / TensorRT

Star

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning inference nvidia gpu-acceleration tensorrt

Updated Dec 13, 2024
C++

aws / amazon-sagemaker-examples

Star

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

training aws data-science machine-learning reinforcement-learning deep-learning examples jupyter-notebook inference sagemaker mlops

Updated Dec 23, 2024
Jupyter Notebook

huggingface / text-generation-inference

Star

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated Dec 23, 2024
Python

triton-inference-server / server

Star

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

machine-learning cloud deep-learning gpu inference edge datacenter

Updated Dec 23, 2024
Python

dusty-nv / jetson-inference

Star

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Updated Oct 16, 2024
C++

openvinotoolkit / openvino

Star

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated Dec 24, 2024
C++

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

Star

💎1MB lightweight face detection model (1MB轻量级人脸检测模型)

arm inference face-detection mnn ncnn

Updated Dec 29, 2023
Python

gcanti / io-ts

Sponsor

Star

Runtime type system for IO decoding/encoding

typescript validation types runtime inference

Updated Dec 10, 2024
TypeScript

sgl-project / sglang

Star

SGLang is a fast serving framework for large language models and vision language models.

cuda inference pytorch transformer moe llama vlm llm llm-serving llava llama2 llama3 llama3-1

Updated Dec 24, 2024
Python

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Dec 24, 2024
Python

Trusted-AI / adversarial-robustness-toolbox

Star

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

python machine-learning privacy ai attack extraction inference artificial-intelligence evasion red-team poisoning adversarial-machine-learning blue-team adversarial-examples adversarial-attacks trusted-ai trustworthy-ai

Updated Dec 23, 2024
Python

superduper-io / superduper

Star

Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.

Updated Dec 24, 2024
Python

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

Here are 1,340 public repositories matching this topic...

hpcaitech / ColossalAI

ggerganov / whisper.cpp

microsoft / DeepSpeed

vllm-project / vllm

google-ai-edge / mediapipe

Tencent / ncnn

SYSTRAN / faster-whisper

gvergnaud / ts-pattern

NVIDIA / TensorRT

aws / amazon-sagemaker-examples

huggingface / text-generation-inference

triton-inference-server / server

dusty-nv / jetson-inference

openvinotoolkit / openvino

Linzaer / Ultra-Light-Fast-Generic-Face-Detector-1MB

gcanti / io-ts

sgl-project / sglang

xorbitsai / inference

Trusted-AI / adversarial-robustness-toolbox

superduper-io / superduper

Improve this page

Add this topic to your repo