Skip to content
Change the repository type filter

All

    Repositories list

    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      11k16018Updated Nov 20, 2025Nov 20, 2025
    • model-validation-configs

      Public
      0202Updated Nov 19, 2025Nov 19, 2025
    • research

      Public
      Repository to enable research flows
      Python
      0303Updated Nov 19, 2025Nov 19, 2025
    • Arena-Hard-Auto: An automatic LLM benchmark.
      Python
      136001Updated Nov 19, 2025Nov 19, 2025
    • lighteval

      Public
      Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
      Python
      384000Updated Nov 18, 2025Nov 18, 2025
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      1.2k005Updated Nov 16, 2025Nov 16, 2025
    • speculators-research

      Public
      Python
      0000Updated Nov 13, 2025Nov 13, 2025
    • opendatahub-operator

      Public
      Open Data Hub operator to manage ODH component integrations
      Go
      207001Updated Nov 12, 2025Nov 12, 2025
    • lm-evaluation-harness

      Public
      A framework for few-shot evaluation of language models.
      Python
      2.9k401Updated Nov 10, 2025Nov 10, 2025
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      26k102Updated Oct 29, 2025Oct 29, 2025
    • lmms-eval

      Public
      Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
      Python
      4360010Updated Oct 29, 2025Oct 29, 2025
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      3.5k101Updated Oct 17, 2025Oct 17, 2025
    • Neural Magic GHA
      Python
      0003Updated Oct 8, 2025Oct 8, 2025
    • DeepEP: an efficient expert-parallel communication library
      Cuda
      999000Updated Sep 26, 2025Sep 26, 2025
    • Common mixins, registries, and utilities with native support for Pydantic used across popular repos such as GuideLLM and Speculators
      0000Updated Sep 17, 2025Sep 17, 2025
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      31k100Updated Sep 12, 2025Sep 12, 2025
    • Perplexity GPU Kernels
      C++
      70000Updated Aug 29, 2025Aug 29, 2025
    • DeepEP

      Public
      DeepEP: an efficient expert-parallel communication library
      Cuda
      999000Updated Aug 29, 2025Aug 29, 2025
    • DeepGEMM

      Public
      DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
      Cuda
      748000Updated Aug 29, 2025Aug 29, 2025
    • proving grounds for GitHub to JIRA ... yay!
      0000Updated Aug 27, 2025Aug 27, 2025
    • flashinfer

      Public
      FlashInfer: Kernel Library for LLM Serving
      Cuda
      573000Updated Jul 18, 2025Jul 18, 2025
    • Python
      0000Updated Jul 11, 2025Jul 11, 2025
    • LMCache

      Public
      Redis for LLMs
      Python
      731001Updated Jun 18, 2025Jun 18, 2025
    • Fast and memory-efficient exact attention
      Python
      2.2k500Updated Jun 11, 2025Jun 11, 2025
    • Pytest plugin used by the Release Engineering team
      Python
      0000Updated Jun 9, 2025Jun 9, 2025
    • yolov5

      Public archive
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      17k1900Updated Jun 4, 2025Jun 4, 2025
    • yolov3

      Public archive
      YOLOv3 in PyTorch > ONNX > CoreML > TFLite
      Python
      3.5k300Updated Jun 4, 2025Jun 4, 2025
    • transformers

      Public archive
      🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      31k900Updated Jun 4, 2025Jun 4, 2025
    • llm-d

      Public
      llm-d is a Kubernetes-native high-performance distributed LLM inference framework
      Makefile
      242100Updated Jun 3, 2025Jun 3, 2025
    • deepsparse

      Public archive
      Sparsity-aware deep learning inference runtime for CPUs
      Python
      1913.2k10Updated Jun 2, 2025Jun 2, 2025