Skip to content
@FMInference

Foundation Model Inference

Inference Systems for Foundation Models

Pinned Loading

  1. FlexiGen FlexiGen Public

    Running large language models on a single GPU for throughput-oriented scenarios.

    Python 9.2k 548

Repositories

Showing 3 of 3 repositories
  • FlexiGen Public

    Running large language models on a single GPU for throughput-oriented scenarios.

    FMInference/FlexiGen’s past year of commit activity
    Python 9,171 Apache-2.0 548 51 (3 issues need help) 6 Updated Oct 8, 2024
  • H2O Public

    [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

    FMInference/H2O’s past year of commit activity
    Python 377 37 30 1 Updated Aug 1, 2024
  • DejaVu Public
    FMInference/DejaVu’s past year of commit activity
    Python 276 33 26 1 Updated Apr 2, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Python