Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

1,166

Full-text search

Active filters: fp8

unsloth/MiniMax-M2.1

Text Generation • 229B • Updated 16 days ago • 155 • 9

unsloth/DeepSeek-V3-0324-GGUF

Text Generation • 671B • Updated May 22, 2025 • 3.08k • 195

Qwen/Qwen3-0.6B-FP8

Text Generation • 0.8B • Updated Jul 26, 2025 • 58.4k • 55

unsloth/Qwen3-8B-FP8

Text Generation • 8B • Updated May 11, 2025 • 2.92k • 1

Qwen/Qwen3-30B-A3B-FP8

Text Generation • 31B • Updated Jul 26, 2025 • 41.2k • 79

stabilityai/stable-diffusion-3.5-large-tensorrt

Text-to-Image • Updated Oct 20, 2025 • 1.24k • 50

Qwen/Qwen3-235B-A22B-Thinking-2507-FP8

Text Generation • 235B • Updated Jul 30, 2025 • 29.2k • 77

Qwen/Qwen3-4B-Thinking-2507-FP8

Text Generation • 4B • Updated Aug 6, 2025 • 164k • 46

Qwen/Qwen3-4B-Instruct-2507-FP8

Text Generation • 4B • Updated Sep 17, 2025 • 95.8k • 60

stabilityai/stable-diffusion-3.5-controlnets-tensorrt

Text-to-Image • Updated Oct 20, 2025 • 99 • 5

RedHatAI/gpt-oss-120b-FP8-dynamic

Text Generation • 117B • Updated Aug 26, 2025 • 3.04k • 10

deepseek-ai/DeepSeek-V3.1

Text Generation • 685B • Updated Sep 5, 2025 • 50.8k • • 811

brandonbeiler/InternVL3_5-GPT-OSS-20B-A4B-Preview-FP8-Dynamic

Image-Text-to-Text • 21B • Updated Aug 30, 2025 • 100 • 2

Qwen/Qwen3-Next-80B-A3B-Thinking-FP8

Text Generation • 81B • Updated Sep 22, 2025 • 293k • 45

RedHatAI/NVIDIA-Nemotron-Nano-9B-v2-FP8-dynamic

Text Generation • 9B • Updated Oct 14, 2025 • 1.46k • 3

Qwen/Qwen3-VL-235B-A22B-Instruct-FP8

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 118k • 34

cerebras/MiniMax-M2-REAP-172B-A10B

Text Generation • 173B • Updated Nov 15, 2025 • 1.03k • 17

ai-sage/GigaChat3-10B-A1.8B

Text Generation • 11B • Updated Dec 11, 2025 • 6.19k • 57

deepseek-ai/DeepSeek-Math-V2

Text Generation • 685B • Updated Nov 27, 2025 • 2.22k • 677

jiangchengchengNLP/qwen3-4b-fp8-scaled

Updated Nov 28, 2025 • 43 • 21

Aratako/Ministral-3-14B-Instruct-2512-TextOnly

14B • Updated Dec 2, 2025 • 719 • 4

mlx-community/Ministral-3-8B-Instruct-2512

Text Generation • Updated Dec 3, 2025 • 588 • 2

cerebras/DeepSeek-V3.2-REAP-345B-A37B

Text Generation • 345B • Updated Dec 9, 2025 • 1.93k • 29

XiaomiMiMo/MiMo-V2-Flash-Base

Text Generation • 310B • Updated 25 days ago • 650 • 37

MedAIBase/AntAngelMed-FP8

103B • Updated 7 days ago • 58 • 2

openaudio/qwen3_omni_fp8_dynamic

32B • Updated 9 days ago • 579 • 2

0xSero/MiniMax-M2.1-REAP-40-REPAIR-IN-PROGRESS

Text Generation • 139B • Updated 7 days ago • 42 • 1

FriendliAI/Meta-Llama-3-8B-Instruct-fp8

Text Generation • 8B • Updated Nov 3, 2024 • 27 • 2

RedHatAI/Meta-Llama-3-8B-Instruct-FP8

Text Generation • 8B • Updated Jul 18, 2024 • 2.28k • • 24

RedHatAI/Mixtral-8x7B-Instruct-v0.1-AutoFP8

Text Generation • 47B • Updated Jul 18, 2024 • 6 • 3