HoML Logo

HoML

Models

Discover and download curated models.

DeepSeek R1

DeepSeek R1 is a family of open reasoning models with performance approaching that of leading models.

DeepSeek V2

DeepSeek V2 is a powerful open-source Mixture-of-Experts (MoE) language model from DeepSeek AI. It has 236 billion total parameters, with 21 billion activated for each token, enabling strong performance while maintaining efficiency.

DeepSeek V3

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

GPT-OSS

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

Gemma 3

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. The 4B, 12B, and 27B models can process both text and image inputs.

Gemma 3N

Gemma 3N is a variant of the Gemma 3 model, optimized for enhanced performance and efficiency. Note these are not chat models.

LLaVA

LLaVA is a large multimodal model that combines a vision encoder and a language model for general-purpose visual and language understanding.

Llama 3.1

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.

Llama 3.2

Meta's Llama 3.2 goes small with 1B and 3B models.

Meta Llama 3

Llama 3 is a family of large language models (LLMs) from Meta, designed for a wide array of applications and demonstrating state-of-the-art performance on various industry benchmarks.

MiniCPM-V

A series of multimodal LLMs (MLLMs) designed for vision-language understanding.

Mistral

Mistral models are a series of powerful and efficient large language models from Mistral AI. They are known for their strong performance and open-source availability.

Mixtral

Mixtral models are a series of Sparse Mixture of Experts (SMoE) models from Mistral AI. They are designed to be highly efficient, using only a fraction of their total parameters for any given token, which leads to faster inference times.

Phi-2

Phi-2 is a 2.7 billion-parameter language model from Microsoft. Despite its small size, it demonstrates remarkable performance, excelling at common sense, language understanding, and logical reasoning. It was trained on 'textbook-quality' data.

Phi-3

Phi-3 is a family of lightweight, state-of-the-art open models by Microsoft.

Phi-3

The Phi-3 family are small, cost-effective language models from Microsoft that are powerful and outperform models of similar and larger sizes on various benchmarks. They are instruction-tuned and ready for use 'off-the-shelf'.

Phi-4

Phi-4 is a 14B parameter, state-of-the-art open model from Microsoft.

Qwen2

Qwen2 is a series of large language models from Alibaba Cloud. They are Transformer-based models with SwiGLU activation, attention QKV bias, and group query attention. They have strong performance in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning.

Qwen3

Qwen3 is the latest series of large language models from Alibaba Cloud. They are Transformer-based models with SwiGLU activation, attention QKV bias, and group query attention. They have strong performance in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning.