Models - HoML

DeepSeek R1

DeepSeek R1 is a family of open reasoning models with performance approaching that of leading models.

View Details →

DeepSeek V2

DeepSeek V2 is a powerful open-source Mixture-of-Experts (MoE) language model from DeepSeek AI. It has 236 billion total parameters, with 21 billion activated for each token, enabling strong performance while maintaining efficiency.

View Details →

DeepSeek V3

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

View Details →

GPT-OSS

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

View Details →

Gemma 3

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. The 4B, 12B, and 27B models can process both text and image inputs.

View Details →

Gemma 3N

Gemma 3N is a variant of the Gemma 3 model, optimized for enhanced performance and efficiency. Note these are not chat models.

View Details →

InternVL3.5

InternVL3.5, a new family of open-source multimodal models that significantly advances versatility, reasoning capability, and inference efficiency along the InternVL series.

View Details →

LLaVA

LLaVA is a large multimodal model that combines a vision encoder and a language model for general-purpose visual and language understanding.

View Details →

Llama 3.1

Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.

View Details →

Llama 3.2

Meta's Llama 3.2 goes small with 1B and 3B models.

View Details →

Meta Llama 3

Llama 3 is a family of large language models (LLMs) from Meta, designed for a wide array of applications and demonstrating state-of-the-art performance on various industry benchmarks.

View Details →

MiniCPM-V

A series of multimodal LLMs (MLLMs) designed for vision-language understanding.

View Details →

Mistral

Mistral models are a series of powerful and efficient large language models from Mistral AI. They are known for their strong performance and open-source availability.

View Details →

Mixtral

Mixtral models are a series of Sparse Mixture of Experts (SMoE) models from Mistral AI. They are designed to be highly efficient, using only a fraction of their total parameters for any given token, which leads to faster inference times.

View Details →

Phi-2

Phi-2 is a 2.7 billion-parameter language model from Microsoft. Despite its small size, it demonstrates remarkable performance, excelling at common sense, language understanding, and logical reasoning. It was trained on 'textbook-quality' data.

View Details →

Phi-3

Phi-3 is a family of lightweight, state-of-the-art open models by Microsoft.

View Details →

Phi-3

The Phi-3 family are small, cost-effective language models from Microsoft that are powerful and outperform models of similar and larger sizes on various benchmarks. They are instruction-tuned and ready for use 'off-the-shelf'.

View Details →

Phi-4

Phi-4 is a 14B parameter, state-of-the-art open model from Microsoft.

View Details →

Qwen2

Qwen2 is a series of large language models from Alibaba Cloud. They are Transformer-based models with SwiGLU activation, attention QKV bias, and group query attention. They have strong performance in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning.

View Details →

Qwen3

Qwen3 is the latest series of large language models from Alibaba Cloud. They are Transformer-based models with SwiGLU activation, attention QKV bias, and group query attention. They have strong performance in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning.

View Details →