← Back to Models
LLaVA
LLaVA is a large multimodal model that combines a vision encoder and a language model for general-purpose visual and language understanding.
llava-1.5-7b
7B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:1.5-7b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 14 GB | 14 GB |
llava-1.5-13b
13B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:1.5-13b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 26 GB | 26 GB |
llava-v1.6-mistral-7b
7B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:v1.6-mistral-7b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 14 GB | 14 GB |
llava-v1.6-vicuna-7b
7B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:v1.6-vicuna-7b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 14 GB | 14 GB |
llava-v1.6-vicuna-13b
13B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:v1.6-vicuna-13b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 26 GB | 26 GB |
llava-v1.6-34b
34B parameters
Multimodal
Pull this model
Use the following command with the HoML CLI:
homl pull llava:v1.6-34b
Resource Requirements
Quantization | Disk Space | GPU Memory |
---|---|---|
BF16 | 68 GB | 68 GB |