Hugging Face GGUF Models
Browse, download, and import GGUF models from Hugging Face
Overview
OllaMan supports importing GGUF models from Hugging Face — a large open repository of quantized model files. This gives you access to far more models than the official Ollama registry alone.
There are two ways to get a Hugging Face GGUF model into OllaMan:
1. GGUF Marketplace
Browse and search a catalog of Hugging Face GGUF models, then download in one click
2. Manual URL Import
Paste a Hugging Face model path or file link into the Downloads page
GGUF Model Marketplace
The GGUF page is a dedicated view for browsing Hugging Face GGUF models.
Open the GGUF Page
Click GGUF in the sidebar.

Browse and Search
Find models in several ways:
- Search: type a model name, source, or quant format (e.g.
qwen,Q4_K_M) - Filter by Capability: narrow results to Text, Vision, Code, or Audio
- Filter by Model Type: show only specific families (Llama, Mistral, Qwen, etc.)
- Sort: by Most Downloaded, Most Liked, or Newest
More results load automatically as you scroll.
Read Model Cards
Each card shows the essentials at a glance: quant format, file size, download and like counts, and the GPU / minimum RAM it needs. Click any card to open its detail page, where you'll find the full model description, license, parameter size, and every GGUF variant the repository offers.

Download a Model
Open the Model Detail
Click a model card to open its detail page.
Pick a GGUF Variant
In the GGUF Variants list, choose the file that matches your hardware (see Choosing a Quantization below).
Click Download
Click Download next to the variant. Progress appears in the download manager at the bottom-right, just like any other model. When it finishes, the model shows up in Installed (Local Models).
Manual URL Import
If you already have a specific Hugging Face model in mind, import it directly from the Downloads page.
Open the Downloads Page
Click Downloads in the sidebar.
Click "Pull Model"
Click the Pull Model button at the top-right corner.

Enter the Model Reference
The input accepts several Hugging Face formats in addition to standard Ollama names:
| What you enter | Example | Result |
|---|---|---|
| Short model path | hf.co/user/repo | Downloads the model's default GGUF |
| Short path + quant | hf.co/user/repo:Q8_0 | Downloads the specified quantization |
| Link to a .gguf file | https://huggingface.co/user/repo/resolve/main/model-Q4_K_M.gguf | Downloads that specific file |
| Full filename as tag | hf.co/user/repo:model-Q4_K_M.gguf | Downloads that specific file |
A hint under the input shows the detected format so you can confirm before pulling.
Click Pull
Click Pull. The model downloads and appears in Installed (Local Models) when complete.
Only Hugging Face links are supported
Links must point to Hugging Face — huggingface.co, hf-mirror.com, or the hf.co/user/repo short form. Links to other hosts cannot be pulled and will be rejected with a clear error.
Choosing a Quantization
When a repository offers several GGUF files, pick the one that fits your hardware:
- Q8_0 — highest quality, largest file
- Q6_K / Q5_K — excellent quality, good compression
- Q4_K_M / Q4_0 — the most popular balance of size and quality
- Q3 / Q2 — smallest, with reduced accuracy
For most users Q4_K_M is the sweet spot. Choose Q8_0 if you have plenty of RAM and want maximum fidelity.
Hugging Face Mirror Settings
If the official Hugging Face site is slow or blocked in your region, switch to a mirror or add your own.
Open Settings
Go to Settings → Hugging Face.

Choose an Access Source
- Auto select (default) — OllaMan tries the official Hugging Face first and automatically falls back to a mirror when it's unreachable. Best for most users.
- Use selected source — always use the endpoint you pick below. Best when you already know which mirror works on your network.
Pick or Add an Endpoint
Two sources are built in:
- Hugging Face —
https://huggingface.co - HF-Mirror —
https://hf-mirror.com
Click Test next to any endpoint to check connectivity and latency. To add your own mirror, enter a Name and Endpoint URL under Custom source, then click Add source.
What does the selected source affect?
The chosen source is used when browsing the GGUF marketplace and downloading models from it. When you import a model by URL on the Downloads page, the host is taken from the link you paste — so you can point a single import at a mirror by using an hf-mirror.com link, independent of this setting.
When to Use a Mirror
- The official site is slow or times out during downloads
- Hugging Face is blocked or unreliable on your network
- Your organization runs an internal mirror
Restricted (Gated) Models
Some repositories restrict access to their model files. When you try to pull one, OllaMan shows a notice that the model requires authorization.
To pull a gated or private model:
- Open the model on huggingface.co, accept any terms, and request access
- Add your Ollama public key (
~/.ollama/id_ed25519.pub) to your Hugging Face SSH Keys - Click I configured access, continue pull in OllaMan
OllaMan Docs