Hugging Face GGUF Models

Overview

OllaMan supports importing GGUF models from Hugging Face — a large open repository of quantized model files. This gives you access to far more models than the official Ollama registry alone.

There are two ways to get a Hugging Face GGUF model into OllaMan:

1. GGUF Marketplace

Browse and search a catalog of Hugging Face GGUF models, then download in one click

2. Manual URL Import

Paste a Hugging Face model path or file link into the Downloads page

GGUF Model Marketplace

The GGUF page is a dedicated view for browsing Hugging Face GGUF models.

Open the GGUF Page

Click GGUF in the sidebar.

GGUF Model Marketplace

Browse and Search

Find models in several ways:

Search: type a model name, source, or quant format (e.g. qwen, Q4_K_M)
Filter by Capability: narrow results to Text, Vision, Code, or Audio
Filter by Model Type: show only specific families (Llama, Mistral, Qwen, etc.)
Sort: by Most Downloaded, Most Liked, or Newest

More results load automatically as you scroll.

Each card shows the essentials at a glance: quant format, file size, download and like counts, and the GPU / minimum RAM it needs. Click any card to open its detail page, where you'll find the full model description, license, parameter size, and every GGUF variant the repository offers.

GGUF Model Detail Page

Download a Model

Open the Model Detail

Click a model card to open its detail page.

Pick a GGUF Variant

In the GGUF Variants list, choose the file that matches your hardware (see Choosing a Quantization below).

Click Download

Click Download next to the variant. Progress appears in the download manager at the bottom-right, just like any other model. When it finishes, the model shows up in Installed (Local Models).

Manual URL Import

If you already have a specific Hugging Face model in mind, import it directly from the Downloads page.

Open the Downloads Page

Click Downloads in the sidebar.

Click "Pull Model"

Click the Pull Model button at the top-right corner.

Manually Pull a Hugging Face GGUF Model

Enter the Model Reference

The input accepts several Hugging Face formats in addition to standard Ollama names:

What you enter	Example	Result
Short model path	`hf.co/user/repo`	Downloads the model's default GGUF
Short path + quant	`hf.co/user/repo:Q8_0`	Downloads the specified quantization
Link to a .gguf file	`https://huggingface.co/user/repo/resolve/main/model-Q4_K_M.gguf`	Downloads that specific file
Full filename as tag	`hf.co/user/repo:model-Q4_K_M.gguf`	Downloads that specific file

A hint under the input shows the detected format so you can confirm before pulling.

Click Pull

Click Pull. The model downloads and appears in Installed (Local Models) when complete.

Only Hugging Face links are supported

Links must point to Hugging Face — huggingface.co, hf-mirror.com, or the hf.co/user/repo short form. Links to other hosts cannot be pulled and will be rejected with a clear error.

Choosing a Quantization

When a repository offers several GGUF files, pick the one that fits your hardware:

Q8_0 — highest quality, largest file
Q6_K / Q5_K — excellent quality, good compression
Q4_K_M / Q4_0 — the most popular balance of size and quality
Q3 / Q2 — smallest, with reduced accuracy

For most users Q4_K_M is the sweet spot. Choose Q8_0 if you have plenty of RAM and want maximum fidelity.

Hugging Face Mirror Settings

If the official Hugging Face site is slow or blocked in your region, switch to a mirror or add your own.

Open Settings

Go to Settings → Hugging Face.

Hugging Face Access Source Settings

Choose an Access Source

Auto select (default) — OllaMan tries the official Hugging Face first and automatically falls back to a mirror when it's unreachable. Best for most users.
Use selected source — always use the endpoint you pick below. Best when you already know which mirror works on your network.

Pick or Add an Endpoint

Two sources are built in:

Hugging Face — https://huggingface.co
HF-Mirror — https://hf-mirror.com

Click Test next to any endpoint to check connectivity and latency. To add your own mirror, enter a Name and Endpoint URL under Custom source, then click Add source.

What does the selected source affect?

The chosen source is used when browsing the GGUF marketplace and downloading models from it. When you import a model by URL on the Downloads page, the host is taken from the link you paste — so you can point a single import at a mirror by using an hf-mirror.com link, independent of this setting.

When to Use a Mirror

The official site is slow or times out during downloads
Hugging Face is blocked or unreliable on your network
Your organization runs an internal mirror

Restricted (Gated) Models

Some repositories restrict access to their model files. When you try to pull one, OllaMan shows a notice that the model requires authorization.

To pull a gated or private model:

Open the model on huggingface.co, accept any terms, and request access
Add your Ollama public key (~/.ollama/id_ed25519.pub) to your Hugging Face SSH Keys
Click I configured access, continue pull in OllaMan

Overview

1. GGUF Marketplace

2. Manual URL Import

GGUF Model Marketplace

Open the GGUF Page

Browse and Search

Read Model Cards

Download a Model

Open the Model Detail

Pick a GGUF Variant

Click Download

Manual URL Import

Open the Downloads Page

Click "Pull Model"

Enter the Model Reference

Click Pull

Choosing a Quantization

Hugging Face Mirror Settings

Open Settings

Choose an Access Source

Pick or Add an Endpoint

When to Use a Mirror

Restricted (Gated) Models

Troubleshooting

Next Steps

Browse Ollama Models

Manual Installation

Start Chatting

Table of Contents

Hugging Face GGUF Models

1. GGUF Marketplace

2. Manual URL Import

Link rejected as unsupported

Download is very slow or stalls

Model requires authorization

Wrong quantization downloaded

No models appear in the marketplace

Browse Ollama Models

Manual Installation

Start Chatting

Table of Contents