Install KoboldCpp & connect to Aihorde.net

Hamuki · Mar 03, 2026, 08:48 AM

KoboldCpp Installation Guide + AiHorde.net connection

Windows • Local LLM (KoboldCpp) • Connect as an AiHorde worker

In this guide, I will show you how to install KoboldCpp as our local LLM software, and connect it to the AiHorde.net distributed computing network.

Quote
If you get stuck, ask in the official Discords:
AiHorde: https://discord.gg/93JgzVkBV9
KoboldCpp: https://koboldai.org/discord

Requirements

A computer (modern GPU recommended, but not required)
Internet connection
~15 minutes

1) Install KoboldCpp

Go to the official GitHub releases page and download the latest release.
Download koboldcpp.exe (under Assets at the bottom of the latest release).
Move the .exe into a folder you want to keep (Desktop is fine).
Run koboldcpp.exe.
On first launch Windows may warn about "untrusted software" because it's not signed/registered with Microsoft.
Allow it to run. A CMD window opens, and after ~10–30 seconds the KoboldCpp interface appears.
Done, KoboldCpp is installed.

2) Configure KoboldCpp (model + hardware)

We'll configure hardware + choose a model first, then add the AiHorde settings.

Quick Launch Window

Backend
Kobold will usually auto-pick the best option (e.g. CUDA for NVIDIA GPUs).
Recommended toggles
- Launch Browser = Opens a web UI to chat with the model locally.
- ContextShift = Reduces reprocessing (recommended).
- Use FlashAttention = Performance boost for GGUF models (recommended).
- Force AutoFit = Enable if the model fits fully in VRAM. If unsure, leave unchecked.
Context Size
Higher = better long-context output, but uses more memory and can reduce tokens/sec.
If VRAM runs out, Kobold can offload to RAM (slower).
GGUF Text Model
If you already downloaded a model, click Browse and select it.
If not, click HF Search to find a model on HuggingFace.

For testing, I recommend:
Llama 3.2 1B Instruct (GGUF)

Tip: store your models in the same folder as koboldcpp.exe, or create a dedicated "Models" folder.

Hardware Window
Here you can fine-tune performance.

Batch Size: 512
Launch Browser: On (only if you want local web UI)
High Priority: Only if you are CPU-only
Force AutoFit: If model fits fully in VRAM
Use FlashAttention: On for GGUF models

3) Configure AiHorde details

This is where you enter your AiHorde worker info.

QuoteIf you don't have an AiHorde account yet, register here: https://aihorde.net/register

Model Name: If it doesn't show up automatically, type it manually.
Gen. Length: Max tokens per request.
I usually use 4096, but 1024 is a good starting point.
Max Context: Set to 0 (it will use the context size from your settings above).
API Key: Your AiHorde API key.
Worker Name: Choose a unique name.
VERY IMPORTANT: Save your config before clicking Launch.
If you Launch without saving, you may need to re-enter everything again.

Tip: Make a separate config file per model/use-case, so you can quickly switch later via Load Config.

4) Launch (you're done!)

Click Launch. The model will load into your GPU (or RAM if CPU-only).
Once loaded, it will connect to AiHorde and start accepting requests.

New workers can take a few minutes to receive jobs.
If you see no work after ~10 minutes, your model might not be in demand, try another model.
Check what's popular here: currently active models
Download models via the built-in downloader or directly from: Huggingface.co

Questions? Post them below and I'll help if I can.

/Hamuki

henk717 · Mar 03, 2026, 01:10 PM

Awesome, thanks for posting! If people have questions ill do my best to answer them.

Hamuki · Mar 03, 2026, 01:18 PM

Quote from: henk717 on Mar 03, 2026, 01:10 PMAwesome, thanks for posting! If people have questions ill do my best to answer them.

Highly appreciated!