Main Menu

Install KoboldCpp & connect to Aihorde.net

Started by Hamuki, Mar 03, 2026, 08:48 AM

Previous topic - Next topic

Hamuki

KoboldCpp Installation Guide + AiHorde.net connection
Windows • Local LLM (KoboldCpp) • Connect as an AiHorde worker




In this guide, I will show you how to install KoboldCpp as our local LLM software, and connect it to the AiHorde.net distributed computing network.


Quote
If you get stuck, ask in the official Discords:
AiHorde: https://discord.gg/93JgzVkBV9 
KoboldCpp: https://koboldai.org/discord



Requirements
  • A computer (modern GPU recommended, but not required)
  • Internet connection
  • ~15 minutes



1) Install KoboldCpp
  • Go to the official GitHub releases page and download the latest release
    Download koboldcpp.exe (under Assets at the bottom of the latest release).
  • Move the .exe into a folder you want to keep (Desktop is fine).
  • Run koboldcpp.exe
    On first launch Windows may warn about "untrusted software" because it's not signed/registered with Microsoft.
  • Allow it to run. A CMD window opens, and after ~10–30 seconds the KoboldCpp interface appears.
  • Done, KoboldCpp is installed.



2) Configure KoboldCpp (model + hardware)

We'll configure hardware + choose a model first, then add the AiHorde settings.




Quick Launch Window
  • Backend 
    Kobold will usually auto-pick the best option (e.g. CUDA for NVIDIA GPUs).
  • Recommended toggles
    • Launch Browser = Opens a web UI to chat with the model locally.
    • ContextShift = Reduces reprocessing (recommended).
    • Use FlashAttention = Performance boost for GGUF models (recommended).
    • Force AutoFit = Enable if the model fits fully in VRAM. If unsure, leave unchecked.
  • Context Size 
    Higher = better long-context output, but uses more memory and can reduce tokens/sec. 
    If VRAM runs out, Kobold can offload to RAM (slower).
  • GGUF Text Model 
    If you already downloaded a model, click Browse and select it. 
    If not, click HF Search to find a model on HuggingFace. 

    For testing, I recommend: 
    Llama 3.2 1B Instruct (GGUF)

    Tip: store your models in the same folder as koboldcpp.exe, or create a dedicated "Models" folder.




Hardware Window
Here you can fine-tune performance.
  • Batch Size: 512
  • Launch Browser: On (only if you want local web UI)
  • High Priority: Only if you are CPU-only
  • Force AutoFit: If model fits fully in VRAM
  • Use FlashAttention: On for GGUF models




3) Configure AiHorde details

This is where you enter your AiHorde worker info.


QuoteIf you don't have an AiHorde account yet, register here: https://aihorde.net/register

  • Model Name: If it doesn't show up automatically, type it manually.
  • Gen. Length: Max tokens per request. 
    I usually use 4096, but 1024 is a good starting point.
  • Max Context: Set to 0 (it will use the context size from your settings above).
  • API Key: Your AiHorde API key.
  • Worker Name: Choose a unique name.
  • VERY IMPORTANT: Save your config before clicking Launch. 
    If you Launch without saving, you may need to re-enter everything again.

    Tip: Make a separate config file per model/use-case, so you can quickly switch later via Load Config.




4) Launch (you're done!)

Click Launch. The model will load into your GPU (or RAM if CPU-only). 
Once loaded, it will connect to AiHorde and start accepting requests.


  • New workers can take a few minutes to receive jobs.
  • If you see no work after ~10 minutes, your model might not be in demand, try another model.
  • Check what's popular here: currently active models
  • Download models via the built-in downloader or directly from: Huggingface.co



Questions? Post them below and I'll help if I can.

/Hamuki

henk717

Awesome, thanks for posting! If people have questions ill do my best to answer them.

Hamuki

Quote from: henk717 on Mar 03, 2026, 01:10 PMAwesome, thanks for posting! If people have questions ill do my best to answer them.

Highly appreciated! :D