Recent posts

Main Menu

Home
Search

Welcome to TMF - TheModel.Forum.
Log in
Sign up

Jul 13, 2026, 08:24 AM

TMF - TheModel.Forum
► Recent posts

Recent posts

Pages1 2

OpenClaw / Model TalkPlanning on running Qwen3.5 lo...

Last post by Hamuki - Mar 05, 2026, 01:05 PM

As the title says.

I plan on running Qwen3.5-9b locally, to see what kind of performance I will get from it.

Will update the thread with my findings.

/Hamuki

Text Generation / Re: Install KoboldCpp & connec...

Last post by Hamuki - Mar 03, 2026, 01:18 PM

Quote from: henk717 on Mar 03, 2026, 01:10 PMAwesome, thanks for posting! If people have questions ill do my best to answer them.

Highly appreciated!

Text Generation / Re: Install KoboldCpp & connec...

Last post by henk717 - Mar 03, 2026, 01:10 PM

Awesome, thanks for posting! If people have questions ill do my best to answer them.

Text Generation / GuideInstall KoboldCpp & connect to...

Last post by Hamuki - Mar 03, 2026, 08:48 AM

KoboldCpp Installation Guide + AiHorde.net connection

Windows • Local LLM (KoboldCpp) • Connect as an AiHorde worker

In this guide, I will show you how to install KoboldCpp as our local LLM software, and connect it to the AiHorde.net distributed computing network.

Quote
If you get stuck, ask in the official Discords:
AiHorde: https://discord.gg/93JgzVkBV9
KoboldCpp: https://koboldai.org/discord

Requirements

A computer (modern GPU recommended, but not required)
Internet connection
~15 minutes

1) Install KoboldCpp

Go to the official GitHub releases page and download the latest release.
Download koboldcpp.exe (under Assets at the bottom of the latest release).
Move the .exe into a folder you want to keep (Desktop is fine).
Run koboldcpp.exe.
On first launch Windows may warn about "untrusted software" because it's not signed/registered with Microsoft.
Allow it to run. A CMD window opens, and after ~10–30 seconds the KoboldCpp interface appears.
Done, KoboldCpp is installed.

2) Configure KoboldCpp (model + hardware)

We'll configure hardware + choose a model first, then add the AiHorde settings.

Quick Launch Window

Backend
Kobold will usually auto-pick the best option (e.g. CUDA for NVIDIA GPUs).
Recommended toggles
- Launch Browser = Opens a web UI to chat with the model locally.
- ContextShift = Reduces reprocessing (recommended).
- Use FlashAttention = Performance boost for GGUF models (recommended).
- Force AutoFit = Enable if the model fits fully in VRAM. If unsure, leave unchecked.
Context Size
Higher = better long-context output, but uses more memory and can reduce tokens/sec.
If VRAM runs out, Kobold can offload to RAM (slower).
GGUF Text Model
If you already downloaded a model, click Browse and select it.
If not, click HF Search to find a model on HuggingFace.

For testing, I recommend:
Llama 3.2 1B Instruct (GGUF)

Tip: store your models in the same folder as koboldcpp.exe, or create a dedicated "Models" folder.

Hardware Window
Here you can fine-tune performance.

Batch Size: 512
Launch Browser: On (only if you want local web UI)
High Priority: Only if you are CPU-only
Force AutoFit: If model fits fully in VRAM
Use FlashAttention: On for GGUF models

3) Configure AiHorde details

This is where you enter your AiHorde worker info.

QuoteIf you don't have an AiHorde account yet, register here: https://aihorde.net/register

Model Name: If it doesn't show up automatically, type it manually.
Gen. Length: Max tokens per request.
I usually use 4096, but 1024 is a good starting point.
Max Context: Set to 0 (it will use the context size from your settings above).
API Key: Your AiHorde API key.
Worker Name: Choose a unique name.
VERY IMPORTANT: Save your config before clicking Launch.
If you Launch without saving, you may need to re-enter everything again.

Tip: Make a separate config file per model/use-case, so you can quickly switch later via Load Config.

4) Launch (you're done!)

Click Launch. The model will load into your GPU (or RAM if CPU-only).
Once loaded, it will connect to AiHorde and start accepting requests.

New workers can take a few minutes to receive jobs.
If you see no work after ~10 minutes, your model might not be in demand, try another model.
Check what's popular here: currently active models
Download models via the built-in downloader or directly from: Huggingface.co

Questions? Post them below and I'll help if I can.

/Hamuki

Text Generation / Model TalkNeed models? List of where to ...

Last post by Hamuki - Mar 02, 2026, 02:04 PM

If you are new to running local models, you might wonder where you should be getting your models from.

Below here is a list of sites where you can download models to run locally.

WARNING! Do not blindly trust downloads from any sites.
Always make sure to verify the authenticity of the uploader, check comments on the models and skip a model if you have any doubts.

You can download text models here:

Huggingface.co - The biggest model hub for all types of local models.
Ollama Library - A big model library, but without comments or community features.
Modelscope.cn - A looks like a chinese clone of Huggingface.
LMstudio.ai - A simple model library with easy download function.

These are some of the biggest and most popular model hubs.
I personally recommend that you use Huggingface, as it contains the most models and a very active community that tests and provides feedback on new models.

If you know any other model hubs, please post them below

/Hamuki

Image Generation / What software do you use for l...

Last post by Hamuki - Mar 02, 2026, 01:25 PM

I have only used Automatic1111 webui and ComfyUI for image generation.

Mainly A1111, as I thought it was way easier at the time.
But ComfyUI is growing on me, and I use that more than A1111 now.

What about you?

/Hamuki

OpenClaw / Which models are you using for...

Last post by Hamuki - Mar 02, 2026, 01:21 PM

What models are you using for your OpenClaw agent(s)?

Personally, I have setup Codex using the Oauth login from my ChatGPT Plus subscription.
I plan on trying some local models to see if I can reduce the need for Codex/big expensive models for my usage.

Which models do you use, and how much does it approximately cost you monthly?

/Hamuki

OpenClaw / What hardware is your OpenClaw...

Last post by Hamuki - Mar 02, 2026, 01:18 PM

What shell did you provide for your crustacean?

My own is running on a Intel NUC 5i5RYH.

CPU: i5-5250u - 2 cores
RAM: 8GB of DDR3 1333 MHz
Storage: 120gb SATA SSD
OS: Linux, Ubuntu

I plan on getting 16GB of RAM instead of 8GB, but I haven't spent much time with my Claw yet.
So it hasn't been much of a priority.

What about you?

/Hamuki

Hardware / Best 8GB VRAM Models? TXT/IMG

Last post by Hamuki - Mar 02, 2026, 01:02 PM

Very short and simple, which models that run on 8GB VRAM cards is your favorite?

Feel free to link to the model, mention your tokens per second and what you use the model for

My personal favorite is Nanbeige4.1-3B: https://huggingface.co/Nanbeige/Nanbeige4.1-3B
Only bad thing is that its thinking function can go on a little too long.

Whats yours?

/Hamuki

#10

Show & Tell / 4080 Laptop Setup

Last post by Hamuki - Mar 01, 2026, 03:30 PM

My 4080 Gaming Laptop
I bought a nice MSI gaming laptop on sale back in early 2025.
Had to sell my old desktop with a 3060 12GB, because my office had to become a nursery.

1) Quick Overview
Primary use: chat / image gen / video
Main goal: Speed / Budget, as fast as possible at "low" cost.
Total price: $2000 for the computer, $200 for RAM upgrade.
Build type: Laptop

2) Hardware Specs
CPU: i9-13980HX 24 cores
GPU(s): nVidia 4080 12GB vram
RAM: 64GB DDR5 5600Mhz. (16gb originally)
Storage: 1TB NVME
PSU: 330W Powerbrick
Cooling: Dual fans

3) Software & Stack
OS: Windows 11
Runtime / tools: koboldcpp / ComfyUI / A1111.

4) Models I Use
Text models:
Qwen3-Coder-Next
Nanbeige4.1-3b
Llama 3.2-3b

5) Performance Benchmarks

Text generation (tokens/sec):
Coming soon.. Suggestions on models are also welcomed

6) Workflow & Use Cases
Typical tasks: Coding small scrips, running Koboldccp for AiHorde.

Pages1 2