LoneStarCoder

Create a Simple RAG System with AnythingLLM and LMStudio

Author: Brody Kilpatrick | Created: October 9, 2025

Summary

This beginner-friendly tutorial guides you through building a local Retrieval-Augmented Generation (RAG) system without writing code. You’ll learn to set up your own AI assistant that can answer questions based on your documents, running entirely on your own hardware for complete privacy and control.

What is RAG? RAG combines document retrieval with AI language models, allowing the AI to reference your specific documents when answering questions, rather than relying solely on its training data.

Time Required: 1-2 hours for initial setup
Skill Level: Beginner (no coding required)


Hardware Requirements

This setup provides good performance for single-user environments with reasonably fast response times.

Note: This configuration is suitable for personal use or small team testing. For production environments with multiple concurrent users, significantly more GPU power is required.

Minimum Configuration

This setup will work but expect slower response times. Acceptable for proof-of-concept testing.

Performance Note: Without a GPU, the system will use CPU-only inference, which can be 10-20x slower depending on model size.


Software Components


Installation Instructions

Step 1: Install Ubuntu

  1. Download Ubuntu Desktop 22.04 LTS or newer from ubuntu.com/download/desktop
  2. Install Ubuntu using the standard installation wizard (default options are fine)
  3. After installation completes, open Terminal and update the system:
sudo apt update
sudo apt upgrade -y
  1. Reboot your system to ensure all updates are applied:
sudo reboot

Step 2: Install LMStudio

LMStudio provides a user-friendly interface for downloading, managing, and running local language models.

  1. Visit lmstudio.ai and download the Linux version
  2. Open Terminal and configure system permissions for LMStudio’s sandbox environment:
# Enable unprivileged user namespaces (addresses kernel-level permission requirements)
echo "kernel.unprivileged_userns_clone = 1" | sudo tee /etc/sysctl.d/00-local-userns.conf

# Apply the new system configuration
sudo sysctl --system

# Configure AppArmor to allow unprivileged user namespaces
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
  1. Navigate to your Downloads folder, right-click the LMStudio AppImage file, and select “Properties”
  2. Go to the “Permissions” tab and check “Allow executing file as program”
  3. Double-click the file to launch LMStudio

First Launch Configuration:


Step 3: Install AnythingLLM

AnythingLLM provides the RAG interface that connects your documents to the language model.

Official documentation: docs.anythingllm.com/installation-desktop/linux

Open Terminal and run the following commands:

# Install FUSE (required for AppImage support)
sudo apt install libfuse2 -y

# Install cURL (for downloading the installer)
sudo apt install curl -y

# Download the AnythingLLM installer script
curl -fsSL https://cdn.anythingllm.com/latest/installer.sh -o installer.sh

# Make the script executable
chmod +x installer.sh

# Run the installer
./installer.sh

Follow the on-screen prompts to complete the installation. The application will typically install to your home directory.


Configuration

Step 4: Configure LMStudio

Download and Load Your First Model:

  1. Open LMStudio
  2. Click the Search icon (magnifying glass) in the left sidebar
  3. Search for and download one of these recommended starter models:
    • qwen2.5-7b-instruct (good balance of speed and quality)
    • granite-3-8b-instruct (excellent for RAG tasks)
    • llama-3.2-3b-instruct (fastest, lower resource usage)
  4. Once downloaded, click the 💬 Chat tab in the left sidebar
  5. Select your model from the dropdown at the top
  6. Test the model with a simple question like “What is the capital of France?”

Start the LMStudio API Server:

  1. Click the Developer tab (🔌) in the left sidebar
  2. Find the status indicator that says “Status: Stopped”
  3. Click the “Start Server” button
  4. Verify the server shows “Status: Running” and displays a local URL (typically http://localhost:1234)

Important: Keep LMStudio running with the server active whenever you’re using AnythingLLM.


Step 5: Configure AnythingLLM

Initial Setup:

  1. Launch AnythingLLM Desktop (check your home directory or applications menu)
  2. Click “Get Started”
  3. On the “LLM Preference” screen:
    • Scroll down and select “LMStudio”
    • Choose the model you loaded in LMStudio (e.g., “qwen2.5-7b-instruct”)
  4. Click “Next” through the remaining setup screens (embedding and vector database defaults are fine)
  5. Create your first workspace:
    • Name it something descriptive like “Test Workspace” or “My Documents”
    • Click “Create Workspace”

Test Basic Functionality:

Send a test message like “Hello, can you introduce yourself?” to verify the connection is working.


Step 6: Vectorize Your First Document

This is where the RAG magic happens—you’ll teach your AI about specific documents.

Upload and Embed a Document:

  1. In your workspace, click the Upload icon (up arrow) in the bottom-left corner
  2. Click “Upload Files” and select a well-formatted document:
    • Best formats: PDF, TXT, DOCX, Markdown
    • Best content: Documents with clear structure, headings, and relevant information
    • Size: Start with smaller documents (under 100 pages) for faster processing
  3. Once uploaded, you’ll see your document in the file list
  4. Select the document by checking its box
  5. Click “Move to Workspace”
  6. Click “Save and Embed”

What’s Happening? AnythingLLM is breaking your document into chunks, converting them into mathematical representations (vectors), and storing them in a searchable database.

Test Your RAG System:

  1. Go back to your workspace chat
  2. Start a new thread (click the + icon)
  3. Ask questions about your document:
    • “What documents do you have access to?”
    • “Summarize the main points from [document name]”
    • “What does the document say about [specific topic]?”

Pro Tips:


Troubleshooting

Common Issues and Solutions

LMStudio server won’t start:

AnythingLLM can’t connect to LMStudio:

Document embedding is slow or failing:

Answers don’t reference the document:


Next Steps

You’ve completed the basic setup! Here are ways to improve your RAG system:

Optimization Tips

  1. Try Different Models:
    • Larger models (13B+) provide better reasoning but are slower
    • Experiment with models specifically fine-tuned for RAG tasks
    • Browse lmstudio.ai/models for options
  2. Improve Document Quality:
    • Use well-formatted documents with clear headings
    • Break large documents into logical sections
    • Remove unnecessary formatting or images that don’t add value
  3. Tune Embedding Settings:
    • In AnythingLLM settings, adjust chunk size and overlap
    • Smaller chunks = more precise but potentially less context
    • Larger chunks = more context but potentially less precise
  4. Create Multiple Workspaces:
    • Organize documents by topic or project
    • Prevents irrelevant documents from affecting answers

Advanced Features to Explore


Important Notes


Additional Resources


Congratulations! You now have a working local RAG system. Start experimenting with different documents and questions to see what works best for your use case.