My Local AI Stack

By Saurabh MLN

Published: March 1, 2026

#AI #LocalAI #Ollama #OpenCode #Development #Privacy

Introduction

A few weeks ago, I set up a local AI coding assistant on my laptop that runs entirely offline. No cloud services, no monthly subscriptions, no data leaving my machine.

And it actually works really well.

In this article, I'll walk you through what I built, the alternative options available, why it matters, and how you can try it too. Whether you're a developer, a student, or just curious about AI - there's something here for you.

The Stack

Here's my current setup:

Component	What I Use	Other Options
AI Agent	OpenWork	OpenCode CLI, Claude Code, Cursor, GitHub Copilot
LLM Runtime	Ollama	LM Studio, llama.cpp, text-generation-webui, GPT4All
AI Model	Llama 3.2 (3B)	Phi 3.5, Mistral, Qwen, Codellama, Gemma
Platform	Linux	macOS, Windows, WSL

My Configuration

RAM: 8GB (plenty for 3B models!)
Storage: ~3GB for the model
Time to setup: About 30 minutes

Component Options Explained

AI Agents / Coding Assistants

Tool	Type	Best For
OpenWork (what I use)	Full workspace agent	Complete development workflow
OpenCode CLI	CLI-based agent	Terminal-first developers
Claude Code	Autonomous CLI agent	Complex multi-step tasks
Cursor	IDE-integrated AI	IDE users wanting AI
GitHub Copilot	IDE plugin	Inline code suggestions

LLM Runtimes

Runtime	Platform	Best For
Ollama (what I use)	All platforms	Ease of use, great model library
LM Studio	macOS/Windows	GUI, model management
llama.cpp	All platforms	Maximum performance, no GPU needed
text-generation-webui	All platforms	Web UI, extensive features
GPT4All	All platforms	Privacy-focused, local-only

AI Models

Model	Size	Strengths	RAM Needed
Llama 3.2 3B (what I use)	3B	Balanced performance	6-8GB
Llama 3.2 1B	1B	Lightweight, fast	4GB
Phi 3.5	3.8B	Microsoft's efficient model	6-8GB
Mistral 7B	7B	Great reasoning	12-16GB
Codellama 7B	7B	Code-specialized	12-16GB
Qwen 2.5	7B	Multilingual	12-16GB
Gemma 2B	2B	Google's lightweight option	4GB

What Can It Do?

My local AI assistant handles:

Development Tasks

Write and edit code files
Debug issues and explain errors
Refactor and improve code
Search through codebases
Run terminal commands

General Tasks

Answer technical questions
Explain programming concepts
Help with shell commands
Do web research (when online)

It's not as smart as the big cloud AI systems, but for everyday development tasks, it's incredibly useful.

The Architecture: Why This Matters

Here's the cool part - everything runs locally:

┌─────────────────────────────────────────────────┐
│                 My Laptop                        │
│  ┌─────────────┐    ┌─────────────────────┐    │
│  │   OpenWork  │───▶│      Ollama          │    │
│  │  (AI Agent) │    │  (Local LLM Runtime) │    │
│  └─────────────┘    └─────────────────────┘    │
│                              │                   │
│                     ┌──────▼──────┐             │
│                     │ Llama 3.2   │             │
│                     │    (3B)     │             │
│                     └─────────────┘             │
└─────────────────────────────────────────────────┘
         100% Offline Capable! 🔒

No API calls. No data leaving my machine. No monthly bill.

Key Advantages

1. Privacy First

Your code, your files, your data - all stay on your machine. For developers working on proprietary projects, this is huge.

2. No Ongoing Costs

Once set up, it's free. No per-token billing, no subscription fees.

3. Works Offline

Perfect for travel, remote work, or spotty internet.

4. Zero Latency

No network round-trips. Responses are instant.

5. Complete Control

You own the stack. You decide when to update and what models to use.

Security Considerations 🔐

Running local AI has its own security profile:

Aspect	Local AI	Cloud AI
Data transmission	None	All data sent to cloud
API keys	Not needed	Required
Updates	Manual	Automatic
Network exposure	Minimal	Standard web

Best Practices:

Keep your system software updated
Use strong passwords on your machine
Be mindful of what files you share externally
Use a VPN for additional privacy when researching online

Pricing Breakdown 💰

One of the biggest advantages? The cost.

Item	Local AI Stack	Claude Pro
Monthly	$0	$20-200/month
Year 1	$0	$240-2,400/year
Year 2	$0	$240-2,400/year
Year 3	$0	$240-2,400/year

My total investment: Just my time (~30 min to set up).

Optional upgrades (not required):

Extra RAM (16GB): ~$50-100
SSD upgrade: Varies

Comparison: Local vs Cloud AI

Feature	Local (My Stack)	Cloud AI (Claude/GPT)
Privacy	🔒 100% private	Data goes to cloud
Cost	Free after setup	$20-200/month
Works Offline	✅ Yes	❌ Needs internet
Model Size	3B parameters	100B+ parameters
Smartness	Good for basics	Genius level
Context Window	4K-8K tokens	100K+ tokens
Multimodal	❌ Text only	✅ Images, files
Setup Time	30 minutes	5 minutes

When to Use Local AI

Quick code edits and refactoring
Learning and experimentation
Privacy-sensitive projects
Offline work
Budget-conscious developers

When to Use Cloud AI

Complex problem-solving
Large codebase understanding
Multimodal tasks (images, files)
Latest information retrieval
Production-grade code

The Future of Local AI

The local AI space is evolving rapidly. Here's what's coming:

Better models - 7B and 8B models will run smoothly on 16GB+
Specialized models - Code-specific, embedding, and vision models
Fine-tuning - Train models on your private codebase
More tools - MCP integrations for enhanced capabilities

If you're on a budget or care about privacy, now is a great time to start.

Getting Started

Ready to try it yourself? Here's how:

Step 1: Install Ollama

curl -fsSL https://ollama.com/install | sh

Step 2: Pull a Model

# My recommendation - good balance of speed and capability
ollama pull llama3.2

# Lighter option
ollama pull llama3.2:1b

# Alternative model
ollama pull phi3.5

Step 3: Install OpenCode

Visit opencode.ai for your platform's installation instructions.

Step 4: Connect Them

Configure OpenCode to use Ollama as the model provider. (See OpenCode documentation for details.)

Conclusion

I've been using this local setup for a few weeks now, and I'm genuinely impressed. It handles most of my daily development tasks - writing helper scripts, debugging code, explaining concepts - without ever needing to touch the cloud.

Is it as smart as Claude? No, not even close.

But it doesn't need to be. For quick tasks, offline work, and privacy-conscious development, it's perfect.

And the best part? It runs on an 8GB laptop - the same machine I use for everyday work.

Note: This article was written with the assistance of my Local AI Stack - OpenWork + OpenCode running Ollama 3.2 (3B model) on my 8GB laptop!

What do you think? Would you try running AI locally? What's your stack? Let me know!

#AI #LocalAI #Ollama #OpenCode #Development #Privacy #Tech #MyLocalAIStack

Author: Saurabh MLN

Date: March 1, 2026

Search This Blog

How to make sense?

My Local AI Stack

My Local AI Stack

Introduction

The Stack

My Configuration

Component Options Explained

AI Agents / Coding Assistants

LLM Runtimes

AI Models

What Can It Do?

Development Tasks

General Tasks

The Architecture: Why This Matters

Key Advantages

1. Privacy First

2. No Ongoing Costs

3. Works Offline

4. Zero Latency

5. Complete Control

Security Considerations 🔐

Pricing Breakdown 💰

Comparison: Local vs Cloud AI

When to Use Local AI

When to Use Cloud AI

The Future of Local AI

Getting Started

Step 1: Install Ollama

Step 2: Pull a Model

Step 3: Install OpenCode

Step 4: Connect Them

Conclusion

Comments

Post a Comment

Popular posts from this blog

Questioning the Politics of Hostility

Māyā Capitalis: Nobiscum Crescite Aut Peribitis

War at India's Geopolitical Borders - A Citizen's Reckoning