Posts

Showing posts from May, 2026

token

For your issue: Small file → answer comes Large file → no answer The solution is usually do not send the whole file to Ollama. Use this approach: Large File     ↓ Extract text     ↓ Split into chunks (500–1000 tokens)     ↓ Create embeddings     ↓ Store in Vector DB (FAISS / Chroma)     ↓ Retrieve only relevant chunks     ↓ Send to Ollama     ↓ Get answer If you want a quick fix (without redesign) Try these: 1. Increase context size Example: OLLAMA_CONTEXT_LENGTH=8192 ollama run llama3 Or use models with larger context. But this is only temporary. --- 2. Reduce input size Instead of: Send 200 page PDF Do: Page 1–20 Page 21–40 Page 41–60 Process separately. --- 3. Add chunking in code Example logic: text = extract_pdf() chunk_size = 1000 chunks = split(text) for chunk in chunks:     send_to_ollama(chunk) --- 4. Use RAG (recommended for company/intranet AI) Stack: Ollama LangChain FAISS or ChromaDB Embedding m...

system design

🚀System Design Roadmap for Beginners If you are starting off with system design, this is the BEST starting point for you! If you like my work, please do subscribe to my FREE Newsletter, for getting such articles right into email inbox every week :) Subscribe If you’re an absolute beginner in System Design, this roadmap will guide you step-by-step through all the key concepts using one consistent example:  Instagram . Each week, I’ve also included the best YouTube videos to help you understand these concepts practically and visually! 🗓️ Week 1 – Foundations What is System Design? : The process of building software systems like Instagram that can  scale  (handle millions of users), maintain  reliability  (stay available without crashing), and ensure  maintainability  (are easy to update and extend). 4 Components:  Client, Server, Database, APIs. Client : Your Instagram app, Server : Processes requests like “upload photo” API:  Fetches user fe...