Posts

Showing posts with the label token

token

For your issue: Small file → answer comes Large file → no answer The solution is usually do not send the whole file to Ollama. Use this approach: Large File     ↓ Extract text     ↓ Split into chunks (500–1000 tokens)     ↓ Create embeddings     ↓ Store in Vector DB (FAISS / Chroma)     ↓ Retrieve only relevant chunks     ↓ Send to Ollama     ↓ Get answer If you want a quick fix (without redesign) Try these: 1. Increase context size Example: OLLAMA_CONTEXT_LENGTH=8192 ollama run llama3 Or use models with larger context. But this is only temporary. --- 2. Reduce input size Instead of: Send 200 page PDF Do: Page 1–20 Page 21–40 Page 41–60 Process separately. --- 3. Add chunking in code Example logic: text = extract_pdf() chunk_size = 1000 chunks = split(text) for chunk in chunks:     send_to_ollama(chunk) --- 4. Use RAG (recommended for company/intranet AI) Stack: Ollama LangChain FAISS or ChromaDB Embedding m...