docrag

Author	SHA1	Message	Date
Z User	7a6b6f1086	Add local tool selector: keyword parser picks relevant tools, no LLM _select_tools() parses the user message with keyword matching: - News keywords → news_aggregate, news_get_top_stories, news_get_reddit - Finance/stock keywords → finance_get_stock_info/history (extracts ticker) - Crypto keywords → finance_get_crypto_price (extracts coin name), finance_get_top_cryptos - Weather keywords → weather_get_current/forecast/air_quality (extracts location) - Medical keywords → pubmed, fda, disease data, health topics - Science keywords → science_aggregate_search - Wikipedia keywords → wikipedia_search - Always: web_search + web_instant_answer as general fallback - URL in message → web_get_page_content Entity extractors: - _extract_ticker: maps known company names, handles $TICKER format - _extract_crypto: maps known crypto names to CoinGecko IDs - _extract_location: preposition-based + known locations (prefers longest match) - _extract_subject: strips question patterns, leading articles, trailing punctuation Flow remains: request → select tools → run in parallel → results into system prompt → 1 LLM call	2026-03-29 18:44:14 +00:00
Z User	70109d6889	Rewrite: firehose all tools in parallel, then single LLM call No LLM needed for tool selection. Flow is now: Request → run ALL tools in parallel → results into system prompt → 1 LLM call - _run_all_tools: fires every tool concurrently (30s timeout each) - No required args: run with schema defaults - Query-like required args (query, topic, title, etc): use user message - Specific args (symbol, url, pmid): skip (can't guess) - _build_tool_results_text: formats all results into system prompt - build_enhanced_messages: system prompt now has real-time data section - call_llm: dead simple, just prompt → response (replaces generate_response) - Removed: generate_response, _parse_tool_calls, _clean_tool_syntax, _build_tool_descriptions (all dead code now) - Streaming path: same flow, runs tools then streams the LLM response - Both streaming and non-streaming use identical tool pipeline	2026-03-29 18:36:37 +00:00
Z User	8a46a78a4e	Fix: add robust parsing, logging, and safety net for empty responses Three fixes for the 'I apologize, couldnt generate a response' bug: 1. Safety net: if _clean_tool_syntax strips ALL content (e.g. the LLM output only the JSON tool call block and nothing else), return the original content instead of the useless error message. 2. Detailed logging: now logs the first 300 chars of every LLM response so we can see exactly what the model outputs. Also logs which parse pattern matched and which tool names were found. 3. Desperate fallback parser (Pattern 4): if none of the regex/brace patterns match, tries to json.loads() the entire content and looks for known tool names. Catches LLMs that output the array directly or use slightly different formatting.	2026-03-29 18:11:43 +00:00
Z User	a2285d3a48	Switch to mega-tool-call approach for unlimited tool calls The upstream LLM only supports 2 native tool calls per response, but the user needs to fire many tools at once. Solution: content-based 'mega tool call' where the LLM bundles ALL tool calls into a single JSON array in its response text. Key changes: - System prompt: tells LLM to output {tool_calls: [...]} array with ALL needed tools in one block (no native tools param) - _parse_tool_calls: parses the tool_calls array format (with legacy tool_call single-object fallback) - generate_response: NO tools/tool_choice params to API, pure content-based parsing - generate_response: executes ALL tools concurrently via asyncio.gather - generate_response: feeds ALL results back in one consolidated message - _clean_tool_syntax: strips both tool_calls and tool_call blocks	2026-03-29 18:06:39 +00:00
Z User	57228625fc	Fix tool calling: switch to native OpenAI tools parameter Problems fixed: - 'Mega tool call': LLM outputting multiple tool calls that got bundled into one. Now uses native OpenAI tools parameter which handles multiple tool calls properly via message.tool_calls array. - 'Returning nothing': _clean_tool_syntax was too aggressive, stripping the entire response. Now only strips code-fence-wrapped blocks. - Tool results were appended to system message growing it unboundedly; now uses proper 'tool' role messages in conversation history. Key changes: - generate_response: passes tools/tool_choice to OpenAI API (native tool calling), with retry without tool_choice for unsupported models - generate_response: handles multiple tool_calls per response natively - generate_response: uses proper 'tool' role for results instead of appending to system message - _parse_tool_calls (was _parse_tool_call): now returns a list, supports multiple tool calls, used as fallback for models without native tools - _clean_tool_syntax: much less aggressive, only strips code-fence blocks, no longer removes bare JSON (was eating valid responses) - System prompt: removed JSON format instructions (native tools handles format), simplified rules	2026-03-29 17:57:26 +00:00
Z User	c03bde8023	Fix tool call parsing, improve embeddings, and fix async issues - main.py: Rewrote _parse_tool_call with brace-counting for robust JSON extraction - main.py: Improved _clean_tool_syntax with brace-aware removal of tool_call JSON - main.py: Fixed dict key mismatches (chunks_ingested, pages_downloaded) - main.py: Run tool execution in asyncio.to_thread to avoid blocking event loop - main.py: Always clean tool syntax from responses (handles edge cases) - rag/__init__.py: Wrap blocking website_downloader in run_in_executor - rag/__init__.py: Replace deprecated datetime.utcnow() with datetime.now(timezone.utc) - rag/__init__.py: Add add_document_from_url method - rag/vector_store.py: Replace hash-based embeddings with TF-IDF inspired embeddings - rag/vector_store.py: Add embedding dimension mismatch handling in search - README.md: Update API key config documentation	2026-03-29 17:49:32 +00:00
Z User	6eb18ce7f3	Switch to context-based tool calling (no API tool limit) Instead of passing tools to the OpenRouter API (limited to 10 tools): - Tool descriptions are now embedded in the system prompt - LLM outputs tool calls as JSON: {"tool_call": {"name": "...", "arguments": {...}}} - We parse the response, execute tools, and feed results back - Supports all 33 tools without hitting the API limit Changes: - Added _build_tool_descriptions() for tool docs in prompt - Added _parse_tool_call() to extract tool requests from LLM output - Added _clean_tool_syntax() to remove tool JSON from responses - Rewrote generate_response() for context-based approach - Updated system prompt with tool usage instructions	2026-03-29 17:02:02 +00:00
Z User	ac0eff1cdd	Fix: Prevent website re-downloads and skip automated tasks - Skip website download for Open WebUI automated tasks (title, tags, follow-ups) - Check if site already downloaded before re-downloading - Return cached site info if previously downloaded - Reduces unnecessary network calls and processing time	2026-03-29 16:54:38 +00:00
Z User	d966f8ea5d	Add detailed logging for debugging tool calling issues - Log full LLM response object - Log message content and tool calls - Log request start/end with request_id - Add traceback logging for errors	2026-03-29 16:25:44 +00:00
Z User	b811162f78	Implement tool calling loop for LLM - Pass all registered tools to LLM during chat completion - Handle tool_calls from LLM response - Execute tools and feed results back to LLM - Loop until LLM returns final response - Updated system prompt to encourage tool use - Updated streaming to handle tool calls - Increased MAX_TOOL_ITERATIONS to 5	2026-03-29 16:07:56 +00:00
Z User	973bf5ab88	Fix AsyncOpenAI proxy compatibility issue - Create custom httpx.AsyncClient to avoid proxy argument error - This fixes 'AsyncClient.__init__() got an unexpected keyword argument proxies'	2026-03-29 04:51:23 +00:00
Z User	5ec2ef5911	Fix .env loading and add debug logging for API key - Load .env from script directory explicitly - Add logging to show .env file location and existence - Show API key preview on startup for debugging	2026-03-29 04:47:54 +00:00
Z User	b23964b35a	Switch from ZAI SDK to OpenRouter with openrouter/free model - Replace z-ai-web-dev-sdk with openai SDK - Add OPENROUTER_API_KEY and OPENROUTER_BASE_URL config - Update AsyncOpenAI client for OpenRouter - Update generate_response and stream_chat_completion - Update .env.example with OpenRouter settings	2026-03-29 04:35:54 +00:00
Z User	10e61dd2f1	Fix: Auto-download websites BEFORE RAG retrieval Key changes: - Add URL extraction and detection functions - Download websites BEFORE RAG retrieval (not after) - Expand trigger keywords to include common phrases like 'go to', 'headlines', etc. - Update system prompt to tell LLM it CAN access websites - Improve streaming response handling Now when user asks 'go to orovillemr.com and give me the headlines': 1. System detects URL and access intent 2. Downloads and ingests website content 3. RAG retrieves relevant content 4. LLM generates response with actual website content	2026-03-29 03:58:39 +00:00
Z User	6aecc4b231	Integrate website_downloader_tool into RAG system Features: - RAG system now uses website_downloader_tool as primary content ingestion method - download_and_ingest_website() method for complete website processing - Stores page pointers (source_url, page_url, local_path) in vector store - Site registry tracks all downloaded websites with metadata - New API endpoints for website management: - POST /v1/documents/website - Download and ingest a website - GET /v1/documents/sites - List all downloaded sites - GET /v1/documents/sites/{url} - Get site info - DELETE /v1/documents/sites/{url} - Delete a site and its content Changes: - rag/__init__.py: Added download_and_ingest_website(), site registry - rag/document_processor.py: Added extract_text_from_html() public method - rag/vector_store.py: Added delete_by_source_url(), get_stats() - main.py: New website endpoints, integrated tool with RAG system	2026-03-29 02:36:59 +00:00
Z User	eabdadfb62	Implement full DocRAG server with OpenAI-compatible API Features: - FastAPI server with OpenAI-compatible endpoints (/v1/chat/completions, /v1/models) - RAG system with document processing and vector storage - Support for multiple document formats (PDF, DOCX, HTML, text, code) - Streaming response support - Tool integration with website_downloader - Document management API endpoints - GLM-4.7-Flash integration via z-ai-web-dev-sdk - Works transparently with Open WebUI and other OpenAI clients Components: - main.py: FastAPI application with OpenAI-compatible API - rag/: RAG system (document processor, vector store, retriever) - tools/: Tool manager with website_downloader integration - .env.example: Configuration template	2026-03-29 00:57:37 +00:00
turtle89431	e3681949e2	add main	2026-03-28 17:46:13 -07:00

17 Commits