docrag

History

Z User 6aecc4b231 Integrate website_downloader_tool into RAG system Features: - RAG system now uses website_downloader_tool as primary content ingestion method - download_and_ingest_website() method for complete website processing - Stores page pointers (source_url, page_url, local_path) in vector store - Site registry tracks all downloaded websites with metadata - New API endpoints for website management: - POST /v1/documents/website - Download and ingest a website - GET /v1/documents/sites - List all downloaded sites - GET /v1/documents/sites/{url} - Get site info - DELETE /v1/documents/sites/{url} - Delete a site and its content Changes: - rag/__init__.py: Added download_and_ingest_website(), site registry - rag/document_processor.py: Added extract_text_from_html() public method - rag/vector_store.py: Added delete_by_source_url(), get_stats() - main.py: New website endpoints, integrated tool with RAG system		2026-03-29 02:36:59 +00:00
..
__init__.py	Integrate website_downloader_tool into RAG system	2026-03-29 02:36:59 +00:00
document_processor.py	Integrate website_downloader_tool into RAG system	2026-03-29 02:36:59 +00:00
retriever.py	Implement full DocRAG server with OpenAI-compatible API	2026-03-29 00:57:37 +00:00
vector_store.py	Integrate website_downloader_tool into RAG system	2026-03-29 02:36:59 +00:00