knowledge-library

Auto-generated from project docs

North Star

Booklet Status

BookletStatusAppetiteNotes
B0DONE30 minFile Browser deployed (replaced in B1)
B1DONE30 minFilestash deployed, dirs restructured, File Browser removed
B2DONE2 hours181 docs, 2632 chunks. pgvector schema, hybrid search RPC, corpus-ingest.py, corpus-api (systemd:18793), corpus-search CLI
B3P0 DONE, P1 PENDING2 hoursdufs WebDAV live, auto-ingest working. Syncthing pending. Mister iPad test pending.
B4PLANNED3 hoursDatasette explorer + Radar bridge + auto-sourcing
B5PLANNED3 hoursContextual retrieval, reranking, RAGAS evaluation
B6PLANNED3 hoursFine-tuning data prep + Google migration (rclone)

Recent Decisions

DateDecisionRationale
---------------------------
2026-04-12Project kickstartedCAO client workflow needs central reference library
2026-04-12Filestash over File BrowserBeautiful UI, 30MB RAM, direct filesystem access
2026-04-12No Nextcloud/Seafile/JVM toolsBloated, proprietary storage, INCIDENT-039
2026-04-12RESHAPED: file browser → LLM training corpusPrimary purpose is agent grounding + fine-tuning, not file browsing
2026-04-12pgvector over ChromaDB/Qdrant/WeaviateAlready running, zero new RAM (RESEARCH-223)
2026-04-12Docling over Unstructured.ioMIT, lighter, no Docker (RESEARCH-223)
2026-04-12nomic-embed-text on cmd-aorus768-dim matches mem0, free, offloads VPS
2026-04-12Bucket C → B reclassifiedCorpus grounds agents for revenue work (Safetii, CAO)
2026-04-12Paperless-ngx deferredDocling handles PDF/OCR; Paperless adds 400MB for marginal gain

Source: /root/projects/knowledge-library/