1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-08-24 15:13:03 +02:00

Updated Setting up Search (markdown)

Nick Sweeting
2024-05-07 01:12:22 -07:00
parent 466cfd8bae
commit 602b8b5359

@@ -14,6 +14,7 @@ You can search your ArchiveBox data in a number of ways:
> This will be [improved in the future](https://zulip.archivebox.io/#narrow/stream/154-support/topic/Full.20Text.20Search.20works.2E.2E.2E.20but.20is.20there.20a.20UI.3F) to highlight the *specific paragraph/line/area that matched* within a Snapshot. > This will be [improved in the future](https://zulip.archivebox.io/#narrow/stream/154-support/topic/Full.20Text.20Search.20works.2E.2E.2E.20but.20is.20there.20a.20UI.3F) to highlight the *specific paragraph/line/area that matched* within a Snapshot.
> For now we recommend using Ctl+F in the browser or one of the external tools listed above to further filter for a term within a Snapshot's contents. > For now we recommend using Ctl+F in the browser or one of the external tools listed above to further filter for a term within a Snapshot's contents.
<br/> <br/>
--- ---
@@ -151,6 +152,8 @@ docker compose run archivebox update --index-only
docker compose run archivebox list --filter-type=search 'some text to search' docker compose run archivebox list --filter-type=search 'some text to search'
``` ```
*Fore more detailed instructions [see here](https://github.com/ArchiveBox/ArchiveBox/issues/956#issuecomment-1320587158)...*
#### Pros #### Pros
- extremely fast, most queries complete in microseconds even with 100k+ snapshots - extremely fast, most queries complete in microseconds even with 100k+ snapshots
@@ -211,3 +214,19 @@ archivebox config --set FTS_SQLITE_MAX_LENGTH=1000000000
- Maintains a (compressed, but still potentially large) duplicate copy of all searchable text in `search.sqlite3` db - Maintains a (compressed, but still potentially large) duplicate copy of all searchable text in `search.sqlite3` db
- Does not support searching binary files PDFs, eBooks, compressed archives, etc. - Does not support searching binary files PDFs, eBooks, compressed archives, etc.
- Search indexing and querying must be performed on same server as ArchiveBox data (we don't yet support sending FTS5 queries to a remote server) - Search indexing and querying must be performed on same server as ArchiveBox data (we don't yet support sending FTS5 queries to a remote server)
<br/>
---
<br/>
### Further Reading
### Further Reading
- https://github.com/ArchiveBox/ArchiveBox/blob/dev/docker-compose.yml#:~:text=SEARCH_BACKEND_ENGINE
- https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#ripgrep_binary
* [#1139 Feature Request: Add AI-assisted summarization, tagging, search, and more using LLMs / RAG](https://github.com/ArchiveBox/ArchiveBox/issues/1139)
*