diff --git a/Setting-up-Search.md b/Setting-up-Search.md index 628b265..fcf9b7d 100644 --- a/Setting-up-Search.md +++ b/Setting-up-Search.md @@ -48,6 +48,8 @@ However, there are some fundamental limitations of scanning through every file o
+ + ### `ripgrep` *(the default)* If you do not already have `ripgrep` installed, follow the [instructions here](https://github.com/BurntSushi/ripgrep#installation) to get it. @@ -78,6 +80,8 @@ archivebox list --filter-type=search 'text to search for'
+ + ### `ripgrep-all` (aka `rga`) The same as ripgrep except that it supports searching more binary filetypes like PDFs, eBooks, Office documents, zip, tar.gz, etc. @@ -97,6 +101,8 @@ archivebox list --filter-type=search 'text to search for'
+ + ### `ugrep` Not tested by the ArchiveBox team but it's very similar to `ripgrep` and may work as a drop-in replacement, with some caveats. (contributions welcome to improve support) @@ -123,6 +129,8 @@ archivebox config --set RIPGREP_BINARY=ugrep+

+ + ### `sonic` ⭐️ (the recommended upgrade path for most people) [Sonic](https://github.com/valeriansaliou/sonic) is a fast, lightweight, rust-based alternative to super-heavy traditional search backends like Elasticsearch. It is capable of normalizing natural language search queries, fuzzy matching, and searching Unicode, without needing to maintain a duplicate document store index of all the searchable text. @@ -172,6 +180,8 @@ docker compose run archivebox list --filter-type=search 'some text to search'
+ + ### `SQLite FTS5` This is a [recently added](https://github.com/ArchiveBox/ArchiveBox/pull/1241) experimental option that uses a separate SQLite3 Database (similar to the one ArchiveBox already uses for Snapshot metadata) to provide full-text search.