Updated Roadmap (markdown)

2025-08-21 21:57:26 +02:00 · 2020-08-15 01:05:29 -04:00
parent 6856a467f2
commit 783062969f
1 changed files with 32 additions and 32 deletions
--- a/Roadmap.md
+++ b/Roadmap.md
@@ -23,49 +23,49 @@ To see how this spec has been scheduled / implemented / released so far, read th
 (this is not set in stone, just a rough estimate)

 ### `v0.5`: Remove live-updated JSON & HTML index in favor of `archivebox export`
-	- use SQLite as the main db and export staticfile indexes once at the *end* of the whole process  instead of live-updating them during each extractor run (i.e. remove `patch_main_index`)
-	- create archivebox export command
-	- we have to create a public view to replace `index.html` / `old.html` used for non-logged in users
+ - use SQLite as the main db and export staticfile indexes once at the *end* of the whole process  instead of live-updating them during each extractor run (i.e. remove `patch_main_index`)
+ - create archivebox export command
+ - we have to create a public view to replace `index.html` / `old.html` used for non-logged in users
    
 ### `v0.6`: Code cleanup / refactor
-	- move config loading logic into settings.py
-	- move all the extractors into "plugin" style folders that register their own config
-	- right now, the paths of the extractor output are scattered all over the codebase, e.g. `output.pdf` (should be moved to constants at the top of the plugin config file)
-	- make out_dir, link_dir, extractor_dir, naming consistent across codebase
-	- convert all `os.path` calls and raw string paths to `Pathlib`
+ - move config loading logic into settings.py
+ - move all the extractors into "plugin" style folders that register their own config
+ - right now, the paths of the extractor output are scattered all over the codebase, e.g. `output.pdf` (should be moved to constants at the top of the plugin config file)
+ - make out_dir, link_dir, extractor_dir, naming consistent across codebase
+ - convert all `os.path` calls and raw string paths to `Pathlib`

 ### `v0.7`: Schema improvements
-	- remove `timestamps` as primary keys in favor of hashes, UUIDs, or some other slug
-	- create a migration system for folder layout independent of the index (`mv` is atomic at the FS level, so we just need a `transaction.atomic(): move(oldpath, newpath); snap.data_dir = newpath; snap.save()`)
-	- make `Tag` a real model `ManyToMany` with Snapshots
-	- allow multiple Snapshots of the same site over time + CLI / UI to manage those, + migration from old style `#2020-01-01` hack to proper versioned snapshots
+ - remove `timestamps` as primary keys in favor of hashes, UUIDs, or some other slug
+ - create a migration system for folder layout independent of the index (`mv` is atomic at the FS level, so we just need a `transaction.atomic(): move(oldpath, newpath); snap.data_dir = newpath; snap.save()`)
+ - make `Tag` a real model `ManyToMany` with Snapshots
+ - allow multiple Snapshots of the same site over time + CLI / UI to manage those, + migration from old style `#2020-01-01` hack to proper versioned snapshots
    
 ### `v0.8`:  Security
-	- Add CSRF/CSP/XSS protection to rendered archive pages
-	- Provide secure reverse proxy in front of archivebox server in docker-compose.yml
-	- Create UX flow for users to setup session cookies / auth for archiving private sites
-		- cookies for wget, curl, etc low-level commands
-		- localstorage, cookies, indexedb setup for chrome archiving methods
+ - Add CSRF/CSP/XSS protection to rendered archive pages
+ - Provide secure reverse proxy in front of archivebox server in docker-compose.yml
+ - Create UX flow for users to setup session cookies / auth for archiving private sites
+   - cookies for wget, curl, etc low-level commands
+   - localstorage, cookies, indexedb setup for chrome archiving methods
        
 ### `v0.9`:  Performance
-	- setup huey, break up archiving process into tasks on a queue that a worker pool executes
-	- setup pyppeteer2 to wrap chrome so that it's not open/closed during each extractor
+ - setup huey, break up archiving process into tasks on a queue that a worker pool executes
+ - setup pyppeteer2 to wrap chrome so that it's not open/closed during each extractor

 ### `v1.0`: Full headless browser control
-	- run user-scripts / extensions in the context of the page during archiving
-	- community userscripts for unrolling twitter threads, reddit threads, youtube comment sections, etc.
-	- pywb-based headless browser session recording and warc replay
-	- archive proxy support
-		- support sending upstream requests through an external proxy
-		- support for exposing a proxy that archives all downstream traffic
+ - run user-scripts / extensions in the context of the page during archiving
+ - community userscripts for unrolling twitter threads, reddit threads, youtube comment sections, etc.
+ - pywb-based headless browser session recording and warc replay
+ - archive proxy support
+   - support sending upstream requests through an external proxy
+   - support for exposing a proxy that archives all downstream traffic

 ...

 ### `v2.0` Federated or distributed archiving + paid hosted service offering
-	- merkel tree for storing archive output subresource hashes
-	- DHT for assigning merkel tree hash:file shards to nodes
-	- tag system for tagging certain hashes with human-readable names, e.g. title, url, tags, filetype etc.
-	- distributed tag lookup system
+ - merkel tree for storing archive output subresource hashes
+ - DHT for assigning merkel tree hash:file shards to nodes
+ - tag system for tagging certain hashes with human-readable names, e.g. title, url, tags, filetype etc.
+ - distributed tag lookup system