From aebfda61a6625a6d1e1f1ee43cd82e7a5943153b Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Tue, 5 Mar 2019 12:26:01 -0500 Subject: [PATCH] Updated Home (markdown) --- Home.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Home.md b/Home.md index 4550c41..249bb8c 100644 --- a/Home.md +++ b/Home.md @@ -53,11 +53,13 @@ organized by timestamp bookmarked. It's Powered by [headless](https://developer Wget doesn't work on sites you need to be logged into, but chrome headless does, see the [Configuration](#configuration)* section for `CHROME_USER_DATA_DIR`. -**Large Exports & Estimated Runtime:** +### Large Exports I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB. Those numbers are from running it single-threaded on my i5 machine with 50mbps down. YMMV. +Storage requirements go up immensely if you're using `FETCH_MEDIA=True` and are archiving many pages with audio & video. + You can run it in parallel by using the `resume` feature, or by manually splitting export.html into multiple files: ```bash ./archive export.html 1498800000 & # second argument is timestamp to resume downloading from