1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-08-30 17:50:04 +02:00

Updated Home (markdown)

Nick Sweeting
2019-03-05 12:26:01 -05:00
parent 243db316b8
commit aebfda61a6

@@ -53,11 +53,13 @@ organized by timestamp bookmarked. It's Powered by [headless](https://developer
Wget doesn't work on sites you need to be logged into, but chrome headless does, see the [Configuration](#configuration)* section for `CHROME_USER_DATA_DIR`.
**Large Exports & Estimated Runtime:**
### Large Exports
I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it single-threaded on my i5 machine with 50mbps down. YMMV.
Storage requirements go up immensely if you're using `FETCH_MEDIA=True` and are archiving many pages with audio & video.
You can run it in parallel by using the `resume` feature, or by manually splitting export.html into multiple files:
```bash
./archive export.html 1498800000 & # second argument is timestamp to resume downloading from