mirror of
https://github.com/pirate/ArchiveBox.git
synced 2025-08-30 17:50:04 +02:00
Updated Home (markdown)
4
Home.md
4
Home.md
@@ -53,11 +53,13 @@ organized by timestamp bookmarked. It's Powered by [headless](https://developer
|
||||
|
||||
Wget doesn't work on sites you need to be logged into, but chrome headless does, see the [Configuration](#configuration)* section for `CHROME_USER_DATA_DIR`.
|
||||
|
||||
**Large Exports & Estimated Runtime:**
|
||||
### Large Exports
|
||||
|
||||
I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
|
||||
Those numbers are from running it single-threaded on my i5 machine with 50mbps down. YMMV.
|
||||
|
||||
Storage requirements go up immensely if you're using `FETCH_MEDIA=True` and are archiving many pages with audio & video.
|
||||
|
||||
You can run it in parallel by using the `resume` feature, or by manually splitting export.html into multiple files:
|
||||
```bash
|
||||
./archive export.html 1498800000 & # second argument is timestamp to resume downloading from
|
||||
|
Reference in New Issue
Block a user