1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-08-26 07:44:41 +02:00

Updated Home (markdown)

Nick Sweeting
2019-01-11 07:29:28 -05:00
parent fa10e3f006
commit f3c0f8dedb

15
Home.md

@@ -28,7 +28,7 @@ For each site, it outputs (configurable):
- Index summary pages: index.html & index.json - Index summary pages: index.html & index.json
The archiving is additive, so you can schedule `./archive` to run regularly and pull new links into the index. The archiving is additive, so you can schedule `./archive` to run regularly and pull new links into the index.
All the saved content is static and indexed with json files, so it lives forever & is easily parseable, it requires no always-running backend. All the saved content is static and indexed with JSON files, so it lives forever & is easily parseable, it requires no always-running backend.
[DEMO: archive.sweeting.me](https://archive.sweeting.me) [DEMO: archive.sweeting.me](https://archive.sweeting.me)
@@ -50,8 +50,9 @@ For each sites it saves:
- `screenshot.png` 1440x900 screenshot of site using headless chrome - `screenshot.png` 1440x900 screenshot of site using headless chrome
- `output.html` DOM Dump of the HTML after rendering using headless chrome - `output.html` DOM Dump of the HTML after rendering using headless chrome
- `archive.org.txt` A link to the saved site on archive.org - `archive.org.txt` A link to the saved site on archive.org
- `audio/` and `video/` for sites like youtube, soundcloud, etc. (using youtube-dl) (WIP) - `warc/` for the html + gzipped warc file <timestamp>.gz
- `code/` clone of any repository for github, bitbucket, or gitlab links (WIP) - `media/` for sites like youtube, soundcloud, etc. (using youtube-dl)
- `git/` clone of any repository for github, bitbucket, or gitlab links)
- `index.json` JSON index containing link info and archive details - `index.json` JSON index containing link info and archive details
- `index.html` HTML index containing link info and archive details (optional fancy or simple index) - `index.html` HTML index containing link info and archive details (optional fancy or simple index)
@@ -79,11 +80,11 @@ which are already in the index.
## Info & Motivation ## Info & Motivation
This is basically an open-source version of [Pocket Premium](https://getpocket.com/premium) (which you should consider paying for!). This is basically an open-source version of [Pocket Premium](https://getpocket.com/premium) (which you should consider paying for!).
I got tired of sites I saved going offline or changing their URLS, so I started I got tired of sites I saved going offline or changing their URLs, so I started
archiving a copy of them locally now, similar to The Way-Back Machine provided archiving a copy of them locally now, similar to The Way-Back Machine provided
by [archive.org](https://archive.org). Self hosting your own archive allows you to save by [archive.org](https://archive.org). Self-hosting your own archive allows you to save
PDFs & Screenshots of dynamic sites in addition to static html, something archive.org doesn't do. PDFs & Screenshots of dynamic sites in addition to static HTML, something archive.org doesn't do.
Now I can rest soundly knowing important articles and resources I like wont dissapear off the internet. Now I can rest soundly knowing important articles and resources I like won't disappear off the internet.
My published archive as an example: [archive.sweeting.me](https://archive.sweeting.me). My published archive as an example: [archive.sweeting.me](https://archive.sweeting.me).