mirror of
https://github.com/pirate/ArchiveBox.git
synced 2025-08-23 14:44:21 +02:00
update url paths
16
Changelog.md
16
Changelog.md
@@ -1,6 +1,6 @@
|
|||||||
# Changelog
|
# Changelog
|
||||||
|
|
||||||
▶️ *If you're having an issue with a breaking change, or migrating your data between versions, open an [issue](https://github.com/pirate/ArchiveBox/issues) to get help.*
|
▶️ *If you're having an issue with a breaking change, or migrating your data between versions, open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues) to get help.*
|
||||||
|
|
||||||
**`ArchiveBox` was previously named `Pocket Archive Stream` and then `Bookmark Archiver`.**
|
**`ArchiveBox` was previously named `Pocket Archive Stream` and then `Bookmark Archiver`.**
|
||||||
|
|
||||||
@@ -8,7 +8,7 @@
|
|||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versioned source downloads and full changelog.
|
See the [releases](https://github.com/ArchiveBox/ArchiveBox/releases) page for versioned source downloads and full changelog.
|
||||||
🍰 Many thanks to our 30+ contributors and everyone in the web archiving community! 🏛
|
🍰 Many thanks to our 30+ contributors and everyone in the web archiving community! 🏛
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
@@ -19,7 +19,7 @@ See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versi
|
|||||||
- `pip install archivebox` https://pypi.org/project/archivebox/
|
- `pip install archivebox` https://pypi.org/project/archivebox/
|
||||||
- `docker run nikisweeting/archivebox` https://hub.docker.com/r/nikisweeting/archivebox
|
- `docker run nikisweeting/archivebox` https://hub.docker.com/r/nikisweeting/archivebox
|
||||||
- https://archivebox.readthedocs.io/en/latest/
|
- https://archivebox.readthedocs.io/en/latest/
|
||||||
- https://github.com/pirate/ArchiveBox/releases
|
- https://github.com/ArchiveBox/ArchiveBox/releases
|
||||||
- easy migration from previous versions
|
- easy migration from previous versions
|
||||||
```bash
|
```bash
|
||||||
cd path/to/your/archive/folder
|
cd path/to/your/archive/folder
|
||||||
@@ -33,7 +33,7 @@ See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versi
|
|||||||
- new subcommands-based CLI for `archivebox` (see below)
|
- new subcommands-based CLI for `archivebox` (see below)
|
||||||
- new Web UI with pagination, better search, filtering, permissions, and more
|
- new Web UI with pagination, better search, filtering, permissions, and more
|
||||||
- 30+ assorted bugfixes, new features, and tickets closed
|
- 30+ assorted bugfixes, new features, and tickets closed
|
||||||
- for more info, see: https://github.com/pirate/ArchiveBox/releases/tag/v0.4.9
|
- for more info, see: https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.4.9
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -81,7 +81,7 @@ See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versi
|
|||||||
|
|
||||||
---
|
---
|
||||||
- v0.2.0 released with new name
|
- v0.2.0 released with new name
|
||||||
- [renamed](https://github.com/pirate/ArchiveBox/issues/108) from **Bookmark Archiver** -> **ArchiveBox**
|
- [renamed](https://github.com/ArchiveBox/ArchiveBox/issues/108) from **Bookmark Archiver** -> **ArchiveBox**
|
||||||
|
|
||||||
---
|
---
|
||||||
- v0.1.0 released
|
- v0.1.0 released
|
||||||
@@ -104,10 +104,10 @@ See the [releases](https://github.com/pirate/ArchiveBox/releases) page for versi
|
|||||||
- Index links now work without nginx url rewrites, archive can now be hosted on github pages
|
- Index links now work without nginx url rewrites, archive can now be hosted on github pages
|
||||||
- added setup.sh script & docstrings & help commands
|
- added setup.sh script & docstrings & help commands
|
||||||
- made Chromium the default instead of Google Chrome (yay free software)
|
- made Chromium the default instead of Google Chrome (yay free software)
|
||||||
- added [env-variable](https://github.com/pirate/ArchiveBox/pull/25) configuration (thanks to https://github.com/hannah98!)
|
- added [env-variable](https://github.com/ArchiveBox/ArchiveBox/pull/25) configuration (thanks to https://github.com/hannah98!)
|
||||||
- renamed from **Pocket Archive Stream** -> **Bookmark Archiver**
|
- renamed from **Pocket Archive Stream** -> **Bookmark Archiver**
|
||||||
- added [Netscape-format](https://github.com/pirate/ArchiveBox/pull/20) export support (thanks to https://github.com/ilvar!)
|
- added [Netscape-format](https://github.com/ArchiveBox/ArchiveBox/pull/20) export support (thanks to https://github.com/ilvar!)
|
||||||
- added [Pinboard-format](https://github.com/pirate/ArchiveBox/pull/7) export support (thanks to https://github.com/sconeyard!)
|
- added [Pinboard-format](https://github.com/ArchiveBox/ArchiveBox/pull/7) export support (thanks to https://github.com/sconeyard!)
|
||||||
- front-page of HN, oops! apparently I have users to support now :grin:?
|
- front-page of HN, oops! apparently I have users to support now :grin:?
|
||||||
- added Pocket-format export support
|
- added Pocket-format export support
|
||||||
|
|
||||||
|
@@ -50,4 +50,4 @@ apt install google-chrome-beta
|
|||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
If you encounter problems setting up Google Chrome or Chromium, see the [Troubleshooting](https://github.com/pirate/ArchiveBox/wiki/Troubleshooting#chromiumgoogle-chrome) page.
|
If you encounter problems setting up Google Chrome or Chromium, see the [Troubleshooting](https://github.com/ArchiveBox/ArchiveBox/wiki/Troubleshooting#chromiumgoogle-chrome) page.
|
||||||
|
@@ -1,11 +1,11 @@
|
|||||||
# Configuration
|
# Configuration
|
||||||
|
|
||||||
▶️ *The full ArchiveBox config file definition with defaults can be found here: [`archivebox/config.py`](https://github.com/pirate/ArchiveBox/blob/master/archivebox/config.py#L27).*
|
▶️ *The full ArchiveBox config file definition with defaults can be found here: [`archivebox/config.py`](https://github.com/ArchiveBox/ArchiveBox/blob/master/archivebox/config.py#L27).*
|
||||||
|
|
||||||
Configuration of ArchiveBox is done by using the `archivebox config` command, modifying the `ArchiveBox.conf` file in the data folder, or by using environment variables. All three methods work equivalently when using Docker as well.
|
Configuration of ArchiveBox is done by using the `archivebox config` command, modifying the `ArchiveBox.conf` file in the data folder, or by using environment variables. All three methods work equivalently when using Docker as well.
|
||||||
|
|
||||||
*Some equivalent examples of setting some configuration options:*
|
*Some equivalent examples of setting some configuration options:*
|
||||||
```bash
|
```bash[][]
|
||||||
archivebox config --set CHROME_BINARY=google-chrome-stable
|
archivebox config --set CHROME_BINARY=google-chrome-stable
|
||||||
# OR
|
# OR
|
||||||
echo "CHROME_BINARY=google-chrome-stable" >> ArchiveBox.conf
|
echo "CHROME_BINARY=google-chrome-stable" >> ArchiveBox.conf
|
||||||
@@ -28,7 +28,7 @@ Environment variables take precedence over the config file, which is useful if y
|
|||||||
|
|
||||||
<br/>
|
<br/>
|
||||||
|
|
||||||
All the available config options are described in this document below, but can also be found along with examples in [`etc/ArchiveBox.conf.default`](https://github.com/pirate/ArchiveBox/blob/master/etc/ArchiveBox.conf.default). The code that loads the config is in [`archivebox/config/__init__.py`](https://github.com/pirate/ArchiveBox/blob/master/archivebox/config/__init__.py#L45).
|
All the available config options are described in this document below, but can also be found along with examples in [`etc/ArchiveBox.conf.default`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/ArchiveBox.conf.default). The code that loads the config is in [`archivebox/config/__init__.py`](https://github.com/ArchiveBox/ArchiveBox/blob/master/archivebox/config/__init__.py#L45).
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -84,7 +84,7 @@ Maximum allowed download time for fetching media when `SAVE_MEDIA=True` in secon
|
|||||||
---
|
---
|
||||||
#### `TEMPLATES_DIR`
|
#### `TEMPLATES_DIR`
|
||||||
**Possible Values:** [`$REPO_DIR/archivebox/templates`]/`/path/to/custom/templates`/...
|
**Possible Values:** [`$REPO_DIR/archivebox/templates`]/`/path/to/custom/templates`/...
|
||||||
Path to a directory containing custom index html templates for theming your archive output. Files found in the folder at the specified path can override any of the defaults in the [`archivebox/themes`](https://github.com/pirate/ArchiveBox/tree/master/archivebox/themes) directory. If you've used `django` before, this works exactly the same way that `django` template overrides work (because it uses `django` under the hood).
|
Path to a directory containing custom index html templates for theming your archive output. Files found in the folder at the specified path can override any of the defaults in the [`archivebox/themes`](https://github.com/ArchiveBox/ArchiveBox/tree/master/archivebox/themes) directory. If you've used `django` before, this works exactly the same way that `django` template overrides work (because it uses `django` under the hood).
|
||||||
|
|
||||||
*Related options:*
|
*Related options:*
|
||||||
[`FOOTER_INFO`](#footer_info)
|
[`FOOTER_INFO`](#footer_info)
|
||||||
@@ -428,3 +428,4 @@ This can be installed using `npm install -g git+https://github.com/pirate/readab
|
|||||||
|
|
||||||
|
|
||||||
<img src="https://i.imgur.com/almAbwK.png" width="100%"/>
|
<img src="https://i.imgur.com/almAbwK.png" width="100%"/>
|
||||||
|
[]:
|
||||||
|
@@ -37,7 +37,7 @@ docker run -v $PWD:/data -p 8000:8000 nikisweeting/archivebox server 0.0.0.0:800
|
|||||||
|
|
||||||
## Docker Compose
|
## Docker Compose
|
||||||
|
|
||||||
An example [`docker-compose.yml`](https://github.com/pirate/ArchiveBox/blob/master/docker-compose.yml) config with ArchiveBox and an Nginx server to serve the archive is included in the project root. You can edit it as you see fit, or just run it as it comes out-of-the-box.
|
An example [`docker-compose.yml`](https://github.com/ArchiveBox/ArchiveBox/blob/master/docker-compose.yml) config with ArchiveBox and an Nginx server to serve the archive is included in the project root. You can edit it as you see fit, or just run it as it comes out-of-the-box.
|
||||||
|
|
||||||
Just make sure you have a Docker version that's [new enough](https://docs.docker.com/compose/compose-file/) to support `version: 3` format:
|
Just make sure you have a Docker version that's [new enough](https://docs.docker.com/compose/compose-file/) to support `version: 3` format:
|
||||||
|
|
||||||
@@ -50,7 +50,7 @@ Docker version 18.09.1, build 4c52b90 # must be >= 17.04.0
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
mkdir archivebox && cd archivebox
|
mkdir archivebox && cd archivebox
|
||||||
wget https://raw.githubusercontent.com/pirate/ArchiveBox/master/docker-compose.yml
|
wget https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml
|
||||||
docker-compose up -d
|
docker-compose up -d
|
||||||
docker-compose run archivebox init
|
docker-compose run archivebox init
|
||||||
docker-compose run archivebox manage createsuperuser
|
docker-compose run archivebox manage createsuperuser
|
||||||
@@ -213,4 +213,4 @@ echo 'https://example.com' | docker run -it -v $PWD:/data -e FETCH_SCREENSHOT=Fa
|
|||||||
docker run -i -v --env-file=ArchiveBox.env nikisweeting/archivebox
|
docker run -i -v --env-file=ArchiveBox.env nikisweeting/archivebox
|
||||||
```
|
```
|
||||||
|
|
||||||
You can also edit the `data/ArchiveBox.conf` file directly and the changes will take effect on the next run.
|
You can also edit the `data/ArchiveBox.conf` file directly and the changes will take effect on the next run.
|
||||||
|
6
Home.md
6
Home.md
@@ -10,7 +10,7 @@
|
|||||||
|
|
||||||
**❓If you need help or have a question, you can:**
|
**❓If you need help or have a question, you can:**
|
||||||
<!-- - 💬 Ask our community by joining the ArchiveBox IRC [chat room](http://webchat.freenode.net?channels=ArchiveBox&uio=d4)-->
|
<!-- - 💬 Ask our community by joining the ArchiveBox IRC [chat room](http://webchat.freenode.net?channels=ArchiveBox&uio=d4)-->
|
||||||
- 🐞 Open an [issue](https://github.com/pirate/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) in our bug tracker
|
- 🐞 Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) in our bug tracker
|
||||||
- 🗣 Reach out to me on [Twitter](https://twitter.com/theSquashSH)
|
- 🗣 Reach out to me on [Twitter](https://twitter.com/theSquashSH)
|
||||||
- 💠 Reach out to me via DM on [Patreon](https://patreon.com/theSquashSH) (you'll get the fastest response here)
|
- 💠 Reach out to me via DM on [Patreon](https://patreon.com/theSquashSH) (you'll get the fastest response here)
|
||||||
|
|
||||||
@@ -22,11 +22,11 @@
|
|||||||
<img src="https://i.imgur.com/viklZNG.png" width="30%" alt="Desktop index screenshot" align="top">
|
<img src="https://i.imgur.com/viklZNG.png" width="30%" alt="Desktop index screenshot" align="top">
|
||||||
<img src="https://i.imgur.com/RefWsXB.jpg" width="30%" alt="Desktop details page Screenshot"/><br/>
|
<img src="https://i.imgur.com/RefWsXB.jpg" width="30%" alt="Desktop details page Screenshot"/><br/>
|
||||||
|
|
||||||
<a href="https://github.com/pirate/ArchiveBox">Readme</a> | <a href="https://archive.sweeting.me/">Demo</a> | <a href="https://github.com/pirate/ArchiveBox/wiki/Quickstart">Quickstart</a> | <a href="https://github.com/pirate/ArchiveBox/wiki/Usage">Usage</a> | <a href="https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community">Community</a>
|
<a href="https://github.com/ArchiveBox/ArchiveBox">Readme</a> | <a href="https://archive.sweeting.me/">Demo</a> | <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart">Quickstart</a> | <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Usage">Usage</a> | <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community">Community</a>
|
||||||
|
|
||||||
<br/>
|
<br/>
|
||||||
<hr/>
|
<hr/>
|
||||||
|
|
||||||
[](https://www.patreon.com/theSquashSH)
|
[](https://www.patreon.com/theSquashSH)
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
@@ -6,13 +6,13 @@
|
|||||||
|
|
||||||
▶️ *It only takes about 5 minutes to get up and running with ArchiveBox.*
|
▶️ *It only takes about 5 minutes to get up and running with ArchiveBox.*
|
||||||
|
|
||||||
ArchiveBox [officially supports](https://github.com/pirate/ArchiveBox/wiki/Install#supported-systems) **macOS**, **Ubuntu/Debian**, and **BSD**, but likely runs on many other systems. You can run it on any system that supports **Docker**, including Windows (using Docker in WSL2).
|
ArchiveBox [officially supports](https://github.com/ArchiveBox/ArchiveBox/wiki/Install#supported-systems) **macOS**, **Ubuntu/Debian**, and **BSD**, but likely runs on many other systems. You can run it on any system that supports **Docker**, including Windows (using Docker in WSL2).
|
||||||
|
|
||||||
If you want to use Docker or Docker Compose to run ArchiveBox, see the [[Docker]] page.
|
If you want to use Docker or Docker Compose to run ArchiveBox, see the [[Docker]] page.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
First, we install the ArchiveBox [dependencies](./Install#dependencies), then we create a folder to [store the archive data](https://github.com/pirate/ArchiveBox/wiki/Usage#Disk-Layout), and finally, we [import the list of links](https://github.com/pirate/ArchiveBox/wiki/Usage#CLI-Usage) to the archive by running:
|
First, we install the ArchiveBox [dependencies](./Install#dependencies), then we create a folder to [store the archive data](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout), and finally, we [import the list of links](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#CLI-Usage) to the archive by running:
|
||||||
`archivebox add < [links_file]`
|
`archivebox add < [links_file]`
|
||||||
|
|
||||||
## 1. Set up ArchiveBox
|
## 1. Set up ArchiveBox
|
||||||
@@ -27,7 +27,7 @@ docker run -v $PWD:/data -it nikisweeting/archivebox init
|
|||||||
|
|
||||||
# alternatively, install ArchiveBox and its dependencies directly on your system without docker
|
# alternatively, install ArchiveBox and its dependencies directly on your system without docker
|
||||||
# (script prompts for user confirmation before installing anything)
|
# (script prompts for user confirmation before installing anything)
|
||||||
curl https://raw.githubusercontent.com/pirate/ArchiveBox/master/bin/setup.sh | sh
|
curl https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/bin/setup.sh | sh
|
||||||
# or follow the manual setup instructions if you don't like using curl | sh
|
# or follow the manual setup instructions if you don't like using curl | sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
12
Roadmap.md
12
Roadmap.md
@@ -3,7 +3,7 @@
|
|||||||
<img src="https://i.imgur.com/es97GGV.png" width="20%" align="right"/>
|
<img src="https://i.imgur.com/es97GGV.png" width="20%" align="right"/>
|
||||||
|
|
||||||
▶️ *Comment here to discuss the contribution roadmap:
|
▶️ *Comment here to discuss the contribution roadmap:
|
||||||
[Official Roadmap Discussion](https://github.com/pirate/ArchiveBox/issues/120).*
|
[Official Roadmap Discussion](https://github.com/ArchiveBox/ArchiveBox/issues/120).*
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -81,7 +81,7 @@
|
|||||||
- full-text search of extracted text with elasticsearch/elasticlunr/ag
|
- full-text search of extracted text with elasticsearch/elasticlunr/ag
|
||||||
- download closed-caption subtitles from Youtube and other video sites for full-text indexing of video content
|
- download closed-caption subtitles from Youtube and other video sites for full-text indexing of video content
|
||||||
- try pulling dead sites from archive.org and other sources if original is down (https://github.com/hartator/wayback-machine-downloader)
|
- try pulling dead sites from archive.org and other sources if original is down (https://github.com/hartator/wayback-machine-downloader)
|
||||||
- And more in the [issues list](https://github.com/pirate/ArchiveBox/issues/)...
|
- And more in the [issues list](https://github.com/ArchiveBox/ArchiveBox/issues/)...
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -93,7 +93,7 @@
|
|||||||
## Past Releases
|
## Past Releases
|
||||||
|
|
||||||
To see how this spec has been scheduled / implemented / released so far, read these pull requests:
|
To see how this spec has been scheduled / implemented / released so far, read these pull requests:
|
||||||
- ✅ [v0.2.x](https://github.com/pirate/ArchiveBox/tree/483a3bef9e2b1a7b80611947a3be99b0cf4f9959)
|
- ✅ [v0.2.x](https://github.com/ArchiveBox/ArchiveBox/tree/483a3bef9e2b1a7b80611947a3be99b0cf4f9959)
|
||||||
- ✅ [v0.3.x](https://github.com/pirate/ArchiveBox/pull/197)
|
- ✅ [v0.3.x](https://github.com/ArchiveBox/ArchiveBox/pull/197)
|
||||||
- ✅ [v0.4.x](https://github.com/pirate/ArchiveBox/pull/207)
|
- ✅ [v0.4.x](https://github.com/ArchiveBox/ArchiveBox/pull/207)
|
||||||
- 🛠 [v0.5.x](https://github.com/pirate/ArchiveBox/pull/275)
|
- 🛠 [v0.5.x](https://github.com/ArchiveBox/ArchiveBox/pull/275)
|
||||||
|
@@ -8,7 +8,7 @@ ArchiveBox ignores links that are imported multiple times (keeping the earliest
|
|||||||
This means you can add cron jobs that regularly poll the same file or URL for new links, adding only new
|
This means you can add cron jobs that regularly poll the same file or URL for new links, adding only new
|
||||||
ones as necessary.
|
ones as necessary.
|
||||||
|
|
||||||
For some example configs, see the [`etc/cron.d`](https://github.com/pirate/ArchiveBox/blob/master/etc/cron.d) and [`etc/supervisord`](https://github.com/pirate/ArchiveBox/blob/master/etc/supervisord) folders.
|
For some example configs, see the [`etc/cron.d`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/cron.d) and [`etc/supervisord`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/supervisord) folders.
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
|
@@ -28,7 +28,7 @@ This mode should not be used for archiving entire browser history or authenticat
|
|||||||
|
|
||||||
~~ArchiveBox is designed to be able to archive content that requires authentication or cookies. This includes paywalled content, private forums, LAN-only content, etc.~~
|
~~ArchiveBox is designed to be able to archive content that requires authentication or cookies. This includes paywalled content, private forums, LAN-only content, etc.~~
|
||||||
|
|
||||||
~~To get started, set [`CHROME_USER_DATA_DIR`](https://github.com/pirate/ArchiveBox/wiki/Configuration#chrome_user_data_dir) and [`COOKIES_FILE`](https://github.com/pirate/ArchiveBox/wiki/Configuration#COOKIES_FILE) to point to a Chrome user folder that has your sessions and a wget `cookies.txt` file respectively.~~
|
~~To get started, set [`CHROME_USER_DATA_DIR`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#chrome_user_data_dir) and [`COOKIES_FILE`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#COOKIES_FILE) to point to a Chrome user folder that has your sessions and a wget `cookies.txt` file respectively.~~
|
||||||
|
|
||||||
~~If you're importing private links or authenticated content, you probably don't want to share your archive folder publicly on a webserver, so don't follow the [[Publishing Your Archive]] instructions unless you are only serving it on a trusted LAN or have some sort of authentication in front of it. Make sure to point ArchiveBox to an output folder with conservative permissions, as it may contain archived content with secret session tokens or pieces of your user data. You may also wish to encrypt the archive using an encrypted disk image or filesystem like ZFS as it will contain all requests and response data, including session keys, user data, usernames, etc.~~
|
~~If you're importing private links or authenticated content, you probably don't want to share your archive folder publicly on a webserver, so don't follow the [[Publishing Your Archive]] instructions unless you are only serving it on a trusted LAN or have some sort of authentication in front of it. Make sure to point ArchiveBox to an output folder with conservative permissions, as it may contain archived content with secret session tokens or pieces of your user data. You may also wish to encrypt the archive using an encrypted disk image or filesystem like ZFS as it will contain all requests and response data, including session keys, user data, usernames, etc.~~
|
||||||
|
|
||||||
@@ -38,8 +38,8 @@ This mode should not be used for archiving entire browser history or authenticat
|
|||||||
|
|
||||||
~~If you want ArchiveBox to be less noisy and avoid leaking any URLs to 3rd-party APIs during archiving, you can disable the options below. Disabling these are recommended if you plan on archiving any sites that use secret tokens in the URL to grant access to private content without authentication, e.g. Google Docs, CodiDM notepads, etc.~~
|
~~If you want ArchiveBox to be less noisy and avoid leaking any URLs to 3rd-party APIs during archiving, you can disable the options below. Disabling these are recommended if you plan on archiving any sites that use secret tokens in the URL to grant access to private content without authentication, e.g. Google Docs, CodiDM notepads, etc.~~
|
||||||
|
|
||||||
- `https://web.archive.org/save/{url}` when [`SUBMIT_ARCHIVE_DOT_ORG`](https://github.com/pirate/ArchiveBox/wiki/Configuration#submit_archive_dot_org) is `True`, full URLs are submitted to the Wayback Machine for archiving, but no cookies or content from the local authenticated archive are shared
|
- `https://web.archive.org/save/{url}` when [`SUBMIT_ARCHIVE_DOT_ORG`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#submit_archive_dot_org) is `True`, full URLs are submitted to the Wayback Machine for archiving, but no cookies or content from the local authenticated archive are shared
|
||||||
- `https://www.google.com/s2/favicons?domain={domain}` when [`FETCH_FAVICON`](https://github.com/pirate/ArchiveBox/wiki/Configuration#fetch_favicon) is `True`, the domains for each link are shared in order to get the favicon, but not the full URL~~
|
- `https://www.google.com/s2/favicons?domain={domain}` when [`FETCH_FAVICON`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#fetch_favicon) is `True`, the domains for each link are shared in order to get the favicon, but not the full URL~~
|
||||||
|
|
||||||
## Do not run as root
|
## Do not run as root
|
||||||
|
|
||||||
@@ -60,7 +60,7 @@ chown -R archivebox:archivebox /home/archivebox
|
|||||||
sudo -u archivebox archivebox add ...
|
sudo -u archivebox archivebox add ...
|
||||||
```
|
```
|
||||||
|
|
||||||
~~If you absolutely must run it as root for some reason, a footgun is provided: you can set [`ALLOW_ROOT=True`](https://github.com/pirate/ArchiveBox/wiki/Configuration#ALLOW_ROOT) via environment variable or in your ArchiveBox.conf file.~~ It was removed.
|
~~If you absolutely must run it as root for some reason, a footgun is provided: you can set [`ALLOW_ROOT=True`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#ALLOW_ROOT) via environment variable or in your ArchiveBox.conf file.~~ It was removed.
|
||||||
|
|
||||||
<img src="https://i.imgur.com/ca1he6I.png" width="40px" align="right"/>
|
<img src="https://i.imgur.com/ca1he6I.png" width="40px" align="right"/>
|
||||||
|
|
||||||
@@ -68,7 +68,7 @@ sudo -u archivebox archivebox add ...
|
|||||||
|
|
||||||
### Permissions
|
### Permissions
|
||||||
|
|
||||||
What are the permissions on the archive folder? Limit access to the fewest possible users by checking folder ownership and setting [`OUTPUT_PERMISSIONS`](https://github.com/pirate/ArchiveBox/wiki/Configuration#OUTPUT_PERMISSIONS) accordingly.
|
What are the permissions on the archive folder? Limit access to the fewest possible users by checking folder ownership and setting [`OUTPUT_PERMISSIONS`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#OUTPUT_PERMISSIONS) accordingly.
|
||||||
|
|
||||||
### Filesystem
|
### Filesystem
|
||||||
|
|
||||||
@@ -78,4 +78,4 @@ How much are you planning to archive? Only a few bookmarked articles, or thousa
|
|||||||
|
|
||||||
Are you publishing your archive? If so, make sure you're only serving it as HTML and not accidentally running it as php or cgi, and put it on its own domain not shared with other services. This is done in order to avoid cookies leaking between your main domain and domains hosting content you don't control. Many companies put user provided files on separate domains like googleusercontent.com and github.io to avoid this problem.
|
Are you publishing your archive? If so, make sure you're only serving it as HTML and not accidentally running it as php or cgi, and put it on its own domain not shared with other services. This is done in order to avoid cookies leaking between your main domain and domains hosting content you don't control. Many companies put user provided files on separate domains like googleusercontent.com and github.io to avoid this problem.
|
||||||
|
|
||||||
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/pirate/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
|
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
|
||||||
|
@@ -1,11 +1,11 @@
|
|||||||
# Troubleshooting
|
# Troubleshooting
|
||||||
|
|
||||||
▶️ *If you need help or have a question, you can open an [issue](https://github.com/pirate/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) or reach out on [Twitter](https://twitter.com/theSquashSH).*
|
▶️ *If you need help or have a question, you can open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) or reach out on [Twitter](https://twitter.com/theSquashSH).*
|
||||||
|
|
||||||
What are you having an issue with?:
|
What are you having an issue with?:
|
||||||
|
|
||||||
- [Installing](#Installing)
|
- [Installing](#Installing)
|
||||||
- [Configuration](https://github.com/pirate/ArchiveBox/wiki/Configuration)
|
- [Configuration](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration)
|
||||||
- [Archiving Process](#Archiving)
|
- [Archiving Process](#Archiving)
|
||||||
- [Hosting the Archive](#Hosting-the-Archive)
|
- [Hosting the Archive](#Hosting-the-Archive)
|
||||||
|
|
||||||
@@ -77,7 +77,7 @@ a bug in versions `<=1.19.1_1` that caused wget to fail for perfectly valid site
|
|||||||
|
|
||||||
### No links parsed from export file
|
### No links parsed from export file
|
||||||
|
|
||||||
Please open an [issue](https://github.com/pirate/ArchiveBox/issues) with a description of where you got the export, and
|
Please open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues) with a description of where you got the export, and
|
||||||
preferrably your export file attached (you can redact the links). We'll fix the parser to support your format.
|
preferrably your export file attached (you can redact the links). We'll fix the parser to support your format.
|
||||||
|
|
||||||
### Lots of skipped sites
|
### Lots of skipped sites
|
||||||
@@ -91,12 +91,12 @@ If you're still having issues, try deleting or moving the `output/archive` folde
|
|||||||
### Lots of errors
|
### Lots of errors
|
||||||
|
|
||||||
Make sure you have all the dependencies installed and that you're able to visit the links from your browser normally.
|
Make sure you have all the dependencies installed and that you're able to visit the links from your browser normally.
|
||||||
Open an [issue](https://github.com/pirate/ArchiveBox/issues) with a description of the errors if you're still having problems.
|
Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues) with a description of the errors if you're still having problems.
|
||||||
|
|
||||||
### Lots of broken links from the index
|
### Lots of broken links from the index
|
||||||
|
|
||||||
Not all sites can be effectively archived with each method, that's why it's best to use a combination of `wget`, PDFs, and screenshots.
|
Not all sites can be effectively archived with each method, that's why it's best to use a combination of `wget`, PDFs, and screenshots.
|
||||||
If it seems like more than 10-20% of sites in the archive are broken, open an [issue](https://github.com/pirate/ArchiveBox/issues)
|
If it seems like more than 10-20% of sites in the archive are broken, open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues)
|
||||||
with some of the URLs that failed to be archived and I'll investigate.
|
with some of the URLs that failed to be archived and I'll investigate.
|
||||||
|
|
||||||
### Removing unwanted links from the index
|
### Removing unwanted links from the index
|
||||||
@@ -106,5 +106,5 @@ If you accidentally added lots of unwanted links into index and they slow down y
|
|||||||
## Hosting the Archive
|
## Hosting the Archive
|
||||||
|
|
||||||
If you're having issues trying to host the archive via nginx, make sure you already have nginx running with SSL.
|
If you're having issues trying to host the archive via nginx, make sure you already have nginx running with SSL.
|
||||||
If you don't, google around, there are plenty of tutorials to help get that set up. Open an [issue](https://github.com/pirate/ArchiveBox/issues)
|
If you don't, google around, there are plenty of tutorials to help get that set up. Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues)
|
||||||
if you have problem with a particular nginx config.
|
if you have problem with a particular nginx config.
|
||||||
|
6
Usage.md
6
Usage.md
@@ -1,6 +1,6 @@
|
|||||||
# Usage
|
# Usage
|
||||||
|
|
||||||
▶️ _Make sure the dependencies are [fully installed](https://github.com/pirate/ArchiveBox/wiki/Install) before running any ArchiveBox commands._
|
▶️ _Make sure the dependencies are [fully installed](https://github.com/ArchiveBox/ArchiveBox/wiki/Install) before running any ArchiveBox commands._
|
||||||
|
|
||||||
**ArchiveBox API Reference:**
|
**ArchiveBox API Reference:**
|
||||||
|
|
||||||
@@ -18,7 +18,7 @@
|
|||||||
- [[Scheduled Archiving]]: Learn how to set up automatic daily archiving
|
- [[Scheduled Archiving]]: Learn how to set up automatic daily archiving
|
||||||
- [[Publishing Your Archive]]: Learn how to host your archive for others to access
|
- [[Publishing Your Archive]]: Learn how to host your archive for others to access
|
||||||
- [[Troubleshooting]]: Resources if you encounter any problems
|
- [[Troubleshooting]]: Resources if you encounter any problems
|
||||||
- [Screenshots](https://github.com/pirate/ArchiveBox#Screenshots): See what the CLI and outputted HTML look like
|
- [Screenshots](https://github.com/ArchiveBox/ArchiveBox#Screenshots): See what the CLI and outputted HTML look like
|
||||||
|
|
||||||
|
|
||||||
## CLI Usage
|
## CLI Usage
|
||||||
@@ -230,7 +230,7 @@ from archivebox import *
|
|||||||
schedule
|
schedule
|
||||||
|
|
||||||
[i] Welcome to the ArchiveBox Shell!
|
[i] Welcome to the ArchiveBox Shell!
|
||||||
https://github.com/pirate/ArchiveBox/wiki/Usage#Shell-Usage
|
https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Shell-Usage
|
||||||
|
|
||||||
Hint: Example use:
|
Hint: Example use:
|
||||||
print(Snapshot.objects.filter(is_archived=True).count())
|
print(Snapshot.objects.filter(is_archived=True).count())
|
||||||
|
@@ -1,7 +1,7 @@
|
|||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
[✏️ Help improve our documentation...](https://github.com/pirate/ArchiveBox/issues/new?assignees=&labels=&template=documentation_change.md&title=)
|
[✏️ Help improve our documentation...](https://github.com/ArchiveBox/ArchiveBox/issues/new?assignees=&labels=&template=documentation_change.md&title=)
|
||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||

|

|
||||||
|
10
_Sidebar.md
10
_Sidebar.md
@@ -14,8 +14,8 @@
|
|||||||
|
|
||||||
- [[Usage]]
|
- [[Usage]]
|
||||||
- [[Configuration]]
|
- [[Configuration]]
|
||||||
- [Supported Sources](https://github.com/pirate/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)
|
- [Supported Sources](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)
|
||||||
- [Supported Outputs](https://github.com/pirate/ArchiveBox#output-formats)
|
- [Supported Outputs](https://github.com/ArchiveBox/ArchiveBox#output-formats)
|
||||||
- [[Scheduled Archiving]]
|
- [[Scheduled Archiving]]
|
||||||
- [[Publishing Your Archive]]
|
- [[Publishing Your Archive]]
|
||||||
- [[Chromium Install]]
|
- [[Chromium Install]]
|
||||||
@@ -27,8 +27,8 @@
|
|||||||
- [[Roadmap]]
|
- [[Roadmap]]
|
||||||
- [[Changelog]]
|
- [[Changelog]]
|
||||||
- [[Donations]]
|
- [[Donations]]
|
||||||
- [Background & Motivation](https://github.com/pirate/ArchiveBox#background--motivation)
|
- [Background & Motivation](https://github.com/ArchiveBox/ArchiveBox#background--motivation)
|
||||||
- [Comparison to Other Tools](https://github.com/pirate/ArchiveBox#comparison-to-other-projects)
|
- [Comparison to Other Tools](https://github.com/ArchiveBox/ArchiveBox#comparison-to-other-projects)
|
||||||
- [[Web Archiving Community]]
|
- [[Web Archiving Community]]
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -37,4 +37,4 @@
|
|||||||
<a href="https://archivebox.io"><img src="https://i.imgur.com/4nkFjdv.png" height="30px"/></a><br/><br/>
|
<a href="https://archivebox.io"><img src="https://i.imgur.com/4nkFjdv.png" height="30px"/></a><br/><br/>
|
||||||
<a href="https://twitter.com/thesquashSH"><img src="https://img.shields.io/twitter/url/http/shields.io.svg?style=social"/></a>
|
<a href="https://twitter.com/thesquashSH"><img src="https://img.shields.io/twitter/url/http/shields.io.svg?style=social"/></a>
|
||||||
<a href="https://www.patreon.com/theSquashSH"><img src="https://img.shields.io/badge/Donate-Patreon-%23DD5D76.svg"/></a>
|
<a href="https://www.patreon.com/theSquashSH"><img src="https://img.shields.io/badge/Donate-Patreon-%23DD5D76.svg"/></a>
|
||||||
</p>
|
</p>
|
||||||
|
4
conf.py
4
conf.py
@@ -38,8 +38,8 @@ VERSION = json.loads((Path(ROOT_DIR) / 'package.json').read_text().strip())['ver
|
|||||||
project = 'ArchiveBox'
|
project = 'ArchiveBox'
|
||||||
copyright = '2020, Nick Sweeting'
|
copyright = '2020, Nick Sweeting'
|
||||||
author = 'Nick Sweeting'
|
author = 'Nick Sweeting'
|
||||||
github_url = 'https://github.com/pirate/ArchiveBox'
|
github_url = 'https://github.com/ArchiveBox/ArchiveBox'
|
||||||
github_doc_root = 'https://github.com/pirate/ArchiveBox/tree/master/docs/'
|
github_doc_root = 'https://github.com/ArchiveBox/ArchiveBox/tree/master/docs/'
|
||||||
language = 'en'
|
language = 'en'
|
||||||
|
|
||||||
# The full version, including alpha/beta/rc tags
|
# The full version, including alpha/beta/rc tags
|
||||||
|
@@ -3,9 +3,9 @@
|
|||||||
Just getting started?
|
Just getting started?
|
||||||
Check out the `Quickstart <Quickstart.html>`_ guide.
|
Check out the `Quickstart <Quickstart.html>`_ guide.
|
||||||
Need help with something?
|
Need help with something?
|
||||||
Ping us on `Twitter <https://twitter.com/theSquashSH>`_ or `Github <https://github.com/pirate/ArchiveBox/issues>`_.
|
Ping us on `Twitter <https://twitter.com/theSquashSH>`_ or `Github <https://github.com/ArchiveBox/ArchiveBox/issues>`_.
|
||||||
Want to join the community?
|
Want to join the community?
|
||||||
See our `Community Wiki <https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community>`_ page.
|
See our `Community Wiki <https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community>`_ page.
|
||||||
|
|
||||||
.. image:: logo.png
|
.. image:: logo.png
|
||||||
:width: 200px
|
:width: 200px
|
||||||
@@ -18,7 +18,7 @@ ArchiveBox
|
|||||||
|
|
||||||
"The open-source self-hosted internet archive."
|
"The open-source self-hosted internet archive."
|
||||||
|
|
||||||
`Website <https://archivebox.io>`_ | `Github <https://github.com/pirate/ArchiveBox>`_ | `Source <https://github.com/pirate/ArchiveBox/tree/master>`_ | `Bug Tracker <https://github.com/pirate/ArchiveBox/issues>`_
|
`Website <https://archivebox.io>`_ | `Github <https://github.com/ArchiveBox/ArchiveBox>`_ | `Source <https://github.com/ArchiveBox/ArchiveBox/tree/master>`_ | `Bug Tracker <https://github.com/ArchiveBox/ArchiveBox/issues>`_
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user