From 6c54fca67e07ea25682df5a4a3c500a1a96e1332 Mon Sep 17 00:00:00 2001
From: Nick Sweeting
-All the available config options are described in this document below, but can also be found along with examples in [`etc/ArchiveBox.conf.default`](https://github.com/pirate/ArchiveBox/blob/master/etc/ArchiveBox.conf.default). The code that loads the config is in [`archivebox/config/__init__.py`](https://github.com/pirate/ArchiveBox/blob/master/archivebox/config/__init__.py#L45).
+All the available config options are described in this document below, but can also be found along with examples in [`etc/ArchiveBox.conf.default`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/ArchiveBox.conf.default). The code that loads the config is in [`archivebox/config/__init__.py`](https://github.com/ArchiveBox/ArchiveBox/blob/master/archivebox/config/__init__.py#L45).
---
@@ -84,7 +84,7 @@ Maximum allowed download time for fetching media when `SAVE_MEDIA=True` in secon
---
#### `TEMPLATES_DIR`
**Possible Values:** [`$REPO_DIR/archivebox/templates`]/`/path/to/custom/templates`/...
-Path to a directory containing custom index html templates for theming your archive output. Files found in the folder at the specified path can override any of the defaults in the [`archivebox/themes`](https://github.com/pirate/ArchiveBox/tree/master/archivebox/themes) directory. If you've used `django` before, this works exactly the same way that `django` template overrides work (because it uses `django` under the hood).
+Path to a directory containing custom index html templates for theming your archive output. Files found in the folder at the specified path can override any of the defaults in the [`archivebox/themes`](https://github.com/ArchiveBox/ArchiveBox/tree/master/archivebox/themes) directory. If you've used `django` before, this works exactly the same way that `django` template overrides work (because it uses `django` under the hood).
*Related options:*
[`FOOTER_INFO`](#footer_info)
@@ -428,3 +428,4 @@ This can be installed using `npm install -g git+https://github.com/pirate/readab
+[]:
diff --git a/Docker.md b/Docker.md
index 8c7a752..566ce0e 100644
--- a/Docker.md
+++ b/Docker.md
@@ -37,7 +37,7 @@ docker run -v $PWD:/data -p 8000:8000 nikisweeting/archivebox server 0.0.0.0:800
## Docker Compose
-An example [`docker-compose.yml`](https://github.com/pirate/ArchiveBox/blob/master/docker-compose.yml) config with ArchiveBox and an Nginx server to serve the archive is included in the project root. You can edit it as you see fit, or just run it as it comes out-of-the-box.
+An example [`docker-compose.yml`](https://github.com/ArchiveBox/ArchiveBox/blob/master/docker-compose.yml) config with ArchiveBox and an Nginx server to serve the archive is included in the project root. You can edit it as you see fit, or just run it as it comes out-of-the-box.
Just make sure you have a Docker version that's [new enough](https://docs.docker.com/compose/compose-file/) to support `version: 3` format:
@@ -50,7 +50,7 @@ Docker version 18.09.1, build 4c52b90 # must be >= 17.04.0
```bash
mkdir archivebox && cd archivebox
-wget https://raw.githubusercontent.com/pirate/ArchiveBox/master/docker-compose.yml
+wget https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml
docker-compose up -d
docker-compose run archivebox init
docker-compose run archivebox manage createsuperuser
@@ -213,4 +213,4 @@ echo 'https://example.com' | docker run -it -v $PWD:/data -e FETCH_SCREENSHOT=Fa
docker run -i -v --env-file=ArchiveBox.env nikisweeting/archivebox
```
-You can also edit the `data/ArchiveBox.conf` file directly and the changes will take effect on the next run.
\ No newline at end of file
+You can also edit the `data/ArchiveBox.conf` file directly and the changes will take effect on the next run.
diff --git a/Home.md b/Home.md
index 74db05c..222a264 100644
--- a/Home.md
+++ b/Home.md
@@ -10,7 +10,7 @@
**❓If you need help or have a question, you can:**
- - 🐞 Open an [issue](https://github.com/pirate/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) in our bug tracker
+ - 🐞 Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) in our bug tracker
- 🗣 Reach out to me on [Twitter](https://twitter.com/theSquashSH)
- 💠 Reach out to me via DM on [Patreon](https://patreon.com/theSquashSH) (you'll get the fastest response here)
@@ -22,11 +22,11 @@
-Readme | Demo | Quickstart | Usage | Community
+Readme | Demo | Quickstart | Usage | Community
[](https://www.patreon.com/theSquashSH)
-
\ No newline at end of file
+
diff --git a/Quickstart.md b/Quickstart.md
index 901fe11..46554fc 100644
--- a/Quickstart.md
+++ b/Quickstart.md
@@ -6,13 +6,13 @@
▶️ *It only takes about 5 minutes to get up and running with ArchiveBox.*
-ArchiveBox [officially supports](https://github.com/pirate/ArchiveBox/wiki/Install#supported-systems) **macOS**, **Ubuntu/Debian**, and **BSD**, but likely runs on many other systems. You can run it on any system that supports **Docker**, including Windows (using Docker in WSL2).
+ArchiveBox [officially supports](https://github.com/ArchiveBox/ArchiveBox/wiki/Install#supported-systems) **macOS**, **Ubuntu/Debian**, and **BSD**, but likely runs on many other systems. You can run it on any system that supports **Docker**, including Windows (using Docker in WSL2).
If you want to use Docker or Docker Compose to run ArchiveBox, see the [[Docker]] page.
---
-First, we install the ArchiveBox [dependencies](./Install#dependencies), then we create a folder to [store the archive data](https://github.com/pirate/ArchiveBox/wiki/Usage#Disk-Layout), and finally, we [import the list of links](https://github.com/pirate/ArchiveBox/wiki/Usage#CLI-Usage) to the archive by running:
+First, we install the ArchiveBox [dependencies](./Install#dependencies), then we create a folder to [store the archive data](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout), and finally, we [import the list of links](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#CLI-Usage) to the archive by running:
`archivebox add < [links_file]`
## 1. Set up ArchiveBox
@@ -27,7 +27,7 @@ docker run -v $PWD:/data -it nikisweeting/archivebox init
# alternatively, install ArchiveBox and its dependencies directly on your system without docker
# (script prompts for user confirmation before installing anything)
-curl https://raw.githubusercontent.com/pirate/ArchiveBox/master/bin/setup.sh | sh
+curl https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/bin/setup.sh | sh
# or follow the manual setup instructions if you don't like using curl | sh
```
diff --git a/Roadmap.md b/Roadmap.md
index 9603a7d..2dcb8c9 100644
--- a/Roadmap.md
+++ b/Roadmap.md
@@ -3,7 +3,7 @@
▶️ *Comment here to discuss the contribution roadmap:
-[Official Roadmap Discussion](https://github.com/pirate/ArchiveBox/issues/120).*
+[Official Roadmap Discussion](https://github.com/ArchiveBox/ArchiveBox/issues/120).*
---
@@ -81,7 +81,7 @@
- full-text search of extracted text with elasticsearch/elasticlunr/ag
- download closed-caption subtitles from Youtube and other video sites for full-text indexing of video content
- try pulling dead sites from archive.org and other sources if original is down (https://github.com/hartator/wayback-machine-downloader)
- - And more in the [issues list](https://github.com/pirate/ArchiveBox/issues/)...
+ - And more in the [issues list](https://github.com/ArchiveBox/ArchiveBox/issues/)...
---
@@ -93,7 +93,7 @@
## Past Releases
To see how this spec has been scheduled / implemented / released so far, read these pull requests:
- - ✅ [v0.2.x](https://github.com/pirate/ArchiveBox/tree/483a3bef9e2b1a7b80611947a3be99b0cf4f9959)
- - ✅ [v0.3.x](https://github.com/pirate/ArchiveBox/pull/197)
- - ✅ [v0.4.x](https://github.com/pirate/ArchiveBox/pull/207)
- - 🛠 [v0.5.x](https://github.com/pirate/ArchiveBox/pull/275)
+ - ✅ [v0.2.x](https://github.com/ArchiveBox/ArchiveBox/tree/483a3bef9e2b1a7b80611947a3be99b0cf4f9959)
+ - ✅ [v0.3.x](https://github.com/ArchiveBox/ArchiveBox/pull/197)
+ - ✅ [v0.4.x](https://github.com/ArchiveBox/ArchiveBox/pull/207)
+ - 🛠 [v0.5.x](https://github.com/ArchiveBox/ArchiveBox/pull/275)
diff --git a/Scheduled-Archiving.md b/Scheduled-Archiving.md
index eef457e..3ba4e61 100644
--- a/Scheduled-Archiving.md
+++ b/Scheduled-Archiving.md
@@ -8,7 +8,7 @@ ArchiveBox ignores links that are imported multiple times (keeping the earliest
This means you can add cron jobs that regularly poll the same file or URL for new links, adding only new
ones as necessary.
-For some example configs, see the [`etc/cron.d`](https://github.com/pirate/ArchiveBox/blob/master/etc/cron.d) and [`etc/supervisord`](https://github.com/pirate/ArchiveBox/blob/master/etc/supervisord) folders.
+For some example configs, see the [`etc/cron.d`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/cron.d) and [`etc/supervisord`](https://github.com/ArchiveBox/ArchiveBox/blob/master/etc/supervisord) folders.
## Examples
diff --git a/Security-Overview.md b/Security-Overview.md
index dffb7e2..68b7ba1 100644
--- a/Security-Overview.md
+++ b/Security-Overview.md
@@ -28,7 +28,7 @@ This mode should not be used for archiving entire browser history or authenticat
~~ArchiveBox is designed to be able to archive content that requires authentication or cookies. This includes paywalled content, private forums, LAN-only content, etc.~~
-~~To get started, set [`CHROME_USER_DATA_DIR`](https://github.com/pirate/ArchiveBox/wiki/Configuration#chrome_user_data_dir) and [`COOKIES_FILE`](https://github.com/pirate/ArchiveBox/wiki/Configuration#COOKIES_FILE) to point to a Chrome user folder that has your sessions and a wget `cookies.txt` file respectively.~~
+~~To get started, set [`CHROME_USER_DATA_DIR`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#chrome_user_data_dir) and [`COOKIES_FILE`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#COOKIES_FILE) to point to a Chrome user folder that has your sessions and a wget `cookies.txt` file respectively.~~
~~If you're importing private links or authenticated content, you probably don't want to share your archive folder publicly on a webserver, so don't follow the [[Publishing Your Archive]] instructions unless you are only serving it on a trusted LAN or have some sort of authentication in front of it. Make sure to point ArchiveBox to an output folder with conservative permissions, as it may contain archived content with secret session tokens or pieces of your user data. You may also wish to encrypt the archive using an encrypted disk image or filesystem like ZFS as it will contain all requests and response data, including session keys, user data, usernames, etc.~~
@@ -38,8 +38,8 @@ This mode should not be used for archiving entire browser history or authenticat
~~If you want ArchiveBox to be less noisy and avoid leaking any URLs to 3rd-party APIs during archiving, you can disable the options below. Disabling these are recommended if you plan on archiving any sites that use secret tokens in the URL to grant access to private content without authentication, e.g. Google Docs, CodiDM notepads, etc.~~
- - `https://web.archive.org/save/{url}` when [`SUBMIT_ARCHIVE_DOT_ORG`](https://github.com/pirate/ArchiveBox/wiki/Configuration#submit_archive_dot_org) is `True`, full URLs are submitted to the Wayback Machine for archiving, but no cookies or content from the local authenticated archive are shared
- - `https://www.google.com/s2/favicons?domain={domain}` when [`FETCH_FAVICON`](https://github.com/pirate/ArchiveBox/wiki/Configuration#fetch_favicon) is `True`, the domains for each link are shared in order to get the favicon, but not the full URL~~
+ - `https://web.archive.org/save/{url}` when [`SUBMIT_ARCHIVE_DOT_ORG`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#submit_archive_dot_org) is `True`, full URLs are submitted to the Wayback Machine for archiving, but no cookies or content from the local authenticated archive are shared
+ - `https://www.google.com/s2/favicons?domain={domain}` when [`FETCH_FAVICON`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#fetch_favicon) is `True`, the domains for each link are shared in order to get the favicon, but not the full URL~~
## Do not run as root
@@ -60,7 +60,7 @@ chown -R archivebox:archivebox /home/archivebox
sudo -u archivebox archivebox add ...
```
-~~If you absolutely must run it as root for some reason, a footgun is provided: you can set [`ALLOW_ROOT=True`](https://github.com/pirate/ArchiveBox/wiki/Configuration#ALLOW_ROOT) via environment variable or in your ArchiveBox.conf file.~~ It was removed.
+~~If you absolutely must run it as root for some reason, a footgun is provided: you can set [`ALLOW_ROOT=True`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#ALLOW_ROOT) via environment variable or in your ArchiveBox.conf file.~~ It was removed.
@@ -68,7 +68,7 @@ sudo -u archivebox archivebox add ...
### Permissions
-What are the permissions on the archive folder? Limit access to the fewest possible users by checking folder ownership and setting [`OUTPUT_PERMISSIONS`](https://github.com/pirate/ArchiveBox/wiki/Configuration#OUTPUT_PERMISSIONS) accordingly.
+What are the permissions on the archive folder? Limit access to the fewest possible users by checking folder ownership and setting [`OUTPUT_PERMISSIONS`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#OUTPUT_PERMISSIONS) accordingly.
### Filesystem
@@ -78,4 +78,4 @@ How much are you planning to archive? Only a few bookmarked articles, or thousa
Are you publishing your archive? If so, make sure you're only serving it as HTML and not accidentally running it as php or cgi, and put it on its own domain not shared with other services. This is done in order to avoid cookies leaking between your main domain and domains hosting content you don't control. Many companies put user provided files on separate domains like googleusercontent.com and github.io to avoid this problem.
-Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/pirate/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
+Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
diff --git a/Troubleshooting.md b/Troubleshooting.md
index 68b80bd..3a72faf 100644
--- a/Troubleshooting.md
+++ b/Troubleshooting.md
@@ -1,11 +1,11 @@
# Troubleshooting
-▶️ *If you need help or have a question, you can open an [issue](https://github.com/pirate/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) or reach out on [Twitter](https://twitter.com/theSquashSH).*
+▶️ *If you need help or have a question, you can open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc) or reach out on [Twitter](https://twitter.com/theSquashSH).*
What are you having an issue with?:
- [Installing](#Installing)
-- [Configuration](https://github.com/pirate/ArchiveBox/wiki/Configuration)
+- [Configuration](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration)
- [Archiving Process](#Archiving)
- [Hosting the Archive](#Hosting-the-Archive)
@@ -77,7 +77,7 @@ a bug in versions `<=1.19.1_1` that caused wget to fail for perfectly valid site
### No links parsed from export file
-Please open an [issue](https://github.com/pirate/ArchiveBox/issues) with a description of where you got the export, and
+Please open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues) with a description of where you got the export, and
preferrably your export file attached (you can redact the links). We'll fix the parser to support your format.
### Lots of skipped sites
@@ -91,12 +91,12 @@ If you're still having issues, try deleting or moving the `output/archive` folde
### Lots of errors
Make sure you have all the dependencies installed and that you're able to visit the links from your browser normally.
-Open an [issue](https://github.com/pirate/ArchiveBox/issues) with a description of the errors if you're still having problems.
+Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues) with a description of the errors if you're still having problems.
### Lots of broken links from the index
Not all sites can be effectively archived with each method, that's why it's best to use a combination of `wget`, PDFs, and screenshots.
-If it seems like more than 10-20% of sites in the archive are broken, open an [issue](https://github.com/pirate/ArchiveBox/issues)
+If it seems like more than 10-20% of sites in the archive are broken, open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues)
with some of the URLs that failed to be archived and I'll investigate.
### Removing unwanted links from the index
@@ -106,5 +106,5 @@ If you accidentally added lots of unwanted links into index and they slow down y
## Hosting the Archive
If you're having issues trying to host the archive via nginx, make sure you already have nginx running with SSL.
-If you don't, google around, there are plenty of tutorials to help get that set up. Open an [issue](https://github.com/pirate/ArchiveBox/issues)
+If you don't, google around, there are plenty of tutorials to help get that set up. Open an [issue](https://github.com/ArchiveBox/ArchiveBox/issues)
if you have problem with a particular nginx config.
diff --git a/Usage.md b/Usage.md
index cc3259a..4883763 100644
--- a/Usage.md
+++ b/Usage.md
@@ -1,6 +1,6 @@
# Usage
-▶️ _Make sure the dependencies are [fully installed](https://github.com/pirate/ArchiveBox/wiki/Install) before running any ArchiveBox commands._
+▶️ _Make sure the dependencies are [fully installed](https://github.com/ArchiveBox/ArchiveBox/wiki/Install) before running any ArchiveBox commands._
**ArchiveBox API Reference:**
@@ -18,7 +18,7 @@
- [[Scheduled Archiving]]: Learn how to set up automatic daily archiving
- [[Publishing Your Archive]]: Learn how to host your archive for others to access
- [[Troubleshooting]]: Resources if you encounter any problems
-- [Screenshots](https://github.com/pirate/ArchiveBox#Screenshots): See what the CLI and outputted HTML look like
+- [Screenshots](https://github.com/ArchiveBox/ArchiveBox#Screenshots): See what the CLI and outputted HTML look like
## CLI Usage
@@ -230,7 +230,7 @@ from archivebox import *
schedule
[i] Welcome to the ArchiveBox Shell!
- https://github.com/pirate/ArchiveBox/wiki/Usage#Shell-Usage
+ https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Shell-Usage
Hint: Example use:
print(Snapshot.objects.filter(is_archived=True).count())
diff --git a/_Footer.md b/_Footer.md
index 2f18a05..b487e93 100644
--- a/_Footer.md
+++ b/_Footer.md
@@ -1,7 +1,7 @@
-