mirror of
https://github.com/pirate/ArchiveBox.git
synced 2025-08-21 13:52:30 +02:00
Updated Security Overview (markdown)
@@ -49,9 +49,47 @@ If you're importing private links or authenticated content, you probably don't w
|
|||||||
*An example of a session cookie reflected in `headers.json` visible in the archive.*
|
*An example of a session cookie reflected in `headers.json` visible in the archive.*
|
||||||
|
|
||||||
<img src="https://imgur.zervice.io/DfyQUDV.png" width="50px" align="right"/>
|
<img src="https://imgur.zervice.io/DfyQUDV.png" width="50px" align="right"/>
|
||||||
|
<br/>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
### Publishing
|
||||||
|
|
||||||
|
> [!CAUTION]
|
||||||
|
> Re-hosting untrusted archived content on a domain can potentially compromise *all apps on that domain*!
|
||||||
|
> (including other subdomains)
|
||||||
|
|
||||||
|
Make sure you thoroughly understand the dangers of [hosting untrusted HTML/JS/CSS that may be captured during archiving](https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy), and how viewing it can enable [CSRF attacks](https://en.wikipedia.org/wiki/Cross-site_request_forgery) across all apps on the same domain. If a logged-in user happens to visit an archived page with malicious Javascript embedded, it would allow the JS to hijack any cookies on the domain and pretend to be them, potentially exfiltrating or modifying other Snapshots/data on your server.
|
||||||
|
|
||||||
|
(This is why we don't support serving ArchiveBox from a subdirectory like `myapps.example.com/archivebox/`, it's too dangerous to share domains)
|
||||||
|
|
||||||
|
The industry standard approach is to use a separate domain for untrusted content, for example Github uses `githubusercontent.com` and Google uses `googleusercontent.com` for all user-uploaded files. If hosting ArchiveBox publicly, do the same and keep it on an isolated domain in order to mitigate potential damage of leaked cookies, CORS, and CSRF attack.
|
||||||
|
|
||||||
|
To protect the Admin dashboard, it's also recommended to serve all content under `/archive/` on a separate domain from `/admin/`. We do this on our servers using a simple redirect rule in nginx/cloudflare like so:
|
||||||
|
|
||||||
|
- https://demo.archivebox.io: only serves `/`, redirects `/archive/*` to `demo-static.`
|
||||||
|
- https://demo-static.archivebox.io: only serves `/archive/`, redirects everything else to `demo.`
|
||||||
|
|
||||||
|
|
||||||
|
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
|
||||||
|
|
||||||
|
⚠️ Make sure to read all the warnings [above](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#%EF%B8%8F-things-to-watch-out-for-%EF%B8%8F) about the dangers of exposing Chrome profile data, cookies, secret tokens in URLs, and the risks of viewing archived JS on a shared origin before publishing your archive.
|
||||||
|
|
||||||
|
More info:
|
||||||
|
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive
|
||||||
|
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#security-concerns
|
||||||
|
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#copyright-concerns
|
||||||
|
- https://en.wikipedia.org/wiki/Cross-site_request_forgery
|
||||||
|
- https://github.com/ArchiveBox/ArchiveBox/issues/239
|
||||||
|
|
||||||
|
<br/>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<br/>
|
||||||
|
|
||||||
## Do not run as root
|
## Do not run as root
|
||||||
|
|
||||||
<img src="https://imgur.zervice.io/yDqJc4I.jpg" width="150px" align="right"/>
|
<img src="https://imgur.zervice.io/yDqJc4I.jpg" width="150px" align="right"/>
|
||||||
@@ -123,32 +161,3 @@ More info:
|
|||||||
- https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting
|
- https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting
|
||||||
- https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#filesystem-doesnt-support-fsync-eg-network-mounts
|
- https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#filesystem-doesnt-support-fsync-eg-network-mounts
|
||||||
- https://github.com/ArchiveBox/ArchiveBox#storage-requirements
|
- https://github.com/ArchiveBox/ArchiveBox#storage-requirements
|
||||||
|
|
||||||
### Publishing
|
|
||||||
|
|
||||||
> [!CAUTION]
|
|
||||||
> Re-hosting untrusted archived content on a domain can potentially compromise *all apps on that domain*!
|
|
||||||
> (including other subdomains)
|
|
||||||
|
|
||||||
Make sure you thoroughly understand the dangers of [hosting untrusted HTML/JS/CSS that may be captured during archiving](https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy), and how viewing it can enable [CSRF attacks](https://en.wikipedia.org/wiki/Cross-site_request_forgery) across all apps on the same domain. If a logged-in user happens to visit an archived page with malicious Javascript embedded, it would allow the JS to hijack any cookies on the domain and pretend to be them, potentially exfiltrating or modifying other Snapshots/data on your server.
|
|
||||||
|
|
||||||
(This is why we don't support serving ArchiveBox from a subdirectory like `myapps.example.com/archivebox/`, it's too dangerous to share domains)
|
|
||||||
|
|
||||||
The industry standard approach is to use a separate domain for untrusted content, for example Github uses `githubusercontent.com` and Google uses `googleusercontent.com` for all user-uploaded files. If hosting ArchiveBox publicly, do the same and keep it on an isolated domain in order to mitigate potential damage of leaked cookies, CORS, and CSRF attack.
|
|
||||||
|
|
||||||
To protect the Admin dashboard, it's also recommended to serve all content under `/archive/` on a separate domain from `/admin/`. We do this on our servers using a simple redirect rule in nginx/cloudflare like so:
|
|
||||||
|
|
||||||
- https://demo.archivebox.io: only serves `/`, redirects `/archive/*` to `demo-static.`
|
|
||||||
- https://demo-static.archivebox.io: only serves `/archive/`, redirects everything else to `demo.`
|
|
||||||
|
|
||||||
|
|
||||||
Published archives automatically include a `robots.txt` `Dissallow: /` to block search engines from indexing them. You may still wish to publish your contact info in the index footer though using [`FOOTER_INFO`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#FOOTER_INFO) so that you can respond to any DMCA and copyright takedown notices if you accidentally rehost copyrighted content.
|
|
||||||
|
|
||||||
⚠️ Make sure to read all the warnings [above](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#%EF%B8%8F-things-to-watch-out-for-%EF%B8%8F) about the dangers of exposing Chrome profile data, cookies, secret tokens in URLs, and the risks of viewing archived JS on a shared origin before publishing your archive.
|
|
||||||
|
|
||||||
More info:
|
|
||||||
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive
|
|
||||||
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#security-concerns
|
|
||||||
- https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#copyright-concerns
|
|
||||||
- https://en.wikipedia.org/wiki/Cross-site_request_forgery
|
|
||||||
- https://github.com/ArchiveBox/ArchiveBox/issues/239
|
|
Reference in New Issue
Block a user