1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-08-15 11:04:17 +02:00

Reformats document, fixes spelling and caps

I know this is kind of uncool to do, but there's no need to have this indent on the list elements :)

- `+` secondary list bullets → `-`
- Github → GitHub
- ipfs → IPFS
- Heretrix → Heritrix
- warc → WARC (where appropriate)
- wayback machine → Wayback Machine (where appropriate, as used to relate to Internet Archive's tool)
Henry Wilkinson
2024-10-09 13:40:36 -04:00
committed by Nick Sweeting
parent 678856ed43
commit 74331e4082

@@ -16,32 +16,31 @@ The internet archiving community is surprisingly far-reaching and almost univers
Whether you want to learn which organizations are the big players in the web archiving space, want to find a specific open source tool for your web archiving need, or just want to see where archivists hang out online, this is my attempt at an index of the entire web archiving community. Whether you want to learn which organizations are the big players in the web archiving space, want to find a specific open source tool for your web archiving need, or just want to see where archivists hang out online, this is my attempt at an index of the entire web archiving community.
<img src="https://imgur.zervice.io/duS8Lm7.png" width="200px" align="right" style="float: right; margin: 5px"/> <img src="https://imgur.zervice.io/duS8Lm7.png" width="200px" align="right" style="float: right; margin: 5px"/>
- [The Master Lists](#The-Master-Lists) - [The Master Lists](#The-Master-Lists)
*Community-maintained indexes of web archiving tools and groups by IIPC, COPTR, ArchiveTeam, Wikipedia, & the ASA.* *Community-maintained indexes of web archiving tools and groups by IIPC, COPTR, ArchiveTeam, Wikipedia, & the ASA.*
- [Web Archiving Software](#Web-Archiving-Projects) - [Web Archiving Software](#Web-Archiving-Projects)
*Open source tools and projects in the internet archiving space.* *Open source tools and projects in the internet archiving space.*
+ [Bookmarking Services](#bookmarking-services) - [Bookmarking Services](#bookmarking-services)
+ [Well-Known Open Source Projects](#from-the-archiveorg--archive-it-teams) - [Well-Known Open Source Projects](#from-the-archiveorg--archive-it-teams)
+ [Public Archiving Services](#other-public-archiving-services) - [Public Archiving Services](#other-public-archiving-services)
+ [ArchiveBox Alternatives](#other-archivebox-alternatives) - [ArchiveBox Alternatives](#other-archivebox-alternatives)
+ [Smaller Utilities](#smaller-utilities) - [Smaller Utilities](#smaller-utilities)
- [Reading List](#Reading-List) - [Reading List](#Reading-List)
*Articles, posts, and blogs relevant to ArchiveBox and web archiving in general.* *Articles, posts, and blogs relevant to ArchiveBox and web archiving in general.*
+ [Blogs](#Blogs) - [Blogs](#Blogs)
+ [Articles](#Articles) - [Articles](#Articles)
+ [ArchiveBox-Specific Posts, Tutorials, and Guides](#archivebox-specific-posts-tutorials-and-guides) - [ArchiveBox-Specific Posts, Tutorials, and Guides](#archivebox-specific-posts-tutorials-and-guides)
+ [ArchiveBox Discussions in News & Social Media](#archivebox-discussions-in-news--social-media) - [ArchiveBox Discussions in News & Social Media](#archivebox-discussions-in-news--social-media)
- [Communities](#Communities) - [Communities](#Communities)
*A collection of the most active internet archiving communities and initiatives.* *A collection of the most active internet archiving communities and initiatives.*
+ [Most Active Web-Archiving Communities](#most-active-communities) - [Most Active Web-Archiving Communities](#most-active-communities)
+ [Other Web Archiving Communities](#web-archiving-communities) - [Other Web Archiving Communities](#web-archiving-communities)
+ [General Archiving Foundations, Coalitions, Initiatives, and Institutes](#general-archiving-foundations-coalitions-initiatives-and-institutes) - [General Archiving Foundations, Coalitions, Initiatives, and Institutes](#general-archiving-foundations-coalitions-initiatives-and-institutes)
--- ---
@@ -73,14 +72,14 @@ Indexes of archiving institutions and software maintained by other people. If t
### Bookmarking Services ### Bookmarking Services
- **[Pocket Premium](https://getpocket.com) Bookmarking tool that provides an archiving service in their paid version, run by Mozilla** - **[Pocket Premium](https://getpocket.com) Bookmarking tool that provides an archiving service in their paid version, run by Mozilla**
- **[Pinboard](https://pinboard.in) Bookmarking tool that provides archiving in a paid version, run by a single independent developer** - **[Pinboard](https://pinboard.in) Bookmarking tool that provides archiving in a paid version, run by a single independent developer**
- **[Raindrop](https://raindrop.io) Bookmarking tool with archiving in their paid version, run by a company est. 2011** - **[Raindrop](https://raindrop.io) Bookmarking tool with archiving in their paid version, run by a company est. 2011**
- [Instapaper](https://www.instapaper.com) Bookmarking alternative to Pocket/Pinboard (with no archiving) - [Instapaper](https://www.instapaper.com) Bookmarking alternative to Pocket/Pinboard (with no archiving)
- [Wallabag](https://wallabag.org) / [Wallabag.it](https://wallabag.it) Self-hostable web archiving server that can import via RSS - [Wallabag](https://wallabag.org) / [Wallabag.it](https://wallabag.it) Self-hostable web archiving server that can import via RSS
- [Shaarli](https://github.com/shaarli/Shaarli) Self-hostable bookmark tagging, archiving, and sharing service - [Shaarli](https://github.com/shaarli/Shaarli) Self-hostable bookmark tagging, archiving, and sharing service
- [ReadWise](https://readwise.io/) A paid Pocket/Pinboard alternative that includes article snippet and highlight saving - [ReadWise](https://readwise.io/) A paid Pocket/Pinboard alternative that includes article snippet and highlight saving
- [Diigo](https://www.diigo.com/) Another brookmarking/annotation service with archiving as a paid feature - [Diigo](https://www.diigo.com/) Another brookmarking/annotation service with archiving as a paid feature
--- ---
@@ -89,15 +88,15 @@ Indexes of archiving institutions and software maintained by other people. If t
<img src="https://docs.monadical.com/uploads/9b9d97d4-eaa4-4399-b7e2-628f1ef81675.png" width="115px" align="right" style="float:right; margin: 5px"/> <img src="https://docs.monadical.com/uploads/9b9d97d4-eaa4-4399-b7e2-628f1ef81675.png" width="115px" align="right" style="float:right; margin: 5px"/>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/84/Internet_Archive_logo_and_wordmark.svg/250px-Internet_Archive_logo_and_wordmark.svg.png" width="100px" align="right" style="float:right; margin: 5px"/> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/84/Internet_Archive_logo_and_wordmark.svg/250px-Internet_Archive_logo_and_wordmark.svg.png" width="100px" align="right" style="float:right; margin: 5px"/>
- **[Archive.org](https://archive.org) The O.G. wayback machine provided publicly by the Internet Archive (Archive.org)** - **[Archive.org](https://archive.org) The O.G. Wayback Machine provided publicly by the Internet Archive (Archive.org)**
- **[Archive.it](https://archive-it.org) commercial Wayback-Machine solution** - **[Archive.it](https://archive-it.org) commercial Wayback Machine solution**
- **[Heretrix](https://github.com/internetarchive/heritrix3) The king of internet archiving crawlers, powers the Wayback Machine** - **[Heritrix](https://github.com/internetarchive/heritrix3) The king of internet archiving crawlers, powers the Wayback Machine**
- **[Brozzler](https://github.com/internetarchive/brozzler) chrome headless crawler + WARC archiver maintained by Archive.org** - **[Brozzler](https://github.com/internetarchive/brozzler) chrome headless crawler + WARC archiver maintained by Archive.org**
- [WarcProx](https://github.com/internetarchive/warcprox) warc proxy recording and playback utility - [WarcProx](https://github.com/internetarchive/warcprox) WARC proxy recording and playback utility
- [WarcTools](https://github.com/internetarchive/warctools) utilities for dealing with WARCs - [WarcTools](https://github.com/internetarchive/warctools) utilities for dealing with WARCs
- [Grab-Site](https://github.com/ArchiveTeam/grab-site) An easy preconfigured web crawler designed for backing up websites - [Grab-Site](https://github.com/ArchiveTeam/grab-site) An easy preconfigured web crawler designed for backing up websites
- [WPull](https://github.com/ArchiveTeam/wpull) A pure python implementation of wget with WARC saving - [WPull](https://github.com/ArchiveTeam/wpull) A pure python implementation of wget with WARC saving
- [More on their Github...](https://github.com/internetarchive) - [More on their GitHub...](https://github.com/internetarchive)
--- ---
@@ -128,14 +127,14 @@ Indexes of archiving institutions and software maintained by other people. If t
<img src="https://avatars2.githubusercontent.com/u/4416806?s=280&v=4" width="130px" align="right" style="float: right; margin: 5px"/> <img src="https://avatars2.githubusercontent.com/u/4416806?s=280&v=4" width="130px" align="right" style="float: right; margin: 5px"/>
- **[ipwb](https://github.com/oduwsdl/ipwb) A distributed web archiving solution using pywb with ipfs for storage** - **[ipwb](https://github.com/oduwsdl/ipwb) A distributed web archiving solution using pywb with IPFS for storage**
- **[archivenow](https://github.com/oduwsdl/archivenow) tool that pushes urls into all the online archive services like Archive.is and Archive.org** - **[archivenow](https://github.com/oduwsdl/archivenow) tool that pushes urls into all the online archive services like Archive.is and Archive.org**
- [node-warc](https://github.com/N0taN3rd/node-warc) Parse And Create Web ARChive (WARC) files with node.js - [node-warc](https://github.com/N0taN3rd/node-warc) Parse And Create Web ARChive (WARC) files with node.js
- [WAIL](https://machawk1.github.io/wail/) Web archiver GUI using Heritrix and OpenWayback - [WAIL](https://machawk1.github.io/wail/) Web archiver GUI using Heritrix and OpenWayback
- [Squidwarc](https://github.com/N0taN3rd/Squidwarc) User-scriptable, archival crawler using Chrome - [Squidwarc](https://github.com/N0taN3rd/Squidwarc) User-scriptable, archival crawler using Chrome
- [WAIL (Electron)](https://github.com/n0tan3rd/wail) Electron app version of the original [wail](https://github.com/machawk1/wail) for creating and interacting with web archives - [WAIL (Electron)](https://github.com/n0tan3rd/wail) Electron app version of the original [wail](https://github.com/machawk1/wail) for creating and interacting with web archives
- **[warcreate](https://github.com/machawk1/warcreate) a Chrome extension for creating WARCs from any webpage** - **[warcreate](https://github.com/machawk1/warcreate) a Chrome extension for creating WARCs from any webpage**
- [More on their Github...](https://github.com/oduwsdl) - [More on their GitHub...](https://github.com/oduwsdl)
--- ---
@@ -143,9 +142,9 @@ Indexes of archiving institutions and software maintained by other people. If t
<img src="https://archivesunleashed.org/images/hairball-roboto.png" width="220px" align="right" style="float: right; margin: 5px"/> <img src="https://archivesunleashed.org/images/hairball-roboto.png" width="220px" align="right" style="float: right; margin: 5px"/>
- [AUT](https://github.com/archivesunleashed/aut) Archives Unleashed Toolkit for analyzing web archives (formerly WarcBase) - [AUT](https://github.com/archivesunleashed/aut) Archives Unleashed Toolkit for analyzing web archives (formerly WarcBase)
- [Warclight](https://github.com/archivesunleashed/warclight) A Rails engine for finding and searching web archives - [Warclight](https://github.com/archivesunleashed/warclight) A Rails engine for finding and searching web archives
- [More on their Github...](https://github.com/archivesunleashed) - [More on their GitHub...](https://github.com/archivesunleashed)
--- ---
@@ -153,10 +152,10 @@ Indexes of archiving institutions and software maintained by other people. If t
### From the IIPC team ### From the IIPC team
- **[OpenWayback](https://github.com/iipc/openwayback/wiki) Open source project developing core Wayback-Machine components** - **[OpenWayback](https://github.com/iipc/openwayback/wiki) Open source project developing core Wayback Machine components**
- **[awesome-web-archiving](https://github.com/iipc/awesome-web-archiving) Large list of archiving projects and orgs** - **[awesome-web-archiving](https://github.com/iipc/awesome-web-archiving) Large list of archiving projects and orgs**
- [JWARC](https://github.com/iipc/jwarc) A Java library for reading and writing WARC files. - [JWARC](https://github.com/iipc/jwarc) A Java library for reading and writing WARC files.
- [More on their Github...](https://github.com/iipc) - [More on their GitHub...](https://github.com/iipc)
--- ---
@@ -164,23 +163,23 @@ Indexes of archiving institutions and software maintained by other people. If t
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/10/Archive.is.jpg/250px-Archive.is.jpg" width="150px" align="right" style="float: right; margin: 5px"/> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/10/Archive.is.jpg/250px-Archive.is.jpg" width="150px" align="right" style="float: right; margin: 5px"/>
- https://archive.is / https://archive.today - https://archive.is / https://archive.today
- https://ghostarchive.org - https://ghostarchive.org
- https://perma.cc - https://perma.cc
- https://arquivo.pt - https://arquivo.pt
- https://www.pagefreezer.com - https://www.pagefreezer.com
- https://www.smarsh.com - https://www.smarsh.com
- https://www.stillio.com - https://www.stillio.com
- https://archive.st - https://archive.st
- https://theoldnet.com/ - https://theoldnet.com/
- https://timetravel.mementoweb.org/ - https://timetravel.mementoweb.org/
- https://freezepage.com/ - https://freezepage.com/
- https://webcitation.org/archive - https://webcitation.org/archive
- https://archiveofourown.org/ - https://archiveofourown.org/
- https://megalodon.jp/ - https://megalodon.jp/
- https://www.webarchive.org.uk/ukwa/ - https://www.webarchive.org.uk/ukwa/
- https://github.com/HelloZeroNet/ZeroNet (super cool project) - https://github.com/HelloZeroNet/ZeroNet (super cool project)
- Google, Bing, DuckDuckGo, and other [search engine caches](https://www.clickminded.com/google-cache-search/) - Google, Bing, DuckDuckGo, and other [search engine caches](https://www.clickminded.com/google-cache-search/)
--- ---
@@ -188,75 +187,75 @@ Indexes of archiving institutions and software maintained by other people. If t
> *There are lots more projects listed here too: https://github.com/stars/pirate/lists/internet-archiving* > *There are lots more projects listed here too: https://github.com/stars/pirate/lists/internet-archiving*
- **[SingleFile](https://github.com/gildas-lormeau/SingleFile/) Web Extension / CLI util for Firefox and Chrome to save a web page as a single HTML file**
- **[Memex by Worldbrain.io](https://github.com/WorldBrain/Memex) a beautiful, user-friendly browser extension that archives all history with full-text search, annotation support, and more**
- **[Hypothes.is](https://web.hypothes.is/) a web/pdf/ebook annotation tool that also archives content**
- **[Reminiscence](https://github.com/kanishka-linux/reminiscence/) extremely similar to ArchiveBox, uses a Django backend + UI and provides auto-tagging and summary features with NLTK**
- **[Shaarchiver](https://github.com/nodiscc/shaarchiver) very similar project that archives Firefox, Shaarli, or Delicious bookmarks and all linked media, generating a markdown/HTML index**
- **[Archivy](https://github.com/archivy/archivy) Python-based self-hosted knowledge base embedded into your filesystem**
- **[Polarized](https://getpolarized.io/) a desktop application for bookmarking, annotating, and archiving articles offline**
- **[LinkWarden](https://github.com/linkwarden/linkwarden) Link archival and curation web app, very similar to ArchiveBox**
- **[Photon](https://github.com/s0md3v/Photon) a fast crawler with archiving and asset extraction support**
- **[Scoop](https://github.com/harvard-lil/scoop)** Create high-fidelity WARC/WACZ captures using a playwright browser, with support for signing, media extraction, PDFs, etc. ([by the perma.cc team](https://lil.law.harvard.edu/blog/2023/04/13/scoop-witnessing-the-web/))
- **[Browsertrix](https://webrecorder.net/browsertrix) + [ArchiveWeb.page](https://webrecorder.net/archivewebpage) + [ReplayWeb.page](https://webrecorder.net/replaywebpage) Webrecorder's archiving suite has the highest fidelity, and can flawlessly archive YouTube, X, Facebook, and other complex, JS-heavy SPAs** - **[Browsertrix](https://webrecorder.net/browsertrix) + [ArchiveWeb.page](https://webrecorder.net/archivewebpage) + [ReplayWeb.page](https://webrecorder.net/replaywebpage) Webrecorder's archiving suite has the highest fidelity, and can flawlessly archive YouTube, X, Facebook, and other complex, JS-heavy SPAs**
- **[SingleFile](https://github.com/gildas-lormeau/SingleFile/) Web Extension / CLI util for Firefox and Chrome to save a web page as a single HTML file**
- **[Memex by Worldbrain.io](https://github.com/WorldBrain/Memex) a beautiful, user-friendly browser extension that archives all history with full-text search, annotation support, and more**
- **[Hypothes.is](https://web.hypothes.is/) a web/pdf/ebook annotation tool that also archives content**
- **[Reminiscence](https://github.com/kanishka-linux/reminiscence/) extremely similar to ArchiveBox, uses a Django backend + UI and provides auto-tagging and summary features with NLTK**
- **[Shaarchiver](https://github.com/nodiscc/shaarchiver) very similar project that archives Firefox, Shaarli, or Delicious bookmarks and all linked media, generating a markdown/HTML index**
- **[Archivy](https://github.com/archivy/archivy) Python-based self-hosted knowledge base embedded into your filesystem**
- **[Polarized](https://getpolarized.io/) a desktop application for bookmarking, annotating, and archiving articles offline**
- **[LinkWarden](https://github.com/linkwarden/linkwarden) Link archival and curation web app, very similar to ArchiveBox**
- **[Photon](https://github.com/s0md3v/Photon) a fast crawler with archiving and asset extraction support**
- **[Scoop](https://github.com/harvard-lil/scoop)** Create high-fidelity WARC/WACZ captures using a playwright browser, with support for signing, media extraction, PDFs, etc. ([by the Perma.cc team](https://lil.law.harvard.edu/blog/2023/04/13/scoop-witnessing-the-web/))
Ones I haven't personally vetted: Ones I haven't personally vetted:
- [Shiori](https://github.com/go-shiori/shiori) Simple bookmark manager + readability archiver built with Go (like a clone of Pocket) - [Shiori](https://github.com/go-shiori/shiori) Simple bookmark manager + readability archiver built with Go (like a clone of Pocket)
- [Percollate](https://github.com/danburzo/percollate) A command-line tool to turn web pages into beautiful, readable PDF, EPUB, or HTML docs. - [Percollate](https://github.com/danburzo/percollate) A command-line tool to turn web pages into beautiful, readable PDF, EPUB, or HTML docs.
- [LinkAce](https://www.linkace.org/) A self-hosted bookmark management tool that saves snapshots to archive.org - [LinkAce](https://www.linkace.org/) A self-hosted bookmark management tool that saves snapshots to archive.org
- [LinkDing](https://github.com/sissbruecker/linkding) Self-hosted bookmark manager that is designed be to be minimal, fast, and easy to set up using Docker. - [LinkDing](https://github.com/sissbruecker/linkding) Self-hosted bookmark manager that is designed be to be minimal, fast, and easy to set up using Docker.
- [LinkWallet](https://github.com/tardisx/linkwallet) A self-hosted bookmark database with full-text page content search and limited archiving features - [LinkWallet](https://github.com/tardisx/linkwallet) A self-hosted bookmark database with full-text page content search and limited archiving features
- [Espial](https://github.com/jonschoning/espial) Bookmark manager and search tool with limited archiving features - [Espial](https://github.com/jonschoning/espial) Bookmark manager and search tool with limited archiving features
- [Diskernet](https://github.com/dosyago/DiskerNet) Archiving tool that uses the Chrome debugger protocol to save each page as-loaded in the browser** (aka 22120 by c0fe or i5ik) - [Diskernet](https://github.com/dosyago/DiskerNet) Archiving tool that uses the Chrome debugger protocol to save each page as-loaded in the browser** (aka 22120 by c0fe or i5ik)
- [Trilium](https://github.com/zadam/trilium) Personal web UI based knowledge-base with web clipping and note-taking - [Trilium](https://github.com/zadam/trilium) Personal web UI based knowledge-base with web clipping and note-taking
- [Herodotus](https://github.com/alaskanpuffin/herodotus-core) Django-based web archiving tool with a focus on collecting text-based content - [Herodotus](https://github.com/alaskanpuffin/herodotus-core) Django-based web archiving tool with a focus on collecting text-based content
- [Buku](https://github.com/jarun/buku) Browser-independent bookmark manager CLI written in Python3 and SQLite3 - [Buku](https://github.com/jarun/buku) Browser-independent bookmark manager CLI written in Python3 and SQLite3
- [ReadableWebProxy](https://github.com/fake-name/ReadableWebProxy) A proxying archiver that downloads content from sites and can snapshot multiple versions of sites over time - [ReadableWebProxy](https://github.com/fake-name/ReadableWebProxy) A proxying archiver that downloads content from sites and can snapshot multiple versions of sites over time
- [Perkeep](https://perkeep.org/) "Perkeep lets you permanently keep your stuff, for life." - [Perkeep](https://perkeep.org/) "Perkeep lets you permanently keep your stuff, for life."
- [Fossilo](https://www.fossilo.com/) A commercial archiving solution that appears to be very similar to ArchiveBox - [Fossilo](https://www.fossilo.com/) A commercial archiving solution that appears to be very similar to ArchiveBox
- [NeonLink](https://github.com/AlexSciFier/neonlink) Simple self-hosted bookmark management + [Benotes](https://noted.lol/benotes/) note-taking app with limited archiving features - [NeonLink](https://github.com/AlexSciFier/neonlink) Simple self-hosted bookmark management + [Benotes](https://noted.lol/benotes/) note-taking app with limited archiving features
- [Archivematica](https://github.com/artefactual/archivematica) web GUI for institutional long-term archiving of web and other content - [Archivematica](https://github.com/artefactual/archivematica) web GUI for institutional long-term archiving of web and other content
- [Headless Chrome Crawler](https://github.com/yujiosaka/headless-chrome-crawler) distributed web crawler built on puppeteer with screenshots - [Headless Chrome Crawler](https://github.com/yujiosaka/headless-chrome-crawler) distributed web crawler built on puppeteer with screenshots
- [WWWofle](http://www.gedanken.org.uk/software/wwwoffle/) old proxying recorder software similar to ArchiveBox - [WWWofle](http://www.gedanken.org.uk/software/wwwoffle/) old proxying recorder software similar to ArchiveBox
- [Erised](https://github.com/marvelm/erised) Super simple CLI utility to bookmark and archive webpages - [Erised](https://github.com/marvelm/erised) Super simple CLI utility to bookmark and archive webpages
- [Zotero](https://www.zotero.org/) collect, organize, cite, and share research (mainly for technical/scientific papers & citations) - [Zotero](https://www.zotero.org/) collect, organize, cite, and share research (mainly for technical/scientific papers & citations)
- [TiddlyWiki](https://tiddlywiki.com/) Non-linear bookmark and note-taking tool with archiving support - [TiddlyWiki](https://tiddlywiki.com/) Non-linear bookmark and note-taking tool with archiving support
- [Joplin](https://joplinapp.org/) Desktop + mobile app for knowledge-base-style info collection and notes (w/ optional plugin for archiving) - [Joplin](https://joplinapp.org/) Desktop + mobile app for knowledge-base-style info collection and notes (w/ optional plugin for archiving)
- [Hunchly](https://www.hunch.ly/) A paid web archiving / session recording tool design for OSINT - [Hunchly](https://www.hunch.ly/) A paid web archiving / session recording tool design for OSINT
- [Monolith](https://github.com/Y2Z/monolith) CLI tool for saving complete web pages as a single HTML file - [Monolith](https://github.com/Y2Z/monolith) CLI tool for saving complete web pages as a single HTML file
- [Obelisk](https://github.com/go-shiori/obelisk) Go package and CLI tool for saving web page as single HTML file - [Obelisk](https://github.com/go-shiori/obelisk) Go package and CLI tool for saving web page as single HTML file
- [Munin Archiver](https://github.com/peterk/munin-indexer) Social media archiver for Facebook, Instagram and VKontakte accounts. - [Munin Archiver](https://github.com/peterk/munin-indexer) Social media archiver for Facebook, Instagram and VKontakte accounts.
- **[Wayback](https://github.com/wabarc/wayback) Archiving in style like ArchiveBox, but with a chat.** - **[Wayback](https://github.com/wabarc/wayback) Archiving in style like ArchiveBox, but with a chat.**
--- ---
### Smaller Utilities ### Smaller Utilities
Random helpful utilities for web archiving, WARC creation and replay, and more... Random helpful utilities for web archiving, WARC creation and replay, and more...
- https://github.com/TheCakeIsNaOH/xbs-to-archivebox A utility to sync xBrowserSync bookmarks with ArchiveBox - https://github.com/TheCakeIsNaOH/xbs-to-archivebox A utility to sync xBrowserSync bookmarks with ArchiveBox
- https://github.com/karlicoss/promnesia A browser extension that [collects and collates all the URLs you visit](https://beepb00p.xyz/promnesia.html) into a hierarchical/graph structure with metadata - https://github.com/karlicoss/promnesia A browser extension that [collects and collates all the URLs you visit](https://beepb00p.xyz/promnesia.html) into a hierarchical/graph structure with metadata
- https://github.com/vrtdev/save-page-state A Chrome extension for saving the state of a page in multiple formats - https://github.com/vrtdev/save-page-state A Chrome extension for saving the state of a page in multiple formats
- https://github.com/jsvine/waybackpack command-line tool that lets you download the entire Wayback Machine archive for a given URL - https://github.com/jsvine/waybackpack command-line tool that lets you download the entire Wayback Machine archive for a given URL
- https://github.com/hartator/wayback-machine-downloader Download an entire website from the Internet Archive Wayback Machine. - https://github.com/hartator/wayback-machine-downloader Download an entire website from the Internet Archive Wayback Machine.
- https://github.com/Lifesgood123/prevent-link-rot Replace any broken URLs in some content with Wayback machine URL equivalents - https://github.com/Lifesgood123/prevent-link-rot Replace any broken URLs in some content with Wayback machine URL equivalents
- https://en.archivarix.com download an archived page or entire site from the Wayback Machine - https://en.archivarix.com download an archived page or entire site from the Wayback Machine
- https://proofofexistence.com prove that a certain file existed at a given time using the blockchain - https://proofofexistence.com prove that a certain file existed at a given time using the blockchain
- https://github.com/chfoo/warcat for merging, extracting, and verifying WARC files - https://github.com/chfoo/warcat for merging, extracting, and verifying WARC files
- https://github.com/mozilla/readability tool for extracting article contents and text - https://github.com/mozilla/readability tool for extracting article contents and text
- https://github.com/mholt/timeliner All your digital life on a single timeline, stored locally - https://github.com/mholt/timeliner All your digital life on a single timeline, stored locally
- https://github.com/wkhtmltopdf/wkhtmltopdf Webkit HTML to PDF archiver/saver - https://github.com/wkhtmltopdf/wkhtmltopdf Webkit HTML to PDF archiver/saver
- [Sheetsee-Pocket](http://jlord.us/sheetsee-pocket/) project that provides a pretty auto-updating index of your Pocket links (without archiving them) - [Sheetsee-Pocket](http://jlord.us/sheetsee-pocket/) project that provides a pretty auto-updating index of your Pocket links (without archiving them)
- [Pocket -> IFTTT -> Dropbox](https://christopher.su/2013/saving-pocket-links-file-day-dropbox-ifttt-launchd/) Post by Christopher Su on his Pocket saving IFTTT recipe - [Pocket -> IFTTT -> Dropbox](https://christopher.su/2013/saving-pocket-links-file-day-dropbox-ifttt-launchd/) Post by Christopher Su on his Pocket saving IFTTT recipe
- http://squidman.net/squidman/index.html - http://squidman.net/squidman/index.html
- https://wordpress.org/plugins/broken-link-checker/ - https://wordpress.org/plugins/broken-link-checker/
- https://github.com/ArchiveTeam/wpull - https://github.com/ArchiveTeam/wpull
- http://freedup.org/ - http://freedup.org/
- https://en.wikipedia.org/wiki/Furl - https://en.wikipedia.org/wiki/Furl
- https://preservica.com/digital-archive-software-1/active-digital-preservation For-profit company offering a digital preservation software suite - https://preservica.com/digital-archive-software-1/active-digital-preservation For-profit company offering a digital preservation software suite
- https://github.com/karlicoss/grasp capture webpages from Firefox and Chrome into Org-mode documents - https://github.com/karlicoss/grasp capture webpages from Firefox and Chrome into Org-mode documents
- https://github.com/dgtlmoon/changedetection.io Change detection and monitoring of web page content changes - https://github.com/dgtlmoon/changedetection.io Change detection and monitoring of web page content changes
- [And many more on the other lists...](#the-master-lists) - [And many more on the other lists...](#the-master-lists)
--- ---
@@ -271,44 +270,44 @@ A collection of blog posts and articles about internet archiving, contact me / o
<img src="https://media.npr.org/assets/img/2017/06/28/istock-506236357-5961b1f611e5136a7cd3fd5f74d97f4575f48c66-s800-c85.jpg" width="350px" align="right" style="float: right; margin: 5px"/> <img src="https://media.npr.org/assets/img/2017/06/28/istock-506236357-5961b1f611e5136a7cd3fd5f74d97f4575f48c66-s800-c85.jpg" width="350px" align="right" style="float: right; margin: 5px"/>
- https://blog.archive.org - https://blog.archive.org
- https://webrecorder.net/blog - https://webrecorder.net/blog
- https://netpreserveblog.wordpress.com - https://netpreserveblog.wordpress.com
- https://blog.conifer.rhizome.org/ - https://blog.conifer.rhizome.org/
- https://ws-dl.blogspot.com - https://ws-dl.blogspot.com
- https://siarchives.si.edu/blog - https://siarchives.si.edu/blog
- https://parameters.ssrc.org - https://parameters.ssrc.org
- https://sr.ithaka.org/publications - https://sr.ithaka.org/publications
- https://ait.blog.archive.org - https://ait.blog.archive.org
- https://brewster.kahle.org - https://brewster.kahle.org
- https://ianmilligan.ca - https://ianmilligan.ca
- https://medium.com/@giovannidamiola - https://medium.com/@giovannidamiola
--- ---
### Articles We Like About Internet Archiving ### Articles We Like About Internet Archiving
- https://items.ssrc.org/parameters/on-the-importance-of-web-archiving/ - https://items.ssrc.org/parameters/on-the-importance-of-web-archiving/
- https://theconversation.com/your-internet-data-is-rotting-115891 - https://theconversation.com/your-internet-data-is-rotting-115891
- https://www.bbc.com/future/story/20190401-why-theres-so-little-left-of-the-early-internet - https://www.bbc.com/future/story/20190401-why-theres-so-little-left-of-the-early-internet
- https://sr.ithaka.org/publications/the-state-of-digital-preservation-in-2018/ - https://sr.ithaka.org/publications/the-state-of-digital-preservation-in-2018/
- https://gizmodo.com/delete-never-the-digital-hoarders-who-collect-tumblrs-1832900423 - https://gizmodo.com/delete-never-the-digital-hoarders-who-collect-tumblrs-1832900423
- https://siarchives.si.edu/blog/we-are-not-alone-progress-digital-preservation-community - https://siarchives.si.edu/blog/we-are-not-alone-progress-digital-preservation-community
- https://www.gwern.net/Archiving-URLs - https://www.gwern.net/Archiving-URLs
- http://brewster.kahle.org/2015/08/11/locking-the-web-open-a-call-for-a-distributed-web-2/ - http://brewster.kahle.org/2015/08/11/locking-the-web-open-a-call-for-a-distributed-web-2/
- https://lwn.net/Articles/766374/ - https://lwn.net/Articles/766374/
- https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives - https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives
- https://medium.com/@giovannidamiola/making-the-internet-archives-full-text-search-faster-30fb11574ea9 - https://medium.com/@giovannidamiola/making-the-internet-archives-full-text-search-faster-30fb11574ea9
- https://xkcd.com/1909/ - https://xkcd.com/1909/
- https://samsaffron.com/archive/2012/06/07/testing-3-million-hyperlinks-lessons-learned#comment-31366 - https://samsaffron.com/archive/2012/06/07/testing-3-million-hyperlinks-lessons-learned#comment-31366
- https://www.gwern.net/docs/linkrot/2011-muflax-backup.pdf - https://www.gwern.net/docs/linkrot/2011-muflax-backup.pdf
- https://thoughtstreams.io/higgins/permalinking-vs-transience/ - https://thoughtstreams.io/higgins/permalinking-vs-transience/
- http://ait.blog.archive.org/files/2014/04/archiveit_life_cycle_model.pdf - http://ait.blog.archive.org/files/2014/04/archiveit_life_cycle_model.pdf
- https://blog.archive.org/2016/05/26/web-archiving-with-national-libraries/ - https://blog.archive.org/2016/05/26/web-archiving-with-national-libraries/
- https://blog.archive.org/2014/10/28/building-libraries-together/ - https://blog.archive.org/2014/10/28/building-libraries-together/
- https://ianmilligan.ca/2018/03/27/ethics-and-the-archived-web-presentation-the-ethics-of-studying-geocities/ - https://ianmilligan.ca/2018/03/27/ethics-and-the-archived-web-presentation-the-ethics-of-studying-geocities/
- https://ianmilligan.ca/2018/05/22/new-article-if-these-crawls-could-talk-studying-and-documenting-web-archives-provenance/ - https://ianmilligan.ca/2018/05/22/new-article-if-these-crawls-could-talk-studying-and-documenting-web-archives-provenance/
- https://ws-dl.blogspot.com/2019/02/2019-02-08-google-is-being-shuttered.html - https://ws-dl.blogspot.com/2019/02/2019-02-08-google-is-being-shuttered.html
If any of these links are dead, you can find an archived version on https://archive.sweeting.me or https://web.archive.org. If any of these links are dead, you can find an archived version on https://archive.sweeting.me or https://web.archive.org.
@@ -319,50 +318,50 @@ If any of these links are dead, you can find an archived version on https://arch
*Beware: many of these may be outdated, as ArchiveBox has frequent updates and continual improvement.* *Beware: many of these may be outdated, as ArchiveBox has frequent updates and continual improvement.*
- "Install ArchiveBox on SaltBox.dev" https://docs.saltbox.dev/sandbox/apps/archivebox/#3-setup - "Install ArchiveBox on SaltBox.dev" https://docs.saltbox.dev/sandbox/apps/archivebox/#3-setup
- "ArchiveBox is an open-source self-hosted web archiving system for the web and the desktop" https://medevel.com/archivebox/ - "ArchiveBox is an open-source self-hosted web archiving system for the web and the desktop" https://medevel.com/archivebox/
- "Install ArchiveBox on a One-Click Docker Application" https://www.vultr.com/docs/install-archivebox-on-a-oneclick-docker-application/ - "Install ArchiveBox on a One-Click Docker Application" https://www.vultr.com/docs/install-archivebox-on-a-oneclick-docker-application/
- "ArchiveBox, una solución para crear nuestro propio Archive.org en miniatura y personalizado" https://www.genbeta.com/herramientas/archivebox-solucion-para-crear-nuestro-propio-archive-org-miniatura-personalizado - "ArchiveBox, una solución para crear nuestro propio Archive.org en miniatura y personalizado" https://www.genbeta.com/herramientas/archivebox-solucion-para-crear-nuestro-propio-archive-org-miniatura-personalizado
- "网页存档的开源工具ArchiveBox可以将网页文字、图片、媒体文件等都保存下来供日后查看。基于Python的开源项目可搭建私人的网络存档服务。" https://www.bilibili.com/s/video/BV1ib4y1X7SL - "网页存档的开源工具ArchiveBox可以将网页文字、图片、媒体文件等都保存下来供日后查看。基于Python的开源项目可搭建私人的网络存档服务。" https://www.bilibili.com/s/video/BV1ib4y1X7SL
- "Персональный интернет-архив без боли" https://habr.com/ru/company/vdsina/blog/550180/ - "Персональный интернет-архив без боли" https://habr.com/ru/company/vdsina/blog/550180/
- "ArchiveBox, una solución para crear nuestro propio Archive.org en miniatura y personalizado" https://www.genbeta.com/herramientas/archivebox-solucion-para-crear-nuestro-propio-archive-org-miniatura-personalizado - "ArchiveBox, una solución para crear nuestro propio Archive.org en miniatura y personalizado" https://www.genbeta.com/herramientas/archivebox-solucion-para-crear-nuestro-propio-archive-org-miniatura-personalizado
- "Preserve the Internet With ArchiveBox" https://www.cyberpunks.com/preserve-the-internet-with-archivebox/ - "Preserve the Internet With ArchiveBox" https://www.cyberpunks.com/preserve-the-internet-with-archivebox/
- "Сам себе архивариус. Изучаем возможности ArchiveBox" https://xakep.ru/2021/02/01/archivebox/ - "Сам себе архивариус. Изучаем возможности ArchiveBox" https://xakep.ru/2021/02/01/archivebox/
- "使用存档盒制作自己的Internet存档" http://www.diglog.com/story/1045192.html - "使用存档盒制作自己的Internet存档" http://www.diglog.com/story/1045192.html
- "How to Make Your Own Internet Archive With ArchiveBox" https://nixintel.info/osint-tools/make-your-own-internet-archive-with-archive-box/ - "How to Make Your Own Internet Archive With ArchiveBox" https://nixintel.info/osint-tools/make-your-own-internet-archive-with-archive-box/
- "Mit ArchiveBox Webseiten auf der Festplatte archivieren" https://www.linux-community.de/ausgaben/linuxuser/2020/12/mit-archivebox-webseiten-auf-der-festplatte-archivieren/ - "Mit ArchiveBox Webseiten auf der Festplatte archivieren" https://www.linux-community.de/ausgaben/linuxuser/2020/12/mit-archivebox-webseiten-auf-der-festplatte-archivieren/
- "ArchiveBox开源的WEB存档" https://zhen.bushini.de/14738.html / https://www.1fishsauce.com/?p=4206 - "ArchiveBox开源的WEB存档" https://zhen.bushini.de/14738.html / https://www.1fishsauce.com/?p=4206
- "两个基于爬虫的项目: Kiwix & ArchiveBox" https://blog.csdn.net/JackLang/article/details/108328791 - "两个基于爬虫的项目: Kiwix & ArchiveBox" https://blog.csdn.net/JackLang/article/details/108328791
- "如何创建自己的私人自托管即时阅读应用程序" https://www.pcpc.me/tech/self-hosted-read-later-app - "如何创建自己的私人自托管即时阅读应用程序" https://www.pcpc.me/tech/self-hosted-read-later-app
- "How to install ArchiveBox to preserve websites you care about" - "How to install ArchiveBox to preserve websites you care about"
https://blog.sleeplessbeastie.eu/2019/06/19/how-to-install-archivebox-to-preserve-websites-you-care-about/ https://blog.sleeplessbeastie.eu/2019/06/19/how-to-install-archivebox-to-preserve-websites-you-care-about/
- "How to remotely archive websites using ArchiveBox" - "How to remotely archive websites using ArchiveBox"
https://blog.sleeplessbeastie.eu/2019/06/26/how-to-remotely-archive-websites-using-archivebox/ https://blog.sleeplessbeastie.eu/2019/06/26/how-to-remotely-archive-websites-using-archivebox/
- "How to Create Your Own Private Self-Hosted Read-It-Later App" https://www.makeuseof.com/tag/self-hosted-read-later-app/ - "How to Create Your Own Private Self-Hosted Read-It-Later App" https://www.makeuseof.com/tag/self-hosted-read-later-app/
- "How to use CutyCapt inside ArchiveBox" - "How to use CutyCapt inside ArchiveBox"
https://blog.sleeplessbeastie.eu/2019/07/10/how-to-use-cutycapt-inside-archivebox/ https://blog.sleeplessbeastie.eu/2019/07/10/how-to-use-cutycapt-inside-archivebox/
- "Automate ArchiveBox with Google Spreadsheet to Backup your internet" - "Automate ArchiveBox with Google Spreadsheet to Backup your internet"
https://manfred.life/archivebox https://manfred.life/archivebox
- "【デモ有♪】ConoHaのArchiveBoxアプリケーションを使ってみたよ" - "【デモ有♪】ConoHaのArchiveBoxアプリケーションを使ってみたよ"
https://qiita.com/CloudRemix/items/691caf91efa3ef19a7ad https://qiita.com/CloudRemix/items/691caf91efa3ef19a7ad
- "WEB-ARCHIV TEIL 8: WALLABAG UND ARCHIVEBOX" - "WEB-ARCHIV TEIL 8: WALLABAG UND ARCHIVEBOX"
http://webermartin.net/blog/web-archiv-teil-8-wallabag-und-archivebox/ http://webermartin.net/blog/web-archiv-teil-8-wallabag-und-archivebox/
- https://metaxyntax.neocities.org/entries/7.html - https://metaxyntax.neocities.org/entries/7.html
### ArchiveBox Discussions in News & Social Media ### ArchiveBox Discussions in News & Social Media
<img src="https://cdn.dribbble.com/users/896843/screenshots/2560608/news_media_icons-07.png" width="380px" align="right" style="float: right; margin: 5px"/> <img src="https://cdn.dribbble.com/users/896843/screenshots/2560608/news_media_icons-07.png" width="380px" align="right" style="float: right; margin: 5px"/>
- **Aggregators:** - **Aggregators:**
**[ProductHunt](https://www.producthunt.com/posts/archivebox)**, **[AlternativeTo](https://alternativeto.net/software/archivebox/)**, **[SaaSHub](https://www.saashub.com/archivebox)**, [Logiciels](https://www.logiciels.pro/logiciel-saas/archivebox/), [SteemHunt](https://steemhunt.com/@adnan556644/archivebox-the-open-source-self-hosted-internet-archiving-solution), [Recurse Center: The Joy of Computing](https://joy.recurse.com/posts/224-archivebox), [Github Changelog](https://changelog.com/news/archivebox-opensource-selfhosted-web-archive-6D0d), [Dev.To Ultra List](https://dev.to/teamxenox/-ultra-list-one-list-to-rule-them-all-march-19-4p4f), [O'Reilly 4 Short Links](https://www.oreilly.com/ideas/four-short-links-15-april-2019), [JaxEnter](https://jaxenter.com/github-trending-march-2019-157470.html) **[ProductHunt](https://www.producthunt.com/posts/archivebox)**, **[AlternativeTo](https://alternativeto.net/software/archivebox/)**, **[SaaSHub](https://www.saashub.com/archivebox)**, [Logiciels](https://www.logiciels.pro/logiciel-saas/archivebox/), [SteemHunt](https://steemhunt.com/@adnan556644/archivebox-the-open-source-self-hosted-internet-archiving-solution), [Recurse Center: The Joy of Computing](https://joy.recurse.com/posts/224-archivebox), [GitHub Changelog](https://changelog.com/news/archivebox-opensource-selfhosted-web-archive-6D0d), [Dev.To Ultra List](https://dev.to/teamxenox/-ultra-list-one-list-to-rule-them-all-march-19-4p4f), [O'Reilly 4 Short Links](https://www.oreilly.com/ideas/four-short-links-15-april-2019), [JaxEnter](https://jaxenter.com/github-trending-march-2019-157470.html)
- **Blog Posts & Podcasts:** - **Blog Posts & Podcasts:**
[Korben.info](https://korben.info/archivebox-un-clone-darchive-org-et-de-la-wayback-machine-a-auto-heberger.html), [Defining Desktop Linux Podcast #296 (0:55:00)](https://linuxunplugged.com/296), [Binärgewitter Podcast #221](http://blog.binaergewitter.de/2019/01/18/binaergewitter-talk-number-221-vertieft-in-die-andere-richtung/), [Schrankmonster.de](https://www.schrankmonster.de/2019/04/10/archive-your-slice-of-the-web/), [La Ferme Du Web](https://www.lafermeduweb.net/veille/archivebox-archivez-des-copies-de-sites-en-local-avec-tous-les-medias-lies) [Korben.info](https://korben.info/archivebox-un-clone-darchive-org-et-de-la-wayback-machine-a-auto-heberger.html), [Defining Desktop Linux Podcast #296 (0:55:00)](https://linuxunplugged.com/296), [Binärgewitter Podcast #221](http://blog.binaergewitter.de/2019/01/18/binaergewitter-talk-number-221-vertieft-in-die-andere-richtung/), [Schrankmonster.de](https://www.schrankmonster.de/2019/04/10/archive-your-slice-of-the-web/), [La Ferme Du Web](https://www.lafermeduweb.net/veille/archivebox-archivez-des-copies-de-sites-en-local-avec-tous-les-medias-lies)
- **Hacker News threads and comments:** - **Hacker News threads and comments:**
[#1](https://news.ycombinator.com/item?id=14272133), [#2](https://news.ycombinator.com/item?id=18728546), [#3](https://news.ycombinator.com/item?id=18876685), **[#4](https://news.ycombinator.com/item?id=19346985)**, [and many more...](https://www.google.com/search?q=site%3Anews.ycombinator.com+%22archivebox%22) [#1](https://news.ycombinator.com/item?id=14272133), [#2](https://news.ycombinator.com/item?id=18728546), [#3](https://news.ycombinator.com/item?id=18876685), **[#4](https://news.ycombinator.com/item?id=19346985)**, [and many more...](https://www.google.com/search?q=site%3Anews.ycombinator.com+%22archivebox%22)
- **Reddit r/DataHoarder, r/SelfHosted, etc. posts and comments**: - **Reddit r/DataHoarder, r/SelfHosted, etc. posts and comments**:
[#1](https://www.reddit.com/r/DataHoarder/comments/69e6i9/archive_a_browseable_copy_of_your_saved_pocket/), [#2](https://www.reddit.com/r/DataHoarder/comments/6kepv6/bookmarkarchiver_now_supports_archiving_all_major/), [#3](https://www.reddit.com/r/DataHoarder/comments/apnud4/continually_archive_websites_and_keep_the_older/), [#4](https://www.reddit.com/r/DataHoarder/comments/azdhd9/archivebox_open_source_selfhosted_web_archive/), [#5](https://www.reddit.com/r/DataHoarder/comments/b0o10h/archivebox_self_hosting_clone_of_archiveorg/) , **[#6](https://www.reddit.com/r/DataHoarder/comments/b4nrlc/in_case_you_havent_seen_it_archivebox_has_a/)**, [#7](https://www.reddit.com/r/selfhosted/comments/69eoi3/pocket_stream_archive_your_own_personal_wayback/), [#8](https://www.reddit.com/r/selfhosted/comments/an2368/archivebox_the_opensource_selfhosted_web_archive/), [and many more...](https://www.google.com/search?q=site%3Areddit.com+%22archivebox%22) [#1](https://www.reddit.com/r/DataHoarder/comments/69e6i9/archive_a_browseable_copy_of_your_saved_pocket/), [#2](https://www.reddit.com/r/DataHoarder/comments/6kepv6/bookmarkarchiver_now_supports_archiving_all_major/), [#3](https://www.reddit.com/r/DataHoarder/comments/apnud4/continually_archive_websites_and_keep_the_older/), [#4](https://www.reddit.com/r/DataHoarder/comments/azdhd9/archivebox_open_source_selfhosted_web_archive/), [#5](https://www.reddit.com/r/DataHoarder/comments/b0o10h/archivebox_self_hosting_clone_of_archiveorg/) , **[#6](https://www.reddit.com/r/DataHoarder/comments/b4nrlc/in_case_you_havent_seen_it_archivebox_has_a/)**, [#7](https://www.reddit.com/r/selfhosted/comments/69eoi3/pocket_stream_archive_your_own_personal_wayback/), [#8](https://www.reddit.com/r/selfhosted/comments/an2368/archivebox_the_opensource_selfhosted_web_archive/), [and many more...](https://www.google.com/search?q=site%3Areddit.com+%22archivebox%22)
- **Twitter:** - **Twitter:**
[Python Trending](https://twitter.com/pythontrending/status/1092492387182628865), [PyCoder's Weekly](https://twitter.com/pycoders/status/1105803699799105536), [Python Hub](https://twitter.com/PythonHub/status/1107601343395651589), [Smashing Magazine](https://twitter.com/smashingmag/status/1107990604774928386), <a href="https://twitter.com/search?q=archivebox.io%20OR%20archivebox%2Farchivebox%20OR%20archiveboxapp&src=typed_query&f=live">and many more...</a> [Python Trending](https://twitter.com/pythontrending/status/1092492387182628865), [PyCoder's Weekly](https://twitter.com/pycoders/status/1105803699799105536), [Python Hub](https://twitter.com/PythonHub/status/1107601343395651589), [Smashing Magazine](https://twitter.com/smashingmag/status/1107990604774928386), <a href="https://twitter.com/search?q=archivebox.io%20OR%20archivebox%2Farchivebox%20OR%20archiveboxapp&src=typed_query&f=live">and many more...</a>
--- ---
@@ -373,16 +372,16 @@ If any of these links are dead, you can find an archived version on https://arch
<img src="https://www.archiveteam.org/images/f/f3/Archive_team.png" width="230px" align="right" style="float: right; margin: 5px"/> <img src="https://www.archiveteam.org/images/f/f3/Archive_team.png" width="230px" align="right" style="float: right; margin: 5px"/>
- **[The Internet Archive (Archive.org)](https://archive.org/iathreads/forums.php)** (USA) - **[The Internet Archive (Archive.org)](https://archive.org/iathreads/forums.php)** (USA)
- **[International Internet Preservation Consortium (IIPC)](http://netpreserve.org/)** (International) - **[International Internet Preservation Consortium (IIPC)](http://netpreserve.org/)** (International)
- **[The Archive Team](https://www.archiveteam.org/), [URL Team](https://www.archiveteam.org/index.php?title=URLTeam), [r/ArchiveTeam](https://reddit.com/r/ArchiveTeam)** (International) - **[The Archive Team](https://www.archiveteam.org/), [URL Team](https://www.archiveteam.org/index.php?title=URLTeam), [r/ArchiveTeam](https://reddit.com/r/ArchiveTeam)** (International)
- **[Rhizome.org](http://archive.rhizome.org/)** The digital preservation group that works on [Conifer by Rhizome](https://conifer.rhizome.org/) formerly Webrecorder.io (USA) - **[Rhizome.org](http://archive.rhizome.org/)** The digital preservation group that works on [Conifer by Rhizome](https://conifer.rhizome.org/) formerly Webrecorder.io (USA)
- **[Webrecorder.net](https://webrecorder.net/)** Formerly known[¹](https://blog.conifer.rhizome.org/2020/06/11/webrecorder-conifer.html) as Webrecorder.io is a project Led by Ilya Kreymer, that researches and develops web archiving tools, widely used by the community. - **[Webrecorder](https://webrecorder.net/)** Formerly known[¹](https://blog.conifer.rhizome.org/2020/06/11/webrecorder-conifer.html) as Webrecorder.io is a company led by Ilya Kreymer, that researches and develops web archiving tools, widely used by the community.
- **[Old Dominion University: Web Science and Digital Libraries (WS-DL @ ODU)](https://ws-dl.cs.odu.edu)** (Virginia, USA) - **[Old Dominion University: Web Science and Digital Libraries (WS-DL @ ODU)](https://ws-dl.cs.odu.edu)** (Virginia, USA)
- **[r/DataHoarder](https://www.reddit.com/r/DataHoarder), [r/Archivists](https://www.reddit.com/r/Archivists/), [r/DHExchange](https://www.reddit.com/r/DHExchange/)** (International) - **[r/DataHoarder](https://www.reddit.com/r/DataHoarder), [r/Archivists](https://www.reddit.com/r/Archivists/), [r/DHExchange](https://www.reddit.com/r/DHExchange/)** (International)
- [The Eye](https://the-eye.eu) Non-profit working on content archival and long-term preservation (Europe) - [The Eye](https://the-eye.eu) Non-profit working on content archival and long-term preservation (Europe)
- [Digital Preservation Coalition](https://www.dpconline.org/about) & their [Software Tool Registry (COPTR)](http://coptr.digipres.org/Main_Page) (UK & Wales) - [Digital Preservation Coalition](https://www.dpconline.org/about) & their [Software Tool Registry (COPTR)](http://coptr.digipres.org/Main_Page) (UK & Wales)
- [Archives Unleashed Project](https://archivesunleashed.org/about-project/) and [UAP Github](https://github.com/archivesunleashed) (Canada) - [Archives Unleashed Project](https://archivesunleashed.org/about-project/) and [UAP GitHub](https://github.com/archivesunleashed) (Canada)
--- ---
@@ -392,21 +391,21 @@ If any of these links are dead, you can find an archived version on https://arch
Follow these technological and organizational archiving hubs for the latest archiving news. Follow these technological and organizational archiving hubs for the latest archiving news.
- [Canadian Web Archiving Coalition](https://www.carl-abrc.ca/advancing-research/digital-preservation/cwac/) (Canada) - [Canadian Web Archiving Coalition](https://www.carl-abrc.ca/advancing-research/digital-preservation/cwac/) (Canada)
- [Web Archives for Historical Research Group](https://uwaterloo.ca/web-archive-group/about) (Canada) - [Web Archives for Historical Research Group](https://uwaterloo.ca/web-archive-group/about) (Canada)
- [Smithsonian Institution Archives: Digital Curation](https://siarchives.si.edu/what-we-do/digital-curation) (Washington D.C., USA) - [Smithsonian Institution Archives: Digital Curation](https://siarchives.si.edu/what-we-do/digital-curation) (Washington D.C., USA)
- [National Digital Stewardship Alliance (NDSA)](http://www.digitalpreservation.gov/ndsa/NDSAtoDLF.html) (USA) - [National Digital Stewardship Alliance (NDSA)](http://www.digitalpreservation.gov/ndsa/NDSAtoDLF.html) (USA)
- [Digital Library Federation (DLF)](https://www.diglib.org/about/) (USA) - [Digital Library Federation (DLF)](https://www.diglib.org/about/) (USA)
- [Council on Library and Information Resources (CLIR)](http://www.clir.org/about) (USA) - [Council on Library and Information Resources (CLIR)](http://www.clir.org/about) (USA)
- [Digital Curation Centre (DCC)](http://www.dcc.ac.uk/about-us) (UK) - [Digital Curation Centre (DCC)](http://www.dcc.ac.uk/about-us) (UK)
- [ArchiveMatica](https://www.archivematica.org/en/) & their [Community Wiki](https://wiki.archivematica.org/Community) (International) - [ArchiveMatica](https://www.archivematica.org/en/) & their [Community Wiki](https://wiki.archivematica.org/Community) (International)
- [Professional Development Institutes for Digital Preservation (POWRR)](https://digitalpowrr.niu.edu/) (USA) - [Professional Development Institutes for Digital Preservation (POWRR)](https://digitalpowrr.niu.edu/) (USA)
- [Institute of Museum and Library Services (IMLS)](https://www.imls.gov/about/mission) (USA) - [Institute of Museum and Library Services (IMLS)](https://www.imls.gov/about/mission) (USA)
- [Stanford Libraries Web Archiving](https://library.stanford.edu/projects/web-archiving) (USA) - [Stanford Libraries Web Archiving](https://library.stanford.edu/projects/web-archiving) (USA)
- [Society of American Archivists: Electronic Records (SAA)](https://www2.archivists.org/groups/electronic-records-section) (USA) - [Society of American Archivists: Electronic Records (SAA)](https://www2.archivists.org/groups/electronic-records-section) (USA)
- [BitCurator Consortium (BCC)](https://bitcuratorconsortium.org/mission) (USA) - [BitCurator Consortium (BCC)](https://bitcuratorconsortium.org/mission) (USA)
- [Ethics & Archiving the Web Conference (Rhizome/Webrecorder.io)](https://eaw.rhizome.org/) (USA) - [Ethics & Archiving the Web Conference (Rhizome)](https://eaw.rhizome.org/) (USA)
- [Archivists Round Table of NYC](https://www.nycarchivists.org/) (USA) - [Archivists Round Table of NYC](https://www.nycarchivists.org/) (USA)
--- ---
@@ -416,56 +415,56 @@ Follow these technological and organizational archiving hubs for the latest arch
Find your local archiving group in the list and see how you can contribute! Find your local archiving group in the list and see how you can contribute!
- [Community Archives and Heritage Group](https://www.communityarchives.org.uk/content/about/history-and-purpose) (UK & Ireland) - [Community Archives and Heritage Group](https://www.communityarchives.org.uk/content/about/history-and-purpose) (UK & Ireland)
- [Open Preservation Foundation (OPF)](https://openpreservation.org/about/organisation/) (UK & Europe) - [Open Preservation Foundation (OPF)](https://openpreservation.org/about/organisation/) (UK & Europe)
- [Software Preservation Network](https://www.softwarepreservationnetwork.org/about/) (International) - [Software Preservation Network](https://www.softwarepreservationnetwork.org/about/) (International)
- [ITHAKA](https://www.ithaka.org/content/our-mission), [Portico](https://www.portico.org/why-portico/), [JSTOR](https://www.jstor.org/), [ARTSTOR](http://www.artstor.org/), [S+R](https://sr.ithaka.org/our-work/collections-and-preservation/) (USA) - [ITHAKA](https://www.ithaka.org/content/our-mission), [Portico](https://www.portico.org/why-portico/), [JSTOR](https://www.jstor.org/), [ARTSTOR](http://www.artstor.org/), [S+R](https://sr.ithaka.org/our-work/collections-and-preservation/) (USA)
- [Archives and Records Association](https://www2.archivists.org/assoc-orgs/archives-and-records-association-united-kingdom-ireland) (UK & Ireland) - [Archives and Records Association](https://www2.archivists.org/assoc-orgs/archives-and-records-association-united-kingdom-ireland) (UK & Ireland)
- [Arkivrådet AAS](http://www.arkivradet.se/) (Sweden) - [Arkivrådet AAS](http://www.arkivradet.se/) (Sweden)
- [Asociación Española de Archiveros, Bibliotecarios, Museologos y Documentalistas (ANABAD)](https://www2.archivists.org/assoc-orgs/asociaci%C3%B3n-espa%C3%B1ola-de-archiveros-bibliotecarios-museologos-y-documentalistas-anabad) (Spain) - [Asociación Española de Archiveros, Bibliotecarios, Museologos y Documentalistas (ANABAD)](https://www2.archivists.org/assoc-orgs/asociaci%C3%B3n-espa%C3%B1ola-de-archiveros-bibliotecarios-museologos-y-documentalistas-anabad) (Spain)
- [Associação dos Arquivistas Brasileiros (AAB)](https://www2.archivists.org/assoc-orgs/associacao-dos-arquivistas-brasileiros-aab) (Brazil) - [Associação dos Arquivistas Brasileiros (AAB)](https://www2.archivists.org/assoc-orgs/associacao-dos-arquivistas-brasileiros-aab) (Brazil)
- [Associação Portuguesa de Bibliotecários, Archivistas e Documentalistas (BAD)](https://www2.archivists.org/assoc-orgs/associacao-portuguesa-de-bibliotecarios-archivistas-e-documentalistas-bad) (Portugal) - [Associação Portuguesa de Bibliotecários, Archivistas e Documentalistas (BAD)](https://www2.archivists.org/assoc-orgs/associacao-portuguesa-de-bibliotecarios-archivistas-e-documentalistas-bad) (Portugal)
- [Association des archivistes français (AAF)](https://www2.archivists.org/assoc-orgs/association-des-archivistes-francais-aaf) (France) - [Association des archivistes français (AAF)](https://www2.archivists.org/assoc-orgs/association-des-archivistes-francais-aaf) (France)
- [Associazione Nazionale Archivistica Italiana (ANAI)](https://www2.archivists.org/assoc-orgs/associazione-nazionale-archivistica-italiana-anai) (Italy) - [Associazione Nazionale Archivistica Italiana (ANAI)](https://www2.archivists.org/assoc-orgs/associazione-nazionale-archivistica-italiana-anai) (Italy)
- [Australian Society of Archivists Inc.](https://www2.archivists.org/assoc-orgs/australian-society-of-archivists-inc) (Australia) - [Australian Society of Archivists Inc.](https://www2.archivists.org/assoc-orgs/australian-society-of-archivists-inc) (Australia)
- [International Council on Archives (ICA)](https://www2.archivists.org/assoc-orgs/international-council-on-archives-ica) - [International Council on Archives (ICA)](https://www2.archivists.org/assoc-orgs/international-council-on-archives-ica)
- [International Records Management Trust (IRMT)](https://www2.archivists.org/assoc-orgs/international-records-management-trust-irmt) - [International Records Management Trust (IRMT)](https://www2.archivists.org/assoc-orgs/international-records-management-trust-irmt)
- [Irish Society for Archives](https://www2.archivists.org/assoc-orgs/irish-society-for-archives) (Ireland) - [Irish Society for Archives](https://www2.archivists.org/assoc-orgs/irish-society-for-archives) (Ireland)
- [Koninklijke Vereniging van Archivarissen in Nederland](https://www2.archivists.org/assoc-orgs/koninklijke-vereniging-van-archivarissen-in-nederland) (Netherlands) - [Koninklijke Vereniging van Archivarissen in Nederland](https://www2.archivists.org/assoc-orgs/koninklijke-vereniging-van-archivarissen-in-nederland) (Netherlands)
- [State Archives Administration of the People's Republic of China](https://www2.archivists.org/assoc-orgs/state-archives-administration-of-the-peoples-republic-of-china) (China) - [State Archives Administration of the People's Republic of China](https://www2.archivists.org/assoc-orgs/state-archives-administration-of-the-peoples-republic-of-china) (China)
- [Academy of Certified Archivists](https://www2.archivists.org/assoc-orgs/academy-of-certified-archivists) - [Academy of Certified Archivists](https://www2.archivists.org/assoc-orgs/academy-of-certified-archivists)
- [Archivists and Librarians in the History of the Health Sciences](https://www2.archivists.org/assoc-orgs/archivists-and-librarians-in-the-history-of-the-health-sciences) - [Archivists and Librarians in the History of the Health Sciences](https://www2.archivists.org/assoc-orgs/archivists-and-librarians-in-the-history-of-the-health-sciences)
- [Archivists for Congregations of Women Religious](https://www2.archivists.org/assoc-orgs/archivists-for-congregations-of-women-religious) - [Archivists for Congregations of Women Religious](https://www2.archivists.org/assoc-orgs/archivists-for-congregations-of-women-religious)
- [Archivists of Religious Institutions](https://www2.archivists.org/assoc-orgs/archivists-of-religious-institutions) - [Archivists of Religious Institutions](https://www2.archivists.org/assoc-orgs/archivists-of-religious-institutions)
- [Association of Catholic Diocesan Archivists](https://www2.archivists.org/assoc-orgs/association-of-catholic-diocesan-archivists) - [Association of Catholic Diocesan Archivists](https://www2.archivists.org/assoc-orgs/association-of-catholic-diocesan-archivists)
- [Association of Moving Image Archivists](https://www2.archivists.org/assoc-orgs/association-of-moving-image-archivists) - [Association of Moving Image Archivists](https://www2.archivists.org/assoc-orgs/association-of-moving-image-archivists)
- [Council of State Archivists](https://www2.archivists.org/assoc-orgs/council-of-state-archivists) - [Council of State Archivists](https://www2.archivists.org/assoc-orgs/council-of-state-archivists)
- [National Association of Government Archives and Records Administrators](https://www2.archivists.org/assoc-orgs/national-association-of-government-archives-and-records-administrators) - [National Association of Government Archives and Records Administrators](https://www2.archivists.org/assoc-orgs/national-association-of-government-archives-and-records-administrators)
- [National Episcopal Historians and Archivists](https://www2.archivists.org/assoc-orgs/national-episcopal-historians-and-archivists) - [National Episcopal Historians and Archivists](https://www2.archivists.org/assoc-orgs/national-episcopal-historians-and-archivists)
- [Archival Education and Research Institute](https://www2.archivists.org/assoc-orgs/archival-education-and-research-institute) - [Archival Education and Research Institute](https://www2.archivists.org/assoc-orgs/archival-education-and-research-institute)
- [Archives Leadership Institute](https://www2.archivists.org/assoc-orgs/archives-leadership-institute) - [Archives Leadership Institute](https://www2.archivists.org/assoc-orgs/archives-leadership-institute)
- [Georgia Archives Institute](https://www2.archivists.org/assoc-orgs/georgia-archives-institute) - [Georgia Archives Institute](https://www2.archivists.org/assoc-orgs/georgia-archives-institute)
- [Modern Archives Institute](https://www2.archivists.org/assoc-orgs/modern-archives-institute) - [Modern Archives Institute](https://www2.archivists.org/assoc-orgs/modern-archives-institute)
- [Western Archives Institute](https://www2.archivists.org/assoc-orgs/western-archives-institute) - [Western Archives Institute](https://www2.archivists.org/assoc-orgs/western-archives-institute)
- [Association des archivistes du Québec](https://www2.archivists.org/assoc-orgs/association-des-archivistes-du-quebec) - [Association des archivistes du Québec](https://www2.archivists.org/assoc-orgs/association-des-archivistes-du-quebec)
- [Association of Canadian Archivists](https://www2.archivists.org/assoc-orgs/association-of-canadian-archivists) - [Association of Canadian Archivists](https://www2.archivists.org/assoc-orgs/association-of-canadian-archivists)
- [Canadian Council of Archives/Conseil canadien des archives](https://www2.archivists.org/assoc-orgs/canadian-council-of-archivesconseil-canadien-des-archives) - [Canadian Council of Archives/Conseil canadien des archives](https://www2.archivists.org/assoc-orgs/canadian-council-of-archivesconseil-canadien-des-archives)
- [Archives Association of British Columbia](https://www2.archivists.org/assoc-orgs/archives-association-of-british-columbia) - [Archives Association of British Columbia](https://www2.archivists.org/assoc-orgs/archives-association-of-british-columbia)
- [Archives Association of Ontario](https://www2.archivists.org/assoc-orgs/archives-association-of-ontario) - [Archives Association of Ontario](https://www2.archivists.org/assoc-orgs/archives-association-of-ontario)
- [Archives Council of Prince Edward Island](https://www2.archivists.org/assoc-orgs/archives-council-of-prince-edward-island) - [Archives Council of Prince Edward Island](https://www2.archivists.org/assoc-orgs/archives-council-of-prince-edward-island)
- [Archives Society of Alberta](https://www2.archivists.org/assoc-orgs/archives-society-of-alberta) - [Archives Society of Alberta](https://www2.archivists.org/assoc-orgs/archives-society-of-alberta)
- [Association for Manitoba Archives](https://www2.archivists.org/assoc-orgs/association-for-manitoba-archives) - [Association for Manitoba Archives](https://www2.archivists.org/assoc-orgs/association-for-manitoba-archives)
- [Association of Newfoundland and Labrador Archives](https://www2.archivists.org/assoc-orgs/association-of-newfoundland-and-labrador-archives) - [Association of Newfoundland and Labrador Archives](https://www2.archivists.org/assoc-orgs/association-of-newfoundland-and-labrador-archives)
- [Council of Nova Scotia Archives](https://www2.archivists.org/assoc-orgs/council-of-nova-scotia-archives) - [Council of Nova Scotia Archives](https://www2.archivists.org/assoc-orgs/council-of-nova-scotia-archives)
- [Réseau des services d'archives du Québec](https://www2.archivists.org/assoc-orgs/reseau-des-services-darchives-du-quebec) - [Réseau des services d'archives du Québec](https://www2.archivists.org/assoc-orgs/reseau-des-services-darchives-du-quebec)
- [Saskatchewan Council for Archives and Archivists](https://www2.archivists.org/assoc-orgs/saskatchewan-council-for-archives-and-archivists) - [Saskatchewan Council for Archives and Archivists](https://www2.archivists.org/assoc-orgs/saskatchewan-council-for-archives-and-archivists)
You can find more organizations and initiatives on these other lists: You can find more organizations and initiatives on these other lists:
- [Wikipedia.org List of Web Archiving Initiatives](https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives) - [Wikipedia.org List of Web Archiving Initiatives](https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives)
- [SAA List of USA & Canada Based Archiving Organizations](https://www2.archivists.org/assoc-orgs/directory) - [SAA List of USA & Canada Based Archiving Organizations](https://www2.archivists.org/assoc-orgs/directory)
- [SAA List of International Archiving Organizations](https://www2.archivists.org/assoc-orgs/i_a_o) - [SAA List of International Archiving Organizations](https://www2.archivists.org/assoc-orgs/i_a_o)
- [Digital Preservation Coalition's Member List](https://www.dpconline.org/about/members) - [Digital Preservation Coalition's Member List](https://www.dpconline.org/about/members)
--- ---
@@ -473,25 +472,25 @@ You can find more organizations and initiatives on these other lists:
### ArchiveBox Chat Rooms ### ArchiveBox Chat Rooms
- [Official ArchiveBox Zulip Chat Server](https://zulip.archivebox.io) - [Official ArchiveBox Zulip Chat Server](https://zulip.archivebox.io)
- [Unofficial ArchiveBox Matrix chat room](https://matrix.to/#/#archivebox:matrix.org) (old) - [Unofficial ArchiveBox Matrix chat room](https://matrix.to/#/#archivebox:matrix.org) (old)
- [Github Discussions](https://github.com/ArchiveBox/ArchiveBox/discussions) - [GitHub Discussions](https://github.com/ArchiveBox/ArchiveBox/discussions)
### ArchiveBox on Social Media ### ArchiveBox on Social Media
- [Twitter: @ArchiveBoxApp](https://twitter.com/ArchiveBoxApp) - [Twitter: @ArchiveBoxApp](https://twitter.com/ArchiveBoxApp)
- [LinkedIn: ArchiveBox](https://www.linkedin.com/company/archivebox/) - [LinkedIn: ArchiveBox](https://www.linkedin.com/company/archivebox/)
- [YouTube: @ArchiveBoxApp](https://www.youtube.com/@ArchiveBoxApp) - [YouTube: @ArchiveBoxApp](https://www.youtube.com/@ArchiveBoxApp)
- [Reddit: r/ArchiveBox](https://www.reddit.com/r/ArchiveBox/) - [Reddit: r/ArchiveBox](https://www.reddit.com/r/ArchiveBox/)
- [Alternative.to](https://alternativeto.net/software/archivebox/about/) - [Alternative.to](https://alternativeto.net/software/archivebox/about/)
- [ReposHub](https://reposhub.com/python/web-crawling/pirate-ArchiveBox.html) - [ReposHub](https://reposhub.com/python/web-crawling/pirate-ArchiveBox.html)
### ArchiveBox on Package Distribution Platforms ### ArchiveBox on Package Distribution Platforms
- [Python PyPI](https://pypi.org/project/archivebox/) - [Python PyPI](https://pypi.org/project/archivebox/)
- [Docker Hub](https://hub.docker.com/r/archivebox/archivebox) - [Docker Hub](https://hub.docker.com/r/archivebox/archivebox)
- [ArchLinux AUR](https://aur.archlinux.org/packages/archivebox) - [ArchLinux AUR](https://aur.archlinux.org/packages/archivebox)
- [Ubuntu Launchpad PPA](https://launchpad.net/~archivebox/+archive/ubuntu/archivebox) - [Ubuntu Launchpad PPA](https://launchpad.net/~archivebox/+archive/ubuntu/archivebox)
--- ---
@@ -502,4 +501,4 @@ You can find more organizations and initiatives on these other lists:
<br/><br/> <br/><br/>
<small><a href="#contents">^ &nbsp; Back to Top &nbsp; ^</a></small> <small><a href="#contents">^ &nbsp; Back to Top &nbsp; ^</a></small>
</div> </div>