1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-09-02 02:42:38 +02:00

Updated Web Archiving Community (markdown)

Nick Sweeting
2019-01-22 15:38:42 -05:00
parent d8ed24c6df
commit 221ad6598a

@@ -22,17 +22,16 @@ Start with the master list: the [Awesome Web Archiving List](https://github.com/
### Similar Projects
- [Reminiscence](https://github.com/kanishka-linux/reminiscence/) extremely similar to ArchiveBox, uses a Django backend + UI and provides auto tagging and summary features with NLTK
- [Webrecorder.io](https://webrecorder.io/) Save full browsing sessions and archive all the content (ipwb)
- [Brozzler](https://github.com/internetarchive/brozzler) chrome headless crawler + WARC archiver maintained by Archive.org
- [Memex by Worldbrain.io](https://github.com/WorldBrain/Memex) a browser extension that saves all your history and does full-text search
- [Hypothes.is](https://web.hypothes.is/) a web/pdf/ebook annotation tool that also archives content
- [Perkeep](https://perkeep.org/) "Perkeep lets you permanently keep your stuff, for life."
- [Fetching.io](http://fetching.io/) A personal search engine/archiver that lets you search through all archived websites that you've bookmarked
- [Shaarchiver](https://github.com/nodiscc/shaarchiver) very similar project that archives Firefox, Shaarli, or Delicious bookmarks and all linked media, generating a markdown/HTML index
- [Webrecorder.io](https://webrecorder.io/) Save full browsing sessions and archive all the content
- [Wallabag](https://wallabag.org) Save articles you read locally or on your phone
- [Archivematica](https://github.com/artefactual/archivematica) web GUI for institutional long-term archiving of web and other content
### Tools
- https://github.com/chfoo/warcat for merging, extracting, and verifying WARC files