1
0
mirror of https://github.com/pirate/ArchiveBox.git synced 2025-08-31 10:01:52 +02:00

new generic_html parser for extracting hrefs

This commit is contained in:
Nick Sweeting
2020-08-18 08:29:05 -04:00
parent a682a9c478
commit 15efb2d5ed
5 changed files with 106 additions and 39 deletions

View File

@@ -70,6 +70,7 @@ archivebox/index/json.py
archivebox/index/schema.py
archivebox/index/sql.py
archivebox/parsers/__init__.py
archivebox/parsers/generic_html.py
archivebox/parsers/generic_json.py
archivebox/parsers/generic_rss.py
archivebox/parsers/generic_txt.py