1
0
mirror of https://github.com/RSS-Bridge/rss-bridge.git synced 2025-08-27 10:04:53 +02:00

Compare commits

...

181 Commits

Author SHA1 Message Date
Hunter T.
32f324dbb5 [nginx.conf]: Set 'server_tokens off'; won't leak nginx version (#4681) 2025-08-25 23:28:06 +02:00
Quentin B.
29b47d95dc [CentreFranceBridge] Fix title parsing (#4680)
* [CentreFranceBridge] Fix title parsing

* [CentreFranceBridge] Fix title
2025-08-25 23:21:45 +02:00
Tone
bed2de02c3 [GolemBridge] changes due to redesign (#4684)
* [GolemBridge] changes due to redesign

* Update GolemBridge.php

remove "öffnet in neuen Fenster" in Links

* Update GolemBridge.php

- remove redundant stuff, like title in text
- add first part of article
- fix categories
2025-08-25 23:21:21 +02:00
Florent V.
096e398e41 [SamMobileBridge] New brige to fetches the latest security patches for Samsung devices (#4676)
* [SamMobileBridge] Fetches the latest security patches for Samsung devices

* [SamMobileBridge] Add date handling

* [SamMobileBridge] Remove empty spaces

* [SamMobileBridge] add strict_types

---------

Co-authored-by: Florent VIOLLEAU <florent.violleau@samsic.fr>
2025-08-20 16:30:22 +02:00
Florent V.
b423b13bd5 [EdfPricesBrige] Update for dom change (#4675)
Co-authored-by: Florent VIOLLEAU <florent.violleau@samsic.fr>
2025-08-19 18:23:27 +02:00
Mynacol
e30698f12f [GolemBridge] Add multi-page headings
On multi-page articles like [1], some paragraph headers were missing
because they are headers of the article pages.

These headers were previously removed in
c5f586497f for being redundant with the
original header. The article at [1] proves us wrong, but I added a logic
to ignore truly duplicate headers.

[1] https://www.golem.de/news/es-muss-nicht-immer-apple-sein-fuenf-ueberzeugende-airpods-pro-alternativen-im-test-2508-195000.html
2025-08-17 14:56:42 +02:00
Lukas Nabakowski
876d3c8ae7 [ZDFMediathekBridge] add bridge (#4672)
* Add ZDFMediathekBridge

* Declare strict types = 1
2025-08-15 16:46:32 +02:00
Matt DeMoss
ee4f85cc94 pcgamer: meta tag change (#4670)
* pcgamer: the parsely tags are gone, use different tags

* apply phpcs.xml rules
2025-08-14 19:06:39 +02:00
tillcash
1b584b4551 [CybernewsBridge] add bridge (#4665)
* [CybernewsBridge] add bridge

* [CybernewsBridge] fix lint

* [CybernewsBridge] add header

* [CybernewsBridge] fix url

* [CybernewsBridge] fix url 2

* [CybernewsBridge] revert header

* [CybernewsBridge] refactor

* [CybernewsBridge] final

* [CybernewsBridge] lint
2025-08-14 14:31:47 +02:00
xnand-dot-xyz
3a9e398228 [ModrinthBridge] Add bridge (#4651)
* [ModrinthBridge] Add bridge

Support for querying updates to projects on https://modrinth.com

May need modification,  and I'm alright with the maintainer name being changed or cleared if actual maintenance is expected

* Added declare and fixed linting errors

* Skip parsing lists if null, and trim trailing space
2025-08-14 14:28:33 +02:00
Simone Dotto
2e387eb9d6 [SubitoBridge] Add bridge (#1800) (#4628)
* [SubitoBridge] Add bridge (issue #1800)

* php 74 compat

* user-agent blocking bypass

* constant variable access

* strict types

---------

Co-authored-by: Simone Dotto <simonedotto@proton.me>
2025-08-14 14:26:16 +02:00
User123698745
9b6fa7cd97 [prtester] improve prtester.py and prhtmlgenerator.yml for running in forks (#4313)
* [prtester] support forks to upload to their own "rss-bridge-tests"

add parameter "--artifact-base-url" and "--artifact-directory"

* [prtester] review feedback: add 'github.event.number' fallback to 'none'
2025-08-14 08:17:42 +02:00
Mynacol
5382dee516 [GolemBridge] Fix removal of affiliate images
On
https://www.golem.de/news/anlage-in-etfs-was-alternativen-zum-msci-world-bringen-2508-199041.html
the affiliate box isn't properly filtered out.
The reason seems to be switching from a `div` to an `aside` element.
HTML source fragment:
```html
<aside class="gbox_affiliate" data-nosnippet>
    <div class="gbox_attribution"></div>
    <div class="gbox_fx1">
<a href="https://www.financeads.net/tc.php?t=36731C67231788T" target="_blank" rel="nofollow" onclick="_gcpx.push(['ev','d','rklmbox/14387']); return true;"><img src="https://scr3.golem.de/screenshots/affiliate/14
387/9caaa476f979dcf7457395f39ac9ed9f.png" alt=""></a>
        <div class="gbox_fx2">
            <div class="gbox_title">Tagesgeld, Festgeld, ETFs, Aktien und mehr bei raisin</div>
            <div><a class="gbox_btn" data-cta="Jetzt Investmentm&ouml;glichkeiten bei raisin entdecken" href="https://www.financeads.net/tc.php?t=36731C67231788T" target="_blank" rel="nofollow" onclick="_gcpx.push(['ev','d','rklmbox/14387']); return true;"></a></div>
        </div>
    </div>
<!-- /gbox --></aside>
```
2025-08-13 12:15:26 +02:00
Mynacol
b60556ffb4 [HeiseBridge] Remove "Videos by heise" ads
This seems to be a new middle-of-content self-ad.
Seen on https://heise.de/-10519045

The code snippet in that case was:
```html
<div class="ad ad--inread">

  <div class="ad--inread-header">
    <p class="ad--inread-header__text">
      Videos by heise
    </p>

    <div class="ad--inread-header__more">
      <button class="ad--inread-header-menu-toggle" popovertarget="ad--inread-header-menu">
        mehr Videos
        <svg fill="none" height="24" viewbox="0 0 24 24" width="24" xmlns="http://www.w3.org/2000/svg">
          <path d="M8.625 12.0023C8.625 12.2094 8.45711 12.3773 8.25 12.3773C8.04289 12.3773 7.875 12.2094 7.875 12.0023C7.875 11.7952 8.04289 11.6273 8.25 11.6273C8.45711 11.6273 8.625 11.7952 8.625 12.0023ZM8.625 12.0023H8.25M12.375 12.0023C12.375 12.2094 12.2071 12.3773 12 12.3773C11.7929 12.3773 11.625 12.2094 11.625 12.0023C11.625 11.7952 11.7929 11.6273 12 11.6273C12.2071 11.6273 12.375 11.7952 12.375 12.0023ZM12.375 12.0023H12M16.125 12.0023C16.125 12.2094 15.9571 12.3773 15.75 12.3773C15.5429 12.3773 15.375 12.2094 15.375 12.0023C15.375 11.7952 15.5429 11.6273 15.75 11.6273C15.9571 11.6273 16.125 11.7952 16.125 12.0023ZM16.125 12.0023H15.75M21 12.0023C21 16.9729 16.9706 21.0023 12 21.0023C7.02944 21.0023 3 16.9729 3 12.0023C3 7.03176 7.02944 3.00232 12 3.00232C16.9706 3.00232 21 7.03176 21 12.0023Z" stroke="#777" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5"></path>
        </svg>
      </button>

      <div class="ad--inread-header-menu" id="ad--inread-header-menu" popover>
        <ul class="a-u-mb-0">
          <li>
            <a class="ad--inread-header-menu-link" href="https://www.youtube.com/@ct3003" target="_blank">
              c&#39;t 3003
            </a>
          </li>
          <li>
            <a class="ad--inread-header-menu-link" href="https://www.youtube.com/heiseonline" target="_blank">
              heise &amp; ct
            </a>
          </li>
          <li>
            <a class="ad--inread-header-menu-link" href="https://peertube.heise.de/" target="_blank">
              Peertube
            </a>
          </li>
        </ul>
      </div>
    </div>
  </div>

  <figure class="video video--fullwidth">
    <a-video entry-id="25969" height="9" instant is-target-video-playlist style="aspect-ratio: 16 / 9" type="targetvideo" width="16"></a-video>
  </figure>
</div>
```

Hence filtering anything with the class `ad` or `ad--inread` gets rid of
it.
2025-08-12 21:25:42 +02:00
Dag
37174f01e5 fix: throw client exception in some bridges (#4661) 2025-08-08 02:24:13 +02:00
Dag
a599f4ba83 fix: dont log user errors (#4660) 2025-08-08 02:16:43 +02:00
Dag
81ce9c9483 fix: introduce system env var, remove debug mode (#4658)
* fix: introduce system env var

* docs

* docs
2025-08-08 01:38:12 +02:00
Dag
a128c05a97 docs: emphasize strict types (#4657) 2025-08-05 21:06:40 +02:00
Dag
9caa043fe1 lint: add returnClientError and returnServerError to forbiddenFcuntions (#4656) 2025-08-05 20:55:04 +02:00
Dag
f11571ae78 refactor: rename functions (#4655)
returnClientError => throwClientException
returnServerError => throwServerException

New convenience function: throwRateLimitException

Old functions are kept but deprecated.
2025-08-05 20:44:40 +02:00
Dag
b39964cee3 chore: prepare for aug 2025 release (#4654) 2025-08-05 19:50:27 +02:00
Joseph
9c43921a33 [FirstLookMediaTechBridge] Remove bridge (#4653)
Website no longer exists
2025-08-04 22:57:35 +02:00
Joseph
9e2975048f [AskfmBridge] Remove bridge (#4652)
Website closed in December 2024 https://web.archive.org/web/20241129120541/https://about.ask.fm/closure-notice-the-platform-to-be-deactivated-december-1-2024/
2025-08-04 22:56:27 +02:00
Joseph
fb153f9a92 [DansTonChatBridge] Remove bridge (#4650)
bridge is broken and website has native feeds.

https://danstonchat.com/category/quote/feed
2025-08-04 17:19:24 +02:00
Joseph
20fec74c63 [DailymotionBridge] Fetch playlist title from API (#4649) 2025-08-04 15:41:04 +02:00
Simone Dotto
b5f90f8d47 [AmazonPriceTracker] Fix price not shown, new default source (#4631)
Fixes issue #4586

Co-authored-by: Simone Dotto <simonedotto@proton.me>
2025-08-04 14:31:43 +02:00
shaun
aba38845d2 [YoutubeCommunityTabsBridge] Rename Community→Posts to fix broken bridge (#4606)
* youtube community posts are just called "Posts" now

* finish renaming Community -> Posts

* add feedName fallbacks (thanks @Mar-Koeh)

* rename YouTubePostsTabBridge back to YouTubeCommunityTabBridge

* fix linter error by breaking up long expression

* fix optional-chaining regression by using ‘?? null’
2025-08-04 14:30:48 +02:00
Joseph
1211ac63d9 Update DailymotionBridge.php (#4648) 2025-08-04 14:28:16 +02:00
Joseph
640503168e [FirefoxAddonsBridge] Minor change to item content html (#4647) 2025-08-04 14:27:40 +02:00
Arnav Jain
93de253d01 [GoComicsBridge] cache individual comic page for 24h (#4646) 2025-08-04 14:27:19 +02:00
User123698745
6ec4da854f [FallGuysBridge] fix: handle new data structure (#4640)
* [FallGuysBridge] fix: handle new data structure

* [FallGuysBridge] review feedback: removed mixed
2025-08-04 01:36:44 +02:00
Dag
e5f9fe6251 lint (#4645) 2025-08-04 01:36:15 +02:00
Dag
47c9983e16 fix: dont cache basic auth response (#4644) 2025-08-04 01:32:36 +02:00
Sandro
69eda522c8 Mention php extension filter (#4608)
While trying around to minimize my installation, I noticed that this
extension is nowhere mentioned.
2025-08-04 01:09:38 +02:00
User123698745
172e7eb280 [prtester] fix wrong pr check fail when refactoring code (the bridge html output has not changed) (#4642)
ignore "nothing to commit, working tree clean"
2025-08-04 01:08:25 +02:00
User123698745
acb9373c10 [DRKBlutspendeBridge] add offers to content & add caption to images & use cached request (#4641) 2025-08-04 01:07:41 +02:00
Joseph
85497238c5 Update HaveIBeenPwnedBridge.php (#4638) 2025-08-04 00:58:09 +02:00
Marcin Morawski
a2334838a6 Fix deprecations (#4636)
* Fix PHP 8.4 deprecation

Implicitly marking parameter as nullable is deprecated, the explicit nullable type must be used instead

* [github workflow] Add additional php versions
2025-08-04 00:55:50 +02:00
mruac
c65fbd5543 [BlueskyBridge] Fix cases for missing reply post context and QoL fix for video loading (#4635)
* added fix for missing reply post context

* qol fix - no preload on videos
2025-08-04 00:50:12 +02:00
sysadminstory
e241f3dcde [PepperBridgeAbstract, DealabsBridge, HotUKDealsBridge, MydealsBridge] Adapt RSS bridge to website content update; remove country of origin due to missing data (#4634)
Website use now "vue3" and some class and attributes have changed their
names : bridge was updated to use the new class and attribute names

Country of origin has been removed from the deal list : it's for now
disabled, but code is still present in the bridge, in case the website
enable it again.
2025-08-04 00:48:27 +02:00
Pavel Korytov
16bb6156a5 [UniverseTodayBridge] Add bridge (#4627) 2025-08-04 00:22:50 +02:00
Pavel Korytov
9f8dc411a4 [InstituteForTheStudyOfWarBridge] Increase caching time (#4626) 2025-08-04 00:21:57 +02:00
July
5b97899734 [FanaticalBridge] Create a new bridge (#4624)
Provides a fairly barebones bridge for Fanatical bundles:
- Tags detail bundle tiers and prices
- Contents name and link to each bundle item
- Images for each item are in enclosures
2025-08-04 00:21:04 +02:00
July
8ae2c2e3c3 [HumbleBundleBridge] Overhaul to include more information (#4621)
* [HumbleBundleBridge] Overhaul to include more information

* [HumbleBundleBridge] Remove use of named args in calls

PHP 7.4 lacks named arg support and fails unit tests
2025-08-04 00:20:00 +02:00
July
9ec6ae39a2 [ComickBridge] Add new bridge (#4625)
Makes new brige for manga from comick.io. Like the CubariProxyBridge,
can provide manga page images in feed entry content or enclosures.
2025-08-04 00:19:08 +02:00
July
3517cda4a5 [YouTubeFeedExpanderBridge] More reliable channel icons (#4622) 2025-08-04 00:17:30 +02:00
July
52be29d3ec [AnnasArchiveBridge] Fix book list CSS selector (#4619) 2025-08-04 00:17:01 +02:00
July
696aed22cc [CubariProxyBridge] Replace MangaSee with WeebCentral (#4618) 2025-08-04 00:16:30 +02:00
July
e394be7ca5 [KemonoBridge] Add search query support (#4620) 2025-08-04 00:16:14 +02:00
jaydeethree
3835f290c1 Update GOGBridge to use GOG's REST API. I have tested this locally and it seems to work correctly. (#4616) 2025-08-04 00:14:51 +02:00
Nomis
c7de5c95be Update 06_Public_Hosts.md (#4614)
Remove bridge.easter.fr
2025-08-04 00:12:38 +02:00
Tobias Alexander Franke
71808aaa81 [WarhammerComBridge] Bridge for Warhammer Community blog (#4610)
* [WarhammerComBridge] Bridge for Warhammer Community blog

* Fix Linter issues
2025-08-04 00:10:58 +02:00
Anton Smirnov
2ca696c1cf [EpicGamesFreeBridge] productSlug can be null; also add a universal future-proof-ish fallback (#4595)
* productSlug can be null, do more discovery, add fallback

* productSlug can be garbage too, remove it completely
2025-08-03 23:59:42 +02:00
Sebastian K
c90b98b965 Error handling in ExplosmBridge (#4600)
Skip further processing if element was not found to avoid errors
2025-08-03 23:58:24 +02:00
Quentin B.
8e880de3d2 [CentreFranceBridge] Fix parser following website update (#4596)
* [CentreFranceBridge] Fix parser following website update

* [CentreFranceBridge] Fix empty content

* [CentreFranceBridge] Fix title parsing
2025-08-03 23:52:06 +02:00
Tone
bfa6c4c080 [HeiseBridge] removes language-info-text, add archive.is link for people without subscription (#4594)
* [HeiseBridge] removes language-info-text, add archive.is link for people without subscription

* fix annoying phpcs
2025-08-03 23:50:54 +02:00
User.
5ab938ada7 [WaggaCouncilBridge] Add bridge (#4593)
Co-authored-by: Scrub000 <scrub@example.com>
2025-08-03 23:49:10 +02:00
Petr Prenghy
4d2fe2f12d [NasestrechaBridge] Add bridge (#4591)
* Add files via upload

Bridge for NaseStrecha.cz - NaseStrecha.cz is a specialized Czech news and advice portal focusing on roofs, construction, and home improvement, offering reliable expert guidance on roofing materials, insulation, and energy-saving techniques nasestrecha.cz . It is run by the team behind the Strechy-Solar-Remeslo trade fair and includes up-to-date news, practical tips, and industry events

* phpcs fix

* Bridge for i4wifi.cz for product news.
The website i4wifi.cz is a wholesale distributor specializing in wireless, networking, and photovoltaic equipment, offering products from brands like MikroTik, Ubiquiti, and Hikvision. It provides a wide range of network solutions, technical support, and training services for businesses and professional installers in the Czech Republic and beyond.
2025-08-03 23:46:35 +02:00
Mynacol
4c0b97d605 [ZeitBridge] Add advertorial marker to article
So users are aware that it's a paid article.

Some might still find them interesting, so we cannot just filter them
away.
2025-07-20 01:35:28 +02:00
Mynacol
1d5bcba41f [ZeitBridge] Hide magazine ads in articles
Test article: https://www.zeit.de/campus/2025/03/kyoto-university-abschlussfeier-kostueme-japan
2025-07-20 01:35:28 +02:00
Mynacol
d19ce75d4b Merge pull request #4613 from Mynacol/golem-add-table
[GolemBridge] Add tables to content
2025-07-16 13:53:53 +02:00
Mynacol
bfbe2abdce [GolemBridge] Add tables to content
For example the following article has such tables that should be
included:
https://www.golem.de/news/immobilien-mieten-oder-kaufen-warum-es-dabei-nicht-nur-ums-geld-geht-2507-197406.html
2025-07-16 11:50:00 +00:00
Jonathan Kay
354cea09a7 [GoComicsBridge] Add fallback when link to current comic is missing (#4589) 2025-06-08 21:57:41 +02:00
sysadminstory
8dada08e69 [IdealoBridge] Bypass bot protection (#4588)
Add some headers (User-Agent, Accept, Accept-Language) and activate
compression to bypass the bot protection
2025-06-07 23:31:02 +02:00
Jonathan Kay
514b3edf0b [GoComicsBridge] Fix for JSON being removed (#4585)
- Now redirects to first comic from landing page
- Switched to meta tags
2025-06-05 23:41:20 +02:00
Tobias Alexander Franke
7aa54602cf [FabBridge] Pull 100% discounted items via Fab API (#4584)
* [FabBridge] Pull 100% discounted items via Fab API

* [FabBridge] Linter fixes
2025-06-04 22:15:28 +02:00
Dag
98e03011db chore: prepare for 2025-06-03 release (#4583) 2025-06-03 21:24:35 +02:00
Anton Smirnov
b8064d9dfe [EpicGamesFree] Fixes: url not set, other promos shown (#4575)
* URI was not set because of the typo

* Filter out other promos
2025-05-30 11:05:36 +02:00
Mynacol
976217111c [GolemBridge] Add code elements
The extractor missed <pre> elements for code snippets.
For example the code line in
https://www.golem.de/news/falsch-deklarierte-hdds-betrug-bei-festplatten-bleibt-ein-problem-2505-196675.html
2025-05-28 21:21:44 +02:00
Joseph
419844f010 Delete OpenlyBridge.php (#4572) 2025-05-26 22:46:42 +02:00
Joseph
e5b3ec85d9 Delete CuriousCatBridge.php (#4571) 2025-05-26 22:46:28 +02:00
Stéphane
7b55eb3824 Adding a bridge for Paul Graham's essays (#4570)
* Adding a bridge for Paul Graham's essays

* lint

---------

Co-authored-by: Dag <me@dvikan.no>
2025-05-25 20:46:50 +02:00
Dag
7397cabeee fix(telegram): remove meta message (#4569) 2025-05-24 19:29:04 +02:00
Thiago Ferreira
daef06c6dd devcontainer: Fixed Dev Containers setup (#4556)
The current setup for Dev Containers was not working, with multiple
different errors. So, in order to restore its funcionality (and allow
for things like linting and debugging), the following changes were made:

- The Dockerfile was severely alterered. Now, the `docker-php-ext-enable` binary is installed before its usage,
  it points to the correct PHP binary, and we install Composer for for
  loading dev-dependencies later-on.

- Moved the "postCreateCommand" section (defined on the `devcontainer.json` file) into its own script file (for a
  more readable experience)

- On the post-creation script, moved the `xdebug.ini` to the correct
  directory (alongside the PHP-FPM bin), installed PHPUnit,
  PHPCodesniffer (and the 'PHP Compatibility' sniffer) with Composer on
  a global location, and changed owner of the `cache` directory

- Changed VSCode-specific customization setting in order to point to the
  update some binary paths. Also made sure globally-installed composer
  packages binaries are accessible via PATHdocker-php-ext-enable
2025-05-24 19:18:52 +02:00
Dag
ec5b32c551 ci: fix broken ci (#4568)
* fix: deprecation warning

* ci: fix broken ci
2025-05-24 19:14:53 +02:00
Dag
0130adcd6c fix: deprecation warning (#4567) 2025-05-23 22:55:41 +02:00
Dawid Wróbel
b7c04f8587 Overhaul the usage of libcurl-impersonate (#4535)
libcurl-impersonate was not being used properly, as the code was
overriding the headers set by it to prevent detection.

- update the libcurl-impersonate to an actively managed lexiforest
  fork
- impersonate Chrome 131
- move the defaultHttpHeaders to http.php, where it belongs
- only set defaultHttpHeaders if curl-impersonate is not detected
- make useragent ini setting optional and disabled by default
- add necessary documentation updates
2025-05-17 20:18:36 +02:00
Christian Schabesberger
0f77d3ae0a fix nnplus article filter (#4555) 2025-05-10 21:48:54 +02:00
Dag
8f21a030a8 fix(furaffinity): type error (#4554)
fixes array_filter(): Argument #1 ($array) must be of type array, null given

fix #4553
2025-05-09 09:39:35 +02:00
Dag
d36b335725 fix: do not log rate limit exceptions (#4552) 2025-05-09 06:14:13 +02:00
Dag
b8c0c1f3b8 fix: tweak logging rules (#4551) 2025-05-09 05:58:11 +02:00
tillcash
fd267df0e9 [LinuxBlogBridge] fix typo (#4549) 2025-05-09 05:41:10 +02:00
Dag
6c4225441a fix(tiktok) (#4550) 2025-05-09 05:40:48 +02:00
Apollo Nargang
5bd767b862 [TikTokBridge] Use oEmbed for video metadata (#4514)
* [TikTokBridge] Use oEmbed for video metadata

Fetches oEmbed-formatted metadata for videos through the TikTok API to
provide post titles, thumbnails, and authors. This hasn't yet been
tested, so it's possible it doesn't work.

* [TikTokBridge] Add back view count parsing

oops

* [TikTokBridge] Prepend www to the oEmbed API endpoint URL

The non-www URL resulted in a 301 redirect to the www URL, so this just
skips that redirect, improving performance a bit and hopefully helping
with the 400 errors.

* [TikTokBridge] Retry failed OEmbed requests

If an OEmbed request fails, retry a few times, waiting a bit in between
each retry. This should fix the problem for the most part, since I think
the problem was related to some sort of rate limit (it isn't mentioned
in the docs, but it seems to only happen when sending large quantities
of sequential requests).
2025-05-09 05:10:04 +02:00
Dawid Wróbel
72e1998e16 [AllegroBridge]: fix, use JSON instead of HTML (#4536)
Cookie is now obligatory, otherwise 403 is returned
2025-05-09 05:06:23 +02:00
Tone
083ba1e4f7 [FinanzflussBridge] fix for images not displayed (#4538) 2025-05-09 04:36:22 +02:00
Jonathan Kay
1cb9e91697 [GoComicsBridge] Update fix for latest layout changes (#4539) 2025-05-09 04:35:59 +02:00
sysadminstory
6342b8387e [InstagramBridge] Use fallback when User ID can not be found (#4531)
- In case the userId can not be found, use the Fallback method

- Fallback method move to it's own function
2025-05-09 04:33:18 +02:00
tillcash
648fcc38b5 [LinuxBlogBridge] add bridge (#4528)
* [LinuxBlogBridge] add bridge

* refactor

* Update LinuxBlogBridge.php
2025-05-09 04:28:31 +02:00
√(noham)²
9fb4a5dd72 Apple App Store bridge fix (#4516)
* Apple App Store bridge fix

* Fixe AppleAppStore + lint

* fix endpoint
2025-05-09 03:33:56 +02:00
Dag
83edf5a48b fix(CssSelector): html entity decode bug, fix #4484 (#4547) 2025-05-09 03:26:10 +02:00
Dag
66f1d449a7 test (#4546) 2025-05-09 02:15:28 +02:00
Petr Prenghy
908937383b [ElektroARGOSBridge] add new bridge - News, events and promotions on ARGOS electro shop (#4523) 2025-05-09 00:23:21 +02:00
Dag
67c5198cbb chore(fdroid): remove dead bridge (#4545) 2025-05-09 00:15:48 +02:00
Dag
9dc673a038 fix(github): PRs and issues (#4544) 2025-05-09 00:09:28 +02:00
Dag
58e30f8b4b fix(furaffinity): date and tags, #4513 (#4543) 2025-05-08 23:33:18 +02:00
Dag
e6a84052f0 fix(reddit): handle absent search keywords, #4502 (#4542) 2025-05-08 23:04:12 +02:00
Dag
e364dd1a20 fix(atom): omit item timestamp if absent (#4541)
prev behavior inserted current time, which seems wrong
2025-05-08 22:37:56 +02:00
tillcash
e69ceba237 [ZonebourseBridge] Add Bridge (#4501) 2025-05-08 22:15:55 +02:00
Dag
0d20a8c48c fix(telegram): trim username for convenience #4520 (#4521) 2025-04-16 02:47:57 +02:00
Petr Prenghy
a6ee840533 Update 06_Public_Hosts.md (#4519)
new mirror in The Czech Republic
2025-04-14 12:55:41 +02:00
Dag
95af1ffddf fix(reuters): tweak, try to avoid antibot (#4515) 2025-04-08 21:12:42 +02:00
July
d6a9da1cc8 [SubstackProfileBridge] Add new bridge (#4507) 2025-04-03 07:51:58 +02:00
Jonathan Kay
85962e18d3 [GoComicsBridge] New layout fix and added features (#4510)
* Updated to use the new layout launched April 1st
* Adds new title date/full name option
* Adds limit option for how many days of comics to get
2025-04-03 07:50:16 +02:00
July
a19b63e840 [AO3Bridge] Add option to make one entry per fic (#4508) 2025-04-02 04:09:28 +02:00
tillcash
5365b57638 [MinecraftBridge] fix favicon (#4506) 2025-04-02 03:57:40 +02:00
Dag
462c005f2c fix: dont read /etc if open_basedir #4502 (#4505) 2025-04-01 01:15:59 +02:00
ORelio
db42f2786c [FeedExpander] Add prepareXml() overridable function (#4485)
* FeedExpander: Remove tailing content in XML

- Move preprocessing code into overridable preprocessXml()
- Auto-remove trailing data after root xml node

* FeedExpander: Add PR reference with use case

* FeedExpander: Code linting

* [FeedExpander] Keep content at end of document for now

Will add back later if more sites have the same issue

* [FeedExpander] prepareXml: Add type hints
2025-04-01 00:42:08 +02:00
ORelio
26a4c255d3 [html] convertLazyLoading: Add parseSrcset() (#4503)
* [html] convertLazyLoading: Add parseSrcset()

Add srcset parser closer to the specifications

* [html] code linting

* [html] parseSrcset: Add type hints, check preg_match_all
2025-04-01 00:41:33 +02:00
subtle4553
3055e69c23 [ManyVidsBridge] Fix parsing of URL input (#4499) 2025-03-27 21:02:12 +01:00
tillcash
7c1e01b45a [MinecraftBridge] Add Bridge (#4497) 2025-03-26 19:46:02 +01:00
Dag
4d8a46d46e feat: add sanity check for required curl module (#4495) 2025-03-26 00:07:33 +01:00
Dag
9d6aa5ee38 fix: operator precedence bug (#4494) 2025-03-25 23:52:47 +01:00
subtle4553
1c45eff505 [ManyVidsBridge] Create proper feed content (#4493) 2025-03-25 23:34:19 +01:00
Joseph
68ff39e164 [TheFarSideBridge] Remove hotlink protection bypass (#4492) 2025-03-25 21:55:09 +01:00
mruac
abb1602524 fix #4475 (#4491)
* support embeds for feeds, lists and starter packs

* lint
2025-03-25 21:54:25 +01:00
Pavel Korytov
87112497de [AnthropicBridge] Delete bridges (#4490) 2025-03-25 21:52:53 +01:00
Niehztog
38bb5115c9 fix issues reported in https://github.com/RSS-Bridge/rss-bridge/issues/4477 (#4488) 2025-03-24 21:12:26 +01:00
Tomasz Molski
23cb9349fc [CeskaTelevizeBridge] Adjusted getting article timestamp (#4486)
* [CeskaTelevizeBridge] Adjusted getting article timestamp

* [CeskaTelevizeBridge] Removed excess whitespace
2025-03-23 21:30:45 +01:00
Pavel Korytov
05a9ac0f06 [OpenCVEBridge] Rewrite for API change (#4476)
* [OpenCVEBridge] Rewrite for API change

* [OpenCVEBridge] Fix lint
2025-03-23 21:01:21 +01:00
Dan Wainwright
91fe6c1fae [BazarakiBridge] Add new bridge (#4473)
* [BazarakiBridge] Add new bridge

* fix

---------

Co-authored-by: Dag <me@dvikan.no>
2025-03-23 20:57:17 +01:00
chibicitiberiu
7260f28e10 [RedditBridge] Added time interval and filter for min comment count (#4471)
* Reddit Bridge - added filter for min comment count and time interval.

* [RedditBridge] Add sort by comment count

* lint

* consistent commas

---------

Co-authored-by: Dag <me@dvikan.no>
2025-03-23 20:45:35 +01:00
Tomasz Molski
87ab1e4513 [BruegelBridge] Initial commit (#4470) 2025-03-23 19:50:11 +01:00
André Andersson
dee734d360 Add Auctionet bridge (#4452) 2025-03-05 19:41:24 +01:00
Latz
744f996224 Added bridge for Toms Touché (https://taz.de/#!tom=tomdestages) (#4438) 2025-03-05 19:39:18 +01:00
Pavel Korytov
f270cd35e7 [TldrTechBridge] Fix duplicate entries and empty sections (#4466) 2025-03-05 19:36:41 +01:00
Tomasz Molski
83c36a87e2 [ReutersBridge] Adjust Fact Check feed path (#4465) 2025-03-05 19:35:12 +01:00
Tomasz Molski
810e17b556 feat: added LeagueOfLegendsNewsBridge (#4462) 2025-03-05 19:34:35 +01:00
sysadminstory
97f07cf216 [InstagramBridge] Add a fallback to the "Username" mode (#4461)
- Added some header that could help Instagram to not block RSS Bridge
- Added a fallback function to use the "Embed profile" Instagram feature
  to get the content shared by one Instagram user
2025-03-05 19:32:03 +01:00
sysadminstory
62fafdc24b [FreeTelechargerBridge] Update URL and some fix (#4459)
- Updated the URL to the new URL in the bridge Meta Data
- Use an other URL that seems to permit to bypass CF protection
  (sometimes)
2025-03-05 19:30:38 +01:00
sysadminstory
cd4cdcfd65 [RadioMelodieBridge] Fix media content (#4458)
- Fix the audio source with the absolute URL
- Fix the pictture enclosure URL (those are already absolute URL)
2025-03-05 19:30:09 +01:00
Tobias Alexander Franke
00a24e2f69 New bridge for the latest Shadertoy submissions (#4456)
* New bridge for the latest Shadertoy submissions

* [ShadertoyBridge] Linter fixes

* [ShadertoyBridge] More Linter fixes

* [ShadertoyBridge] Even more Linter fixes
2025-02-26 10:20:28 +01:00
André Andersson
92b5e7093f Fix data-lot-id not being correctly set so use href instead (#4453) 2025-02-24 17:58:24 +01:00
Dag
b52f01505d fix(github): semi-repair (#4449) 2025-02-14 02:42:23 +01:00
Dag
e4c32bb046 fix(vk): semi-disable broken bridge (#4448) 2025-02-14 02:00:07 +01:00
Christian Schabesberger
dd4dcfa59c fix nn.de description and paywall filter (#4444) 2025-02-08 01:41:51 +01:00
Tostiman
4e678c955f fix CarThrottleBridge (#4442) 2025-02-05 18:41:42 +01:00
July
549bed64d2 [YouTubeFeedExpanderBridge] Add bridge (#4430) 2025-02-04 20:11:43 +01:00
sysadminstory
94924d8e16 [PepperBridgeAbstract, DealabsBridge, HotUKDealsBridge, MydealsBridge] Fix parameters typo (#4439)
Fixed typo in DealabsBridge and HotUKDealsBridge parameters name
2025-02-03 23:24:42 +01:00
sysadminstory
920b21b1fd [PepperBridgeAbstract, DealabsBridge, HotUKDealsBridge, MydealsBridge] Fixing bridge and add subcategories (#4436)
- Follow site change to get deal data (fix for #4432)
- Add Categories (sub categories in reality) support
2025-02-03 15:35:48 +01:00
Dag
935075072b fix: set default cache ttl of 1d (#4434) 2025-01-30 21:05:17 +01:00
July
3ae7a10223 [GovTrackBridge] Rebase on top of official RSS feed (#4429) 2025-01-29 11:11:25 +01:00
Tone
bf431a6eae [AnisearchBridge] changed id of div so trailers work again (#4428) 2025-01-27 21:55:34 +01:00
Dag
824ac5e373 docs (#4427)
* docs

* docs
2025-01-26 21:24:33 +01:00
Bartosz Sosna
ae8394d976 Fix lfc.pl bug with page content when comments exist (#4425)
* Add lfc.pl bridge

* Adjust bridge

* Add comments section

* Fix a bug with page content when comments exist

* Add brtsos to CONTRIBUTORS.md
2025-01-26 18:58:03 +01:00
Dag
4da61b7922 chore: prepare 2025-01-26 release (#4424) 2025-01-26 11:16:35 +01:00
burrow335
8b1ba003a8 Add support for custom feeds in posts (#4413) 2025-01-25 18:46:12 +01:00
Bartosz Sosna
230edf602e Add lfc.pl bridge (#4419)
* Add lfc.pl bridge

* Adjust bridge

* Add comments section
2025-01-25 18:43:27 +01:00
Eugene Molotov
bd7d1734c3 [RutubeBridge] Use publication time instead of creation time (#4417)
Publication time is shown in video page itself, so it is more essential
2025-01-25 18:40:13 +01:00
Dag
dd8bc077ed feat(FeedParser): recursively parse rss modules (#4422)
Also stop excluding the media module

fix #4415
2025-01-25 18:29:01 +01:00
SebLaus
952a2d99a3 Beginning of URL not needed anymore: ErrorMessage: cURL error Could not resolve host: www.bundestag.dehttps: 6 (https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://www.bundestag.dehttps://www.bundestag.de/parlament/praesidium/parteienfinanzierung/fundstellen50000/2025/2025-inhalt-1032412 (#4420) 2025-01-25 18:28:36 +01:00
Dag
58b3cfb158 fix: drop extension requirement in feed icon url, fix #4416 (#4421) 2025-01-25 17:43:03 +01:00
Eugene Molotov
028acd0af1 [VkBridge] Unassign maintainer (#4418) 2025-01-25 17:27:36 +01:00
axor-mst
2a58f82bd8 [Formula1Bridge] API key and URL format update (#4412)
* [Formula1Bridge] API key and URL format update

* [WorldCosplayBridge] Bridge removal
2025-01-20 17:32:41 +01:00
Simon Alberny
5214581386 Fix MondeDiplo empty date (#4407) 2025-01-15 20:50:56 +01:00
Sebastian Wolf
eadea242a7 [FragDenStaatBridge] remove bridge, site provides full feed at fragdenstaat.de/artikel/feed/ (#4405) 2025-01-12 17:03:27 +01:00
Pavel Korytov
1a2c1f5bba [OllamaBridge] Add bridge (#4403)
* [OllamaBridge] Add bridge

* [OllamaBridge] Fix typo
2025-01-10 20:28:58 +01:00
vdbhb59
776a1f47f3 Update 06_Public_Hosts.md (#4401)
Updated my hosting provider & country to reflect the correct details.
2025-01-10 13:08:35 +01:00
Tone
39ecd63f72 [GolemBridge.php] changed cookie (#4399)
the cookie value changed, without the new cookie it's not possible to parse the articles
2025-01-07 23:40:55 +01:00
Pavel Korytov
0e2655fc8a [AnthropicBridge] Add Anthropic Bridge (#4398)
* [AnthropicBridge] Add Anthropic Bridge

* [AnthropicBridge] Fix lint
2025-01-06 19:10:12 +01:00
Pavel Korytov
e355276378 [EconomistWorldInBriefBridge] Update bridge (#4397)
* [EconomistWorldInBriefBridge] Fix and update bridge

* [EconomistWorldInBriefBridge] Fix lint
2025-01-06 19:08:08 +01:00
Dag
cb65125dbd feat: add section link to frontpage bridge card (#4396) 2025-01-04 20:34:36 +01:00
Dag
1d02214e12 feat: extract simple_html_dom max_file_size to config (#4395) 2025-01-04 19:43:48 +01:00
Dag
48cb7d71ed feat(telegram): add pagination fetching of messages (#4394)
* feat(telegram): add pagination fetching of messages

* docs
2025-01-04 19:00:26 +01:00
Dag
f9e9c8101e Fix 257 (#4393)
* fix(tldrtech): trim duplicate leading slashes

* fix
2025-01-03 08:41:55 +01:00
Dag
97f7df0d06 feat(feedmerge): remove duplicates based off of title too (#4392) 2025-01-03 08:17:47 +01:00
Dag
db3899f2e6 fix(legifrance): emergency repair, still semi-broken (#4391) 2025-01-03 07:23:13 +01:00
Dag
d36cd0a332 fix(ceska): item image (#4390) 2025-01-03 07:11:08 +01:00
Dag
662e0bfa95 refactor(donnons) (#4389) 2025-01-03 06:49:10 +01:00
Dag
3fc38c15a3 fix: cache 400 and 404, and refactor token auth (#4388)
* fix(cache): also cache 400 and 404 responses

* refactor(token_auth)
2025-01-03 06:19:24 +01:00
Dag
be51ba17df fix(url): disallowed wonky path (#4386) 2025-01-03 05:40:30 +01:00
Dag
c44a76ff17 refactor: remove dead code (#4385) 2025-01-03 05:04:49 +01:00
Dag
7c6d4a932c fix: upgrade hardcoded version number, fix #4382 (#4384) 2025-01-03 01:58:38 +01:00
Sebastian Wolf
45ee018a6e [MixologyBridge] add null checks for author and timestamp elements (#4383)
* [MixologyBridge] add null checks for author and timestamp elements

* [MixologyBridge] fix formatting
2025-01-03 01:43:39 +01:00
Dag
e825272987 fix(rumble): exterminate double leading slashes in item url (#4381)
Fixed for items with pub date newer than 31. jan 2025
2025-01-02 18:22:47 +01:00
Niehztog
97eebfb562 [BlizzardNewsBridge] fix BlizzardNewsBridge (#4379)
* fix BlizzardNewsBridge

* fix linter warnings

* fix linter warnings

* fix linter warnings
2025-01-02 17:44:36 +01:00
mruac
2a44a006b2 Update BlueskyBridge.php (#4367)
* Update BlueskyBridge.php

* Used human readable terms
* Include quote and reply post
* Added video support
* Replaced Youtube embed with thumbnail preview
* Added link embed preview
* Included visible alt text to images

* appease the lint

* remove unused test code

* fix unset displayName

* appease the lint
2025-01-02 17:39:07 +01:00
Sebastian Wolf
974f00cd6a [MixologyBridge] adapt to latest site changes (#4368)
* [MixologyBridge] adapt to latest site changes

* [MixologyBridge] fix category selector
2025-01-02 17:17:54 +01:00
Quentin B.
4b4d622333 [CentreFranceBridge] Update parser to handle latest website layout changes (#4372) 2025-01-02 17:14:10 +01:00
Florent V.
b4a63e7040 [EdfPrices Bridge] add HC/HP, base and EJP (#4369)
* [EdfPrices Bridge] add HC/HP, base and EJP

* [EdfPrices Bridge] lint

* [EdfPrices Bridge] fix missing variable
2025-01-02 16:45:33 +01:00
Dag
7d544f1fab feat(reddit): support video (#4380) 2025-01-02 16:33:56 +01:00
256 changed files with 7429 additions and 2745 deletions

View File

@@ -1,8 +1,21 @@
FROM rssbridge/rss-bridge:latest
RUN apt-get update && \
apt-get install --yes --no-install-recommends \
git && \
pecl install xdebug && \
pear install PHP_CodeSniffer && \
docker-php-ext-enable xdebug
COPY --chmod=755 post-create-command.sh /usr/local/bin/post-create-command
ADD https://raw.githubusercontent.com/docker-library/php/master/docker-php-ext-enable /usr/local/bin/docker-php-ext-enable
RUN chmod u+x /usr/local/bin/docker-php-ext-enable
ADD https://getcomposer.org/installer /usr/local/bin/composer-installer.php
RUN chmod u+x /usr/local/bin/composer-installer.php
RUN php /usr/local/bin/composer-installer.php --check && \
php /usr/local/bin/composer-installer.php --filename=composer --install-dir=/usr/local/bin
RUN apt-get update && \
apt-get install -y \
git \
php-dev \
make \
unzip
RUN pecl install xdebug && \
PHP_INI_DIR=/etc/php/8.2/fpm docker-php-ext-enable xdebug

View File

@@ -6,9 +6,9 @@
"vscode": {
// Set *default* container specific settings.json values on container create.
"settings": {
"php.validate.executablePath": "/usr/local/bin/php",
"phpSniffer.executablesFolder": "/usr/local/bin/",
"phpcs.executablePath": "/usr/local/bin/phpcs",
"php.validate.executablePath": "/usr/bin/php",
"phpSniffer.executablesFolder": "/root/.config/composer/vendor/bin",
"phpcs.executablePath": "/root/.config/composer/vendor/bin/phpcs",
"phpcs.lintOnType": false
},
@@ -22,6 +22,9 @@
]
}
},
"remoteEnv": {
"PATH": "${containerEnv:PATH}:/root/.config/composer/vendor/bin",
},
"forwardPorts": [3100, 9000, 9003],
"postCreateCommand": "cp .devcontainer/nginx.conf /etc/nginx/conf.d/default.conf && cp .devcontainer/xdebug.ini /usr/local/etc/php/conf.d/xdebug.ini && mkdir .vscode && cp .devcontainer/launch.json .vscode && echo '*' > whitelist.txt && chmod a+x \"$(pwd)\" && rm -rf /var/www/html && ln -s \"$(pwd)\" /var/www/html && nginx && php-fpm -D"
"postCreateCommand": "/usr/local/bin/post-create-command"
}

View File

@@ -9,7 +9,8 @@
"type": "php",
"request": "launch",
"port": 9003,
"auto": true
"auto": true,
"log": true
},
{
"name": "Launch currently open script",

View File

@@ -0,0 +1,27 @@
#/bin/sh
cp .devcontainer/nginx.conf /etc/nginx/conf.d/default.conf
cp .devcontainer/xdebug.ini /etc/php/8.2/fpm/conf.d/xdebug.ini
# This should download some dev-dependencies, like phpunit and the PHP code sniffers
composer global require "phpunit/phpunit:^9"
composer global require "squizlabs/php_codesniffer:^3.6"
composer global require "phpcompatibility/php-compatibility:^9.3"
# We need to this manually for running the PHPCompatibility ruleset
phpcs --config-set installed_paths /root/.config/composer/vendor/phpcompatibility/php-compatibility
mkdir -p .vscode
cp .devcontainer/launch.json .vscode
echo '*' > whitelist.txt
chmod a+x $(pwd)
rm -rf /var/www/html
ln -s $(pwd) /var/www/html
# Solves possible issue of cache directory not being accessible
chown www-data:www-data -R $(pwd)/cache
nginx
php-fpm8.2 -D

View File

@@ -49,9 +49,9 @@ Please describe what you expect from the bridge. Whenever possible provide sampl
- _Default limit_: 5
- [ ] Load full articles
- _Cache articles_ (articles are stored in a local cache on first request): yes
- _Cache timeout_ (max = 24 hours): 24 hours
- _Cache timeout_ : 24 hours
- [X] Balance requests (RSS-Bridge uses cached versions to reduce bandwith usage)
- _Timeout_ (default = 5 minutes, max = 24 hours): 5 minutes
- _Timeout_ (default = 5 minutes): 5 minutes
<!--Be aware that some options might not be available for your specific request due to technical limitations!-->

35
.github/prtester.py vendored
View File

@@ -21,13 +21,10 @@ class Instance:
name = ''
url = ''
def main(instances: Iterable[Instance], with_upload: bool, with_reduced_upload: bool, title: str, output_file: str):
def main(instances: Iterable[Instance], with_artifacts: bool, with_reduced_artifacts: bool, artifacts_directory: str, artifacts_base_url: str, title: str, output_file: str):
start_date = datetime.now()
prid = os.getenv('PR')
artifact_base_url = f'https://rss-bridge.github.io/rss-bridge-tests/prs/{prid}'
artifact_directory = os.getcwd()
for file in glob.glob(f'*{ARTIFACT_FILE_EXTENSION}', root_dir=artifact_directory):
for file in glob.glob(f'*{ARTIFACT_FILE_EXTENSION}', root_dir=artifacts_directory):
os.remove(file)
table_rows = []
@@ -38,10 +35,10 @@ def main(instances: Iterable[Instance], with_upload: bool, with_reduced_upload:
table_rows += testBridges(
instance=instance,
bridge_cards=bridge_cards,
with_upload=with_upload,
with_reduced_upload=with_reduced_upload,
artifact_directory=artifact_directory,
artifact_base_url=artifact_base_url) # run the main scraping code with the list of bridges
with_artifacts=with_artifacts,
with_reduced_artifacts=with_reduced_artifacts,
artifacts_directory=artifacts_directory,
artifacts_base_url=artifacts_base_url) # run the main scraping code with the list of bridges
with open(file=output_file, mode='w+', encoding='utf-8') as file:
table_rows_value = '\n'.join(sorted(table_rows))
file.write(f'''
@@ -53,7 +50,7 @@ def main(instances: Iterable[Instance], with_upload: bool, with_reduced_upload:
*last change: {start_date.strftime("%A %Y-%m-%d %H:%M:%S")}*
'''.strip())
def testBridges(instance: Instance, bridge_cards: Iterable, with_upload: bool, with_reduced_upload: bool, artifact_directory: str, artifact_base_url: str) -> Iterable:
def testBridges(instance: Instance, bridge_cards: Iterable, with_artifacts: bool, with_reduced_artifacts: bool, artifacts_directory: str, artifacts_base_url: str) -> Iterable:
instance_suffix = ''
if instance.name:
instance_suffix = f' ({instance.name})'
@@ -155,12 +152,12 @@ def testBridges(instance: Instance, bridge_cards: Iterable, with_upload: bool, w
status_is_ok = status == '';
if status_is_ok:
status = '✔️'
if with_upload and (not with_reduced_upload or not status_is_ok):
if with_artifacts and (not with_reduced_artifacts or not status_is_ok):
filename = f'{bridge_name} {form_number}{instance_suffix}{ARTIFACT_FILE_EXTENSION}'
filename = re.sub(r'[^a-z0-9 \_\-\.]', '', filename, flags=re.I).replace(' ', '_')
with open(file=f'{artifact_directory}/{filename}', mode='wb') as file:
with open(file=f'{artifacts_directory}/{filename}', mode='wb') as file:
file.write(page_text)
artifact_url = f'{artifact_base_url}/{filename}'
artifact_url = f'{artifacts_base_url}/{filename}'
table_rows.append(f'| {bridge_name} | [{form_number} {context_name}{instance_suffix}]({artifact_url}) | {status} |')
form_number += 1
return table_rows
@@ -177,8 +174,10 @@ def getFirstLine(value: str) -> str:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--instances', nargs='+')
parser.add_argument('--no-upload', action='store_true')
parser.add_argument('--reduced-upload', action='store_true')
parser.add_argument('--no-artifacts', action='store_true')
parser.add_argument('--reduced-artifacts', action='store_true')
parser.add_argument('--artifacts-directory', default=os.getcwd())
parser.add_argument('--artifacts-base-url', default='')
parser.add_argument('--title', default='Pull request artifacts')
parser.add_argument('--output-file', default=os.getcwd() + '/comment.txt')
args = parser.parse_args()
@@ -201,8 +200,10 @@ if __name__ == '__main__':
instances.append(instance)
main(
instances=instances,
with_upload=not args.no_upload,
with_reduced_upload=args.reduced_upload and not args.no_upload,
with_artifacts=not args.no_artifacts,
with_reduced_artifacts=args.reduced_artifacts and not args.no_artifacts,
artifacts_directory=args.artifacts_directory,
artifacts_base_url=args.artifacts_base_url,
title=args.title,
output_file=args.output_file
);

View File

@@ -8,7 +8,7 @@ on:
jobs:
phpcs:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
strategy:
matrix:
php-versions: ['7.4']
@@ -21,7 +21,7 @@ jobs:
- run: phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p
phpcompatibility:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
strategy:
matrix:
php-versions: ['7.4']
@@ -36,7 +36,7 @@ jobs:
- run: ~/.composer/vendor/bin/phpcs . --standard=phpcompatibility.xml --warning-severity=0 --extensions=php -p
executable_php_files_check:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- run: |

View File

@@ -5,24 +5,29 @@ on:
branches: [ master ]
jobs:
check-bridges:
checks:
name: Check if bridges were changed
runs-on: ubuntu-latest
outputs:
BRIDGES: ${{ steps.check1.outputs.BRIDGES }}
BRIDGES: ${{ steps.check_bridges.outputs.BRIDGES }}
WITH_UPLOAD: ${{ steps.check_upload.outputs.WITH_UPLOAD }}
steps:
- name: Check number of bridges
id: check1
id: check_bridges
run: |
PR=${{github.event.number}};
PR=${{ github.event.number || 'none' }};
wget https://patch-diff.githubusercontent.com/raw/$GITHUB_REPOSITORY/pull/$PR.patch;
bridgeamount=$(cat $PR.patch | grep "\bbridges/[A-Za-z0-9]*Bridge\.php\b" | sed "s=.*\bbridges/\([A-Za-z0-9]*\)Bridge\.php\b.*=\1=g" | sort | uniq | wc -l);
echo "BRIDGES=$bridgeamount" >> "$GITHUB_OUTPUT"
- name: "Check upload token secret RSSTESTER_ACTION is set"
id: check_upload
run: |
echo "WITH_UPLOAD=$([ -n "${{ secrets.RSSTESTER_ACTION }}" ] && echo "true" || echo "false")" >> "$GITHUB_OUTPUT"
test-pr:
name: Generate HTML
runs-on: ubuntu-latest
needs: check-bridges
if: needs.check-bridges.outputs.BRIDGES > 0
needs: checks
if: needs.checks.outputs.BRIDGES > 0
env:
PYTHONUNBUFFERED: 1
# Needs additional permissions https://github.com/actions/first-interaction/issues/10#issuecomment-1041402989
@@ -34,7 +39,7 @@ jobs:
repository: ${{github.event.pull_request.head.repo.full_name}}
- name: Check out rss-bridge
run: |
PR=${{github.event.number}};
PR=${{ github.event.number || 'none' }};
wget -O requirements.txt https://raw.githubusercontent.com/$GITHUB_REPOSITORY/${{ github.event.pull_request.base.ref }}/.github/prtester-requirements.txt;
wget https://raw.githubusercontent.com/$GITHUB_REPOSITORY/${{ github.event.pull_request.base.ref }}/.github/prtester.py;
wget https://patch-diff.githubusercontent.com/raw/$GITHUB_REPOSITORY/pull/$PR.patch;
@@ -60,14 +65,12 @@ jobs:
id: testrun
run: |
mkdir results;
python prtester.py;
python prtester.py --artifacts-base-url "https://${{ github.repository_owner }}.github.io/${{ vars.ARTIFACTS_REPO || 'rss-bridge-tests' }}/prs/${{ github.event.number || 'none' }}";
body="$(cat comment.txt)";
body="${body//'%'/'%25'}";
body="${body//$'\n'/'%0A'}";
body="${body//$'\r'/'%0D'}";
echo "bodylength=${#body}" >> $GITHUB_OUTPUT
env:
PR: ${{ github.event.number }}
- name: Upload generated tests
uses: actions/upload-artifact@v4
id: upload-generated-tests
@@ -94,33 +97,31 @@ jobs:
name: Upload tests
runs-on: ubuntu-latest
needs: test-pr
if: needs.checks.outputs.WITH_UPLOAD == 'true'
steps:
- uses: actions/checkout@v4
with:
repository: 'RSS-Bridge/rss-bridge-tests'
repository: "${{ github.repository_owner }}/${{ vars.ARTIFACTS_REPO || 'rss-bridge-tests' }}"
ref: 'main'
token: ${{ secrets.RSSTESTER_ACTION }}
- name: Setup git config
run: |
git config --global user.name "GitHub Actions"
git config --global user.email "<>"
- name: Download tests
uses: actions/download-artifact@v4
with:
name: tests
- name: Move tests
run: |
cd prs
mkdir -p ${{github.event.number}}
cd ${{github.event.number}}
DIRECTORY="$GITHUB_WORKSPACE/prs/${{ github.event.number || 'none' }}"
rm -rf $DIRECTORY
mkdir -p $DIRECTORY
cd $DIRECTORY
mv -f $GITHUB_WORKSPACE/*.html .
- name: Commit and push generated tests
run: |
export COMMIT_MESSAGE="Added tests for PR ${{github.event.number}}"
export COMMIT_MESSAGE="Added tests for PR ${{ github.event.number || 'none' }}"
git add .
git commit -m "$COMMIT_MESSAGE"
git commit -m "$COMMIT_MESSAGE" || exit 0
git push

View File

@@ -8,10 +8,10 @@ on:
jobs:
phpunit8:
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
strategy:
matrix:
php-versions: ['7.4', '8.0', '8.1']
php-versions: ['7.4', '8.0', '8.1', '8.2', '8.3', '8.4']
steps:
- uses: actions/checkout@v4
- uses: shivammathur/setup-php@v2

View File

@@ -15,7 +15,7 @@
* [Astalaseven](https://github.com/Astalaseven)
* [Astyan-42](https://github.com/Astyan-42)
* [austinhuang0131](https://github.com/austinhuang0131)
* [AxorPL](https://github.com/AxorPL)
* [axor-mst](https://github.com/axor-mst)
* [ayacoo](https://github.com/ayacoo)
* [az5he6ch](https://github.com/az5he6ch)
* [b1nj](https://github.com/b1nj)
@@ -23,6 +23,7 @@
* [Binnette](https://github.com/Binnette)
* [BoboTiG](https://github.com/BoboTiG)
* [Bockiii](https://github.com/Bockiii)
* [brtsos](https://github.com/brtsos)
* [captn3m0](https://github.com/captn3m0)
* [chemel](https://github.com/chemel)
* [Chouchen](https://github.com/Chouchen)

View File

@@ -25,36 +25,39 @@ RUN set -xe && \
# php-zlib is enabled by default with PHP 8.2 in Debian 12
# for downloading libcurl-impersonate
curl \
# for patching libcurl-impersonate
patchelf \
&& \
# install curl-impersonate library
curlimpersonate_version=0.6.0 && \
curlimpersonate_version=1.0.0rc2 && \
{ \
{ \
[ $(arch) = 'aarch64' ] && \
archive="libcurl-impersonate-v${curlimpersonate_version}.aarch64-linux-gnu.tar.gz" && \
sha512sum="d04b1eabe71f3af06aa1ce99b39a49c5e1d33b636acedcd9fad163bc58156af5c3eb3f75aa706f335515791f7b9c7a6c40ffdfa47430796483ecef929abd905d" \
sha512sum="c8add80e7a0430a074edea1a11f73d03044c48e848e164af2d6f362866623e29bede207a50f18f95b1bc5ab3d33f5c31408be60a6da66b74a0d176eebe299116" \
; } \
|| { \
[ $(arch) = 'armv7l' ] && \
archive="libcurl-impersonate-v${curlimpersonate_version}.arm-linux-gnueabihf.tar.gz" && \
sha512sum="05906b4efa1a6ed8f3b716fd83d476b6eea6bfc68e3dbc5212d65a2962dcaa7bd1f938c9096a7535252b11d1d08fb93adccc633585ff8cb8cec5e58bfe969bc9" \
sha512sum="d0403ca4ad55a8d499b120e5675c7b5a0dc4946af49c933e91fc24455ffe5e122aa21ee95554612ff5d1bd6faea1556e1e1b9c821918e2644cc9bcbddc05747a" \
; } \
|| { \
[ $(arch) = 'x86_64' ] && \
archive="libcurl-impersonate-v${curlimpersonate_version}.x86_64-linux-gnu.tar.gz" && \
sha512sum="480bbe9452cd9aff2c0daaaf91f1057b3a96385f79011628a9237223757a9b0d090c59cb5982dc54ea0d07191657299ea91ca170a25ced3d7d410fcdff130ace" \
sha512sum="35cafda2b96df3218a6d8545e0947a899837ede51c90f7ef2980bd2d99dbd67199bc620000df28b186727300b8c7046d506807fb48ee0fbc068dc8ae01986339" \
; } \
} && \
curl -LO "https://github.com/lwthiker/curl-impersonate/releases/download/v${curlimpersonate_version}/${archive}" && \
curl -LO "https://github.com/lexiforest/curl-impersonate/releases/download/v${curlimpersonate_version}/${archive}" && \
echo "$sha512sum $archive" | sha512sum -c - && \
mkdir -p /usr/local/lib/curl-impersonate && \
tar xaf "$archive" -C /usr/local/lib/curl-impersonate --wildcards 'libcurl-impersonate-ff.so*' && \
tar xaf "$archive" -C /usr/local/lib/curl-impersonate && \
patchelf --set-soname libcurl.so.4 /usr/local/lib/curl-impersonate/libcurl-impersonate.so && \
rm "$archive" && \
apt-get purge --assume-yes curl && \
apt-get purge --assume-yes curl patchelf && \
rm -rf /var/lib/apt/lists/*
ENV LD_PRELOAD /usr/local/lib/curl-impersonate/libcurl-impersonate-ff.so
ENV CURL_IMPERSONATE ff91esr
ENV LD_PRELOAD /usr/local/lib/curl-impersonate/libcurl-impersonate.so
ENV CURL_IMPERSONATE chrome131
# logs should go to stdout / stderr
RUN ln -sfT /dev/stderr /var/log/nginx/error.log; \

View File

@@ -29,7 +29,7 @@ Requires minimum PHP 7.4.
|![Screenshot #3](/static/screenshot-3.png?raw=true)|![Screenshot #4](/static/screenshot-4.png?raw=true)|
|![Screenshot #5](/static/screenshot-5.png?raw=true)|![Screenshot #6](/static/screenshot-6.png?raw=true)|
## A subset of bridges (16/447)
## A subset of bridges (15/447)
* `CssSelectorBridge`: [Scrape out a feed using CSS selectors](https://rss-bridge.org/bridge01/#bridge-CssSelectorBridge)
* `FeedMergeBridge`: [Combine multiple feeds into one](https://rss-bridge.org/bridge01/#bridge-FeedMergeBridge)
@@ -44,10 +44,9 @@ Requires minimum PHP 7.4.
* `ThePirateBayBridge:` [Fetches torrents by search/user/category](https://rss-bridge.org/bridge01/#bridge-ThePirateBayBridge)
* `TikTokBridge`: [Fetches posts by username](https://rss-bridge.org/bridge01/#bridge-TikTokBridge)
* `TwitchBridge`: [Fetches videos from channel](https://rss-bridge.org/bridge01/#bridge-TwitchBridge)
* `VkBridge`: [Fetches posts from user/group](https://rss-bridge.org/bridge01/#bridge-VkBridge)
* `XPathBridge`: [Scrape out a feed using XPath expressions](https://rss-bridge.org/bridge01/#bridge-XPathBridge)
* `YoutubeBridge`: [Fetches videos by username/channel/playlist/search](https://rss-bridge.org/bridge01/#bridge-YoutubeBridge)
* `YouTubeCommunityTabBridge`: [Fetches posts from a channel's community tab](https://rss-bridge.org/bridge01/#bridge-YouTubeCommunityTabBridge)
* `YouTubeCommunityTabBridge`: [Fetches posts from a channel's Posts tab](https://rss-bridge.org/bridge01/#bridge-YouTubeCommunityTabBridge)
## Tutorial
@@ -72,27 +71,27 @@ useradd --shell /bin/bash --create-home rss-bridge
cd /var/www
# Create folder and change ownership
# Create folder and change its ownership to rss-bridge
mkdir rss-bridge && chown rss-bridge:rss-bridge rss-bridge/
# Become user
# Become rss-bridge
su rss-bridge
# Fetch latest master
# Clone master branch into existing folder
git clone https://github.com/RSS-Bridge/rss-bridge.git rss-bridge/
cd rss-bridge
# Copy over the default config
# Copy over the default config (OPTIONAL)
cp -v config.default.ini.php config.ini.php
# Give full permissions only to owner (rss-bridge)
chmod 700 -R ./
# Recursively give full permissions to user/owner
chmod 700 --recursive ./
# Give read and execute to others (nginx and php-fpm)
# Give read and execute to others on folder ./static
chmod o+rx ./ ./static
# Give read to others (nginx)
chmod o+r -R ./static
# Recursively give give read to others on folder ./static
chmod o+r --recursive ./static
```
Nginx config:
@@ -110,17 +109,14 @@ server {
error_log /var/log/nginx/rss-bridge.error.log;
log_not_found off;
# Intentionally not setting a root folder here
# autoindex is off by default but feels good to explicitly turn off
autoindex off;
# Intentionally not setting a root folder
# Static content only served here
location /static/ {
alias /var/www/rss-bridge/static/;
}
# Pass off to php-fpm when location is exactly /
# Pass off to php-fpm only when location is EXACTLY == /
location = / {
root /var/www/rss-bridge/;
include snippets/fastcgi-php.conf;
@@ -128,12 +124,12 @@ server {
fastcgi_pass unix:/run/php/rss-bridge.sock;
}
# Reduce spam
# Reduce log noise
location = /favicon.ico {
access_log off;
}
# Reduce spam
# Reduce log noise
location = /robots.txt {
access_log off;
}
@@ -154,11 +150,11 @@ listen = /run/php/rss-bridge.sock
listen.owner = www-data
listen.group = www-data
# Create 10 workers standing by to serve requests
; Create 10 workers standing by to serve requests
pm = static
pm.max_children = 10
# Respawn worker after 500 requests (workaround for memory leaks etc.)
; Respawn worker after 500 requests (workaround for memory leaks etc.)
pm.max_requests = 500
```
@@ -325,13 +321,23 @@ The sqlite files (db, wal and shm) are not writeable.
rm cache/*
### How to create a new bridge from scratch
### How to create a completely new bridge
New code files MUST have `declare(strict_types=1);` at the top of file:
```php
<?php
declare(strict_types=1);
```
Create the new bridge in e.g. `bridges/BearBlogBridge.php`:
```php
<?php
declare(strict_types=1);
class BearBlogBridge extends BridgeAbstract
{
const NAME = 'BearBlog (bearblog.dev)';
@@ -363,14 +369,6 @@ enabled_bridges[] = TwitchBridge
enabled_bridges[] = GettrBridge
```
### How to enable debug mode
The
[debug mode](https://rss-bridge.github.io/rss-bridge/For_Developers/Debug_mode.html)
disables the majority of caching operations.
enable_debug_mode = true
### How to switch to memcached as cache backend
```
@@ -464,7 +462,6 @@ See [CONTRIBUTORS.md](CONTRIBUTORS.md)
RSS-Bridge uses caching to prevent services from banning your server for repeatedly updating feeds.
The specific cache duration can be different between bridges.
Cached files are deleted automatically after 24 hours.
RSS-Bridge allows you to take full control over which bridges are displayed to the user.
That way you can host your own RSS-Bridge service with your favorite collection of bridges!

View File

@@ -22,8 +22,8 @@ class ConnectivityAction implements ActionInterface
public function __invoke(Request $request): Response
{
if (!Debug::isEnabled()) {
return new Response('This action is only available in debug mode!', 403);
if (Configuration::getConfig('system', 'env') !== 'dev') {
return new Response('This action is only available in dev environment!', 403);
}
$bridgeName = $request->get('bridge');

View File

@@ -23,7 +23,7 @@ class DisplayAction implements ActionInterface
$noproxy = $request->get('_noproxy');
if (!$bridgeName) {
return new Response(render(__DIR__ . '/../templates/error.html.php', ['message' => 'Missing bridge parameter']), 400);
return new Response(render(__DIR__ . '/../templates/error.html.php', ['message' => 'Missing bridge name parameter']), 400);
}
$bridgeClassName = $this->bridgeFactory->createBridgeClassName($bridgeName);
if (!$bridgeClassName) {
@@ -89,12 +89,12 @@ class DisplayAction implements ActionInterface
$bridge->collectData();
$items = $bridge->getItems();
} catch (\Throwable $e) {
if ($e instanceof RateLimitException) {
// These are internally generated by bridges
$this->logger->info(sprintf('RateLimitException in DisplayAction(%s): %s', $bridge->getShortName(), create_sane_exception_message($e)));
if ($e instanceof ClientException) {
$this->logger->debug(sprintf('Exception in DisplayAction(%s): %s', $bridge->getShortName(), create_sane_exception_message($e)));
} elseif ($e instanceof RateLimitException) {
$this->logger->debug(sprintf('Exception in DisplayAction(%s): %s', $bridge->getShortName(), create_sane_exception_message($e)));
return new Response(render(__DIR__ . '/../templates/exception.html.php', ['e' => $e]), 429);
}
if ($e instanceof HttpException) {
} elseif ($e instanceof HttpException) {
if (in_array($e->getCode(), [429, 503])) {
// Log with debug, immediately reproduce and return
$this->logger->debug(sprintf('Exception in DisplayAction(%s): %s', $bridge->getShortName(), create_sane_exception_message($e)));
@@ -102,7 +102,6 @@ class DisplayAction implements ActionInterface
}
// Some other status code which we let fail normally (but don't log it)
} else {
// Log error if it's not an HttpException
$this->logger->error(sprintf('Exception in DisplayAction(%s)', $bridge->getShortName()), ['e' => $e]);
}
$errorOutput = Configuration::getConfig('error', 'output');

View File

@@ -12,7 +12,7 @@ final class FrontpageAction implements ActionInterface
public function __invoke(Request $request): Response
{
$token = $request->attribute('token');
$token = $request->getAttribute('token');
$messages = [];
$activeBridges = 0;

View File

@@ -27,6 +27,13 @@ class AO3Bridge extends BridgeAbstract
'Entire work' => 'all',
],
],
'unique' => [
'name' => 'Make separate entries for new fic chapters',
'type' => 'checkbox',
'required' => false,
'title' => 'Make separate entries for new fic chapters',
'defaultValue' => 'checked',
],
'limit' => self::LIMIT,
],
'Bookmarks' => [
@@ -118,7 +125,12 @@ class AO3Bridge extends BridgeAbstract
$chapters = $element->find('dl dd.chapters', 0);
// bookmarked series and external works do not have a chapters count
$chapters = (isset($chapters) ? $chapters->plaintext : 0);
$item['uid'] = $item['uri'] . "/$strdate/$chapters";
if ($this->getInput('unique')) {
$item['uid'] = $item['uri'] . "/$strdate/$chapters";
} else {
$item['uid'] = $item['uri'];
}
// Fetch workskin of desired chapter(s) in list
if ($this->getInput('range') && ($limit == 0 || $count++ < $limit)) {

View File

@@ -71,7 +71,7 @@ class ARDAudiothekBridge extends BridgeAbstract
$pathComponents = explode('/', $path);
if (empty($pathComponents)) {
returnClientError('Path may not be empty');
throwClientException('Path may not be empty');
}
if (count($pathComponents) < 2) {
$showID = $pathComponents[0];

View File

@@ -65,7 +65,7 @@ class ARDMediathekBridge extends BridgeAbstract
$pathComponents = explode('/', $this->getInput('path'));
if (empty($pathComponents)) {
returnClientError('Path may not be empty');
throwClientException('Path may not be empty');
}
if (count($pathComponents) < 2) {
$showID = $pathComponents[0];

View File

@@ -32,8 +32,7 @@ class AirBreizhBridge extends BridgeAbstract
public function collectData()
{
$html = '';
$html = getSimpleHTMLDOM(static::URI . 'publications/?fwp_publications_thematiques=' . $this->getInput('theme'))
or returnClientError('No results for this query.');
$html = getSimpleHTMLDOM(static::URI . 'publications/?fwp_publications_thematiques=' . $this->getInput('theme'));
foreach ($html->find('article') as $article) {
$item = [];

View File

@@ -15,8 +15,8 @@ class AllegroBridge extends BridgeAbstract
],
'cookie' => [
'name' => 'The complete cookie value',
'title' => 'Paste the value of the cookie value from your browser if you want to prevent Allegro imposing rate limits',
'required' => false,
'title' => 'Paste the cookie value from your browser, otherwise 403 gets returned',
'required' => true,
],
'includeSponsoredOffers' => [
'type' => 'checkbox',
@@ -65,93 +65,56 @@ class AllegroBridge extends BridgeAbstract
$url = preg_replace('/([?&])order=[^&]+(&|$)/', '$1', $this->getInput('url'));
$url .= (parse_url($url, PHP_URL_QUERY) ? '&' : '?') . 'order=n';
$opts = [];
$html = getContents($url, [], [CURLOPT_COOKIE => $this->getInput('cookie')]);
// If a cookie is provided
if ($cookie = $this->getInput('cookie')) {
$opts[CURLOPT_COOKIE] = $cookie;
$storeData = null;
if (preg_match('/<script[^>]*>\s*(\{\s*?"__listing_StoreState".*\})\s*<\/script>/i', $html, $match)) {
$data = json_decode($match[1], true);
$storeData = $data['__listing_StoreState'] ?? null;
}
$html = getSimpleHTMLDOM($url, [], $opts);
foreach ($storeData['items']['elements'] as $elements) {
if (!array_key_exists('offerId', $elements)) {
continue;
}
if (!$this->getInput('includeSponsoredOffers') && $elements['isSponsored']) {
continue;
}
if (!$this->getInput('includePromotedOffers') && $elements['promoted']) {
continue;
}
# if no results found
if ($html->find('.mzmg_6m.m9qz_yo._6a66d_-fJr5')) {
return;
}
$results = $html->find('article[data-analytics-view-custom-context="REGULAR"]');
if ($this->getInput('includeSponsoredOffers')) {
$results = array_merge($results, $html->find('article[data-analytics-view-custom-context="SPONSORED"]'));
}
if ($this->getInput('includePromotedOffers')) {
$results = array_merge($results, $html->find('article[data-analytics-view-custom-context="PROMOTED"]'));
}
foreach ($results as $post) {
$item = [];
$item['uid'] = $elements['offerId'];
$item['uri'] = $elements['url'];
$item['title'] = $elements['alt'];
$item['uid'] = $post->{'data-analytics-view-value'};
$item_link = $post->find('a[href*="' . $item['uid'] . '"], a[href*="allegrolokalnie"]', 0);
$item['uri'] = $item_link->href;
$item['title'] = $item_link->find('img', 0)->alt;
$image = $item_link->find('img', 0)->{'data-src'} ?: $item_link->find('img', 0)->src ?? false;
$image = $elements['photos'][0]['medium'];
if ($image) {
$item['enclosures'] = [$image . '#.image'];
}
$price = $post->{'data-analytics-view-json-custom-price'};
if ($price) {
$priceDecoded = json_decode(html_entity_decode($price));
$price = $priceDecoded->amount . ' ' . $priceDecoded->currency;
$price = $elements['price']['mainPrice']['amount'];
$currency = $elements['price']['mainPrice']['currency'];
$sellerType = $elements['seller']['title'];
$item['categories'] = [$sellerType];
$description = '';
foreach ($elements['parameters'] as $parameter) {
$item['categories'] = array_merge($item['categories'], $parameter['values']);
$description .= '<dt>' . $parameter['name'] . ': ' . implode(',', $parameter['values']) . '</dt>';
}
$descriptionPatterns = ['/<\s*dt[^>]*>\b/', '/<\/dt>/', '/<\s*dd[^>]*>\b/', '/<\/dd>/'];
$descriptionReplacements = ['<span>', ':</span> ', '<strong>', '&emsp;</strong> '];
$description = $post->find('.m7er_k4.mpof_5r.mpof_z0_s', 0)->innertext;
$descriptionPretty = preg_replace($descriptionPatterns, $descriptionReplacements, $description);
$pricingExtraInfo = array_filter($post->find('.mqu1_g3.mgn2_12'), function ($node) {
return empty($node->find('.mvrt_0'));
});
$pricingExtraInfo = $pricingExtraInfo[0]->plaintext ?? '';
$offerExtraInfo = array_map(function ($node) {
return str_contains($node->plaintext, 'zapłać później') ? '' : $node->outertext;
}, $post->find('div.mpof_ki.mwdn_1.mj7a_4.mgn2_12'));
$isSmart = $post->find('img[alt="Smart!"]', 0) ?? false;
if ($isSmart) {
$pricingExtraInfo .= $isSmart->outertext;
}
$item['categories'] = [];
$parameters = $post->find('dd');
foreach ($parameters as $parameter) {
if (in_array(strtolower($parameter->innertext), ['brak', 'nie'])) {
continue;
}
$item['categories'][] = $parameter->innertext;
}
$item['content'] = $descriptionPretty
. '<div><strong>'
. $price
. '</strong></div><div>'
. implode('</div><div>', $offerExtraInfo)
. '</div><dl>'
. $pricingExtraInfo
$item['content'] = '<div><strong>'
. $price . ' ' . $currency
. '</strong></div><dl><dt>'
. $sellerType . '</dt>'
. $description
. '</dl><hr>';
$this->items[] = $item;
}
}
}

View File

@@ -57,7 +57,7 @@ class AllocineFRBridge extends BridgeAbstract
if (array_key_exists($category, $categories)) {
return static::URI . $this->getLastSeasonURI($categories[$category]);
} else {
returnClientError('Emission inconnue');
throwClientException('Emission inconnue');
}
}

View File

@@ -2,7 +2,7 @@
class AmazonPriceTrackerBridge extends BridgeAbstract
{
const MAINTAINER = 'captn3m0, sal0max';
const MAINTAINER = 'captn3m0, sal0max, bagnacauda';
const NAME = 'Amazon Price Tracker';
const URI = 'https://www.amazon.com/';
const CACHE_TIMEOUT = 3600; // 1h
@@ -13,7 +13,7 @@ class AmazonPriceTrackerBridge extends BridgeAbstract
'asin' => [
'name' => 'ASIN',
'required' => true,
'exampleValue' => 'B071GB1VMQ',
'exampleValue' => 'B0923XT6K7',
// https://stackoverflow.com/a/12827734
'pattern' => 'B[\dA-Z]{9}|\d{9}(X|\d)',
],
@@ -146,7 +146,7 @@ EOT;
{
$uri = $this->getURI();
return getSimpleHTMLDOM($uri) ?: returnServerError('Could not request Amazon.');
return getSimpleHTMLDOM($uri);
}
private function scrapePriceFromMetrics($html)
@@ -169,19 +169,23 @@ EOT;
private function scrapePriceTwister($html)
{
$str = $html->find('.twister-plus-buying-options-price-data', 0);
$json = $html->find('.twister-plus-buying-options-price-data', 0);
if ($json == null) {
return null;
}
$data = json_decode($str->innertext, true);
if (count($data) === 1) {
$data = $data[0];
$data = json_decode($json->innertext, true);
foreach ($data as $key => $value) {
$value = $value[0];
return [
'displayPrice' => $data['displayPrice'],
'currency' => $data['currency'],
'shipping' => '0',
'displayPrice' => $value['displayPrice'],
'price' => $value['priceAmount'],
'currency' => $value['currencySymbol'],
'shipping' => null,
];
}
return false;
return null;
}
private function scrapePriceGeneric($html)
@@ -206,9 +210,21 @@ EOT;
}
$priceString = str_replace(str_split(self::WHITESPACE), '', $priceDiv->plaintext);
preg_match('/(\d+\.\d{0,2})/', $priceString, $matches);
$price = null;
$priceFound = false;
// find longest repeated string
for ($offset = 0; $offset < strlen($priceString); $offset++) {
for ($length = 1; substr_count($priceString, substr($priceString, $offset, $length + 1)) >= 2; $length++) {
$priceFound = true;
}
if ($priceFound) {
$price = substr($priceString, $offset, $length);
break;
}
}
$price = $matches[0] ?? null;
$currency = str_replace($price, '', $priceString);
if ($price != null && $currency != null) {
@@ -216,7 +232,7 @@ EOT;
'price' => $price,
'displayPrice' => null,
'currency' => $currency,
'shipping' => '0'
'shipping' => null
];
}
return $default;
@@ -227,7 +243,7 @@ EOT;
$html = $this->getHtml();
$this->title = $this->getTitle($html);
$image = $this->getImage($html);
$data = $this->scrapePriceGeneric($html);
$data = $this->scrapePriceTwister($html) ?? $this->scrapePriceGeneric($html);
// render
$content = '';
@@ -236,7 +252,7 @@ EOT;
$price = sprintf('%s %s', $data['price'], $data['currency']);
}
$content .= sprintf('%s<br>Price: %s', $image, $price);
if ($data['shipping'] !== '0') {
if ($data['shipping'] !== null) {
$content .= sprintf('<br>Shipping: %s %s</br>', $data['shipping'], $data['currency']);
}

View File

@@ -152,7 +152,7 @@ class AnidexBridge extends BridgeAbstract
}
}
if (empty($results) && empty($this->getInput('q'))) {
returnServerError('No results from Anidex: ' . $search_url);
throwServerException('No results from Anidex: ' . $search_url);
}
//Process each item individually

View File

@@ -67,7 +67,7 @@ class AnisearchBridge extends BridgeAbstract
$trailerlink = $domarticle->find('section#trailers > div > div.swiper > ul.swiper-wrapper > li.swiper-slide > a', 0);
if (isset($trailerlink)) {
$trailersite = getSimpleHTMLDOM($baseurl . $trailerlink->href);
$trailer = $trailersite->find('div#player > iframe', 0);
$trailer = $trailersite->find('div#video > iframe', 0);
$trailer = $trailer->{'data-xsrc'};
$ytlink = <<<EOT
<br /><iframe width="560" height="315" src="$trailer" title="YouTube video player"

View File

@@ -126,7 +126,7 @@ class AnnasArchiveBridge extends BridgeAbstract
return;
}
$elements = $list->find('.w-full > .mb-4 > div');
$elements = $list->find('#aarecord-list > div');
foreach ($elements as $element) {
// stop added entries once partial match list starts
if (str_contains($element->innertext, 'partial match')) {

View File

@@ -52,120 +52,183 @@ class AppleAppStoreBridge extends BridgeAbstract
],
'defaultValue' => 'US',
],
'debug' => [
'name' => 'Debug Mode',
'type' => 'checkbox',
'defaultValue' => false
]
]];
const PLATFORM_MAPPING = [
'iphone' => 'ios',
'ipad' => 'ios',
'iphone' => 'ios',
'ipad' => 'ios',
'mac' => 'osx'
];
private function makeHtmlUrl($id, $country)
private $name;
private function makeHtmlUrl()
{
return 'https://apps.apple.com/' . $country . '/app/id' . $id;
$id = $this->getInput('id');
$country = $this->getInput('country');
return sprintf('https://apps.apple.com/%s/app/id%s', $country, $id);
}
private function makeJsonUrl($id, $platform, $country)
{
return "https://amp-api.apps.apple.com/v1/catalog/$country/apps/$id?platform=$platform&extend=versionHistory";
}
public function getName()
{
if (isset($this->name)) {
return $this->name . ' - AppStore Updates';
}
return parent::getName();
}
/**
* In case of some platforms, the data is present in the initial response
*/
private function getDataFromShoebox($id, $platform, $country)
{
$uri = $this->makeHtmlUrl($id, $country);
$html = getSimpleHTMLDOMCached($uri, 3600);
$script = $html->find('script[id="shoebox-ember-data-store"]', 0);
$json = json_decode($script->innertext, true);
return $json['data'];
}
private function getJWTToken($id, $platform, $country)
{
$uri = $this->makeHtmlUrl($id, $country);
$html = getSimpleHTMLDOMCached($uri, 3600);
$meta = $html->find('meta[name="web-experience-app/config/environment"]', 0);
$json = urldecode($meta->content);
$json = json_decode($json);
return $json->MEDIA_API->token;
}
private function getAppData($id, $platform, $country, $token)
{
$uri = $this->makeJsonUrl($id, $platform, $country);
$headers = [
"Authorization: Bearer $token",
'Origin: https://apps.apple.com',
];
$json = json_decode(getContents($uri, $headers), true);
return $json['data'][0];
}
/**
* Parses the version history from the data received
* @return array list of versions with details on each element
*/
private function getVersionHistory($data, $platform)
{
switch ($platform) {
case 'mac':
return $data['relationships']['platforms']['data'][0]['attributes']['versionHistory'];
default:
$os = self::PLATFORM_MAPPING[$platform];
return $data['attributes']['platformAttributes'][$os]['versionHistory'];
}
}
public function collectData()
private function makeJsonUrl()
{
$id = $this->getInput('id');
$country = $this->getInput('country');
$platform = $this->getInput('p');
switch ($platform) {
case 'mac':
$data = $this->getDataFromShoebox($id, $platform, $country);
break;
$platform_param = ($platform === 'mac') ? 'mac' : $platform;
default:
$token = $this->getJWTToken($id, $platform, $country);
$data = $this->getAppData($id, $platform, $country, $token);
return sprintf(
'https://amp-api-edge.apps.apple.com/v1/catalog/%s/apps/%s?platform=%s&extend=versionHistory',
$country,
$id,
$platform_param
);
}
public function getName()
{
if (isset($this->name)) {
return sprintf('%s - AppStore Updates', $this->name);
}
$versionHistory = $this->getVersionHistory($data, $platform);
$name = $this->name = $data['attributes']['name'];
$author = $data['attributes']['artistName'];
return parent::getName();
}
private function debugLog($message)
{
if ($this->getInput('debug')) {
$this->logger->info(sprintf('[AppleAppStoreBridge] %s', $message));
}
}
private function getHtml()
{
$url = $this->makeHtmlUrl();
$this->debugLog(sprintf('Fetching HTML from: %s', $url));
return getSimpleHTMLDOM($url);
}
private function getJWTToken()
{
$html = $this->getHtml();
$meta = $html->find('meta[name="web-experience-app/config/environment"]', 0);
if (!$meta || !isset($meta->content)) {
throw new \Exception('JWT token not found in page content');
}
$decoded_content = urldecode($meta->content);
$this->debugLog('Found meta tag content');
try {
$decoded_json = Json::decode($decoded_content);
} catch (\Exception $e) {
throw new \Exception(sprintf('Failed to parse JSON from meta tag: %s', $e->getMessage()));
}
if (!isset($decoded_json['MEDIA_API']['token'])) {
throw new \Exception('Token field not found in JSON structure');
}
$token = $decoded_json['MEDIA_API']['token'];
$this->debugLog('Successfully extracted JWT token');
return $token;
}
private function getAppData()
{
$token = $this->getJWTToken();
$url = $this->makeJsonUrl();
$this->debugLog(sprintf('Fetching data from API: %s', $url));
$headers = [
'Authorization: Bearer ' . $token,
'Origin: https://apps.apple.com',
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
];
$content = getContents($url, $headers);
try {
$json = Json::decode($content);
} catch (\Exception $e) {
throw new \Exception(sprintf('Failed to parse API response: %s', $e->getMessage()));
}
if (!isset($json['data']) || empty($json['data'])) {
throw new \Exception('No app data found in API response');
}
$this->debugLog('Successfully retrieved app data from API');
return $json['data'][0];
}
private function extractAppDetails($data)
{
if (isset($data['attributes'])) {
$this->name = $data['attributes']['name'] ?? null;
$author = $data['attributes']['artistName'] ?? null;
$this->debugLog(sprintf('Found app details in attributes: %s by %s', $this->name, $author));
return [$this->name, $author];
}
// Fallback to default values if not found
$this->name = sprintf('App %s', $this->getInput('id'));
$this->debugLog(sprintf('App details not found, using default: %s', $this->name));
return [$this->name, 'Unknown Developer'];
}
private function getVersionHistory($data)
{
$platform = $this->getInput('p');
$this->debugLog(sprintf('Extracting version history for platform: %s', $platform));
// Get the mapped platform key (ios for iPhone/iPad, osx for Mac)
$platform_key = self::PLATFORM_MAPPING[$platform] ?? $platform;
$version_history = $data['attributes']['platformAttributes'][$platform_key]['versionHistory'] ?? [];
if (empty($version_history)) {
$this->debugLog(sprintf('No version history found for %s', $platform));
}
return $version_history;
}
public function collectData()
{
$this->debugLog(sprintf('Getting data for %s app', $this->getInput('p')));
$data = $this->getAppData();
// Get app name and author using array destructuring
[$name, $author] = $this->extractAppDetails($data);
// Get version history
$version_history = $this->getVersionHistory($data);
$this->debugLog(sprintf('Found %d versions for %s', count($version_history), $name));
foreach ($version_history as $entry) {
$version = $entry['versionDisplay'] ?? 'Unknown Version';
$release_notes = $entry['releaseNotes'] ?? 'No release notes available';
$release_date = $entry['releaseDate'] ?? 'Unknown Date';
foreach ($versionHistory as $row) {
$item = [];
$item['content'] = nl2br($row['releaseNotes']);
$item['title'] = $name . ' - ' . $row['versionDisplay'];
$item['timestamp'] = $row['releaseDate'];
$item['title'] = sprintf('%s - %s', $name, $version);
$item['content'] = nl2br($release_notes) ?: 'No release notes available';
$item['timestamp'] = $release_date;
$item['author'] = $author;
$item['uri'] = $this->makeHtmlUrl($id, $country);
$item['uri'] = $this->makeHtmlUrl();
$this->items[] = $item;
}
$this->debugLog(sprintf('Successfully collected %d items', count($this->items)));
}
}
}

View File

@@ -71,7 +71,7 @@ class AppleMusicBridge extends BridgeAbstract
$result = $json->results;
if (!is_array($result) || count($result) == 0) {
returnServerError('There is no artist with id "' . $this->getInput('artist') . '".');
throwServerException('There is no artist with id "' . $this->getInput('artist') . '".');
}
return $result;

View File

@@ -1,80 +0,0 @@
<?php
class AskfmBridge extends BridgeAbstract
{
const MAINTAINER = 'az5he6ch, logmanoriginal';
const NAME = 'Ask.fm Answers';
const URI = 'https://ask.fm/';
const CACHE_TIMEOUT = 300; //5 min
const DESCRIPTION = 'Returns answers from an Ask.fm user';
const PARAMETERS = [
'Ask.fm username' => [
'u' => [
'name' => 'Username',
'required' => true,
'exampleValue' => 'ApprovedAndReal'
]
]
];
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI());
$html = defaultLinkTo($html, self::URI);
foreach ($html->find('article.streamItem-answer') as $element) {
$item = [];
$item['uri'] = $element->find('a.streamItem_meta', 0)->href;
$question = trim($element->find('header.streamItem_header', 0)->innertext);
$item['title'] = trim(
htmlspecialchars_decode(
$element->find('header.streamItem_header', 0)->plaintext,
ENT_QUOTES
)
);
$item['timestamp'] = strtotime($element->find('time', 0)->datetime);
$var = $element->find('div.streamItem_content', 0);
$answer = trim($var->innertext ?? '');
// This probably should be cleaned up, especially for YouTube embeds
if ($visual = $element->find('div.streamItem_visual', 0)) {
$visual = $visual->innertext;
}
// Fix tracking links, also doesn't work
foreach ($element->find('a') as $link) {
if (strpos($link->href, 'l.ask.fm') !== false) {
$link->href = $link->plaintext;
}
}
$item['content'] = '<p>' . $question
. '</p><p>' . $answer
. '</p><p>' . $visual . '</p>';
$this->items[] = $item;
}
}
public function getName()
{
if (!is_null($this->getInput('u'))) {
return self::NAME . ' : ' . $this->getInput('u');
}
return parent::getName();
}
public function getURI()
{
if (!is_null($this->getInput('u'))) {
return self::URI . urlencode($this->getInput('u'));
}
return parent::getURI();
}
}

View File

@@ -66,10 +66,10 @@ class AssociatedPressNewsBridge extends BridgeAbstract
{
switch ($this->getInput('topic')) {
case 'Podcasts':
returnClientError('Podcasts topic feed is not supported');
throwClientException('Podcasts topic feed is not supported');
break;
case 'PressReleases':
returnClientError('PressReleases topic feed is not supported');
throwClientException('PressReleases topic feed is not supported');
break;
default:
$this->collectCardData();
@@ -105,13 +105,12 @@ class AssociatedPressNewsBridge extends BridgeAbstract
private function collectCardData()
{
$json = getContents($this->getTagURI())
or returnServerError('Could not request: ' . $this->getTagURI());
$json = getContents($this->getTagURI());
$tagContents = json_decode($json, true);
if (empty($tagContents['tagObjs'])) {
returnClientError('Topic not found: ' . $this->getInput('topic'));
throwClientException('Topic not found: ' . $this->getInput('topic'));
}
$this->feedName = $tagContents['tagObjs'][0]['name'];

344
bridges/AuctionetBridge.php Normal file
View File

@@ -0,0 +1,344 @@
<?php
class AuctionetBridge extends BridgeAbstract
{
const NAME = 'Auctionet';
const URI = 'https://www.auctionet.com';
const DESCRIPTION = 'Fetches info about auction objects from Auctionet (an auction platform for many European auction houses)';
const MAINTAINER = 'Qluxzz';
const PARAMETERS = [[
'category' => [
'name' => 'Category',
'type' => 'list',
'values' => [
'All categories' => '',
'Art' => [
'All' => '25-art',
'Drawings' => '119-drawings',
'Engravings & Prints' => '27-engravings-prints',
'Other' => '30-other',
'Paintings' => '28-paintings',
'Photography' => '26-photography',
'Sculptures & Bronzes' => '29-sculptures-bronzes',
],
'Asiatica' => [
'All' => '117-asiatica',
],
'Books, Maps & Manuscripts' => [
'All' => '50-books-maps-manuscripts',
'Autographs & Manuscripts' => '206-autographs-manuscripts',
'Books' => '204-books',
'Maps' => '205-maps',
'Other' => '207-other',
],
'Carpets & Textiles' => [
'All' => '35-carpets-textiles',
'Carpets' => '36-carpets',
'Textiles' => '37-textiles',
],
'Ceramics & Porcelain' => [
'All' => '9-ceramics-porcelain',
'European' => '10-european',
'Oriental' => '11-oriental',
'Rest of the world' => '12-rest-of-the-world',
'Tableware' => '210-tableware',
],
'Clocks & Watches' => [
'All' => '31-clocks-watches',
'Carriage & Miniature Clocks' => '258-carriage-miniature-clocks',
'Longcase clocks' => '32-longcase-clocks',
'Mantel clocks' => '33-mantel-clocks',
'Other clocks' => '34-other-clocks',
'Pocket & Stop Watches' => '110-pocket-stop-watches',
'Wall Clocks' => '127-wall-clocks',
'Wristwatches' => '15-wristwatches',
],
'Coins, Medals & Stamps' => [
'All' => '46-coins-medals-stamps',
'Coins' => '128-coins',
'Orders & Medals' => '135-orders-medals',
'Other' => '131-other',
'Stamps' => '136-stamps',
],
'Folk art' => [
'All' => '58-folk-art',
'Bowls & Boxes' => '121-bowls-boxes',
'Furniture' => '122-furniture',
'Other' => '123-other',
'Tools & Gears' => '120-tools-gears',
],
'Furniture' => [
'All' => '16-furniture',
'Armchairs & Chairs' => '18-armchairs-chairs',
'Chests of drawers' => '24-chests-of-drawers',
'Cupboards, Cabinets & Shelves' => '23-cupboards-cabinets-shelves',
'Dining room furniture' => '22-dining-room-furniture',
'Garden' => '21-garden',
'Other' => '17-other',
'Sofas & seatings' => '20-sofas-seatings',
'Tables' => '19-tables',
],
'Glass' => [
'All' => '6-glass',
'Art glass' => '208-art-glass',
'Other' => '8-other',
'Tableware' => '7-tableware',
'Utility glass' => '209-utility-glass',
],
'Jewellery & Gemstones' => [
'All' => '13-jewellery-gemstones',
'Alliance rings' => '113-alliance-rings',
'Bracelets' => '106-bracelets',
'Brooches & Pendants' => '107-brooches-pendants',
'Costume Jewellery' => '259-costume-jewellery',
'Cufflinks & Tie Pins' => '111-cufflinks-tie-pins',
'Ear studs' => '116-ear-studs',
'Earrings' => '115-earrings',
'Gemstones' => '48-gemstones',
'Jewellery' => '14-jewellery',
'Jewellery Suites' => '109-jewellery-suites',
'Necklace' => '104-necklace',
'Other' => '118-other',
'Rings' => '112-rings',
'Signet rings' => '105-signet-rings',
'Solitaire rings' => '114-solitaire-rings',
],
'Licence weapons' => [
'All' => '59-licence-weapons',
'Combi/Combo' => '63-combi-combo',
'Double express rifles' => '60-double-express-rifles',
'Rifles' => '61-rifles',
'Shotguns' => '62-shotguns',
],
'Lighting & Lamps' => [
'All' => '1-lighting-lamps',
'Candlesticks' => '4-candlesticks',
'Ceiling lights' => '3-ceiling-lights',
'Chandeliers' => '203-chandeliers',
'Floor lights' => '2-floor-lights',
'Other lighting' => '5-other-lighting',
'Table Lamps' => '125-table-lamps',
'Wall Lights' => '124-wall-lights',
],
'Mirrors' => [
'All' => '42-mirrors',
],
'Miscellaneous' => [
'All' => '43-miscellaneous',
'Fishing equipment' => '54-fishing-equipment',
'Miscellaneous' => '47-miscellaneous',
'Modern Tools' => '133-modern-tools',
'Modern consumer electronics' => '52-modern-consumer-electronics',
'Musical instruments' => '51-musical-instruments',
'Technica & Nautica' => '45-technica-nautica',
],
'Photo, Cameras & Lenses' => [
'All' => '57-photo-cameras-lenses',
'Cameras & accessories' => '71-cameras-accessories',
'Optics' => '66-optics',
'Other' => '72-other',
],
'Silver & Metals' => [
'All' => '38-silver-metals',
'Other metals' => '40-other-metals',
'Pewter, Brass & Copper' => '41-pewter-brass-copper',
'Silver' => '39-silver',
'Silver plated' => '213-silver-plated',
],
'Toys' => [
'All' => '44-toys',
'Comics' => '211-comics',
'Toys' => '212-toys',
],
'Tribal art' => [
'All' => '134-tribal-art',
],
'Vehicles, Boats & Parts' => [
'All' => '249-vehicles-boats-parts',
'Automobilia & Transport' => '255-automobilia-transport',
'Bicycles' => '132-bicycles',
'Boats & Accessories' => '250-boats-accessories',
'Car parts' => '253-car-parts',
'Cars' => '215-cars',
'Moped parts' => '254-moped-parts',
'Mopeds' => '216-mopeds',
'Motorcycle parts' => '252-motorcycle-parts',
'Motorcycles' => '251-motorcycles',
'Other' => '256-other',
],
'Vintage & Designer Fashion' => [
'All' => '49-vintage-designer-fashion',
],
'Weapons & Militaria' => [
'All' => '137-weapons-militaria',
'Airguns' => '257-airguns',
'Armour & Uniform' => '138-armour-uniform',
'Edged weapons' => '130-edged-weapons',
'Guns & Rifles' => '129-guns-rifles',
'Other' => '214-other',
],
'Wine, Port & Spirits' => [
'All' => '170-wine-port-spirits',
],
]
],
'sort_order' => [
'name' => 'Sort order',
'type' => 'list',
'values' => [
'Most bids' => 'bids_count_desc',
'Lowest bid' => 'bid_asc',
'Highest bid' => 'bid_desc',
'Last bid on' => 'bid_on',
'Ending soonest' => 'end_asc_active',
'Lowest estimate' => 'estimate_asc',
'Highest estimate' => 'estimate_desc',
'Recently added' => 'recent'
],
],
'country' => [
'name' => 'Country',
'type' => 'list',
'values' => [
'All' => '',
'Denmark' => 'DK',
'Finland' => 'FI',
'Germany' => 'DE',
'Spain' => 'ES',
'Sweden' => 'SE',
'United Kingdom' => 'GB'
]
],
'language' => [
'name' => 'Language',
'type' => 'list',
'values' => [
'English' => 'en',
'Español' => 'es',
'Deutsch' => 'de',
'Svenska' => 'sv',
'Dansk' => 'da',
'Suomi' => 'fi',
],
],
]];
const CACHE_TIMEOUT = 3600; // 1 hour
private $title;
public function collectData()
{
// Each page contains 48 auctions
// So we fetch 10 pages so we decrease the likelihood
// of missing auctions between feed refreshes
// Fetch first page and use that to get title
{
$url = $this->getUrl(1);
$data = getContents($url);
$title = $this->getDocumentTitle($data);
$this->items = array_merge($this->items, $this->parsePageData($data));
}
// Fetch remaining pages
for ($page = 2; $page <= 10; $page++) {
$url = $this->getUrl($page);
$data = getContents($url);
$this->items = array_merge($this->items, $this->parsePageData($data));
}
}
public function getName()
{
return $this->title ?: parent::getName();
}
/* HELPERS */
private function getUrl($page)
{
$category = $this->getInput('category');
$language = $this->getInput('language');
$sort_order = $this->getInput('sort_order');
$country = $this->getInput('country');
$url = self::URI . '/' . $language . '/search';
if ($category) {
$url = $url . '/' . $category;
}
$query = [];
$query['page'] = $page;
if ($sort_order) {
$query['order'] = $sort_order;
}
if ($country) {
$query['country_code'] = $country;
}
if (count($query) > 0) {
$url = $url . '?' . http_build_query($query);
}
return $url;
}
private function getDocumentTitle($data)
{
$title_elem = '<title>';
$title_elem_length = strlen($title_elem);
$title_start = strpos($data, $title_elem);
$title_end = strpos($data, '</title>', $title_start);
$title_length = $title_end - $title_start + strlen($title_elem);
$title = substr($data, $title_start + strlen($title_elem), $title_length);
return $title;
}
/**
* The auction items data is included in the HTML document
* as a HTML entities encoded JSON structure
* which is used to hydrate the React component for the list of auctions
*/
private function parsePageData($data)
{
$key = 'data-react-props="';
$keyLength = strlen($key);
$start = strpos($data, $key);
$end = strpos($data, '"', $start + strlen($key));
$length = $end - ($start + $keyLength);
$jsonString = substr($data, $start + $keyLength, $length);
$jsonData = json_decode(htmlspecialchars_decode($jsonString), false);
$items = [];
foreach ($jsonData->{'items'} as $item) {
$title = $item->{'longTitle'};
$relative_url = $item->{'url'};
$images = $item->{'imageUrls'};
$id = $item->{'auctionId'};
$items[] = [
'title' => $title,
'uri' => self::URI . $relative_url,
'uid' => $id,
'content' => count($images) > 0 ? "<img src='$images[0]'/><br/>$title" : $title,
'enclosures' => array_slice($images, 1),
];
}
return $items;
}
}

View File

@@ -29,7 +29,7 @@ class BAEBridge extends BridgeAbstract
public function collectData()
{
$url = $this->getURI();
$html = getSimpleHTMLDOM($url) or returnClientError('No results for this query.');
$html = getSimpleHTMLDOM($url);
$annonces = $html->find('main article');
foreach ($annonces as $annonce) {

View File

@@ -94,7 +94,7 @@ class BakaUpdatesMangaReleasesBridge extends BridgeAbstract
// content is an unstructured pile of divs, ugly to parse
$cols = $html->find('div#main_content div.row > div.text');
if (!$cols) {
returnServerError('No releases');
throwServerException('No releases');
}
$rows = array_slice(

View File

@@ -123,7 +123,7 @@ class BandcampBridge extends BridgeAbstract
$json = json_decode($content);
if ($json->ok !== true) {
returnServerError('Invalid response');
throwServerException('Invalid response');
}
foreach ($json->items as $entry) {
@@ -165,7 +165,7 @@ class BandcampBridge extends BridgeAbstract
$regex = '/band_id=(\d+)/';
if (preg_match($regex, $html, $matches) == false) {
returnServerError('Unable to find band ID on: ' . $this->getURI());
throwServerException('Unable to find band ID on: ' . $this->getURI());
}
$band_id = $matches[1];
@@ -196,7 +196,7 @@ class BandcampBridge extends BridgeAbstract
case 'By album':
$regex = '/album=(\d+)/';
if (preg_match($regex, $html, $matches) == false) {
returnServerError('Unable to find album ID on: ' . $this->getURI());
throwServerException('Unable to find album ID on: ' . $this->getURI());
}
$album_id = $matches[1];

View File

@@ -93,8 +93,7 @@ class BandcampDailyBridge extends BridgeAbstract
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request: ' . $this->getURI());
$html = getSimpleHTMLDOM($this->getURI());
$html = defaultLinkTo($html, self::URI);
@@ -105,8 +104,7 @@ class BandcampDailyBridge extends BridgeAbstract
$articlePath = $article->find('a.title', 0)->href;
$articlePageHtml = getSimpleHTMLDOMCached($articlePath, 3600)
or returnServerError('Could not request: ' . $articlePath);
$articlePageHtml = getSimpleHTMLDOMCached($articlePath, 3600);
$item['uri'] = $articlePath;
$item['title'] = $articlePageHtml->find('article-title', 0)->innertext;

139
bridges/BazarakiBridge.php Normal file
View File

@@ -0,0 +1,139 @@
<?php
class BazarakiBridge extends BridgeAbstract
{
const NAME = 'Bazaraki Bridge';
const URI = 'https://bazaraki.com';
const DESCRIPTION = 'Fetch adverts from Bazaraki, a Cyprus-based classifieds website.';
const MAINTAINER = 'danwain';
const PARAMETERS = [
[
'url' => [
'name' => 'URL',
'type' => 'text',
'required' => true,
'title' => 'Enter the URL of the Bazaraki page to fetch adverts from.',
'exampleValue' => 'https://www.bazaraki.com/real-estate-for-sale/houses/?lat=0&lng=0&radius=100000',
],
'limit' => [
'name' => 'Limit',
'type' => 'number',
'required' => false,
'title' => 'Enter the number of adverts to fetch. (max 50)',
'exampleValue' => '10',
'defaultValue' => 10,
]
]
];
public function collectData()
{
$url = $this->getInput('url');
if (! str_starts_with($url, 'https://www.bazaraki.com/')) {
throw new \Exception('Nope');
}
$html = getSimpleHTMLDOM($url);
$i = 0;
foreach ($html->find('div.advert') as $element) {
$i++;
if ($i > $this->getInput('limit') || $i > 50) {
break;
}
$item = [];
$item['uri'] = 'https://www.bazaraki.com' . $element->find('a.advert__content-title', 0)->href;
# Get the content
$advert = getSimpleHTMLDOM($item['uri']);
$price = trim($advert->find('div.announcement-price__cost', 0)->plaintext);
$name = trim($element->find('a.advert__content-title', 0)->plaintext);
$item['title'] = $name . ' - ' . $price;
$time = trim($advert->find('span.date-meta', 0)->plaintext);
$time = str_replace('Posted: ', '', $time);
$item['content'] = $this->processAdvertContent($advert);
$item['timestamp'] = $this->convertRelativeTime($time);
$item['author'] = trim($advert->find('div.author-name', 0)->plaintext);
$item['uid'] = $advert->find('span.number-announcement', 0)->plaintext;
$this->items[] = $item;
}
}
/**
* Process the advert content to clean up HTML
*
* @param simple_html_dom $advert The SimpleHTMLDOM object for the advert page
* @return string Processed HTML content
*/
private function processAdvertContent($advert)
{
// Get the content sections
$header = $advert->find('div.announcement-content-header', 0);
$characteristics = $advert->find('div.announcement-characteristics', 0);
$description = $advert->find('div.js-description', 0);
$images = $advert->find('div.announcement__images', 0);
// Remove all favorites divs
foreach ($advert->find('div.announcement-meta__favorites') as $favorites) {
$favorites->outertext = '';
}
// Replace all <a> tags with their text content
foreach ($advert->find('a') as $a) {
$a->outertext = $a->innertext;
}
// Format the content with section headers and dividers
$formattedContent = '';
// Add header section
$formattedContent .= $header->innertext;
$formattedContent .= '<hr/>';
// Add characteristics section with header
$formattedContent .= '<h3>Details</h3>';
$formattedContent .= $characteristics->innertext;
$formattedContent .= '<hr/>';
// Add description section with header
$formattedContent .= '<h3>Description</h3>';
$formattedContent .= $description->innertext;
$formattedContent .= '<hr/>';
// Add images section with header
$formattedContent .= '<h3>Images</h3>';
$formattedContent .= $images->innertext;
return $formattedContent;
}
/**
* Convert relative time strings like "Yesterday 12:32" to proper timestamps
*
* @param string $timeString The relative time string from the website
* @return string Timestamp in a format compatible with strtotime()
*/
private function convertRelativeTime($timeString)
{
if (strpos($timeString, 'Yesterday') !== false) {
// Replace "Yesterday" with actual date
$time = str_replace('Yesterday', date('Y-m-d', strtotime('-1 day')), $timeString);
return date('Y-m-d H:i:s', strtotime($time));
} elseif (strpos($timeString, 'Today') !== false) {
// Replace "Today" with actual date
$time = str_replace('Today', date('Y-m-d'), $timeString);
return date('Y-m-d H:i:s', strtotime($time));
} else {
// For other formats, return as is and let strtotime handle it
return $timeString;
}
}
}

View File

@@ -1,6 +1,6 @@
<?php
class BlizzardNewsBridge extends XPathAbstract
class BlizzardNewsBridge extends BridgeAbstract
{
const NAME = 'Blizzard News';
const URI = 'https://news.blizzard.com';
@@ -35,33 +35,73 @@ class BlizzardNewsBridge extends XPathAbstract
];
const CACHE_TIMEOUT = 3600;
const XPATH_EXPRESSION_ITEM = '/html/body/div/div[4]/div[2]/div[2]/div/div/section/ol/li/article';
const XPATH_EXPRESSION_ITEM_TITLE = './/div/div[2]/h2';
const XPATH_EXPRESSION_ITEM_CONTENT = './/div[@class="ArticleListItem-description"]/div[@class="h6"]/text()';
const XPATH_EXPRESSION_ITEM_URI = './/a[@class="ArticleLink ArticleLink"]/@href';
const XPATH_EXPRESSION_ITEM_AUTHOR = '';
const XPATH_EXPRESSION_ITEM_TIMESTAMP = './/time[@class="ArticleListItem-footerTimestamp"]/@timestamp';
const XPATH_EXPRESSION_ITEM_ENCLOSURES = './/div[@class="ArticleListItem-image"]/@style';
const XPATH_EXPRESSION_ITEM_CATEGORIES = './/div[@class="ArticleListItem-label"]';
const SETTING_FIX_ENCODING = true;
private const PRODUCT_IDS = [
'blt525c436e4a1b0a97',
'blt54fbd3787a705054',
'blt2031aef34200656d',
'blt795c314400d7ded9',
'blt5cfc6affa3ca0638',
'blt2e50e1521bb84dc6',
'blt376fb94931906b6f',
'blt81d46fcb05ab8811',
'bltede2389c0a8885aa',
'blt24859ba8086fb294',
'blte27d02816a8ff3e1',
'blt2caca37e42f19839',
'blt90855744d00cd378',
'bltec70ad0ea4fd6d1d',
'blt500c1f8b5470bfdb'
];
private const API_PATH = '/api/news/blizzard?';
/**
* Source Web page URL (should provide either HTML or XML content)
* @return string
*/
protected function getSourceUrl()
private function getSourceUrl(): string
{
$locale = $this->getInput('locale');
if ('zh-cn' === $locale) {
return 'https://cn.news.blizzard.com';
$baseUrl = 'https://cn.news.blizzard.com' . self::API_PATH;
} else {
$baseUrl = 'https://news.blizzard.com/' . $locale . self::API_PATH;
}
return 'https://news.blizzard.com/' . $locale;
return $baseUrl .= http_build_query([
'feedCxpProductIds' => self::PRODUCT_IDS
]);
}
public function collectData()
{
$feedContent = json_decode(getContents($this->getSourceUrl()), true);
foreach ($feedContent['feed']['contentItems'] as $entry) {
$properties = $entry['properties'];
$item = [];
$item['title'] = $this->filterChars($properties['title']);
$item['content'] = $this->filterChars($properties['summary']);
$item['uri'] = $properties['newsUrl'];
$item['author'] = $this->filterChars($properties['author']);
$item['timestamp'] = strtotime($properties['lastUpdated']);
$item['enclosures'] = [$properties['staticAsset']['imageUrl']];
$item['categories'] = [$this->filterChars($properties['cxpProduct']['title'])];
$this->items[] = $item;
}
}
private function filterChars($content)
{
return htmlspecialchars($content, ENT_XML1);
}
public function getIcon()
{
return <<<icon
https://blznews.akamaized.net/images/favicon-cb34a003c6f2f637ee8f4f7b406f3b9b120b918c04cabec7f03a760e708977ea9689a1c638f4396def8dce7b202cd007eae91946cc3c4a578aa8b5694226cfc6.ico
https://dfbmfbnnydoln.cloudfront.net/production/images/favicons/favicon.ba01bb119359d74970b02902472fd82e96b5aba7.ico
icon;
}
}

View File

@@ -2,10 +2,12 @@
class BlueskyBridge extends BridgeAbstract
{
const NAME = 'Bluesky';
//Initial PR by [RSSBridge contributors](https://github.com/RSS-Bridge/rss-bridge/issues/4058).
//Modified from [©DIYgod and contributors at RSSHub](https://github.com/DIYgod/RSSHub/tree/master/lib/routes/bsky), MIT License';
const NAME = 'Bluesky Bridge';
const URI = 'https://bsky.app';
const DESCRIPTION = 'Fetches posts from Bluesky';
const MAINTAINER = 'Code modified from rsshub (TonyRL https://github.com/TonyRL) and expanded';
const MAINTAINER = 'mruac';
const PARAMETERS = [
[
'data_source' => [
@@ -17,24 +19,39 @@ class BlueskyBridge extends BridgeAbstract
],
'title' => 'Select the type of data source to fetch from Bluesky.'
],
'handle' => [
'name' => 'User Handle',
'user_id' => [
'name' => 'User Handle or DID',
'type' => 'text',
'required' => true,
'exampleValue' => 'jackdodo.bsky.social',
'title' => 'Handle found in URL'
'exampleValue' => 'did:plc:z72i7hdynmk6r22z27h6tvur',
'title' => 'ATProto / Bsky.app handle or DID'
],
'filter' => [
'name' => 'Filter',
'feed_filter' => [
'name' => 'Feed type',
'type' => 'list',
'defaultValue' => 'posts_and_author_threads',
'values' => [
'posts_and_author_threads' => 'posts_and_author_threads',
'posts_with_replies' => 'posts_with_replies',
'posts_no_replies' => 'posts_no_replies',
'posts_with_media' => 'posts_with_media',
],
'title' => 'Combinations of post/repost types to include in response.'
'Posts feed' => 'posts_and_author_threads',
'All posts and replies' => 'posts_with_replies',
'Root posts only' => 'posts_no_replies',
'Media only' => 'posts_with_media',
]
],
'include_reposts' => [
'name' => 'Include Reposts?',
'type' => 'checkbox',
'defaultValue' => 'checked'
],
'include_reply_context' => [
'name' => 'Include Reply context?',
'type' => 'checkbox'
],
'verbose_title' => [
'name' => 'Use verbose feed item titles?',
'type' => 'checkbox'
]
]
];
@@ -44,7 +61,11 @@ class BlueskyBridge extends BridgeAbstract
public function getName()
{
if (isset($this->profile)) {
return sprintf('%s (@%s) - Bluesky', $this->profile['displayName'], $this->profile['handle']);
if ($this->profile['handle'] === 'handle.invalid') {
return sprintf('Bluesky - %s', $this->profile['displayName']);
} else {
return sprintf('Bluesky - %s (@%s)', $this->profile['displayName'], $this->profile['handle']);
}
}
return parent::getName();
}
@@ -52,7 +73,11 @@ class BlueskyBridge extends BridgeAbstract
public function getURI()
{
if (isset($this->profile)) {
return self::URI . '/profile/' . $this->profile['handle'];
if ($this->profile['handle'] === 'handle.invalid') {
return self::URI . '/profile/' . $this->profile['did'];
} else {
return self::URI . '/profile/' . $this->profile['handle'];
}
}
return parent::getURI();
}
@@ -77,117 +102,388 @@ class BlueskyBridge extends BridgeAbstract
{
$description = '';
$externalUri = $external['uri'];
$externalTitle = htmlspecialchars($external['title'], ENT_QUOTES, 'UTF-8');
$externalDescription = htmlspecialchars($external['description'], ENT_QUOTES, 'UTF-8');
$externalTitle = e($external['title']);
$externalDescription = e($external['description']);
$thumb = $external['thumb'] ?? null;
if (preg_match('/youtube\.com\/watch\?v=([^\&\?\/]+)/', $externalUri, $id) || preg_match('/youtu\.be\/([^\&\?\/]+)/', $externalUri, $id)) {
$videoId = $id[1];
$description .= "<p>External Link: <a href=\"$externalUri\">$externalTitle</a></p>";
$description .= "<iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/$videoId\" frameborder=\"0\" allowfullscreen></iframe>";
if (preg_match('/http(|s):\/\/media\.tenor\.com/', $externalUri)) {
//tenor gif embed
$tenorInterstitial = str_replace('media.tenor.com', 'media1.tenor.com/m', $externalUri);
$description .= "<figure><a href=\"$tenorInterstitial\"><img src=\"$externalUri\"/></a><figcaption>$externalTitle</figcaption></figure>";
} else {
$description .= "<p>External Link: <a href=\"$externalUri\">$externalTitle</a></p>";
$description .= "<p>$externalDescription</p>";
if ($thumb) {
$thumbUrl = 'https://cdn.bsky.app/img/feed_thumbnail/plain/' . $did . '/' . $thumb['ref']['$link'] . '@jpeg';
$description .= "<p><a href=\"$externalUri\"><img src=\"$thumbUrl\" alt=\"External Thumbnail\" /></a></p>";
}
//link embed preview
$host = parse_url($externalUri)['host'];
$thumbDesc = $thumb ? ('<img src="https://cdn.bsky.app/img/feed_thumbnail/plain/' . $did . '/' . $thumb['ref']['$link'] . '@jpeg"/>') : '';
$externalDescription = strlen($externalDescription) > 0 ? "<figcaption>($host) $externalDescription</figcaption>" : '';
$description .= '<br><blockquote><b><a href="' . $externalUri . '">' . $externalTitle . '</a></b>';
$description .= '<figure>' . $thumbDesc . $externalDescription . '</figure></blockquote>';
}
return $description;
}
private function textToDescription($text)
private function textToDescription($record)
{
$text = nl2br(htmlspecialchars($text, ENT_QUOTES, 'UTF-8'));
$text = preg_replace('/(https?:\/\/[^\s]+)/i', '<a href="$1">$1</a>', $text);
if (isset($record['value'])) {
$record = $record['value'];
}
$text = $record['text'];
$text_copy = $text;
$text = nl2br(e($text));
if (isset($record['facets'])) {
$facets = $record['facets'];
foreach ($facets as $facet) {
if ($facet['features'][0]['$type'] === 'app.bsky.richtext.facet#link') {
$substring = substr($text_copy, $facet['index']['byteStart'], $facet['index']['byteEnd'] - $facet['index']['byteStart']);
$text = str_replace($substring, '<a href="' . $facet['features'][0]['uri'] . '">' . $substring . '</a>', $text);
}
}
}
return $text;
}
public function collectData()
{
$handle = $this->getInput('handle');
$filter = $this->getInput('filter') ?: 'posts_and_author_threads';
$user_id = $this->getInput('user_id');
$handle_match = preg_match('/(?:[a-zA-Z]*\.)+([a-zA-Z](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)/', $user_id, $handle_res); //gets the TLD in $handle_match[1]
$did_match = preg_match('/did:plc:[a-z2-7]{24}/', $user_id); //https://github.com/did-method-plc/did-method-plc#identifier-syntax
$exclude = ['alt', 'arpa', 'example', 'internal', 'invalid', 'local', 'localhost', 'onion']; //https://en.wikipedia.org/wiki/Top-level_domain#Reserved_domains
if ($handle_match == true && array_search($handle_res[1], $exclude) == false) {
//valid bsky handle
$did = $this->resolveHandle($user_id);
} elseif ($did_match == true) {
//valid DID
$did = $user_id;
} else {
throwClientException('Invalid ATproto handle or DID provided.');
}
$filter = $this->getInput('feed_filter') ?: 'posts_and_author_threads';
$replyContext = $this->getInput('include_reply_context');
$did = $this->resolveHandle($handle);
$this->profile = $this->getProfile($did);
$authorFeed = $this->getAuthorFeed($did, $filter);
foreach ($authorFeed['feed'] as $post) {
$postRecord = $post['post']['record'];
$item = [];
$item['uri'] = self::URI . '/profile/' . $post['post']['author']['handle'] . '/post/' . explode('app.bsky.feed.post/', $post['post']['uri'])[1];
$item['title'] = strtok($post['post']['record']['text'], "\n");
$item['timestamp'] = strtotime($post['post']['record']['createdAt']);
$item['author'] = $this->profile['displayName'];
$item['uri'] = self::URI . '/profile/' . $this->fallbackAuthor($post['post']['author'], 'url') . '/post/' . explode('app.bsky.feed.post/', $post['post']['uri'])[1];
$item['title'] = $this->getInput('verbose_title') ? $this->generateVerboseTitle($post) : strtok($postRecord['text'], "\n");
$item['timestamp'] = strtotime($postRecord['createdAt']);
$item['author'] = $this->fallbackAuthor($post['post']['author'], 'display');
$description = $this->textToDescription($post['post']['record']['text']);
$postAuthorDID = $post['post']['author']['did'];
$postAuthorHandle = $post['post']['author']['handle'] !== 'handle.invalid' ? '<i>@' . $post['post']['author']['handle'] . '</i> ' : '';
$postDisplayName = $post['post']['author']['displayName'] ?? '';
$postDisplayName = e($postDisplayName);
$postUri = $item['uri'];
// Retrieve DID for constructing image URLs
$authorDid = $post['post']['author']['did'];
$url = explode('/', $post['post']['uri']);
$this->logger->debug('https://bsky.app/profile/' . $url[2] . '/post/' . $url[4]);
if (isset($post['post']['record']['embed']['$type']) && $post['post']['record']['embed']['$type'] === 'app.bsky.embed.external') {
$description .= $this->parseExternal($post['post']['record']['embed']['external'], $authorDid);
}
$description = '';
$description .= '<p>';
//post
$description .= $this->getPostDescription(
$postDisplayName,
$postAuthorHandle,
$postUri,
$postRecord,
'post'
);
if (isset($post['post']['record']['embed']['$type']) && $post['post']['record']['embed']['$type'] === 'app.bsky.embed.video') {
$thumbnail = $post['post']['embed']['thumbnail'] ?? null;
if ($thumbnail) {
$itemUri = self::URI . '/profile/' . $post['post']['author']['handle'] . '/post/' . explode('app.bsky.feed.post/', $post['post']['uri'])[1];
$description .= "<p><a href=\"$itemUri\"><img src=\"$thumbnail\" alt=\"Video Thumbnail\" /></a></p>";
}
}
if (isset($post['post']['record']['embed']['$type']) && $post['post']['record']['embed']['$type'] === 'app.bsky.embed.recordWithMedia#view') {
$thumbnail = $post['post']['embed']['media']['thumbnail'] ?? null;
$playlist = $post['post']['embed']['media']['playlist'] ?? null;
if ($thumbnail) {
$description .= "<p><video controls poster=\"$thumbnail\">";
$description .= "<source src=\"$playlist\" type=\"application/x-mpegURL\">";
$description .= 'Video source not supported</video></p>';
}
}
if (!empty($post['post']['record']['embed']['images'])) {
foreach ($post['post']['record']['embed']['images'] as $image) {
$linkRef = $image['image']['ref']['$link'];
$thumbnailUrl = $this->resolveThumbnailUrl($authorDid, $linkRef);
$fullsizeUrl = $this->resolveFullsizeUrl($authorDid, $linkRef);
$description .= "<br /><br /><a href=\"$fullsizeUrl\"><img src=\"$thumbnailUrl\" alt=\"Image\"></a>";
}
}
// Enhanced handling for quote posts with images
if (isset($post['post']['record']['embed']) && $post['post']['record']['embed']['$type'] === 'app.bsky.embed.record') {
$quotedRecord = $post['post']['record']['embed']['record'];
$quotedAuthor = $post['post']['embed']['record']['author']['handle'] ?? null;
$quotedDisplayName = $post['post']['embed']['record']['author']['displayName'] ?? null;
$quotedText = $post['post']['embed']['record']['value']['text'] ?? null;
if ($quotedAuthor && isset($quotedRecord['uri'])) {
$parts = explode('/', $quotedRecord['uri']);
$quotedPostId = end($parts);
$quotedPostUri = self::URI . '/profile/' . $quotedAuthor . '/post/' . $quotedPostId;
if (isset($postRecord['embed']['$type'])) {
//post link embed
if ($postRecord['embed']['$type'] === 'app.bsky.embed.external') {
$description .= $this->parseExternal($postRecord['embed']['external'], $postAuthorDID);
} elseif (
$postRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$postRecord['embed']['media']['$type'] === 'app.bsky.embed.external'
) {
$description .= $this->parseExternal($postRecord['embed']['media']['external'], $postAuthorDID);
}
if ($quotedText) {
$description .= '<hr /><strong>Quote from ' . htmlspecialchars($quotedDisplayName) . ' (@ ' . htmlspecialchars($quotedAuthor) . '):</strong><br />';
$description .= $this->textToDescription($quotedText);
if (isset($quotedPostUri)) {
$description .= "<p><a href=\"$quotedPostUri\">View original quote post</a></p>";
//post images
if (
$postRecord['embed']['$type'] === 'app.bsky.embed.images' ||
(
$postRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$postRecord['embed']['media']['$type'] === 'app.bsky.embed.images'
)
) {
$images = $post['post']['embed']['images'] ?? $post['post']['embed']['media']['images'];
foreach ($images as $image) {
$description .= $this->getPostImageDescription($image);
}
}
//post video
if (
$postRecord['embed']['$type'] === 'app.bsky.embed.video' ||
(
$postRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$postRecord['embed']['media']['$type'] === 'app.bsky.embed.video'
)
) {
$description .= $this->getPostVideoDescription(
$postRecord['embed']['video'] ?? $postRecord['embed']['media']['video'],
$postAuthorDID
);
}
}
$description .= '</p>';
//quote post
if (
isset($postRecord['embed']) &&
(
$postRecord['embed']['$type'] === 'app.bsky.embed.record' ||
$postRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia'
) &&
isset($post['post']['embed']['record'])
) {
$description .= '<p>';
$quotedRecord = $post['post']['embed']['record']['record'] ?? $post['post']['embed']['record'];
if (isset($quotedRecord['notFound']) && $quotedRecord['notFound']) { //deleted post
$description .= 'Quoted post deleted.';
} elseif (isset($quotedRecord['detached']) && $quotedRecord['detached']) { //detached quote
$uri_explode = explode('/', $quotedRecord['uri']);
$uri_reconstructed = self::URI . '/profile/' . $uri_explode[2] . '/post/' . $uri_explode[4];
$description .= '<a href="' . $uri_reconstructed . '">Quoted post detached.</a>';
} elseif (isset($quotedRecord['blocked']) && $quotedRecord['blocked']) { //blocked by quote author
$description .= 'Author of quoted post has blocked OP.';
} elseif (
($quotedRecord['$type'] ?? '') === 'app.bsky.feed.defs#generatorView' ||
($quotedRecord['$type'] ?? '') === 'app.bsky.graph.defs#listView'
) {
$description .= $this->getListFeedDescription($quotedRecord);
} elseif (
($quotedRecord['$type'] ?? '') === 'app.bsky.graph.starterpack' ||
($quotedRecord['$type'] ?? '') === 'app.bsky.graph.defs#starterPackViewBasic'
) {
$description .= $this->getStarterPackDescription($post['post']['embed']['record']);
} else {
$quotedAuthorDid = $quotedRecord['author']['did'];
$quotedDisplayName = $quotedRecord['author']['displayName'] ?? '';
$quotedDisplayName = e($quotedDisplayName);
$quotedAuthorHandle = $quotedRecord['author']['handle'] !== 'handle.invalid' ? '<i>@' . $quotedRecord['author']['handle'] . '</i>' : '';
$parts = explode('/', $quotedRecord['uri']);
$quotedPostId = end($parts);
$quotedPostUri = self::URI . '/profile/' . $this->fallbackAuthor($quotedRecord['author'], 'url') . '/post/' . $quotedPostId;
//quoted post - post
$description .= $this->getPostDescription(
$quotedDisplayName,
$quotedAuthorHandle,
$quotedPostUri,
$quotedRecord,
'quote'
);
if (isset($quotedRecord['value']['embed']['$type'])) {
//quoted post - post link embed
if ($quotedRecord['value']['embed']['$type'] === 'app.bsky.embed.external') {
$description .= $this->parseExternal($quotedRecord['value']['embed']['external'], $quotedAuthorDid);
}
//quoted post - post video
if (
$quotedRecord['value']['embed']['$type'] === 'app.bsky.embed.video' ||
(
$quotedRecord['value']['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$quotedRecord['value']['embed']['media']['$type'] === 'app.bsky.embed.video'
)
) {
$description .= $this->getPostVideoDescription(
$quotedRecord['value']['embed']['video'] ?? $quotedRecord['value']['embed']['media']['video'],
$quotedAuthorDid
);
}
//quoted post - post images
if (
$quotedRecord['value']['embed']['$type'] === 'app.bsky.embed.images' ||
(
$quotedRecord['value']['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$quotedRecord['value']['embed']['media']['$type'] === 'app.bsky.embed.images'
)
) {
foreach ($quotedRecord['embeds'] as $embed) {
if (
$embed['$type'] === 'app.bsky.embed.images#view' ||
($embed['$type'] === 'app.bsky.embed.recordWithMedia#view' && $embed['media']['$type'] === 'app.bsky.embed.images#view')
) {
$images = $embed['images'] ?? $embed['media']['images'];
foreach ($images as $image) {
$description .= $this->getPostImageDescription($image);
}
}
}
}
}
}
$description .= '</p>';
}
if (isset($post['post']['embed']['record']['value']['embed']['images'])) {
$quotedImages = $post['post']['embed']['record']['value']['embed']['images'];
foreach ($quotedImages as $image) {
$linkRef = $image['image']['ref']['$link'] ?? null;
if ($linkRef) {
$quotedAuthorDid = $post['post']['embed']['record']['author']['did'] ?? null;
$thumbnailUrl = $this->resolveThumbnailUrl($quotedAuthorDid, $linkRef);
$fullsizeUrl = $this->resolveFullsizeUrl($quotedAuthorDid, $linkRef);
$description .= "<br /><br /><a href=\"$fullsizeUrl\"><img src=\"$thumbnailUrl\" alt=\"Quoted Image\"></a>";
//reply
if ($replyContext && isset($post['reply']) && isset($post['reply']['parent'])) {
$replyPost = $post['reply']['parent'];
$description .= '<hr/>';
$description .= '<p>';
if (isset($replyPost['notFound']) && $replyPost['notFound']) { //deleted post
$description .= 'Replied to post was deleted.';
} elseif (isset($replyPost['blocked']) && $replyPost['blocked']) { //blocked by quote author
$description .= 'Author of replied to post has blocked OP.';
} else {
$replyPostRecord = $replyPost['record'];
$replyPostAuthorDID = $replyPost['author']['did'];
$replyPostAuthorHandle = $replyPost['author']['handle'] !== 'handle.invalid' ? '<i>@' . $replyPost['author']['handle'] . '</i> ' : '';
$replyPostDisplayName = $replyPost['author']['displayName'] ?? '';
$replyPostDisplayName = e($replyPostDisplayName);
$replyPostUri = self::URI . '/profile/' . $this->fallbackAuthor($replyPost['author'], 'url') . '/post/' . explode('app.bsky.feed.post/', $replyPost['uri'])[1];
// reply post
$description .= $this->getPostDescription(
$replyPostDisplayName,
$replyPostAuthorHandle,
$replyPostUri,
$replyPostRecord,
'reply'
);
if (isset($replyPostRecord['embed']['$type'])) {
//post link embed
if ($replyPostRecord['embed']['$type'] === 'app.bsky.embed.external') {
$description .= $this->parseExternal($replyPostRecord['embed']['external'], $replyPostAuthorDID);
} elseif (
$replyPostRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$replyPostRecord['embed']['media']['$type'] === 'app.bsky.embed.external'
) {
$description .= $this->parseExternal($replyPostRecord['embed']['media']['external'], $replyPostAuthorDID);
}
//post images
if (
$replyPostRecord['embed']['$type'] === 'app.bsky.embed.images' ||
(
$replyPostRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$replyPostRecord['embed']['media']['$type'] === 'app.bsky.embed.images'
)
) {
$images = $replyPost['embed']['images'] ?? $replyPost['embed']['media']['images'];
foreach ($images as $image) {
$description .= $this->getPostImageDescription($image);
}
}
//post video
if (
$replyPostRecord['embed']['$type'] === 'app.bsky.embed.video' ||
(
$replyPostRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$replyPostRecord['embed']['media']['$type'] === 'app.bsky.embed.video'
)
) {
$description .= $this->getPostVideoDescription(
$replyPostRecord['embed']['video'] ?? $replyPostRecord['embed']['media']['video'],
$replyPostAuthorDID
);
}
}
$description .= '</p>';
//quote post
if (
isset($replyPostRecord['embed']) &&
($replyPostRecord['embed']['$type'] === 'app.bsky.embed.record' || $replyPostRecord['embed']['$type'] === 'app.bsky.embed.recordWithMedia') &&
isset($replyPost['embed']['record'])
) {
$description .= '<p>';
$replyQuotedRecord = $replyPost['embed']['record']['record'] ?? $replyPost['embed']['record'];
if (isset($replyQuotedRecord['notFound']) && $replyQuotedRecord['notFound']) { //deleted post
$description .= 'Quoted post deleted.';
} elseif (isset($replyQuotedRecord['detached']) && $replyQuotedRecord['detached']) { //detached quote
$uri_explode = explode('/', $replyQuotedRecord['uri']);
$uri_reconstructed = self::URI . '/profile/' . $uri_explode[2] . '/post/' . $uri_explode[4];
$description .= '<a href="' . $uri_reconstructed . '">Quoted post detached.</a>';
} elseif (isset($replyQuotedRecord['blocked']) && $replyQuotedRecord['blocked']) { //blocked by quote author
$description .= 'Author of quoted post has blocked OP.';
} elseif (
($replyQuotedRecord['$type'] ?? '') === 'app.bsky.feed.defs#generatorView' ||
($replyQuotedRecord['$type'] ?? '') === 'app.bsky.graph.defs#listView'
) {
$description .= $this->getListFeedDescription($replyQuotedRecord);
} elseif (
($replyQuotedRecord['$type'] ?? '') === 'app.bsky.graph.starterpack' ||
($replyQuotedRecord['$type'] ?? '') === 'app.bsky.graph.defs#starterPackViewBasic'
) {
$description .= $this->getStarterPackDescription($replyPost['embed']['record']);
} else {
$quotedAuthorDid = $replyQuotedRecord['author']['did'];
$quotedDisplayName = $replyQuotedRecord['author']['displayName'] ?? '';
$quotedDisplayName = e($quotedDisplayName);
$quotedAuthorHandle = $replyQuotedRecord['author']['handle'] !== 'handle.invalid' ? '<i>@' . $replyQuotedRecord['author']['handle'] . '</i>' : '';
$parts = explode('/', $replyQuotedRecord['uri']);
$quotedPostId = end($parts);
$quotedPostUri = self::URI . '/profile/' . $this->fallbackAuthor($replyQuotedRecord['author'], 'url') . '/post/' . $quotedPostId;
//quoted post - post
$description .= $this->getPostDescription(
$quotedDisplayName,
$quotedAuthorHandle,
$quotedPostUri,
$replyQuotedRecord,
'quote'
);
if (isset($replyQuotedRecord['value']['embed']['$type'])) {
//quoted post - post link embed
if ($replyQuotedRecord['value']['embed']['$type'] === 'app.bsky.embed.external') {
$description .= $this->parseExternal($replyQuotedRecord['value']['embed']['external'], $quotedAuthorDid);
}
//quoted post - post video
if (
$replyQuotedRecord['value']['embed']['$type'] === 'app.bsky.embed.video' ||
(
$replyQuotedRecord['value']['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$replyQuotedRecord['value']['embed']['media']['$type'] === 'app.bsky.embed.video'
)
) {
$description .= $this->getPostVideoDescription(
$replyQuotedRecord['value']['embed']['video'] ?? $replyQuotedRecord['value']['embed']['media']['video'],
$quotedAuthorDid
);
}
//quoted post - post images
if (
$replyQuotedRecord['value']['embed']['$type'] === 'app.bsky.embed.images' ||
(
$replyQuotedRecord['value']['embed']['$type'] === 'app.bsky.embed.recordWithMedia' &&
$replyQuotedRecord['value']['embed']['media']['$type'] === 'app.bsky.embed.images'
)
) {
foreach ($replyQuotedRecord['embeds'] as $embed) {
if (
$embed['$type'] === 'app.bsky.embed.images#view' ||
($embed['$type'] === 'app.bsky.embed.recordWithMedia#view' && $embed['media']['$type'] === 'app.bsky.embed.images#view')
) {
$images = $embed['images'] ?? $embed['media']['images'];
foreach ($images as $image) {
$description .= $this->getPostImageDescription($image);
}
}
}
}
}
}
$description .= '</p>';
}
}
}
@@ -197,6 +493,106 @@ class BlueskyBridge extends BridgeAbstract
}
}
private function getPostVideoDescription(array $video, $authorDID)
{
//https://video.bsky.app/watch/$did/$cid/thumbnail.jpg
$videoCID = $video['ref']['$link'];
$videoMime = $video['mimeType'];
$thumbnail = "poster=\"https://video.bsky.app/watch/$authorDID/$videoCID/thumbnail.jpg\"" ?? '';
$videoURL = "https://bsky.social/xrpc/com.atproto.sync.getBlob?did=$authorDID&cid=$videoCID";
return "<figure><video loop $thumbnail preload=\"none\" controls src=\"$videoURL\" type=\"$videoMime\"/></figure>";
}
private function getPostImageDescription(array $image)
{
$thumbnailUrl = $image['thumb'];
$fullsizeUrl = $image['fullsize'];
$alt = strlen($image['alt']) > 0 ? '<figcaption>' . e($image['alt']) . '</figcaption>' : '';
return "<figure><a href=\"$fullsizeUrl\"><img src=\"$thumbnailUrl\"></a>$alt</figure>";
}
private function getPostDescription(
string $postDisplayName,
string $postAuthorHandle,
string $postUri,
array $postRecord,
string $type
) {
$description = '';
if ($type === 'quote') {
// Quoted post/reply from bbb @bbb.com:
$postType = isset($postRecord['reply']) ? 'reply' : 'post';
$description .= "<a href=\"$postUri\">Quoted $postType</a> from <b>$postDisplayName</b> $postAuthorHandle:<br>";
} elseif ($type === 'reply') {
// Replying to aaa @aaa.com's post/reply:
$postType = isset($postRecord['reply']) ? 'reply' : 'post';
$description .= "Replying to <b>$postDisplayName</b> $postAuthorHandle's <a href=\"$postUri\">$postType</a>:<br>";
} else {
// aaa @aaa.com posted:
$description .= "<b>$postDisplayName</b> $postAuthorHandle <a href=\"$postUri\">posted</a>:<br>";
}
$description .= $this->textToDescription($postRecord);
return $description;
}
//used if handle verification fails, fallsback to displayName or DID depending on context.
private function fallbackAuthor($author, $reason)
{
if ($author['handle'] === 'handle.invalid') {
switch ($reason) {
case 'url':
return $author['did'];
case 'display':
$displayName = $author['displayName'] ?? '';
return e($displayName);
}
}
return $author['handle'];
}
private function generateVerboseTitle($post)
{
//use "Post by A, replying to B, quoting C" instead of post contents
$title = '';
if (isset($post['reason']) && str_contains($post['reason']['$type'], 'reasonRepost')) {
$title .= 'Repost by ' . $this->fallbackAuthor($post['reason']['by'], 'display') . ', post by ' . $this->fallbackAuthor($post['post']['author'], 'display');
} else {
$title .= 'Post by ' . $this->fallbackAuthor($post['post']['author'], 'display');
}
if (isset($post['reply'])) {
if (isset($post['reply']['parent']['blocked'])) {
$replyAuthor = 'blocked user';
} elseif (isset($post['reply']['parent']['notFound'])) {
$replyAuthor = 'deleted post';
} else {
$replyAuthor = $this->fallbackAuthor($post['reply']['parent']['author'], 'display');
}
$title .= ', replying to ' . $replyAuthor;
}
if (
isset($post['post']['embed']) &&
isset($post['post']['embed']['record']) &&
//if not starter pack, feed or list
($post['post']['embed']['record']['$type'] ?? '') !== 'app.bsky.feed.defs#generatorView' &&
($post['post']['embed']['record']['$type'] ?? '') !== 'app.bsky.graph.defs#listView' &&
($post['post']['embed']['record']['$type'] ?? '') !== 'app.bsky.graph.defs#starterPackViewBasic'
) {
if (isset($post['post']['embed']['record']['blocked'])) {
$quotedAuthor = 'blocked user';
} elseif (isset($post['post']['embed']['record']['notFound'])) {
$quotedAuthor = 'deleted psost';
} elseif (isset($post['post']['embed']['record']['detached'])) {
$quotedAuthor = 'detached post';
} else {
$quotedAuthor = $this->fallbackAuthor($post['post']['embed']['record']['record']['author'] ?? $post['post']['embed']['record']['author'], 'display');
}
$title .= ', quoting ' . $quotedAuthor;
}
return $title;
}
private function resolveHandle($handle)
{
$uri = 'https://public.api.bsky.app/xrpc/com.atproto.identity.resolveHandle?handle=' . urlencode($handle);
@@ -214,17 +610,65 @@ class BlueskyBridge extends BridgeAbstract
private function getAuthorFeed($did, $filter)
{
$uri = 'https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed?actor=' . urlencode($did) . '&filter=' . urlencode($filter) . '&limit=30';
$this->logger->debug($uri);
$response = json_decode(getContents($uri), true);
return $response;
}
private function resolveThumbnailUrl($authorDid, $linkRef)
//Embed for generated feeds and lists
private function getListFeedDescription(array $record): string
{
return 'https://cdn.bsky.app/img/feed_thumbnail/plain/' . $authorDid . '/' . $linkRef . '@jpeg';
$feedViewAvatar = isset($record['avatar']) ? '<img src="' . preg_replace('/\/img\/avatar\//', '/img/avatar_thumbnail/', $record['avatar']) . '">' : '';
$feedViewName = e($record['displayName'] ?? $record['name']);
$feedViewDescription = e($record['description'] ?? '');
$authorDisplayName = e($record['creator']['displayName']);
$authorHandle = e($record['creator']['handle']);
$likeCount = isset($record['likeCount']) ? '<br>Liked by ' . e($record['likeCount']) . ' users' : '';
preg_match('/\/([^\/]+)$/', $record['uri'], $matches);
if (($record['purpose'] ?? '') === 'app.bsky.graph.defs#modlist') {
$typeURL = '/lists/';
$typeDesc = 'moderation list';
} elseif (($record['purpose'] ?? '') === 'app.bsky.graph.defs#curatelist') {
$typeURL = '/lists/';
$typeDesc = 'list';
} else {
$typeURL = '/feed/';
$typeDesc = 'feed';
}
$uri = e('https://bsky.app/profile/' . $record['creator']['did'] . $typeURL . $matches[1]);
return <<<END
<blockquote>
<b><a href="{$uri}">{$feedViewName}</a></b><br/>
Bluesky {$typeDesc} by <b>{$authorDisplayName}</b> <i>@{$authorHandle}</i>
<figure>
{$feedViewAvatar}
<figcaption>{$feedViewDescription}{$likeCount}</figcaption>
</figure>
</blockquote>
END;
}
private function resolveFullsizeUrl($authorDid, $linkRef)
private function getStarterPackDescription(array $record): string
{
return 'https://cdn.bsky.app/img/feed_fullsize/plain/' . $authorDid . '/' . $linkRef . '@jpeg';
if (!isset($record['record'])) {
return 'Failed to get starter pack information.';
}
$starterpackRecord = $record['record'];
$starterpackName = e($starterpackRecord['name']);
$starterpackDescription = e($starterpackRecord['description']);
$creatorDisplayName = e($record['creator']['displayName']);
$creatorHandle = e($record['creator']['handle']);
preg_match('/\/([^\/]+)$/', $starterpackRecord['list'], $matches);
$uri = e('https://bsky.app/starter-pack/' . $record['creator']['did'] . '/' . $matches[1]);
return <<<END
<blockquote>
<b><a href="{$uri}">{$starterpackName}</a></b><br/>
Bluesky starter pack by <b>{$creatorDisplayName}</b> <i>@{$creatorHandle}</i><br/>
{$starterpackDescription}
</blockquote>
END;
}
}

63
bridges/BruegelBridge.php Normal file
View File

@@ -0,0 +1,63 @@
<?php
class BruegelBridge extends BridgeAbstract
{
const NAME = 'Bruegel';
const URI = 'https://www.bruegel.org';
const DESCRIPTION = 'European think-tank commentary and publications.';
const MAINTAINER = 'KappaPrajd';
const PARAMETERS = [
[
'category' => [
'name' => 'Category',
'type' => 'list',
'defaultValue' => '/publications',
'values' => [
'Publications' => '/publications',
'Commentary' => '/commentary'
]
]
]
];
public function getIcon()
{
return self::URI . '/themes/custom/bruegel/assets/favicon/android-icon-72x72.png';
}
public function collectData()
{
$url = self::URI . $this->getInput('category');
$html = getSimpleHTMLDOM($url);
$articles = $html->find('.c-listing__content article');
foreach ($articles as $article) {
$title = $article->find('.c-list-item__title a span', 0)->plaintext;
$content = trim($article->find('.c-list-item__description', 0)->plaintext);
$publishDate = $article->find('.c-list-item__date', 0)->plaintext;
$href = $article->find('.c-list-item__title a', 0)->getAttribute('href');
$item = [
'title' => $title,
'content' => $content,
'timestamp' => strtotime($publishDate),
'uri' => self::URI . $href,
'author' => $this->getAuthor($article),
];
$this->items[] = $item;
}
}
private function getAuthor($article)
{
$authorsElements = $article->find('.c-list-item__authors a');
$authors = array_map(function ($author) {
return $author->plaintext;
}, $authorsElements);
return join(', ', $authors);
}
}

View File

@@ -98,7 +98,7 @@ class BugzillaBridge extends BridgeAbstract
// Array of comments is here
if (!isset($json['bugs'][$this->bugid]['comments'])) {
returnClientError('Cannot find REST endpoint');
throwClientException('Cannot find REST endpoint');
}
foreach ($json['bugs'][$this->bugid]['comments'] as $comment) {
@@ -131,7 +131,7 @@ class BugzillaBridge extends BridgeAbstract
// Array of changesets which contain an array of changes
if (!isset($json['bugs']['0']['history'])) {
returnClientError('Cannot find REST endpoint');
throwClientException('Cannot find REST endpoint');
}
foreach ($json['bugs']['0']['history'] as $changeset) {

View File

@@ -206,7 +206,7 @@ class BukowskisBridge extends BridgeAbstract
$this->items[] = [
'title' => $title,
'uri' => $baseUrl . $relative_url,
'uid' => $lot->getAttribute('data-lot-id'),
'uid' => $relative_url,
'content' => count($images) > 0 ? "<img src='$images[0]'/><br/>$title" : $title,
'enclosures' => array_slice($images, 1),
];

View File

@@ -26,21 +26,19 @@ TMPL;
https://www.bundestag.de/ajax/filterlist/de/parlament/praesidium/parteienfinanzierung/fundstellen50000/462002-462002
URI;
// Get the main page
$html = getSimpleHTMLDOMCached($ajaxUri, self::CACHE_TIMEOUT)
or returnServerError('Could not request AJAX list.');
$html = getSimpleHTMLDOMCached($ajaxUri, self::CACHE_TIMEOUT);
// Build the URL from the first anchor element. The list is sorted by year, descending, so the first element is the current year.
$firstAnchor = $html->find('a', 0)
or returnServerError('Could not find the proper HTML element.');
or throwServerException('Could not find the proper HTML element.');
$url = 'https://www.bundestag.de' . $firstAnchor->href;
$url = $firstAnchor->href;
// Get the actual page with the soft money donations
$html = getSimpleHTMLDOMCached($url, self::CACHE_TIMEOUT)
or returnServerError('Could not request ' . $url);
$html = getSimpleHTMLDOMCached($url, self::CACHE_TIMEOUT);
$rows = $html->find('table.table > tbody > tr')
or returnServerError('Could not find the proper HTML elements.');
or throwServerException('Could not find the proper HTML elements.');
foreach ($rows as $row) {
$item = $this->generateItemFromRow($row);

View File

@@ -50,7 +50,7 @@ class CNETBridge extends SitemapBridge
}
if (empty($links)) {
returnClientError('Failed to retrieve article list');
throwClientException('Failed to retrieve article list');
}
foreach ($links as $article_uri) {

View File

@@ -87,7 +87,7 @@ class CVEDetailsBridge extends BridgeAbstract
$vendor = $html->find('#contentdiv h1 > a', 0);
if ($vendor == null) {
returnServerError('Invalid Vendor ID ' . $this->getInput('vendor_id') . ' or Product ID ' . $this->getInput('product_id'));
throwServerException('Invalid Vendor ID ' . $this->getInput('vendor_id') . ' or Product ID ' . $this->getInput('product_id'));
}
$this->vendor = $vendor->innertext;

View File

@@ -72,14 +72,14 @@ class CachetBridge extends BridgeAbstract
{
$ping = getContents(urljoin($this->getURI(), '/api/v1/ping'));
if (!$this->validatePing($ping)) {
returnClientError('Provided URI is invalid!');
throwClientException('Provided URI is invalid!');
}
$url = urljoin($this->getURI(), '/api/v1/incidents?sort=id&order=desc');
$incidents = getContents($url);
$incidents = json_decode($incidents);
if ($incidents === null) {
returnClientError('/api/v1/incidents returned no valid json');
throwClientException('/api/v1/incidents returned no valid json');
}
usort($incidents->data, function ($a, $b) {

View File

@@ -66,7 +66,7 @@ class CarThrottleBridge extends BridgeAbstract
foreach ($categoryPage->find('div.cmg-card') as $post) {
$item = [];
$titleElement = $post->find('div.title a')[0];
$titleElement = $post->find('a.title')[0];
$post_uri = self::URI . $titleElement->getAttribute('href');
if (!isset($post_uri) || $post_uri == '') {
@@ -80,8 +80,8 @@ class CarThrottleBridge extends BridgeAbstract
$item['author'] = $this->parseAuthor($articlePage);
$articleImage = $articlePage->find('div.block-layout-field-image')[0];
$article = $articlePage->find('div.block-layout-body')[1];
$articleImage = $articlePage->find('figure')[0];
$article = $articlePage->find('div.first-column div.body')[0];
//remove ads
foreach ($article->find('aside') as $ad) {

View File

@@ -36,7 +36,7 @@ class CastorusBridge extends BridgeAbstract
$title = $activity->find('a', 0);
if (!$title) {
returnServerError('Cannot find title!');
throwServerException('Cannot find title!');
}
return trim($title->plaintext);
@@ -48,7 +48,7 @@ class CastorusBridge extends BridgeAbstract
$url = $activity->find('a', 0);
if (!$url) {
returnServerError('Cannot find url!');
throwServerException('Cannot find url!');
}
return self::URI . $url->href;
@@ -62,7 +62,7 @@ class CastorusBridge extends BridgeAbstract
$nodes = $activity->find('*');
if (!$nodes) {
returnServerError('Cannot find nodes!');
throwServerException('Cannot find nodes!');
}
foreach ($nodes as $node) {
@@ -78,7 +78,7 @@ class CastorusBridge extends BridgeAbstract
$price = $activity->find('span', 1);
if (!$price) {
returnServerError('Cannot find price!');
throwServerException('Cannot find price!');
}
return $price->innertext;
@@ -92,13 +92,13 @@ class CastorusBridge extends BridgeAbstract
$html = getSimpleHTMLDOM(self::URI);
if (!$html) {
returnServerError('Could not load data from ' . self::URI . '!');
throwServerException('Could not load data from ' . self::URI . '!');
}
$activities = $html->find('div#activite > li');
if (!$activities) {
returnServerError('Failed to find activities!');
throwServerException('Failed to find activities!');
}
foreach ($activities as $activity) {

View File

@@ -48,6 +48,11 @@ class CentreFranceBridge extends BridgeAbstract
]
];
private static array $monthNumberByFrenchName = [
'janvier' => 1, 'février' => 2, 'mars' => 3, 'avril' => 4, 'mai' => 5, 'juin' => 6, 'juillet' => 7,
'août' => 8, 'septembre' => 9, 'octobre' => 10, 'novembre' => 11, 'décembre' => 12
];
public function collectData()
{
$value = $this->getInput('limit');
@@ -67,15 +72,9 @@ class CentreFranceBridge extends BridgeAbstract
$newspaperUrl = 'https://www.' . $this->getInput('newspaper') . '/' . $localitySlug . '/';
$html = getSimpleHTMLDOM($newspaperUrl);
// Articles are detected through their titles
foreach ($html->find('.c-titre') as $articleTitleDOMElement) {
$articleLinkDOMElement = $articleTitleDOMElement->find('a', 0);
// Ignore articles in the « Les + partagés » block
if (strpos($articleLinkDOMElement->id, 'les_plus_partages') !== false) {
continue;
}
// Articles are detected through a standard tag
foreach ($html->find('article') as $articleDOMElement) {
$articleLinkDOMElement = $articleDOMElement->find('a', 0);
$articleURI = $articleLinkDOMElement->href;
// If the URI has already been processed, ignore it
@@ -91,7 +90,7 @@ class CentreFranceBridge extends BridgeAbstract
$articleTitle = '';
// If article is reserved for subscribers
if ($articleLinkDOMElement->find('span.premium-picto', 0)) {
if ($articleLinkDOMElement->find('span.premium-icon', 0)) {
if ($this->getInput('remove-reserved-for-subscribers-articles') === true) {
continue;
}
@@ -99,18 +98,23 @@ class CentreFranceBridge extends BridgeAbstract
$articleTitle .= '🔒 ';
}
$articleTitleDOMElement = $articleLinkDOMElement->find('span[data-tb-title]', 0);
if ($articleTitleDOMElement === null) {
continue;
}
if ($limit > 0 && count($this->items) === $limit) {
break;
}
$articleTitle .= $articleLinkDOMElement->find('span[data-tb-title]', 0)->innertext;
$articleFullURI = urljoin('https://www.' . $this->getInput('newspaper') . '/', $articleURI);
// Loop through each possible title class name
for ($i = 1; $i <= 3; $i++) {
foreach ($articleLinkDOMElement->find('.typo-card-heading-' . $i) as $articleTitleDOMElement) {
if ($articleTitleDOMElement->hasClass('font-sans')) {
continue;
}
$articleTitle .= $articleTitleDOMElement->text();
break 2;
}
}
$articleFullURI = urljoin('https://www.' . $this->getInput('newspaper') . '/', $articleURI);
$item = [
'title' => $articleTitle,
'uri' => $articleFullURI,
@@ -130,14 +134,22 @@ class CentreFranceBridge extends BridgeAbstract
'enclosures' => [],
];
$articleInformations = $html->find('.c-article-informations p');
$articleInformations = $html->find('#content hgroup > div.typo-p3 > *');
if (is_array($articleInformations) && $articleInformations !== []) {
$authorPosition = 1;
$publicationDateIndex = 0;
// Article author
$probableAuthorName = strip_tags($articleInformations[0]->innertext);
if (str_starts_with($probableAuthorName, 'Par ')) {
$publicationDateIndex = 1;
$item['author'] = substr($probableAuthorName, 4);
}
// Article publication date
if (preg_match('/(\d{2})\/(\d{2})\/(\d{4})( à (\d{2})h(\d{2}))?/', $articleInformations[0]->innertext, $articleDateParts) > 0) {
preg_match('/Publié le (\d{2}) (.+) (\d{4})( à (\d{2})h(\d{2}))?/', strip_tags($articleInformations[$publicationDateIndex]->innertext), $articleDateParts);
if ($articleDateParts !== [] && array_key_exists($articleDateParts[2], self::$monthNumberByFrenchName)) {
$articleDate = new \DateTime('midnight');
$articleDate->setDate($articleDateParts[3], $articleDateParts[2], $articleDateParts[1]);
$articleDate->setDate($articleDateParts[3], self::$monthNumberByFrenchName[$articleDateParts[2]], $articleDateParts[1]);
if (count($articleDateParts) === 7) {
$articleDate->setTime($articleDateParts[5], $articleDateParts[6]);
@@ -145,59 +157,33 @@ class CentreFranceBridge extends BridgeAbstract
$item['timestamp'] = $articleDate->getTimestamp();
}
// Article update date
if (count($articleInformations) >= 2 && preg_match('/(\d{2})\/(\d{2})\/(\d{4})( à (\d{2})h(\d{2}))?/', $articleInformations[1]->innertext, $articleDateParts) > 0) {
$authorPosition = 2;
$articleDate = new \DateTime('midnight');
$articleDate->setDate($articleDateParts[3], $articleDateParts[2], $articleDateParts[1]);
if (count($articleDateParts) === 7) {
$articleDate->setTime($articleDateParts[5], $articleDateParts[6]);
}
$item['timestamp'] = $articleDate->getTimestamp();
}
if (count($articleInformations) === ($authorPosition + 1)) {
$item['author'] = $articleInformations[$authorPosition]->innertext;
}
}
$articleContent = $html->find('.b-article .contenu > *');
if (is_array($articleContent)) {
$item['content'] = '';
foreach ($articleContent as $contentPart) {
if (in_array($contentPart->getAttribute('id'), ['cf-audio-player', 'poool-widget'], true)) {
continue;
$articleContent = $html->find('#content>div.flex+div.grid section>.z-10')[0] ?? null;
if ($articleContent instanceof \simple_html_dom_node) {
$articleHiddenParts = $articleContent->find('.ad-slot, #cf-digiteka-player');
if (is_array($articleHiddenParts)) {
foreach ($articleHiddenParts as $articleHiddenPart) {
$articleContent->removeChild($articleHiddenPart);
}
$articleHiddenParts = $contentPart->find('.bloc, .p402_hide');
if (is_array($articleHiddenParts)) {
foreach ($articleHiddenParts as $articleHiddenPart) {
$contentPart->removeChild($articleHiddenPart);
}
}
$item['content'] .= $contentPart->innertext;
}
$item['content'] = $articleContent->innertext;
}
$articleIllustration = $html->find('.photo-wrapper .photo-box img');
$articleIllustration = $html->find('#content>div.flex+div.grid section>figure>img');
if (is_array($articleIllustration) && count($articleIllustration) === 1) {
$item['enclosures'][] = $articleIllustration[0]->getAttribute('src');
}
$articleAudio = $html->find('#cf-audio-player-container audio');
$articleAudio = $html->find('audio[src^="https://api.octopus.saooti.com/"]');
if (is_array($articleAudio) && count($articleAudio) === 1) {
$item['enclosures'][] = $articleAudio[0]->getAttribute('src');
}
$articleTags = $html->find('.b-article > ul.c-tags > li > a.t-simple');
$articleTags = $html->find('#content>div.flex+div.grid section>.bg-gray-light>a.border-gray-dark');
if (is_array($articleTags)) {
$item['categories'] = array_map(static fn ($articleTag) => $articleTag->innertext, $articleTags);
$item['categories'] = array_map(static fn ($articleTag) => html_entity_decode($articleTag->innertext), $articleTags);
}
$explode = explode('_', $uri);
@@ -208,6 +194,10 @@ class CentreFranceBridge extends BridgeAbstract
$item['uid'] = $uid;
}
if (!isset($item['content'])) {
$item['content'] = '';
}
// If the article is a "grand format", we use another parsing strategy
if ($item['content'] === '' && $html->find('article') !== []) {
$articleContent = $html->find('article > section');

View File

@@ -18,32 +18,13 @@ class CeskaTelevizeBridge extends BridgeAbstract
]
];
private function fixChars($text)
{
return html_entity_decode($text, ENT_QUOTES, 'UTF-8');
}
private function getUploadTimeFromString($string)
{
if (strpos($string, 'dnes') !== false) {
return strtotime('today');
} elseif (strpos($string, 'včera') !== false) {
return strtotime('yesterday');
} elseif (!preg_match('/(\d+).\s(\d+).(\s(\d+))?/', $string, $match)) {
returnServerError('Could not get date from Česká televize string');
}
$date = sprintf('%04d-%02d-%02d', $match[3] ?? date('Y'), $match[2], $match[1]);
return strtotime($date);
}
public function collectData()
{
$url = $this->getInput('url');
$validUrl = '/^(https:\/\/www\.ceskatelevize\.cz\/porady\/\d+-[a-z0-9-]+\/)(bonus\/)?$/';
if (!preg_match($validUrl, $url, $match)) {
returnServerError('Invalid url');
throwServerException('Invalid url');
}
$category = $match[4] ?? 'nove';
@@ -58,24 +39,42 @@ class CeskaTelevizeBridge extends BridgeAbstract
}
foreach ($html->find('#episodeListSection a[data-testid=card]') as $element) {
$itemTitle = $element->find('h3', 0);
$itemContent = $element->find('p[class^=content-]', 0);
$itemDate = $element->find('div[class^=playTime-] span, [data-testid=episode-item-broadcast] span', 0);
$itemThumbnail = $element->find('img', 0);
$itemUri = self::URI . $element->getAttribute('href');
// Remove special characters and whitespace
$cleanDate = preg_replace('/[^0-9.]/', '', $itemDate->plaintext);
$item = [
'title' => $this->fixChars($itemTitle->plaintext),
'uri' => $itemUri,
'content' => '<img src="' . $itemThumbnail->getAttribute('src') . '" /><br />'
. $this->fixChars($itemContent->plaintext),
'timestamp' => $this->getUploadTimeFromString($itemDate->plaintext)
'title' => $this->fixChars($element->find('h3', 0)->plaintext),
'uri' => self::URI . $element->getAttribute('href'),
'content' => '<img src="' . $element->find('img', 0)->getAttribute('srcset') . '" /><br />' . $this->fixChars($itemContent->plaintext),
'timestamp' => $this->getUploadTimeFromString($cleanDate),
];
$this->items[] = $item;
}
}
private function getUploadTimeFromString($string)
{
if (strpos($string, 'dnes') !== false) {
return strtotime('today');
} elseif (strpos($string, 'včera') !== false) {
return strtotime('yesterday');
} elseif (!preg_match('/(\d+).(\d+).((\d+))?/', $string, $match)) {
throwServerException('Could not get date from Česká televize string');
}
$date = sprintf('%04d-%02d-%02d', $match[3] ?? date('Y'), $match[2], $match[1]);
return strtotime($date);
}
private function fixChars($text)
{
return html_entity_decode($text, ENT_QUOTES, 'UTF-8');
}
public function getURI()
{
return $this->feedUri ?? parent::getURI();

186
bridges/ComickBridge.php Normal file
View File

@@ -0,0 +1,186 @@
<?php
class ComickBridge extends BridgeAbstract
{
const MAINTAINER = 'phantop';
const NAME = 'Comick';
const URI = 'https://comick.io/';
const DESCRIPTION = 'Returns the latest chapters for a manga on comick.io.';
const PARAMETERS = [[
'slug' => [
'name' => 'Manga Slug',
'type' => 'text',
'required' => true,
'title' => 'The part of the URL after /comic/',
'exampleValue' => '00-kusuriya-no-hitorigoto-maomao-no-koukyuu-nazotoki-techou'
],
'lang' => [
'name' => 'Language',
'type' => 'list',
'title' => 'Language for comic (list is # of comics, descending)',
'values' => [
'English' => 'en',
'Brazilian Portuguese' => 'pt-br',
'Spanish Latin American' => 'es-la',
'Russian' => 'ru',
'Vietnamese' => 'vi',
'French' => 'fr',
'Polish' => 'pl',
'Indonesian' => 'id',
'Turkish' => 'tr',
'Italian' => 'it',
'Spanish; Castilian' => 'es',
'Ukrainian' => 'uk',
'Arabic' => 'ar',
'Hong Kong (Traditional Chinese)' => 'zh-hk',
'Hungarian' => 'hu',
'Chinese' => 'zh',
'German' => 'de',
'Korean' => 'ko',
'Thai' => 'th',
'Catalan; Valencian' => 'ca',
'Bulgarian' => 'bg',
'Persian' => 'fa',
'Romanian, Moldavian, Moldovan' => 'ro',
'Czech' => 'cs',
'Mongolian' => 'mn',
'Portuguese' => 'pt',
'Hebrew (modern)' => 'he',
'Hindi' => 'hi',
'Filipino/Tagalog' => 'tl',
'Finnish' => 'fi',
'Malay' => 'ms',
'Basque' => 'eu',
'Kazakh' => 'kk',
'Serbian' => 'sr',
'Burmese' => 'my',
'Japanese' => 'ja',
'Greek, Modern' => 'el',
'Dutch' => 'nl',
'Bengali' => 'bn',
'Uzbek' => 'uz',
'Esperanto' => 'eo',
'Lithuanian' => 'lt',
'Georgian' => 'ka',
'Danish' => 'da',
'Tamil' => 'ta',
'Swedish' => 'sv',
'Belarusian' => 'be',
'Chuvash' => 'cv',
'Croatian' => 'hr',
'Latin' => 'la',
'Nepali' => 'ne',
'Urdu' => 'ur',
'Galician' => 'gl',
'Norwegian' => 'no',
'Albanian' => 'sq',
'Irish' => 'ga',
'Javanese' => 'jv',
'Telugu' => 'te',
'Slovene' => 'sl',
'Estonian' => 'et',
'Azerbaijani' => 'az',
'Slovak' => 'sk',
'Afrikaans' => 'af',
'Latvian' => 'lv',
],
'defaultValue' => 'en'
],
'fetch' => [
'name' => 'Fetch chapter page images',
'type' => 'list',
'title' => 'Places chapter images in feed contents. Entries will consume more bandwidth.',
'defaultValue' => 'c',
'values' => [
'None' => 'n',
'Content' => 'c',
'Enclosure' => 'e'
]
],
'limit' => [
'name' => 'Limit',
'type' => 'number',
'title' => 'Maximum number of chapters to return',
'defaultValue' => 10
]
]];
private $title;
private function getComick($url)
{
$API = 'https://api.comick.fun';
// Need a non-cURL UA, otherwise we get Cloudflare 403'd
$opts = [
CURLOPT_USERAGENT => 'rss-bridge (https://github.com/RSS-Bridge/rss-bridge)'
];
$content = getContents("$API/$url", [], $opts);
return json_decode($content, true);
}
public function collectData()
{
$slug = $this->getInput('slug');
$lang = $this->getInput('lang');
$limit = $this->getInput('limit');
$manga = $this->getComick("comic/$slug");
$hid = $manga['comic']['hid'];
$this->title = $manga['comic']['title'];
$manga = $this->getComick("comic/$hid/chapters?lang=$lang&limit=$limit");
foreach ($manga['chapters'] as $chapter) {
$hid = $chapter['hid'];
$item['author'] = implode(', ', $chapter['group_name']);
$item['timestamp'] = strtotime($chapter['created_at']);
$item['uri'] = $this->getURI() . '/' . $hid;
$item['title'] = '';
if ($chapter['vol']) {
$item['title'] .= ' Vol. ' . $chapter['vol'];
}
if ($chapter['chap']) {
$item['title'] .= ' Ch. ' . $chapter['chap'];
}
if ($chapter['title']) {
$item['title'] .= ' - ' . $chapter['title'];
}
if ($this->getInput('fetch') != 'n') {
$chapter = $this->getComick("chapter/$hid");
if (isset($chapter['chapter']['md_images'])) {
$item['content'] = '';
foreach ($chapter['chapter']['md_images'] as $image) {
$img = 'https://meo.comick.pictures/' . $image['b2key'];
if ($this->getInput('fetch') == 'c') {
$item['content'] .= '<img src="' . $img . '" />';
}
if ($this->getInput('fetch') == 'e') {
$item['enclosures'][] = $img;
}
}
}
}
$this->items[] = $item;
}
}
public function getName()
{
if ($this->title) {
return parent::getName() . ' - ' . $this->title;
}
return parent::getName();
}
public function getURI()
{
if ($this->getInput('slug')) {
return self::URI . 'comic/' . $this->getInput('slug');
}
return parent::getURI();
}
}

View File

@@ -109,7 +109,7 @@ class CrewbayBridge extends BridgeAbstract
public function collectData()
{
$url = $this->getURI();
$html = getSimpleHTMLDOM($url) or returnClientError('No results for this query.');
$html = getSimpleHTMLDOM($url);
$annonces = $html->find('#SearchResults div.result');
$limit = 0;

View File

@@ -217,7 +217,7 @@ class CssSelectorBridge extends BridgeAbstract
$links = $page->find($url_selector);
if (empty($links)) {
returnClientError('No results for URL selector');
throwClientException('No results for URL selector');
}
$link_to_item = [];
@@ -232,8 +232,10 @@ class CssSelectorBridge extends BridgeAbstract
continue;
}
}
$item['uri'] = $link->href;
$item['title'] = $link->plaintext;
$item['uri'] = html_entity_decode($link->href);
$item['title'] = html_entity_decode($link->plaintext);
if (isset($item['content'])) {
$item['content'] = convertLazyLoading($item['content']);
$item['content'] = defaultLinkTo($item['content'], $item['uri']);
@@ -243,13 +245,13 @@ class CssSelectorBridge extends BridgeAbstract
}
if (empty($link_to_item)) {
returnClientError('The provided URL selector matches some elements, but they do not contain links.');
throwClientException('The provided URL selector matches some elements, but they do not contain links.');
}
$links = $this->filterUrlList(array_keys($link_to_item), $url_pattern, $limit);
if (empty($links)) {
returnClientError('No results for URL pattern');
throwClientException('No results for URL pattern');
}
$items = [];
@@ -272,7 +274,7 @@ class CssSelectorBridge extends BridgeAbstract
protected function expandEntryWithSelector($entry_url, $content_selector, $content_cleanup = null, $title_cleanup = null, $title_default = null)
{
if (empty($content_selector)) {
returnClientError('Please specify a content selector');
throwClientException('Please specify a content selector');
}
$entry_html = getSimpleHTMLDOMCached($entry_url);

View File

@@ -187,7 +187,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
// Fetch the elements from the article pages.
if ($use_article_pages) {
if (empty($article_page_content_selector)) {
returnClientError('`Article selector` is required when `Load article page` is enabled');
throwClientException('`Article selector` is required when `Load article page` is enabled');
}
foreach (array_keys($entry_elements) as $uri) {
@@ -307,7 +307,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
$entryElements = $page->find($entry_selector);
if (empty($entryElements)) {
returnClientError('No entry elements for entry selector');
throwClientException('No entry elements for entry selector');
}
// Extract URIs with the associated entry element
@@ -327,7 +327,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
}
if (empty($links_with_elements)) {
returnClientError('The provided URL selector matches some elements, but they do not
throwClientException('The provided URL selector matches some elements, but they do not
contain links.');
}
@@ -335,7 +335,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
$filtered_urls = $this->filterUrlList(array_keys($links_with_elements), $url_pattern, $limit);
if (empty($filtered_urls)) {
returnClientError('No results for URL pattern');
throwClientException('No results for URL pattern');
}
$items = [];
@@ -359,7 +359,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
$article_content = $entry_html->find($content_selector, 0);
if (is_null($article_content)) {
returnClientError('Could not get article content at URL: ' . $entry_url);
throwClientException('Could not get article content at URL: ' . $entry_url);
}
$article_content = defaultLinkTo($article_content, $entry_url);
@@ -370,7 +370,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
{
$date = date_parse_from_format($format, $timeStr);
if ($date['error_count'] != 0) {
returnClientError('Error while parsing time string');
throwClientException('Error while parsing time string');
}
$timestamp = mktime(
@@ -383,7 +383,7 @@ class CssSelectorComplexBridge extends BridgeAbstract
);
if ($timestamp == false) {
returnClientError('Error while creating timestamp');
throwClientException('Error while creating timestamp');
}
return $timestamp;

View File

@@ -15,7 +15,7 @@ class CubariProxyBridge extends BridgeAbstract
'MangAventure' => 'mangadventure',
'MangaDex' => 'mangadex',
'MangaKatana' => 'mangakatana',
'MangaSee' => 'mangasee',
'WeebCentral' => 'weebcentral',
]
],
'series' => [

View File

@@ -1,113 +0,0 @@
<?php
class CuriousCatBridge extends BridgeAbstract
{
const NAME = 'Curious Cat Bridge';
const URI = 'https://curiouscat.me';
const DESCRIPTION = 'Returns list of newest questions and answers for a user profile';
const MAINTAINER = 'VerifiedJoseph';
const PARAMETERS = [[
'username' => [
'name' => 'Username',
'type' => 'text',
'required' => true,
'exampleValue' => 'koethekoethe',
]
]];
const CACHE_TIMEOUT = 3600;
public function collectData()
{
$url = self::URI . '/api/v2/profile?username=' . urlencode($this->getInput('username'));
$apiJson = getContents($url);
$apiData = Json::decode($apiJson);
if (isset($apiData['error'])) {
throw new \Exception($apiData['error_code']);
}
foreach ($apiData['posts'] as $post) {
$item = [];
$item['author'] = 'Anonymous';
if ($post['senderData']['id'] !== false) {
$item['author'] = $post['senderData']['username'];
}
$item['uri'] = $this->getURI() . '/post/' . $post['id'];
$item['title'] = $this->ellipsisTitle($post['comment']);
$item['content'] = $this->processContent($post);
$item['timestamp'] = $post['timestamp'];
$this->items[] = $item;
}
}
public function getURI()
{
if (!is_null($this->getInput('username'))) {
return self::URI . '/' . $this->getInput('username');
}
return parent::getURI();
}
public function getName()
{
if (!is_null($this->getInput('username'))) {
return $this->getInput('username') . ' - Curious Cat';
}
return parent::getName();
}
private function processContent($post)
{
$author = 'Anonymous';
if ($post['senderData']['id'] !== false) {
$authorUrl = self::URI . '/' . $post['senderData']['username'];
$author = <<<EOD
<a href="{$authorUrl}">{$post['senderData']['username']}</a>
EOD;
}
$question = $this->formatUrls($post['comment']);
$answer = $this->formatUrls($post['reply']);
$content = <<<EOD
<p>{$author} asked:</p>
<blockquote>{$question}</blockquote><br/>
<p>{$post['addresseeData']['username']} answered:</p>
<blockquote>{$answer}</blockquote>
EOD;
return $content;
}
private function ellipsisTitle($text)
{
$length = 150;
if (strlen($text) > $length) {
$text = explode('<br>', wordwrap($text, $length, '<br>'));
return $text[0] . '...';
}
return $text;
}
private function formatUrls($content)
{
return preg_replace(
'/(http[s]{0,1}\:\/\/[a-zA-Z0-9.\/\?\&=\-_]{4,})/ims',
'<a target="_blank" href="$1" target="_blank">$1</a> ',
$content
);
}
}

114
bridges/CybernewsBridge.php Normal file
View File

@@ -0,0 +1,114 @@
<?php
declare(strict_types=1);
class CybernewsBridge extends BridgeAbstract
{
const NAME = 'Cybernews';
const URI = 'https://cybernews.com';
const DESCRIPTION = 'Fetches the latest news from Cybernews';
const MAINTAINER = 'tillcash';
const CACHE_TIMEOUT = 3600; // 1 hour
const MAX_ARTICLES = 5;
public function collectData()
{
$sitemapXml = getContents(self::URI . '/news-sitemap.xml');
if (!$sitemapXml) {
throwServerException('Unable to retrieve Cybernews sitemap');
}
$sitemap = simplexml_load_string($sitemapXml, null, LIBXML_NOCDATA);
if (!$sitemap) {
throwServerException('Unable to parse Cybernews sitemap');
}
foreach ($sitemap->url as $entry) {
$url = trim((string) $entry->loc);
$lastmod = trim((string) $entry->lastmod);
if (!$url) {
continue;
}
$pathParts = explode('/', trim(parse_url($url, PHP_URL_PATH), '/'));
$category = isset($pathParts[0]) && $pathParts[0] !== '' ? $pathParts[0] : '';
// Skip non-English versions
if (in_array($category, ['nl', 'de'], true)) {
continue;
}
$namespaces = $entry->getNamespaces(true);
$title = '';
if (isset($namespaces['news'])) {
$news = $entry->children($namespaces['news'])->news;
if ($news) {
$title = trim((string) $news->title);
}
}
if (!$title) {
continue;
}
$this->items[] = [
'title' => $title,
'uri' => $url,
'uid' => $url,
'timestamp' => strtotime($lastmod),
'categories' => $category ? [$category] : [],
'content' => $this->fetchFullArticle($url),
];
if (count($this->items) >= self::MAX_ARTICLES) {
break;
}
}
}
private function fetchFullArticle(string $url): string
{
$html = getSimpleHTMLDOMCached($url);
if (!$html) {
return 'Unable to fetch article content';
}
$article = $html->find('article', 0);
if (!$article) {
return 'Unable to parse article content';
}
// Remove unnecessary elements
$removeSelectors = [
'script',
'style',
'div.links-bar',
'div.google-news-cta',
'div.a-wrapper',
'div.embed_youtube',
];
foreach ($removeSelectors as $selector) {
foreach ($article->find($selector) as $element) {
$element->outertext = '';
}
}
// Handle lazy-loaded images
foreach ($article->find('img') as $img) {
if (!empty($img->{'data-src'})) {
$img->src = $img->{'data-src'};
unset($img->{'data-src'});
}
}
return $article->innertext;
}
}

View File

@@ -37,6 +37,26 @@ class DRKBlutspendeBridge extends FeedExpander
]
];
const OFFER_LOW_PRIORITIES = [
'Imbiss nach der Blutspende',
'Registrierung als Stammzellspender',
'Typisierung möglich!',
'Allgemeine Informationen',
'Krankenkassen belohnen Blutspender',
'Wer benötigt eigentlich eine Blutspende?',
'Win-Win-Situation für die Gesundheit!',
'Terminreservierung',
'Du möchtest das erste Mal Blut spenden?',
'Spende-Check',
'Sie haben Fragen vor Ihrer Blutspende?'
];
const IMAGE_PRIORITIES = [
'DRK',
'Imbiss',
'Obst',
];
public function collectData()
{
$limitItems = intval($this->getInput('limit_items'));
@@ -45,37 +65,116 @@ class DRKBlutspendeBridge extends FeedExpander
protected function parseItem(array $item)
{
$html = getSimpleHTMLDOM($item['uri']);
$html = getSimpleHTMLDOMCached($item['uri']);
$detailsElement = $html->find('.details', 0);
$dateElement = $detailsElement->find('.datum', 0);
$dateLines = self::explodeLines($dateElement->plaintext);
$addressElement = $detailsElement->find('.adresse', 0);
$addressLines = self::explodeLines($addressElement->plaintext);
$dateLines = self::explodeLines($detailsElement->find('.datum', 0)->plaintext);
$addressLines = self::explodeLines($detailsElement->find('.adresse', 0)->plaintext);
$infoElement = $detailsElement->find('.angebote > h4 + p', 0);
$info = $infoElement ? $infoElement->innertext : '';
$info = $infoElement ? trim($infoElement->plaintext) : '';
$imageElements = $detailsElement->find('.fotos img');
$offers = self::parseOffers($detailsElement->find('.angebote .item'));
$item['title'] = $dateLines[0] . ' ' . $dateLines[1] . ' ' . $addressLines[0] . ' - ' . $addressLines[1];
$images = self::parseImages($detailsElement->find('.fotos', 0));
usort($images, function ($imageA, $imageB): int {
list($titleA) = $imageA;
list($titleB) = $imageB;
$prioA = 0;
$prioB = 0;
foreach (self::IMAGE_PRIORITIES as $prioIndex => $prioTitleNeedle) {
if (stripos($titleA, $prioTitleNeedle) !== false) {
$prioA = $prioIndex + 1;
}
if (stripos($titleB, $prioTitleNeedle) !== false) {
$prioB = $prioIndex + 1;
}
}
return $prioA - $prioB;
});
$item['content'] = <<<HTML
<p><b>{$dateLines[0]} {$dateLines[1]}</b></p>
<p>{$addressElement->innertext}</p>
<p>{$info}</p>
$itemContent = <<<HTML
<div>
<p>
<b>{$dateLines[0]} {$dateLines[1]}</b><br>
{$addressLines[3]}
</p>
<p>
<b>{$addressLines[0]}</b><br>
{$addressLines[1]}<br>
{$addressLines[2]}
</p>
</div>
HTML;
foreach ($imageElements as $imageElement) {
$src = $imageElement->getAttribute('src');
$item['content'] .= <<<HTML
<p><img src="{$src}"></p>
if ($info) {
$itemContent .= <<<HTML
<div>
<h3>Infos</h3>
<p>{$info}</p>
</div>
HTML;
}
$majorOffers = array_filter($offers, fn($title): bool => !in_array($title, self::OFFER_LOW_PRIORITIES), ARRAY_FILTER_USE_KEY);
foreach ($majorOffers as $offerTitle => list($offerText, $offerImages)) {
$itemContent .= <<<HTML
<div>
<h3>{$offerTitle}</h3>
<p>{$offerText}</p>
HTML;
foreach ($offerImages as list($imageTitle, $imageUrl)) {
$itemContent .= <<<HTML
<figure>
<img src="{$imageUrl}">
<figcaption>{$imageTitle}</figcaption>
</figure>
HTML;
}
$itemContent .= <<<HTML
</div>
HTML;
}
if (count($images) > 0) {
$itemContent .= <<<HTML
<div>
<h3>Fotos</h3>
HTML;
foreach ($images as list($imageTitle, $imageUrl)) {
$itemContent .= <<<HTML
<figure>
<img src="{$imageUrl}">
<figcaption>{$imageTitle}</figcaption>
</figure>
HTML;
}
$itemContent .= <<<HTML
</div>
HTML;
}
$minorOffers = array_filter($offers, fn($title): bool => in_array($title, self::OFFER_LOW_PRIORITIES), ARRAY_FILTER_USE_KEY);
foreach ($minorOffers as $offerTitle => list($offerText)) {
$itemContent .= <<<HTML
<div>
<h3>{$offerTitle}</h3>
<p>{$offerText}</p>
</div>
HTML;
}
$item['title'] = $dateLines[0] . ' ' . $dateLines[1] . ' ' . $addressLines[0] . ' - ' . $addressLines[1];
$item['content'] = $itemContent;
$item['description'] = null;
$item['enclosures'] = array_map(
function ($image): string {
list($title, $url) = $image;
return $url . '#' . urlencode(str_replace(' ', '_', $title));
},
$images
);
return $item;
}
@@ -97,6 +196,67 @@ class DRKBlutspendeBridge extends FeedExpander
return self::BASE_URI . '/blutspendetermine/termine.rss?date_to=' . $dateTo . '&radius=' . $radius . '&term=' . $term;
}
private function parseImages($parentElement): array
{
$images = [];
if ($parentElement) {
$elements = $parentElement->find('a[data-lightbox]');
foreach ($elements as $i => $element) {
$url = trim($element->getAttribute('href'));
if (!$url) {
continue;
}
$title = trim($element->getAttribute('title'));
if (!$title) {
$number = $i + 1;
$title = "Foto {$number}";
}
$images[] = [$title, $url];
}
}
return $images;
}
private function parseOffers($offerElements): array
{
$offers = [];
foreach ($offerElements as $element) {
$title = self::getCleanPlainText($element->find(':is(h1,h2,h3,h4,h5,h6)', 0));
$text = trim(substr(self::getCleanPlainText($element), strlen($title)));
if (!$title || !$text) {
continue;
}
$linkElements = $element->find('a');
foreach ($linkElements as $linkElement) {
$linkText = trim($linkElement->plaintext);
$linkUrl = trim($linkElement->getAttribute('href'));
if (!$linkText || !$linkUrl) {
continue;
}
$linkHtml = <<<HTML
<a href="{$linkUrl}" target="_blank">{$linkText}</a>
HTML;
$text = str_replace($linkText, $linkHtml, $text);
}
$offers[$title] = [$text, self::parseImages($element)];
}
return $offers;
}
private function getCleanPlainText($htmlElement): string
{
return $htmlElement ? trim(preg_replace('/\s+/', ' ', html_entity_decode($htmlElement->plaintext))) : '';
}
/**
* Returns an array of strings, each of which is a substring of string formed by splitting it on boundaries formed by line breaks.
*/

View File

@@ -53,8 +53,7 @@ class DacksnackBridge extends BridgeAbstract
public function collectData()
{
$NEWSURL = self::URI;
$html = getSimpleHTMLDOMCached($NEWSURL, 18000) or
returnServerError('Could not request: ' . $NEWSURL);
$html = getSimpleHTMLDOMCached($NEWSURL, 18000);
foreach ($html->find('a.main-news-item') as $element) {
// Debug::log($element);
@@ -64,8 +63,7 @@ class DacksnackBridge extends BridgeAbstract
$url = self::URI . $element->getAttribute('href');
$published = $this->parseSwedishDates(trim($element->find('.published', 0)->plaintext));
$article_html = getSimpleHTMLDOMCached($url, 18000) or
returnServerError('Could not request: ' . $url);
$article_html = getSimpleHTMLDOMCached($url, 18000);
$article_content = $article_html->find('#ctl00_ContentPlaceHolder1_NewsArticleVeiw_pnlArticle', 0);
$figure = self::URI . $article_content->find('img.news-image', 0)->getAttribute('src');

View File

@@ -18,8 +18,7 @@ class DagensNyheterDirektBridge extends BridgeAbstract
{
$NEWSURL = self::BASEURL . '/ajax/direkt/';
$html = getSimpleHTMLDOM($NEWSURL) or
returnServerError('Could not request: ' . $NEWSURL);
$html = getSimpleHTMLDOM($NEWSURL);
foreach ($html->find('article') as $element) {
$link = $element->find('button', 0)->getAttribute('data-link');

View File

@@ -44,68 +44,32 @@ class DailymotionBridge extends BridgeAbstract
public function getIcon()
{
return 'https://static1-ssl.dmcdn.net/images/neon/favicons/android-icon-36x36.png.vf806ca4ed0deed812';
return 'https://static1.dmcdn.net/neon-user-ssr/prod/favicons/apple-icon-60x60.831b96ed0a8eca7f6539.png';
}
public function collectData()
{
if ($this->queriedContext === 'By username' || $this->queriedContext === 'By playlist id') {
$apiJson = getContents($this->getApiUrl());
$apiData = json_decode($apiJson, true);
$apiJson = getContents($this->getApiUrl());
$apiData = json_decode($apiJson, true);
if ($this->queriedContext === 'By playlist id') {
$this->feedName = $this->getPlaylistTitle($this->getInput('p'));
foreach ($apiData['list'] as $apiItem) {
$item = [];
$item['uri'] = $apiItem['url'];
$item['uid'] = $apiItem['id'];
$item['title'] = $apiItem['title'];
$item['timestamp'] = $apiItem['created_time'];
$item['author'] = $apiItem['owner.screenname'];
$item['content'] = '<p><a href="' . $apiItem['url'] . '">
<img src="' . $apiItem['thumbnail_url'] . '"></a></p><p>' . $apiItem['description'] . '</p>';
$item['categories'] = $apiItem['tags'];
$item['enclosures'][] = $apiItem['thumbnail_url'];
$this->items[] = $item;
}
}
if ($this->queriedContext === 'From search results') {
$html = getSimpleHTMLDOM($this->getURI());
foreach ($apiData['list'] as $apiItem) {
$item = [];
foreach ($html->find('div.media a.preview_link') as $element) {
$item = [];
$item['uri'] = $apiItem['url'];
$item['uid'] = $apiItem['id'];
$item['title'] = $apiItem['title'];
$item['timestamp'] = $apiItem['created_time'];
$item['author'] = $apiItem['owner.screenname'];
$item['content'] = '<p><a href="' . $apiItem['url'] . '">
<img src="' . $apiItem['thumbnail_url'] . '"></a></p><p>' . $apiItem['description'] . '</p>';
$item['categories'] = $apiItem['tags'];
$item['enclosures'][] = $apiItem['thumbnail_url'];
$item['id'] = str_replace('/video/', '', strtok($element->href, '_'));
$metadata = $this->getMetadata($item['id']);
if (empty($metadata)) {
continue;
}
$item['uri'] = $metadata['uri'];
$item['title'] = $metadata['title'];
$item['timestamp'] = $metadata['timestamp'];
$item['content'] = '<a href="'
. $item['uri']
. '"><img src="'
. $metadata['thumbnailUri']
. '" /></a><br><a href="'
. $item['uri']
. '">'
. $item['title']
. '</a>';
$this->items[] = $item;
if (count($this->items) >= 5) {
break;
}
}
$this->items[] = $item;
}
}
@@ -136,6 +100,7 @@ class DailymotionBridge extends BridgeAbstract
public function getURI()
{
$uri = self::URI;
switch ($this->queriedContext) {
case 'By username':
$uri .= 'user/' . urlencode($this->getInput('u'));
@@ -162,35 +127,11 @@ class DailymotionBridge extends BridgeAbstract
return $uri;
}
private function getMetadata($id)
{
$metadata = [];
$html = getSimpleHTMLDOM(self::URI . 'video/' . $id);
if (!$html) {
return $metadata;
}
$metadata['title'] = $html->find('meta[property=og:title]', 0)->getAttribute('content');
$metadata['timestamp'] = strtotime(
$html->find('meta[property=video:release_date]', 0)->getAttribute('content')
);
$metadata['thumbnailUri'] = $html->find('meta[property=og:image]', 0)->getAttribute('content');
$metadata['uri'] = $html->find('meta[property=og:url]', 0)->getAttribute('content');
return $metadata;
}
private function getPlaylistTitle($id)
{
$title = '';
$url = self::URI . 'playlist/' . $id;
$html = getSimpleHTMLDOM($url);
$title = $html->find('meta[property=og:title]', 0)->getAttribute('content');
return $title;
$apiJson = getContents($this->apiUrl . '/playlist/' . $this->getInput('p'));
$apiData = json_decode($apiJson, true);
return $apiData['name'];
}
private function getApiUrl()
@@ -204,6 +145,9 @@ class DailymotionBridge extends BridgeAbstract
return $this->apiUrl . '/playlist/' . $this->getInput('p')
. '/videos?fields=' . urlencode($this->apiFields) . '&limit=5';
break;
case 'From search results':
return $this->apiUrl . '/videos?search=' . $this->getInput('s') . '&fields=' . urlencode($this->apiFields) . '&limit=5';
break;
}
}
}

View File

@@ -1,28 +0,0 @@
<?php
class DansTonChatBridge extends BridgeAbstract
{
const MAINTAINER = 'Astalaseven';
const NAME = 'DansTonChat Bridge';
const URI = 'https://danstonchat.com/';
const CACHE_TIMEOUT = 21600; //6h
const DESCRIPTION = 'Returns latest quotes from DansTonChat.';
public function collectData()
{
$html = getSimpleHTMLDOM(self::URI . 'latest.html');
foreach ($html->find('div.item') as $element) {
$item = [];
$item['uri'] = $element->find('a', 0)->href;
$titleContent = $element->find('h3 a', 0);
if ($titleContent) {
$item['title'] = 'DansTonChat ' . html_entity_decode($titleContent->plaintext, ENT_QUOTES);
} else {
$item['title'] = 'DansTonChat';
}
$item['content'] = $element->find('div.item-content a', 0)->innertext;
$this->items[] = $item;
}
}
}

View File

@@ -48,6 +48,16 @@ https://www.dealabs.com/groupe/abonnements-internet?sortBy=lowest_price
Il faut alors saisir :
abonnements-internet',
],
'subgroups' => [
'name' => 'Catégorie',
'type' => 'text',
'exampleValue' => '1071',
'title' => 'Numéro du ou des catégories dans l\'URL : Il faut entrer le ou les numéros de catégories qui sont présent après "groups=" et avant tout éventuel "&"
Exemple : Si l\'URL du groupe affichées dans le navigateur est :
https://www.dealabs.com/groupe/telecommunications?groups=1071%2C1070&sortBy=new
Il faut alors saisir :
1071%2C1070',
],
'order' => [
'name' => 'Trier par',
'type' => 'list',
@@ -88,6 +98,7 @@ abonnements-internet',
'uri-group' => 'groupe/',
'uri-deal' => 'bons-plans/',
'uri-merchant' => 'search/bons-plans?merchant-id=',
'image-host' => 'https://static-pepper.dealabs.com/',
'request-error' => 'Impossible de joindre Dealabs',
'thread-error' => 'Impossible de déterminer l\'ID de la discussion. Vérifiez l\'URL que vous avez entré',
'currency' => '€',

View File

@@ -75,7 +75,7 @@ apple-icon-5c6fa9f2bce280428589c6195b7f1924206a53b782b371cfe2d02da932c8c173.png'
$html = defaultLinkTo($html, static::URI);
$articles = $html->find('div.crayons-story')
or returnServerError('Could not find articles!');
or throwServerException('Could not find articles!');
foreach ($articles as $article) {
$item = [];

View File

@@ -1,5 +1,7 @@
<?php
declare(strict_types=1);
/**
* Retourne les dons d'une recherche filtrée sur le site Donnons.org
* Example: https://donnons.org/Sport/Ile-de-France
@@ -44,58 +46,60 @@ class DonnonsBridge extends BridgeAbstract
{
$uri = $this->getPageURI($page);
$html = getSimpleHTMLDOM($uri);
$dom = getSimpleHTMLDOM($uri);
$searchDiv = $html->find('div[id=search]', 0);
$searchDiv = $dom->find('div[id=search]', 0);
if (!is_null($searchDiv)) {
$elements = $searchDiv->find('a.lst-annonce');
foreach ($elements as $element) {
$item = [];
if (! $searchDiv) {
return;
}
// Lien vers le don
$item['uri'] = self::URI . $element->href;
// Id de l'objet
$item['uid'] = $element->getAttribute('data-id');
$elements = $searchDiv->find('a.lst-annonce');
foreach ($elements as $element) {
$item = [];
// Grab info from json
$jsonString = $element->find('script', 0)->innertext;
$json = json_decode($jsonString, true);
// Lien vers le don
$item['uri'] = self::URI . $element->href;
// Id de l'objet
$item['uid'] = $element->getAttribute('data-id');
$name = $json['name'];
$category = $json['category'];
$date = $json['availabilityStarts'];
$description = $json['description'];
$city = $json['availableAtOrFrom']['address']['addressLocality'];
$region = $json['availableAtOrFrom']['address']['addressRegion'];
// Grab info from json
$jsonString = $element->find('script', 0)->innertext;
$json = json_decode($jsonString, true);
// Grab info from HTML
$imageSrc = $element->find('img.ima-center', 0)->getAttribute('src');
// Use large image instead of small one
$imageSrc = str_replace('/xs/', '/lg/', $imageSrc);
$image = self::URI . $imageSrc;
$author = $element->find('div.avatar-holder', 0)->plaintext;
$name = $json['name'];
$category = $json['category'];
$date = $json['availabilityStarts'];
$description = $json['description'];
$city = $json['availableAtOrFrom']['address']['addressLocality'];
$region = $json['availableAtOrFrom']['address']['addressRegion'];
$content = '
<img style="margin-right:1em;" src="' . $image . '">
<div>
<h1>' . $name . '</h1>
<p>' . $description . '</p>
<p>Lieu : <b>' . $city . '</b> - ' . $region . '</p>
<p>Par : ' . $author . '</p>
<p>Date : ' . $date . '</p>
</div>
';
// Grab info from HTML
$imageSrc = $element->find('img.ima-center', 0)->getAttribute('src');
// Use large image instead of small one
$imageSrc = str_replace('/xs/', '/lg/', $imageSrc);
$image = self::URI . $imageSrc;
$author = $element->find('div.avatar-holder', 0)->plaintext;
// Titre du don
$item['title'] = '[' . $category . '] ' . $name;
$item['timestamp'] = $date;
$item['author'] = $author;
$item['content'] = $content;
$item['enclosures'] = [$image];
$content = '
<img style="margin-right:1em;" src="' . $image . '">
<div>
<h1>' . $name . '</h1>
<p>' . $description . '</p>
<p>Lieu : <b>' . $city . '</b> - ' . $region . '</p>
<p>Par : ' . $author . '</p>
<p>Date : ' . $date . '</p>
</div>
';
$this->items[] = $item;
}
// Titre du don
$item['title'] = '[' . $category . '] ' . $name;
$item['timestamp'] = $date;
$item['author'] = $author;
$item['content'] = $content;
$item['enclosures'] = [$image];
$this->items[] = $item;
}
}

View File

@@ -204,13 +204,13 @@ class Drive2ruBridge extends BridgeAbstract
break;
case 'Бортжурналы (По модели или марке)':
if (!preg_match('/^https:\/\/www.drive2.ru\/experience/', $this->getInput('url'))) {
returnServerError('Invalid url');
throwServerException('Invalid url');
}
$this->getLogbooksContent($this->getInput('url'));
break;
case 'Личные блоги':
if (!preg_match('/^[a-zA-Z0-9-]{3,16}$/', $this->getInput('username'))) {
returnServerError('Invalid username');
throwServerException('Invalid username');
}
$this->getUserContent('https://www.drive2.ru/users/' . $this->getInput('username'));
break;

View File

@@ -41,6 +41,12 @@ class EconomistWorldInBriefBridge extends BridgeAbstract
'quote' => [
'name' => 'Include the quote of the day',
'type' => 'checkbox'
],
'mergeEverything' => [
'name' => 'Merge everything into one entry',
'type' => 'checkbox',
'defaultValue' => false,
'title' => 'Whether to merge all the stories into one entry'
]
]
];
@@ -61,7 +67,7 @@ class EconomistWorldInBriefBridge extends BridgeAbstract
}
$html = getSimpleHTMLDOM(self::URI, $headers);
$gobbets = $html->find('p[data-component="the-world-in-brief-paragraph"]');
if ($this->getInput('splitGobbets') == 1) {
if ($this->getInput('splitGobbets') == 1 && !$this->getInput('mergeEverything')) {
$this->splitGobbets($gobbets);
} else {
$this->mergeGobbets($gobbets);
@@ -77,6 +83,9 @@ class EconomistWorldInBriefBridge extends BridgeAbstract
$quote = $html->find('blockquote[data-test-id="inspirational-quote"]', 0);
$this->addQuote($quote);
}
if ($this->getInput('mergeEverything') == 1) {
$this->mergeEverything();
}
}
private function splitGobbets($gobbets)
@@ -131,6 +140,9 @@ class EconomistWorldInBriefBridge extends BridgeAbstract
if ($element->tag != 'div') {
continue;
}
if ($element->find('._newsletterContentPromo', 0) != null) {
continue;
}
$image = $element->find('figure', 0);
$title = $element->find('h3', 0)->plaintext;
$content = $element->find('h3', 0)->parent();
@@ -165,4 +177,35 @@ class EconomistWorldInBriefBridge extends BridgeAbstract
'uid' => 'quote-' . $today->format('U')
];
}
private function mergeEverything()
{
$today = new Datetime();
$today->setTime(0, 0, 0, 0);
$contents = '';
foreach ($this->items as $item) {
$header = null;
if (str_contains($item['uid'], 'story-')) {
$header = $item['title'];
} elseif (str_contains($item['uid'], 'quote-')) {
$header = 'Quote of the day';
} elseif (str_contains($item['uid'], 'world-in-brief-')) {
$header = 'World in brief';
}
if ($header != null) {
$contents .= "<h2>{$header}</h2>";
}
$contents .= $item['content'];
}
$item = [
'uri' => self::URI,
'title' => 'The Economist World in Brief ' . $today->format('d.m.Y'),
'content' => $contents,
'timestamp' => $today->format('U'),
'uid' => 'world-in-brief-merged' . $today->format('U')
];
$this->items = [$item];
}
}

View File

@@ -12,8 +12,28 @@ class EdfPricesBridge extends BridgeAbstract
'contract' => [
'name' => 'Choisir un contrat',
'type' => 'list',
// we can add later HCHP, EJP, base
'values' => ['Tempo' => '/energie/edf/tarifs/tempo'],
// we can add later more option prices
'values' => [
'Base' => '/energie/edf/tarifs/tarif-bleu#base',
'HPHC' => '/energie/edf/tarifs/tarif-bleu#hphc',
'EJP' => '/energie/edf/tarifs/tarif-bleu#ejp',
'Tempo' => '/energie/edf/tarifs/tempo'
],
],
'power' => [
'name' => 'Choisir une puissance',
'type' => 'list',
'values' => [
'3 kVA' => 3,
'6 kVA' => 6,
'9 kVA' => 9,
'12 kVA' => 12,
'15 kVA' => 15,
'18 kVA' => 18,
'24 kVA' => 24,
'30 kVA' => 30,
'36 kVA' => 36
]
]
]
];
@@ -24,36 +44,20 @@ class EdfPricesBridge extends BridgeAbstract
* @param string $contractUri
* @return void
*/
private function tempo(simple_html_dom $html, string $contractUri): void
private function tempo(simple_html_dom $html, string $contractUri, int $power): void
{
// current color and next
$daysDom = $html->find('#calendrier', 0)->nextSibling()->find('.card--ejp');
if ($daysDom && count($daysDom) === 2) {
foreach ($daysDom as $dayDom) {
$day = trim($dayDom->find('.card__title', 0)->innertext) . '/' . (new \DateTime('now'))->format(('Y'));
$dayColor = $dayDom->find('.card-ejp__icon span', 0)->innertext;
$text = $day . ' - ' . $dayColor;
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
}
}
// colors
$ulDom = $html->find('#tarif-de-l-offre-tempo-edf-template-date-now-y', 0)->nextSibling()->nextSibling()->nextSibling();
$elementsDom = $ulDom->find('li');
if ($elementsDom && count($elementsDom) === 3) {
// price per kWh is same for all powers
foreach ($elementsDom as $elementDom) {
$item = [];
$matches = [];
preg_match_all('/Jour (.*) : Heures (.*) : (.*)&nbsp;€ \/ Heures (.*) : (.*)&nbsp;€/um', $elementDom->innertext, $matches, PREG_SET_ORDER, 0);
// for tempo contract we have 2x3 colors
if ($matches && count($matches[0]) === 6) {
for ($i = 0; $i < 2; $i++) {
$text = 'Jour ' . $matches[0][1] . ' - Heures ' . $matches[0][2 + 2 * $i] . ' : ' . $matches[0][3 + 2 * $i] . '€';
@@ -69,26 +73,158 @@ class EdfPricesBridge extends BridgeAbstract
}
}
// powers
$ulPowerContract = $ulDom->nextSibling()->nextSibling();
$elementsPowerContractDom = $ulPowerContract->find('li');
if ($elementsPowerContractDom && count($elementsPowerContractDom) === 4) {
foreach ($elementsPowerContractDom as $elementPowerContractDom) {
// add subscription power info
$tablePrices = $ulDom->nextSibling()->nextSibling()->nextSibling()->find('.table--responsive', 0);
$this->addSubscriptionPowerInfo($tablePrices, $contractUri, $power, 7);
}
/**
* @param simple_html_dom $html
* @param string $contractUri
* @return void
*/
private function base(simple_html_dom $html, string $contractUri, int $power): void
{
$tablePrices = $html
->find('#grille-tarifaire-et-prix-du-kwh-du-tarif-reglemente-edf-en-option-base', 0)
->nextSibling()
->nextSibling();
$prices = $tablePrices->find('.table tbody tr');
// price per kWh is same for all powers
if ($prices && count($prices) === 9) {
$item = [];
$text = 'Base : ' . $prices[0]->children(2);
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
}
$this->addSubscriptionPowerInfo($tablePrices, $contractUri, $power, 9);
}
/**
* @param simple_html_dom $html
* @param string $contractUri
* @return void
*/
private function hphc(simple_html_dom $html, string $contractUri, int $power): void
{
$tablePrices = $html
->find('#grille-tarifaire-et-prix-du-kwh-du-tarif-reglemente-edf-en-option-heures-pleines-heures-creuses', 0)
->nextSibling()
->nextSibling();
$prices = $tablePrices->find('.table tbody tr');
// price per kWh is same for all powers
if ($prices && count($prices) === 8) {
$values = ['HC', 'HP'];
foreach ($values as $key => $value) {
$i++;
$item = [];
$matches = [];
preg_match_all('/(.*) kVA : (.*) €/um', $elementPowerContractDom->innertext, $matches, PREG_SET_ORDER, 0);
$text = $values[$key] . ' : ' . $prices[0]->children($key + 2);
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
if ($matches && count($matches[0]) === 3) {
$text = $matches[0][1] . ' kVA : ' . $matches[0][2] . '€';
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
}
}
$this->items[] = $item;
$this->addSubscriptionPowerInfo($tablePrices, $contractUri, $power, 8);
}
/**
* @param simple_html_dom $html
* @param string $contractUri
* @return void
*/
private function ejp(simple_html_dom $html, string $contractUri, int $power): void
{
$tablePrices = $html
->find('#ejp', 0)
->nextSibling()
->nextSibling()
->nextSibling()
->nextSibling()
->nextSibling();
$prices = $tablePrices->find('.table tbody tr');
// price per kWh is same for all powers
if ($prices && count($prices) === 5) {
$values = ['Non EJP', 'EJP'];
foreach ($values as $key => $value) {
$i++;
$item = [];
$text = $values[$key] . ' : ' . $prices[0]->children($key + 2);
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
}
}
$this->addSubscriptionPowerInfo($tablePrices, $contractUri, $power, 5);
}
private function addSubscriptionPowerInfo(simple_html_dom_node $tablePrices, string $contractUri, int $power, int $numberOfPrices): void
{
$prices = $tablePrices->find('.table tbody tr');
// 8 contracts for tempo: 6, 9, 12, 15, 18, 24, 30 and 36 kVA
// 9 contracts for base: 3, 6, 9, 12, 15, 18, 24, 30 and 36 kVA
// 8 contracts for HPHC: 6, 9, 12, 15, 18, 24, 30 and 36 kVA
// 5 contracts for EJP: 9, 12, 15, 18 and 36 kVA
if ($prices && count($prices) === $numberOfPrices) {
$powerFound = false;
foreach ($prices as $price) {
$powerText = $price->firstChild()->firstChild()->innertext;
$powerValue = (int)substr($powerText, 0, strpos($powerText, ' kVA'));
if ($powerValue !== $power) {
continue;
}
$item = [];
$text = $powerText . ' : ' . $price->children(1) . '/an';
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
$powerFound = true;
break;
}
if (!$powerFound) {
$item = [];
$text = 'Pas de tarif abonnement pour cette puissance et ce contrat';
$item['uri'] = self::URI . $contractUri;
$item['title'] = $text;
$item['author'] = self::MAINTAINER;
$item['content'] = $text;
$item['uid'] = hash('sha256', $item['title']);
$this->items[] = $item;
}
}
}
@@ -97,10 +233,23 @@ class EdfPricesBridge extends BridgeAbstract
{
$contract = $this->getKey('contract');
$contractUri = $this->getInput('contract');
$power = $this->getInput('power');
$html = getSimpleHTMLDOM(self::URI . $contractUri);
if ($contract === 'Tempo') {
$this->tempo($html, $contractUri);
$this->tempo($html, $contractUri, $power);
}
if ($contract === 'Base') {
$this->base($html, $contractUri, $power);
}
if ($contract === 'HPHC') {
$this->hphc($html, $contractUri, $power);
}
if ($contract === 'EJP') {
$this->ejp($html, $contractUri, $power);
}
}
}

View File

@@ -0,0 +1,656 @@
<?php
/**
*
* this code downloads the HTML page with product news from ARGOS website (https://www.i4wifi.cz), parses it, extracts key information
* about each article (title, link, date, description, images), and formats it into a structured form,
* likely for further processing, such as creating an RSS feed.
*/
class ElektroARGOSBridge extends BridgeAbstract
{
const NAME = 'Elektro ARGOS';
const URI = 'https://www.argos.cz/';
const DESCRIPTION = 'News, events and promotions on ARGOS electro shop - www.argos.cz - Czech Republic';
const MAINTAINER = 'pprenghyorg';
const CACHE_TIMEOUT = 86400;
// Only Weekly offer and Promotional letter are supported
const PARAMETERS = [
'News and articles' => [],
'Events' => [],
'Topics and Promos' => []
];
/**
* Fetches and processes data based on the selected context.
*
* This function retrieves the HTML content for the specified context's URI,
* resolves relative links within the content, and then delegates the data
* extraction to the appropriate method (currently only `collectNews` for the 'Articles' context).
*/
public function collectData()
{
$html = getSimpleHTMLDOMCached($this->getURI(), self::CACHE_TIMEOUT);
defaultLinkTo($html, static::URI);
// Router
switch ($this->queriedContext) {
case 'News and articles':
$this->collectNews($html);
break;
case 'Events':
$this->collectEvents($html);
break;
case 'Topics and Promos':
$this->collectTopic($html);
break;
}
}
/**
* Returns the icon for the bridge.
*
* @return string The icon URL.
*/
public function getURI()
{
$uri = static::URI;
// URI Router
switch ($this->queriedContext) {
case 'News and articles':
$uri .= 'akce/nabidka/';
break;
case 'Events':
$uri .= 'pobocka-praha-hostivar/akce/udalosti/';
break;
case 'Topics and Promos':
$uri .= 'pobocka-praha-hostivar/akce/temata/';
break;
}
return $uri;
}
/**
* Returns the keyword URL map for the bridge.
*
* @return string The Name.
*/
public function getKeywordUrlMap()
{
// Get the keyword URL map from the class constant
$keywordUrlMap = static::KEYWORDURLMAP;
// returns the keyword URL map
return $keywordUrlMap;
}
/**
* Returns the name for the bridge.
*
* @return string The Name.
*/
public function getName()
{
$name = static::NAME;
$name .= ($this->queriedContext) ? ' - ' . $this->queriedContext : '';
switch ($this->queriedContext) {
case 'News and articles':
break;
case 'Events':
break;
case 'Topics and Promos':
break;
}
return $name;
}
/**
* Parse most used date formats
*
* Basically strtotime doesn't convert dates correctly due to formats
* being hard to interpret. So we use the DateTime object, manually
* fixing dates and times (set to 00:00:00.000).
*
* We don't know the timezone, so just assume +00:00 (or whatever
* DateTime chooses)
*/
private function fixDate($date)
{
$df = $this->parseDateTimeFromString($date);
return date_format($df, 'U');
}
/**
* Extracts the images from the article.
*
* @param object $article The article object.
* @return array An array of image URLs.
*/
private function extractImages($article)
{
// Notice: We can have zero or more images (though it should mostly be 1)
$elements = $article->find('img');
$images = [];
foreach ($elements as $img) {
$images[] = $img->src;
}
return $images;
}
// region Weekly offer
/**
* Collects uri, timestamp, title, content and images in the product offers from the HTML and transforms to rss.
*
* @param object $html The HTML object.
* @return void
*/
private function collectNews($html)
{
// Check if page contains articles and split by class
$articles = $html->find('.com-news-feature-prerex') or
throwServerException('No articles found! Layout might have changed!');
// Articles loop
foreach ($articles as $article) {
$item = [];
// Add URI
$item['uri'] = $this->extractNewsUri($article);
// echo $item['uri'] . '<BR>';
// Add title
$item['title'] = $this->extractNewsTitle($article);
// echo $item['title'] . '<BR>';
$item['enclosures'] = $this->extractImages($article);
// Add to rss query
$this->items[] = $item;
}
}
/**
* Collects uri, timestamp, title, content and images in the promotional letter from the HTML and transforms to rss.
*
* @param object $html The HTML object.
* @return void
*/
private function collectEvents($html)
{
// Check if page contains articles and split by class
$articles = $html->find('.com-news-common-prerex') or
throwServerException('No articles found! Layout might have changed!');
// Articles loop
foreach ($articles as $article) {
$item = [];
// Add URI
$item['uri'] = $this->extractEventUri($article);
// Add title
$item['title'] = $this->extractEventTitle($article);
// Add content
$item['content'] = $this->extractEventDescription($article);
// Parse time
$newsDate = $this->extractDate($article);
// Remove prefix
$newsDate = str_replace('zveřejněno: ', '', $newsDate);
// Fix date
$item['timestamp'] = $this->fixDate($newsDate);
// Add images
$item['enclosures'] = $this->extractImages($article);
// Add to rss query
$this->items[] = $item;
}
}
/**
* Collects uri, timestamp, title, content and images in the promotional letter from the HTML and transforms to rss.
*
* @param object $html The HTML object.
* @return void
*/
private function collectTopic($html)
{
// Check if page contains articles and split by class
$articles = $html->find('.com-news-common-prerex') or
throwServerException('No articles found! Layout might have changed!');
// Articles loop
foreach ($articles as $article) {
$item = [];
// Add URI
$item['uri'] = $this->extractEventUri($article);
// Add title
$item['title'] = $this->extractEventTitle($article);
// Add content
$item['content'] = $this->extractEventDescription($article);
// Parse time
$newsDate = $this->extractDate($article);
// Remove prefix
$newsDate = str_replace('zveřejněno: ', '', $newsDate);
// Fix date
$item['timestamp'] = $this->fixDate($newsDate);
// Add images
$item['enclosures'] = $this->extractImages($article);
// Add to rss query
$this->items[] = $item;
}
}
/**
* Extracts the URI of the news article.
*
* @param object $article The article object.
* @return string The URI of the news article.
*/
private function extractEventUri($article)
{
return $article->href;
}
/**
* Extracts the URI of the news article.
*
* @param object $article The article object.
* @return string The URI of the news article.
*/
private function extractNewsUri($article)
{
// Return URI of the article
$element = $article->find('a', 0) or
throwServerException('Anchor not found!');
return $element->href;
}
/**
* Extracts the URI of the news article.
*
* @param object $article The article object.
* @return string The URI of the news article.
*/
private function extractLetterUri($article)
{
// Return URI of the article
$element = $article->find('a.ws-btn', 0);
// Element empty check
if ($element == null) {
return '';
}
return $element->href;
}
/**
* Extracts the date of the news article.
*
* @param object $article The article object.
* @return string The date of the news article.
*/
private function extractDate($article)
{
// Check if date is set
$element = $article->find('div.com-news-common-prerex__date', 0) or
throwServerException('Date not found!');
return $element->plaintext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription($article)
{
// Extract description
$element = $article->find('ul.ws-product-information__piece-description', 0)->find('li', 0) or
throwServerException('Description not found!');
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription1($article)
{
// Extract description
$element = $article->find('div.ws-product-price-validity', 0)->find('div', 0) or
throwServerException('Description not found!');
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription2($article)
{
// Extract description
$element = $article->find('div.ws-product-price-validity', 0)->find('div', 1) or
throwServerException('Description not found!');
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription3($article)
{
// Extract description
$element = $article->find('div.ws-product-badge-text', 0);
// Check if element is not null
// If it is null, return empty string
// If it is not null, return the inner text
// This is to avoid errors when the element is not found
// and to ensure that the function always returns a string
if ($element != null) {
return $element->innertext;
} else {
return '';
}
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription4($article)
{
// Extract description
$element = $article->find('div.ws-product-price-type__value', 0);
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription5($article)
{
// Extract description
$element = $article->find('div.ws-product-price-type__label', 0);
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractNewsDescription6($article)
{
// Extract description
$element = $article->find('div.ws-product-price', 0)->find('div.ws-product-price-type', 1);
// Element empty check
if ($element == null) {
return '';
}
// Not null, so we can safely access the element
$element = $element->find('div.ws-product-price-type__value', 0);
return $element->innertext;
}
/**
* Extracts the description of the news article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractEventDescription($article)
{
// Extract description
$element = $article->find('.com-news-common-prerex__text', 0);
return $element->innertext;
}
/**
* Extracts the title of the news article.
*
* @param object $article The article object.
* @return string The title of the news article.
*/
private function extractNewsTitle($article)
{
// Extract title
$element = $article->find('img', 0) or
throwServerException('Title not found!');
return $element->alt;
}
/**
* Extracts the title of the news article.
*
* @param object $article The article object.
* @return string The title of the news article.
*/
private function extractEventTitle($article)
{
// Extract title
$element = $article->find('div.com-news-common-prerex__right-box', 0)->find('h3', 0)
or throwServerException('Title not found!');
return $element->plaintext;
}
/**
* Extracts the description of the letter article.
*
* @param object $article The article object.
* @return string The description of the news article.
*/
private function extractLetterDescription($article)
{
// Extract description
$element = $article->find('a', 0);
return $element;
}
/**
* It attempts to recognize the date/time format in a string and create a DateTime object.
*
* It goes through the list of defined formats and tries to apply them to the input string.
* Returns the first successfully parsed DateTime object that matches the entire string.
*
* @param string $dateString A string potentially containing a date and/or time.
* @return DateTime|null A DateTime object if successfully recognized and parsed, otherwise null.
*/
private function parseDateTimeFromString(string $dateString): ?DateTime
{
// List of common formats - YOU CAN AND SHOULD EXPAND IT according to expected inputs!
// Order may matter if the formats are ambiguous.
// It is recommended to give more specific formats (with time, full year) before more general ones.
$possibleFormats = [
// Czech formats (day.month.year)
'd.m.Y H:i:s', // 10.04.2025 10:57:47
'j.n.Y H:i:s', // 10.4.2025 10:57:47
'd. m. Y H:i:s', // 10. 04. 2025 10:57:47
'j. n. Y H:i:s', // 10. 4. 2025 10:57:47
'd.m.Y H:i', // 10.04.2025 10:57
'j.n.Y H:i', // 10.4.2025 10:57
'd. m. Y H:i', // 10. 04. 2025 10:57
'j. n. Y H:i', // 10. 4. 2025 10:57
'd.m.Y', // 10.04.2025
'j.n.Y', // 10.4.2025
'd. m. Y', // 10. 04. 2025
'j. n. Y', // 10. 4. 2025
// ISO 8601 and international formats (year-month-day)
'Y-m-d H:i:s', // 2025-04-10 10:57:47
'Y-m-d H:i', // 2025-04-10 10:57
'Y-m-d', // 2025-04-10
'YmdHis', // 20250410105747
'Ymd', // 20250410
// American formats (month/day/year) - beware of ambiguity!
'm/d/Y H:i:s', // 04/10/2025 10:57:47
'n/j/Y H:i:s', // 4/10/2025 10:57:47
'm/d/Y H:i', // 04/10/2025 10:57
'n/j/Y H:i', // 4/10/2025 10:57
'm/d/Y', // 04/10/2025
'n/j/Y', // 4/10/2025
// Standard formats (including time zone)
DateTime::ATOM, // example. 2025-04-10T10:57:47+02:00
DateTime::RFC3339, // example. 2025-04-10T10:57:47+02:00
DateTime::RFC3339_EXTENDED, // example. 2025-04-10T10:57:47.123+02:00
DateTime::RFC2822, // example. Thu, 10 Apr 2025 10:57:47 +0200
DateTime::ISO8601, // example. 2025-04-10T105747+0200
'Y-m-d\TH:i:sP', // ISO 8601 s 'T' oddělovačem
'Y-m-d\TH:i:s.uP', // ISO 8601 s mikrosekundami
// You can add more formats as needed...
// e.g. 'd-M-Y' (10-Apr-2025) - requires English locale
// e.g. 'j. F Y' (10. abren 2025) - requires Czech locale
];
// Set locale for parsing month/day names (if using F, M, l, D)
// E.g. setlocale(LC_TIME, 'cs_CZ.UTF-8'); or 'en_US.UTF-8');
foreach ($possibleFormats as $format) {
// We will try to create a DateTime object from the given format
$dateTime = DateTime::createFromFormat($format, $dateString);
// We check that the parsing was successful AND ALSO
// that there were no errors or warnings during the parsing.
// This is important to ensure that the format matches the ENTIRE string.
if ($dateTime !== false) {
$errors = DateTime::getLastErrors();
if (!($errors)) {
// Success! We found a valid format for the entire string.
return $dateTime;
}
}
}
// If no format matches or parsing failed
return null;
}
/**
* Finds values from an associative array whose keys are substrings of a given text.
*
* The function iterates through the `$map` associative array. For each key,
* it checks if that key exists as a substring within the input `$text`.
* If found, the corresponding value from the map is added to the result array.
* The search is case-sensitive and treats special characters literally.
*
* @param string $text The input text string to search within.
* @param array $map An associative array (key => value). Keys from this array will be searched for in `$text`.
* @return array An array of values whose corresponding keys were found as substrings in `$text`. Returns an empty array if no keys are found.
*/
private function findValuesByKeySubstring(string $text, array $map): array
{
$foundValues = []; // Initialize array for found values
// Iterate through each key => value pair in the map
foreach ($map as $key => $value) {
// Use strpos(), which finds the position of the first occurrence of a substring.
// Returns the position (including 0) or `false` if the substring is not found.
// We use `!== false` to correctly handle the case where the key starts at position 0.
// Cast key to string for robustness (though array keys are usually strings or ints).
// `strpos` treats special characters in the key and text literally.
// echo "Key: $key, Text: $text<BR>\n";
if (strpos($text, $key) !== false) {
// If the key was found in the text, add its corresponding value to the result array
$foundValues[] = $value;
}
}
// Return the array of found values
return $foundValues;
}
/**
* Removes Czech diacritics from a given string.
*
* This function replaces Czech characters with their ASCII equivalents.
* For example, 'á' becomes 'a', 'č' becomes 'c', etc.
*
* @param string $text The input string with Czech diacritics.
* @return string The string with Czech diacritics removed.
*/
private function removeCzechDiacritics(string $text): string
{
$czech = [
'á', 'č', 'ď', 'é', 'ě', 'í', 'ň', 'ó', 'ř', 'š', 'ť', 'ú', 'ů', 'ý', 'ž',
'Á', 'Č', 'Ď', 'É', 'Ě', 'Í', 'Ň', 'Ó', 'Ř', 'Š', 'Ť', 'Ú', 'Ů', 'Ý', 'Ž'
];
$ascii = [
'a', 'c', 'd', 'e', 'e', 'i', 'n', 'o', 'r', 's', 't', 'u', 'u', 'y', 'z',
'A', 'C', 'D', 'E', 'E', 'I', 'N', 'O', 'R', 'S', 'T', 'U', 'U', 'Y', 'Z'
];
return str_replace($czech, $ascii, $text);
}
// endregion
/**
* Creates title by clean URI by removing unwanted characters and leaves last part of the URI.
*
* @param string $text The input string with Czech diacritics.
* @return string The string with Czech diacritics removed.
*/
private function formatTitleFromURI(string $uri): string
{
// get last part of the URI
$title = basename($uri);
// Pattern: /[^\p{L}\p{N}]+/u
// [^...] - Match any character NOT in the set
// \p{L} - Any Unicode letter (including 'é', 'ü', 'ñ', etc.)
// \p{N} - Any Unicode number (0-9 and other numeric characters)
// + - Match one or more occurrences of the preceding pattern consecutively
// /u - Unicode modifier, essential for \p{} constructs
$pattern = '/[^\p{L}\p{N}]+/u';
$replacement = ' '; // Replace with a single space
// lets replace
$title = preg_replace($pattern, $replacement, $title);
// first letter to uppercase
$title = ucfirst($title);
return trim((string)$title);
}
}

View File

@@ -27,7 +27,7 @@ class EpicGamesFreeBridge extends BridgeAbstract
'Türkçe' => 'tr',
'简体中文' => 'zh-CN',
'繁體中文' => 'zh-Hant',
],
],
'title' => 'Language for game information',
'defaultValue' => 'en-US',
],
@@ -51,16 +51,28 @@ class EpicGamesFreeBridge extends BridgeAbstract
$data = $json['data']['Catalog']['searchStore']['elements'];
foreach ($data as $element) {
if (!isset($element['promotions']['promotionalOffers'][0])) {
$promo = $element['promotions']['promotionalOffers'][0]['promotionalOffers'][0] ?? false;
if (
!$promo ||
$promo['discountSetting']['discountType'] !== 'PERCENTAGE' ||
$promo['discountSetting']['discountPercentage'] !== 0
) {
continue;
}
$slug = $element['catalogNs']['mappings'][0]['pageSlug'] ?? null;
if ($slug !== null) {
$uri = parent::getURI() . $this->getInput('locale') . '/p/' . $slug;
} else {
// slug not found, show the root promos page
$uri = parent::getURI() . $this->getInput('locale') . '/free-games';
}
$item = [
'author' => $element['seller']['name'],
'content' => $element['description'],
'enclosures' => array_map(fn($item) => $item['url'], $element['keyImages']),
'timestamp' => strtotime($element['promotions']['promotionalOffers'][0]['promotionalOffers'][0]['startDate']),
'timestamp' => strtotime($promo['startDate']),
'title' => $element['title'],
'url' => parent::getURI() . $this->getInput('locale') . '/p/' . $element['urlSlug'],
'uri' => $uri,
];
$this->items[] = $item;
}

View File

@@ -36,6 +36,9 @@ class ExplosmBridge extends BridgeAbstract
$html = getSimpleHTMLDOM($url);
$element = $html->find('[class*=ComicImage]', 0);
if (!$element) {
break; // skip, if element was not found
}
$date = $element->find('[class^=Author__Right] p', 0)->plaintext;
$author = str_replace('by ', '', $element->find('[class^=Author__Right] p', 1)->plaintext);
$image = $element->find('img', 0)->src;

View File

@@ -85,13 +85,13 @@ class FB2Bridge extends BridgeAbstract
$pageInfo = $this->getPageInfos($page, $cookies);
if ($pageInfo['userId'] === null) {
returnClientError(
throwClientException(
<<<EOD
Unable to get the page id. You should consider getting the ID by hand, then importing it into FB2Bridge
EOD
);
} elseif ($pageInfo['userId'] == -1) {
returnClientError(
throwClientException(
<<<EOD
This page is not accessible without being logged in.
EOD

View File

@@ -1,72 +0,0 @@
<?php
class FDroidBridge extends BridgeAbstract
{
const MAINTAINER = 'Mitsukarenai';
const NAME = 'F-Droid Bridge';
const URI = 'https://f-droid.org/';
const CACHE_TIMEOUT = 60 * 60 * 4; // 4 hours
const DESCRIPTION = 'Returns latest added/updated apps on the open-source Android apps repository F-Droid';
const PARAMETERS = [ [
'u' => [
'name' => 'Widget selection',
'type' => 'list',
'values' => [
'Latest added apps' => 'added',
'Latest updated apps' => 'updated'
]
]
]];
public function getIcon()
{
return self::URI . 'assets/favicon.ico';
}
private function getTimestamp($url)
{
$curlOptions = [
CURLOPT_CUSTOMREQUEST => 'HEAD',
CURLOPT_NOBODY => true,
];
$reponse = getContents($url, [], $curlOptions, true);
$lastModified = $reponse->getHeader('last-modified');
$timestamp = strtotime($lastModified ?? 'today');
return $timestamp;
}
public function collectData()
{
$url = self::URI;
$html = getSimpleHTMLDOM($url);
// targetting the corresponding widget based on user selection
// "updated" is the 5th widget on the page, "added" is the 6th
switch ($this->getInput('u')) {
case 'updated':
$html_widget = $html->find('div.sidebar-widget', 5);
break;
default:
$html_widget = $html->find('div.sidebar-widget', 6);
break;
}
// and now extracting app info from the selected widget (and yeah turns out icons are of heterogeneous sizes)
foreach ($html_widget->find('a') as $element) {
$item = [];
$item['uri'] = self::URI . $element->href;
$item['title'] = $element->find('h4', 0)->plaintext;
$item['icon'] = $element->find('img', 0)->src;
$item['timestamp'] = $this->getTimestamp($item['icon']);
$item['summary'] = $element->find('span.package-summary', 0)->plaintext;
$item['content'] = '
<a href="' . $item['uri'] . '">
<img alt="" style="max-height:128px" src="' . $item['icon'] . '">
</a><br>' . $item['summary'];
$this->items[] = $item;
}
}
}

43
bridges/FabBridge.php Normal file
View File

@@ -0,0 +1,43 @@
<?php
class FabBridge extends BridgeAbstract
{
const NAME = 'Epic Games Fab.com';
const URI = 'https://www.fab.com';
const DESCRIPTION = 'Limited-Time Free Game Engine Assets';
const MAINTAINER = 'thefranke';
const CACHE_TIMEOUT = 86400;
public function collectData()
{
$url = static::URI . '/i/listings/search?is_discounted=1&is_free=1';
$header = [
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:139.0) Gecko/20100101 Firefox/139.0',
'Accept: application/json, text/plain, */*',
'Accept-Language: en',
'Accept-Encoding: gzip, deflate, br, zstd',
'Referer: ' . static::URI
];
$json = getContents($url, $header);
$json = json_decode($json);
foreach ($json->results as $item) {
$thumbnail = $item->thumbnails[0]->mediaUrl;
$itemurl = static::URI . '/listings/' . $item->uid;
$itemapiurl = static::URI . '/i/listings/' . $item->uid;
$itemjson = getContents($itemapiurl, $header);
$itemjson = json_decode($itemjson);
$this->items[] = [
'title' => $item->title,
'author' => $item->user->sellerName,
'uri' => $itemurl,
'timestamp' => strtotime($item->lastUpdatedAt),
'content' => '<a href="' . $itemurl . '"><img src="' . $thumbnail . '"></a>' . $itemjson->description,
];
}
}
}

View File

@@ -155,7 +155,7 @@ class FacebookBridge extends BridgeAbstract
break;
default:
returnClientError('Unknown context: "' . $this->queriedContext . '"!');
throwClientException('Unknown context: "' . $this->queriedContext . '"!');
}
$limit = $this->getInput('limit') ?: -1;
@@ -184,7 +184,7 @@ class FacebookBridge extends BridgeAbstract
$html = getSimpleHTMLDOM($touchURI, $header);
if (!$this->isPublicGroup($html)) {
returnClientError('This group is not public! RSS-Bridge only supports public groups!');
throwClientException('This group is not public! RSS-Bridge only supports public groups!');
}
defaultLinkTo($html, substr(self::URI, 0, strlen(self::URI) - 1));
@@ -192,7 +192,7 @@ class FacebookBridge extends BridgeAbstract
$this->groupName = $this->extractGroupName($html);
$posts = $html->find('div.story_body_container')
or returnServerError('Failed finding posts!');
or throwServerException('Failed finding posts!');
foreach ($posts as $post) {
$item = [];
@@ -224,7 +224,7 @@ class FacebookBridge extends BridgeAbstract
return explode('/', $urlparts['path'])[2];
} elseif (strpos($group, '/') !== false) {
returnClientError('The group you provided is invalid: ' . $group);
throwClientException('The group you provided is invalid: ' . $group);
} else {
return $group;
}
@@ -246,7 +246,7 @@ class FacebookBridge extends BridgeAbstract
$provided_host !== $facebook_host
&& 'www.' . $provided_host !== $facebook_host
) {
returnClientError('The host you provided is invalid! Received "'
throwClientException('The host you provided is invalid! Received "'
. $provided_host
. '", expected "'
. $facebook_host
@@ -268,7 +268,7 @@ class FacebookBridge extends BridgeAbstract
private function extractGroupName($html)
{
$ogtitle = $html->find('._de1', 0)
or returnServerError('Unable to find group title!');
or throwServerException('Unable to find group title!');
return html_entity_decode($ogtitle->plaintext, ENT_QUOTES);
}
@@ -276,7 +276,7 @@ class FacebookBridge extends BridgeAbstract
private function extractGroupPostURI($post)
{
$elements = $post->find('a')
or returnServerError('Unable to find URI!');
or throwServerException('Unable to find URI!');
foreach ($elements as $anchor) {
// Find the one that is a permalink
@@ -292,7 +292,7 @@ class FacebookBridge extends BridgeAbstract
private function extractGroupPostContent($post)
{
$content = $post->find('div._5rgt', 0)
or returnServerError('Unable to find user content!');
or throwServerException('Unable to find user content!');
$context_text = $content->innertext;
if ($content->next_sibling() !== null) {
@@ -304,7 +304,7 @@ class FacebookBridge extends BridgeAbstract
private function extractGroupPostAuthor($post)
{
$element = $post->find('h3 a', 0)
or returnServerError('Unable to find author information!');
or throwServerException('Unable to find author information!');
return $element->plaintext;
}
@@ -334,7 +334,7 @@ class FacebookBridge extends BridgeAbstract
private function extractGroupPostTitle($post)
{
$element = $post->find('h3', 0)
or returnServerError('Unable to find title!');
or throwServerException('Unable to find title!');
if (strpos($element->plaintext, 'shared') === false) {
$content = strip_tags($this->extractGroupPostContent($post));
@@ -370,14 +370,14 @@ class FacebookBridge extends BridgeAbstract
!array_key_exists('path', $urlparts)
|| $urlparts['path'] === '/'
) {
returnClientError('The URL you provided doesn\'t contain the user name!');
throwClientException('The URL you provided doesn\'t contain the user name!');
}
return explode('/', $urlparts['path'])[1];
} else {
// First character cannot be a forward slash
if (strpos($user, '/') === 0) {
returnClientError('Remove leading slash "/" from the username!');
throwClientException('Remove leading slash "/" from the username!');
}
return $user;
@@ -572,7 +572,7 @@ EOD;
$loginForm = $html->find('._585r', 0);
if ($loginForm != null) {
returnServerError('You must be logged in to view this page. This is not supported by RSS-Bridge.');
throwServerException('You must be logged in to view this page. This is not supported by RSS-Bridge.');
}
$mainColumn = $html->find('#pagelet_timeline_main_column');

View File

@@ -37,13 +37,14 @@ class FallGuysBridge extends BridgeAbstract
public function collectData()
{
$html = getSimpleHTMLDOM(self::getURI());
$newsData = self::requestJsonData(self::getURI(), false);
$data = json_decode($html->find('#__NEXT_DATA__', 0)->innertext);
foreach ($newsData->props->pageProps->newsList as $newsItem) {
$newsItemUrl = self::getURI() . '/' . $newsItem->slug;
$newsItemTitle = $newsItem->header->title;
foreach ($data->props->pageProps->newsList as $newsItem) {
$headerDescription = property_exists($newsItem->header, 'description') ? $newsItem->header->description : '';
$headerImage = $newsItem->header->image->src;
$headerImage = $newsItem->newsLandingConfig->options[0]->image->src->url;
$contentImages = [$headerImage];
@@ -52,67 +53,79 @@ class FallGuysBridge extends BridgeAbstract
<p><img src="{$headerImage}"></p>
HTML;
foreach ($newsItem->content->items as $contentItem) {
if (property_exists($contentItem, 'articleCopy')) {
if (property_exists($contentItem->articleCopy, 'title')) {
$title = $contentItem->articleCopy->title;
try {
$newsItemData = self::requestJsonData($newsItemUrl, true);
} catch (\Exception $e) {
$this->logger->error(sprintf('Failed to request data for news item "%s" (%s)', $newsItemTitle, $newsItemUrl), ['e' => $e]);
$newsItemData = null;
}
if (!$newsItemData) {
$this->logger->error(sprintf('Failed to parse json data for news item "%s" (%s)', $newsItemTitle, $newsItemUrl));
} else {
foreach ($newsItemData->props->pageProps->pageData->content->items as $contentItem) {
if (property_exists($contentItem, 'articleCopy')) {
if (property_exists($contentItem->articleCopy, 'title')) {
$title = $contentItem->articleCopy->title;
$content .= <<<HTML
<h2>{$title}</h2>
HTML;
}
$text = $contentItem->articleCopy->copy;
$content .= <<<HTML
<h2>{$title}</h2>
<p>{$text}</p>
HTML;
}
} elseif (property_exists($contentItem, 'articleImage')) {
$image = $contentItem->articleImage->imageSrc;
$text = $contentItem->articleCopy->copy;
if ($image != $headerImage) {
$contentImages[] = $image;
$content .= <<<HTML
<p>{$text}</p>
HTML;
} elseif (property_exists($contentItem, 'articleImage')) {
$image = $contentItem->articleImage->imageSrc;
$content .= <<<HTML
<p><img src="{$image}"></p>
HTML;
}
} elseif (property_exists($contentItem, 'embeddedVideo')) {
$mediaOptions = $contentItem->embeddedVideo->mediaOptions;
$mainContentOptions = $contentItem->embeddedVideo->mainContentOptions;
if ($image != $headerImage) {
$contentImages[] = $image;
if (count($mediaOptions) == count($mainContentOptions)) {
for ($i = 0; $i < count($mediaOptions); $i++) {
if (property_exists($mediaOptions[$i], 'youtubeVideo')) {
$videoUrl = 'https://youtu.be/' . $mediaOptions[$i]->youtubeVideo->contentId;
$image = $mainContentOptions[$i]->image->src ?? '';
$content .= <<<HTML
<p><img src="{$image}"></p>
HTML;
}
} elseif (property_exists($contentItem, 'embeddedVideo')) {
$mediaOptions = $contentItem->embeddedVideo->mediaOptions;
$mainContentOptions = $contentItem->embeddedVideo->mainContentOptions;
$content .= '<p>';
if (count($mediaOptions) == count($mainContentOptions)) {
for ($i = 0; $i < count($mediaOptions); $i++) {
if (property_exists($mediaOptions[$i], 'youtubeVideo')) {
$videoUrl = 'https://youtu.be/' . $mediaOptions[$i]->youtubeVideo->contentId;
$image = $mainContentOptions[$i]->image->src ?? '';
if ($image != $headerImage) {
$contentImages[] = $image;
$content .= '<p>';
if ($image != $headerImage) {
$contentImages[] = $image;
$content .= <<<HTML
<a href="{$videoUrl}"><img src="{$image}"></a><br>
HTML;
}
$content .= <<<HTML
<a href="{$videoUrl}"><img src="{$image}"></a><br>
<i>(Video: <a href="{$videoUrl}">{$videoUrl}</a>)</i>
HTML;
$content .= '</p>';
}
$content .= <<<HTML
<i>(Video: <a href="{$videoUrl}">{$videoUrl}</a>)</i>
HTML;
$content .= '</p>';
}
}
} else {
$this->logger->warning(sprintf('Unsupported content item in news item "%s" (%s)', $newsItemTitle, $newsItemUrl));
}
}
}
$item = [
'uid' => $newsItem->_id,
'uri' => self::getURI() . '/' . $newsItem->_slug,
'title' => $newsItem->_title,
'timestamp' => $newsItem->lastModified,
'uid' => $newsItem->id,
'uri' => $newsItemUrl,
'title' => $newsItemTitle,
'timestamp' => $newsItem->activeDate,
'content' => $content,
'enclosures' => $contentImages,
];
@@ -131,4 +144,12 @@ class FallGuysBridge extends BridgeAbstract
{
return self::BASE_URI . '/favicon.ico';
}
private function requestJsonData(string $url, bool $useCache)
{
$html = $useCache ? getSimpleHTMLDOMCached($url) : getSimpleHTMLDOM($url);
$jsonElement = $html->find('#__NEXT_DATA__', 0);
$json = $jsonElement ? $jsonElement->innertext : null;
return json_decode($json);
}
}

View File

@@ -0,0 +1,95 @@
<?php
class FanaticalBridge extends BridgeAbstract
{
const NAME = 'Fanatical';
const MAINTAINER = 'phantop';
const URI = 'https://www.fanatical.com/en/';
const DESCRIPTION = 'Returns bundles from Fanatical.';
const PARAMETERS = [[
'type' => [
'name' => 'Bundle type',
'type' => 'list',
'defaultValue' => 'all',
'values' => [
'All' => 'all',
'Books' => 'book-',
'ELearning' => 'elearning-',
'Games' => '',
'Software' => 'software-',
]
]
]];
const IMGURL = 'https://fanatical.imgix.net/product/original/';
public function collectData()
{
$api = 'https://www.fanatical.com/api/all/en';
$json = json_decode(getContents($api), true)['pickandmix'];
$type = $this->getInput('type');
foreach ($json as $element) {
if ($type != 'all') {
if ($element['type'] != $type . 'bundle') {
continue;
}
}
$item = [
'categories' => [$element['type']],
'content' => '<ul>',
'enclosures' => [self::IMGURL . $element['cover_image']],
'timestamp' => $element['valid_from'],
'title' => $element['name'],
'uri' => parent::getURI() . 'pick-and-mix/' . $element['slug'],
];
$slugs = [];
foreach ($element['products'] as $product) {
$slug = $product['slug'];
if (in_array($slug, $slugs)) {
continue;
}
$slugs[] = $slug;
$uri = parent::getURI() . 'game/' . $slug;
$item['content'] .= '<li><a href="' . $uri . '">' . $product['name'] . '</a></li>';
$item['enclosures'][] = self::IMGURL . $product['cover'];
}
foreach ($element['tiers'] as $tier) {
$count = $tier['quantity'];
$price = round($tier['price']['USD'] / 100, 2);
$per = round($price / $count, 2);
$item['categories'][] = "$count at $per for $price total";
}
$item['content'] .= '</ul>';
$this->items[] = $item;
}
}
public function getName()
{
$name = parent::getName();
$name .= $this->getKey('type') ? ' - ' . $this->getKey('type') : '';
return $name;
}
public function getURI()
{
$uri = parent::getURI();
$type = $this->getKey('type');
if ($type) {
$uri .= 'bundle/';
if ($type != 'All') {
$uri .= strtolower($type);
}
}
return $uri;
}
public function getIcon()
{
return 'https://cdn.fanatical.com/production/icons/fanatical-icon-android-chrome-192x192.png';
}
}

View File

@@ -40,7 +40,7 @@ class FeedExpanderExampleBridge extends FeedExpander
parent::collectExpandableDatas('http://segfault.linuxmint.com/feed/atom/');
break;
default:
returnClientError('Unknown version ' . $this->getInput('version') . '!');
throwClientException('Unknown version ' . $this->getInput('version') . '!');
}
}
}

View File

@@ -6,8 +6,10 @@ class FeedMergeBridge extends FeedExpander
const NAME = 'FeedMerge';
const URI = 'https://github.com/RSS-Bridge/rss-bridge';
const DESCRIPTION = <<<'TEXT'
This bridge merges two or more feeds into a single feed. Max 10 items are fetched from each feed.
TEXT;
This bridge merges two or more feeds into a single feed. <br>
Max 10 latest items are fetched from each individual feed. <br>
Items with identical url or title are considered duplicates (and are removed). <br>
TEXT;
const PARAMETERS = [
[
@@ -36,11 +38,11 @@ TEXT;
];
/**
* todo: Consider a strategy which produces a shorter feed url
* TODO: Consider a strategy which produces a shorter feed url
*/
public function collectData()
{
$limit = (int)($this->getInput('limit') ?: 10);
$limit = (int)($this->getInput('limit') ?: 99);
$feeds = [
$this->getInput('feed_1'),
$this->getInput('feed_2'),
@@ -61,7 +63,7 @@ TEXT;
if (count($feeds) > 1) {
// Allow one or more feeds to fail
try {
$this->collectExpandableDatas($feed);
$this->collectExpandableDatas($feed, 10);
} catch (HttpException $e) {
$this->logger->warning(sprintf('Exception in FeedMergeBridge: %s', create_sane_exception_message($e)));
// This feed item might be spammy. Considering dropping it.
@@ -80,31 +82,48 @@ TEXT;
throw $e;
}
} else {
$this->collectExpandableDatas($feed);
$this->collectExpandableDatas($feed, 10);
}
}
// If $this->items is empty we should consider throw exception here
// Sort by timestamp descending
// Sort by timestamp, uri, title in descending order
usort($this->items, function ($a, $b) {
$t1 = $a['timestamp'] ?? $a['uri'] ?? $a['title'];
$t2 = $b['timestamp'] ?? $b['uri'] ?? $b['title'];
return $t2 <=> $t1;
});
// Remove duplicates by using url as unique key
// Remove duplicates by url
$items = [];
foreach ($this->items as $item) {
$index = $item['uri'] ?? null;
if ($index) {
// Overwrite duplicates
$items[$index] = $item;
$uri = $item['uri'] ?? null;
if ($uri) {
// Insert or override the existing duplicate
$items[$uri] = $item;
} else {
// The item doesn't have a uri!
$items[] = $item;
}
}
$this->items = array_slice(array_values($items), 0, $limit);
$this->items = array_values($items);
// Remove duplicates by title
$items = [];
foreach ($this->items as $item) {
$title = $item['title'] ?? null;
if ($title) {
// Insert or override the existing duplicate
$items[$title] = $item;
} else {
// The item doesn't have a title!
$items[] = $item;
}
}
$this->items = array_values($items);
$this->items = array_slice($this->items, 0, $limit);
}
public function getIcon()

View File

@@ -22,19 +22,9 @@ class FinanzflussBridge extends BridgeAbstract
$domarticle = getSimpleHTMLDOM($url);
$content = $domarticle->find('div.content', 0);
//get header-image and set absolute src
//get header-image
$headerimage = $domarticle->find('div.article-header-image', 0);
$headerimageimg = $headerimage->find('img[src]', 0);
$src = $headerimageimg->src;
$headerimageimg->src = $baseurl . $src;
$headerimageimg->srcset = $baseurl . $src;
//set absolute src for all img
foreach ($content->find('img[src]') as $img) {
$src = $img->src;
$img->src = $baseurl . $src;
$img->srcset = $baseurl . $src;
}
//remove unwanted stuff
foreach ($content->find('div.newsletter-signup') as $element) {

View File

@@ -60,7 +60,7 @@ class FindACrewBridge extends BridgeAbstract
CURLOPT_POSTFIELDS => http_build_query($data) . "\n"
];
$html = getSimpleHTMLDOM($url, $header, $opts) or returnClientError('No results for this query.');
$html = getSimpleHTMLDOM($url, $header, $opts);
$annonces = $html->find('.css_SrhRst');
$limit = $this->getInput('limit') ?? 10;

View File

@@ -58,13 +58,13 @@ class FirefoxAddonsBridge extends BridgeAbstract
}
$item['content'] = <<<EOD
<strong>Release Notes</strong>
<p><strong>Release Notes</strong></p>
<p>{$releaseNotes}</p>
<strong>Compatibility</strong>
<p><strong>Compatibility</strong></p>
<p>{$compatibility}</p>
<strong>License</strong>
<p><strong>License</strong></p>
<p>{$license}</p>
<strong>Download</strong>
<p><strong>Download</strong></p>
<p><a href="{$downloadlink}">{$xpiFilename}</a> ($size)</p>
EOD;

View File

@@ -1,52 +0,0 @@
<?php
class FirstLookMediaTechBridge extends BridgeAbstract
{
const NAME = 'First Look Media - Technology';
const URI = 'https://tech.firstlook.media';
const DESCRIPTION = 'First Look Media Technology page';
const MAINTAINER = 'somini';
const PARAMETERS = [
[
'projects' => [
'type' => 'checkbox',
'name' => 'Include Projects?',
]
]
];
public function collectData()
{
$html = getSimpleHTMLDOM(self::URI);
if ($this->getInput('projects')) {
$top_projects = $html->find('.PromoList-ul', 0);
foreach ($top_projects->find('li.PromoList-item') as $element) {
$item = [];
$item_uri = $element->find('a', 0);
$item['uri'] = $item_uri->href;
$item['title'] = strip_tags($item_uri->innertext);
$item['content'] = $element->find('div > div', 0);
$this->items[] = $item;
}
}
$top_articles = $html->find('.PromoList-ul', 1);
foreach ($top_articles->find('li.PromoList-item') as $element) {
$item = [];
$item_left = $element->find('div > div', 0);
$item_date = $element->find('.PromoList-date', 0);
$item['timestamp'] = strtotime($item_date->innertext);
$item_date->outertext = ''; /* Remove */
$item['author'] = $item_left->innertext;
$item_uri = $element->find('a', 0);
$item['uri'] = self::URI . $item_uri->href;
$item['title'] = strip_tags($item_uri);
$this->items[] = $item;
}
}
}

View File

@@ -112,7 +112,7 @@ class FlickrBridge extends BridgeAbstract
break;
default:
returnClientError('Invalid context: ' . $this->queriedContext);
throwClientException('Invalid context: ' . $this->queriedContext);
}
$model_json = $this->extractJsonModel($html);

View File

@@ -5,13 +5,13 @@ class Formula1Bridge extends BridgeAbstract
const NAME = 'Formula1 Bridge';
const URI = 'https://formula1.com/';
const DESCRIPTION = 'Returns latest official Formula 1 news';
const MAINTAINER = 'AxorPL';
const MAINTAINER = 'axor-mst';
const API_KEY = 'qPgPPRJyGCIPxFT3el4MF7thXHyJCzAP';
const API_KEY = 'xZ7AOODSjiQadLsIYWefQrpCSQVDbHGC';
const API_URL = 'https://api.formula1.com/v1/editorial/articles?limit=%u';
const ARTICLE_AUTHOR = 'Formula 1';
const ARTICLE_URL = 'https://formula1.com/en/latest/article.%s.%s.html';
const ARTICLE_URL = 'https://formula1.com/en/latest/article/%s.%s';
const LIMIT_MIN = 1;
const LIMIT_DEFAULT = 10;
@@ -36,9 +36,13 @@ class Formula1Bridge extends BridgeAbstract
$limit = min(self::LIMIT_MAX, max(self::LIMIT_MIN, $limit));
$url = sprintf(self::API_URL, $limit);
$json = json_decode(getContents($url, ['apikey: ' . self::API_KEY]));
$json = json_decode(getContents($url, [
'Accept: application/json',
'apikey: ' . self::API_KEY,
'locale: en'
]));
if (property_exists($json, 'error')) {
returnServerError($json->message);
throwServerException($json->message);
}
$list = $json->items;

View File

@@ -1,78 +0,0 @@
<?php
class FragDenStaatBridge extends BridgeAbstract
{
const MAINTAINER = 'swofl';
const NAME = 'FragDenStaat';
const URI = 'https://fragdenstaat.de';
const CACHE_TIMEOUT = 2 * 60 * 60; // 2h
const DESCRIPTION = 'Get latest blog posts from FragDenStaat Exklusiv';
const PARAMETERS = [ [
'qLimit' => [
'name' => 'Query Limit',
'title' => 'Amount of articles to query',
'type' => 'number',
'defaultValue' => 5,
],
] ];
protected function parseTeaser($teaser)
{
$result = [];
$header = $teaser->find('h3 > a', 0);
$result['title'] = $header->plaintext;
$result['uri'] = static::URI . $header->href;
$result['enclosures'] = [];
$result['enclosures'][] = $teaser->find('img', 0)->src;
$result['uid'] = hash('sha256', $result['title']);
$result['timestamp'] = strtotime($teaser->find('time', 0)->getAttribute('datetime'));
return $result;
}
public function collectData()
{
$html = getSimpleHTMLDOM(self::URI . '/artikel/exklusiv/');
$queryLimit = (int) $this->getInput('qLimit');
if ($queryLimit > 12) {
$queryLimit = 12;
}
$teasers = [];
$teaserElements = $html->find('article');
for ($i = 0; $i < $queryLimit; $i++) {
array_push($teasers, $this->parseTeaser($teaserElements[$i]));
}
foreach ($teasers as $article) {
$articleHtml = getSimpleHTMLDOMCached($article['uri'], static::CACHE_TIMEOUT * 6);
$articleCore = $articleHtml->find('article.blog-article', 0);
$content = '';
$lead = $articleCore->find('div.lead > p', 0)->innertext;
$content .= '<h2>' . $lead . '</h2>';
foreach ($articleCore->find('div.blog-content > p, div.blog-content > h3') as $paragraph) {
$content .= $paragraph->outertext;
}
$article['content'] = '<img src="' . $article['enclosures'][0] . '"/>' . $content;
$article['author'] = '';
foreach ($articleCore->find('a[rel="author"]') as $author) {
$article['author'] .= $author->innertext . ', ';
}
$article['author'] = rtrim($article['author'], ', ');
$this->items[] = $article;
}
}
}

View File

@@ -3,7 +3,8 @@
class FreeTelechargerBridge extends BridgeAbstract
{
const NAME = 'Free-Telecharger';
const URI = 'https://www.free-telecharger.art/';
const URI = 'https://www.free-telecharger.fun/';
const ALTERNATEURI = 'https://www.free-telecharger.com/';
const DESCRIPTION = 'Suivi de série sur Free-Telecharger';
const MAINTAINER = 'sysadminstory';
const PARAMETERS = [
@@ -12,19 +13,19 @@ class FreeTelechargerBridge extends BridgeAbstract
'name' => 'URL de la série',
'type' => 'text',
'required' => true,
'title' => 'URL d\'une série sans le https://www.free-telecharger.art/',
'title' => 'URL d\'une série sans le https://www.free-telecharger.fun/',
'pattern' => 'series.*\.html',
'exampleValue' => 'series-vf-hd/151432-wolf-saison-1-complete-web-dl-720p.html'
],
]
];
const CACHE_TIMEOUT = 3600;
private string $showTitle;
private string $showTechDetails;
private string $showTitle = '';
private string $showTechDetails = '';
public function collectData()
{
$html = getSimpleHTMLDOM(self::URI . $this->getInput('url'));
$html = getSimpleHTMLDOM(self::ALTERNATEURI . $this->getInput('url'));
// Find all block content of the page
$blocks = $html->find('div[class=block1]');

View File

@@ -40,7 +40,7 @@ class FunkBridge extends BridgeAbstract
}
break;
default:
returnServerError('Unknown context!');
throwServerException('Unknown context!');
}
}

View File

@@ -920,7 +920,9 @@ class FurAffinityBridge extends BridgeAbstract
break;
}
$item = [];
$item = [
'categories' => [],
];
$submissionURL = $figure->find('b u a', 0)->href;
$imgURL = $figure->find('b u a img', 0)->src;
@@ -936,8 +938,7 @@ class FurAffinityBridge extends BridgeAbstract
if ($this->getInput('full') === true) {
$submissionHTML = $this->getFASimpleHTMLDOM($submissionURL, $cache);
if (!$this->isHiddenSubmission($submissionHTML)) {
$stats = $submissionHTML->find('.stats-container', 0);
$popupDate = $stats->find('.popup_date', 0);
$popupDate = $submissionHTML->find('section .popup_date', 0);
if ($popupDate) {
$item['timestamp'] = strtotime($popupDate->title);
}
@@ -947,9 +948,10 @@ class FurAffinityBridge extends BridgeAbstract
$item['enclosures'] = [$var->href];
}
foreach ($stats->find('#keywords a') as $keyword) {
foreach ($submissionHTML->find('.tags-row .tags a') as $keyword) {
$item['categories'][] = $keyword->plaintext;
}
$item['categories'] = array_filter($item['categories']);
$previewSrc = $submissionHTML->find('#submissionImg', 0);
if ($previewSrc) {

View File

@@ -34,8 +34,7 @@ class FurAffinityUserBridge extends BridgeAbstract
$url = self::URI . '/gallery/' . $this->getInput('searchUsername');
$html = getSimpleHTMLDOM($url, [], $opt)
or returnServerError('Could not load the user\'s gallery page.');
$html = getSimpleHTMLDOM($url, [], $opt);
$submissions = $html->find('section[id=gallery-gallery]', 0)->find('figure');
foreach ($submissions as $submission) {

View File

@@ -9,20 +9,19 @@ class GOGBridge extends BridgeAbstract
public function collectData()
{
$values = getContents('https://www.gog.com/games/ajax/filtered?limit=25&sort=new');
$values = getContents('https://catalog.gog.com/v1/catalog?limit=48&order=desc%3AstoreReleaseDate');
$decodedValues = json_decode($values);
$limit = 0;
foreach ($decodedValues->products as $game) {
$item = [];
$item['author'] = $game->developer . ' / ' . $game->publisher;
$item['author'] = implode(', ', $game->developers) . ' / ' . implode(', ', $game->publishers);
$item['title'] = $game->title;
$item['id'] = $game->id;
$item['uri'] = self::URI . $game->url;
$item['uri'] = $game->storeLink;
$item['content'] = $this->buildGameContentPage($game);
$item['timestamp'] = $game->globalReleaseDate;
foreach ($game->gallery as $image) {
foreach ($game->screenshots as $image) {
$item['enclosures'][] = $image . '.jpg';
}
@@ -42,18 +41,10 @@ class GOGBridge extends BridgeAbstract
$gameDescriptionValue = json_decode($gameDescriptionText);
$content = 'Genres: ';
$content .= implode(', ', $game->genres);
$content .= implode(', ', array_column($game->genres, 'name'));
$content .= '<br />Supported Platforms: ';
if ($game->worksOn->Windows) {
$content .= 'Windows ';
}
if ($game->worksOn->Mac) {
$content .= 'Mac ';
}
if ($game->worksOn->Linux) {
$content .= 'Linux ';
}
$content .= implode(', ', $game->operatingSystems);
$content .= '<br />' . $gameDescriptionValue->description->full;

View File

@@ -56,7 +56,7 @@ class GitHubGistBridge extends BridgeAbstract
$html = defaultLinkTo($html, $this->getURI());
$fileinfo = $html->find('[class~="file-info"]', 0)
or returnServerError('Could not find file info!');
or throwServerException('Could not find file info!');
$this->filename = $fileinfo->plaintext;
@@ -68,18 +68,18 @@ class GitHubGistBridge extends BridgeAbstract
foreach ($comments as $comment) {
$uri = $comment->find('a[href*=#gistcomment]', 0)
or returnServerError('Could not find comment anchor!');
or throwServerException('Could not find comment anchor!');
$title = $comment->find('h3', 0);
$datetime = $comment->find('[datetime]', 0)
or returnServerError('Could not find comment datetime!');
or throwServerException('Could not find comment datetime!');
$author = $comment->find('a.author', 0)
or returnServerError('Could not find author name!');
or throwServerException('Could not find author name!');
$message = $comment->find('[class~="comment-body"]', 0)
or returnServerError('Could not find comment body!');
or throwServerException('Could not find comment body!');
$item = [];

View File

@@ -155,8 +155,7 @@ class GiteaBridge extends BridgeAbstract
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request ' . $this->getURI());
$html = getSimpleHTMLDOM($this->getURI());
$html = defaultLinkTo($html, $this->getURI());
$this->title = $html->find('[property="og:title"]', 0)->content;
@@ -189,7 +188,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectReleasesData($html)
{
$releases = $html->find('#release-list > li')
or returnServerError('Unable to find releases');
or throwServerException('Unable to find releases');
foreach ($releases as $release) {
$this->items[] = [
@@ -204,7 +203,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectTagsData($html)
{
$tags = $html->find('table#tags-table > tbody > tr')
or returnServerError('Unable to find tags');
or throwServerException('Unable to find tags');
foreach ($tags as $tag) {
$this->items[] = [
@@ -217,7 +216,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectCommitsData($html)
{
$commits = $html->find('#commits-table tbody tr')
or returnServerError('Unable to find commits');
or throwServerException('Unable to find commits');
foreach ($commits as $commit) {
$this->items[] = [
@@ -233,7 +232,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectIssuesData($html)
{
$issues = $html->find('.issue.list li')
or returnServerError('Unable to find issues');
or throwServerException('Unable to find issues');
foreach ($issues as $issue) {
$uri = $issue->find('a', 0)->href;
@@ -246,8 +245,7 @@ class GiteaBridge extends BridgeAbstract
];
if ($this->getInput('include_description')) {
$issue_html = getSimpleHTMLDOMCached($uri, 3600)
or returnServerError('Unable to load issue description');
$issue_html = getSimpleHTMLDOMCached($uri, 3600);
$issue_html = defaultLinkTo($issue_html, $uri);
@@ -261,7 +259,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectSingleIssueOrPrData($html)
{
$comments = $html->find('.comment')
or returnServerError('Unable to find comments');
or throwServerException('Unable to find comments');
foreach ($comments as $comment) {
if (
@@ -295,7 +293,7 @@ class GiteaBridge extends BridgeAbstract
protected function collectPullRequestsData($html)
{
$issues = $html->find('.issue.list li')
or returnServerError('Unable to find pull requests');
or throwServerException('Unable to find pull requests');
foreach ($issues as $issue) {
$uri = $issue->find('a', 0)->href;
@@ -308,8 +306,7 @@ class GiteaBridge extends BridgeAbstract
];
if ($this->getInput('include_description')) {
$issue_html = getSimpleHTMLDOMCached($uri, 3600)
or returnServerError('Unable to load issue description');
$issue_html = getSimpleHTMLDOMCached($uri, 3600);
$issue_html = defaultLinkTo($issue_html, $uri);

View File

@@ -192,16 +192,22 @@ class GithubIssueBridge extends BridgeAbstract
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI());
$url = $this->getURI();
$html = getSimpleHTMLDOM($url);
switch ($this->queriedContext) {
case static::BRIDGE_OPTIONS[1]: // Issue comments
$this->items = $this->extractIssueComments($html);
break;
case static::BRIDGE_OPTIONS[0]: // Project Issues
foreach ($html->find('.js-active-navigation-container .js-navigation-item') as $issue) {
$info = $issue->find('.opened-by', 0);
// PRs
$issues = $html->find('.js-active-navigation-container .js-navigation-item');
if (!$issues) {
// Issues
$issues = $html->find('.IssueRow-module__row--XmR1f');
}
foreach ($issues as $issue) {
preg_match('/\/([0-9]+)$/', $issue->find('a', 0)->href, $match);
$issueNbr = $match[1];
@@ -211,6 +217,7 @@ class GithubIssueBridge extends BridgeAbstract
if ($this->getInput('c')) {
$uri = static::URI . $this->getInput('u')
. '/' . $this->getInput('p') . '/' . static::URL_PATH . '/' . $issueNbr;
$issue = getSimpleHTMLDOMCached($uri, static::CACHE_TIMEOUT);
if ($issue) {
$this->items = array_merge(
@@ -222,24 +229,34 @@ class GithubIssueBridge extends BridgeAbstract
$item['content'] = 'Can not extract comments from ' . $uri;
}
$item['author'] = $info->find('a', 0)->plaintext;
$item['timestamp'] = strtotime(
$info->find('relative-time', 0)->getAttribute('datetime')
);
$item['title'] = html_entity_decode(
$issue->find('.js-navigation-open', 0)->plaintext,
ENT_QUOTES,
'UTF-8'
);
$item['author'] = $issue->find('a', 1)->plaintext;
$comment_count = 0;
if ($span = $issue->find('a[aria-label*="comment"] span', 0)) {
$comment_count = $span->plaintext;
$time = $issue->find('relative-time', 0);
$datetime = $time->getAttribute('datetime');
if ($datetime) {
$item['timestamp'] = strtotime($datetime);
}
$item['content'] .= "\n" . 'Comments: ' . $comment_count;
$item['title'] = '';
# Works for PRs
$title = $issue->find('a.Link--primary', 0);
if ($title) {
$item['title'] = html_entity_decode($title->plaintext, ENT_QUOTES, 'UTF-8');
}
$title2 = $issue->find('h3 a', 0);
if ($title2) {
$item['title'] = html_entity_decode($title2->plaintext, ENT_QUOTES, 'UTF-8');
}
//$comment_count = 0;
//if ($span = $issue->find('a[aria-label*="comment"] span', 0)) {
// $comment_count = $span->plaintext;
//}
//$item['content'] .= "\n" . 'Comments: ' . $comment_count;
$item['uri'] = self::URI
. trim($issue->find('.js-navigation-open', 0)->getAttribute('href'), '/');
. trim($issue->find('a', 0)->getAttribute('href'), '/');
$this->items[] = $item;
}
break;

View File

@@ -98,7 +98,7 @@ class GlassdoorBridge extends BridgeAbstract
private function collectBlogData($html, $limit)
{
$posts = $html->find('div.post')
or returnServerError('Unable to find blog posts!');
or throwServerException('Unable to find blog posts!');
foreach ($posts as $post) {
$item = [];
@@ -121,7 +121,7 @@ class GlassdoorBridge extends BridgeAbstract
private function collectReviewData($html, $limit)
{
$reviews = $html->find('#ReviewsFeed li[id^="empReview]')
or returnServerError('Unable to find reviews!');
or throwServerException('Unable to find reviews!');
foreach ($reviews as $review) {
$item = [];
@@ -163,7 +163,7 @@ class GlassdoorBridge extends BridgeAbstract
FILTER_FLAG_PATH_REQUIRED
)
) {
returnClientError('The specified URL is invalid!');
throwClientException('The specified URL is invalid!');
}
$uri = filter_var($uri, FILTER_SANITIZE_URL);
@@ -189,7 +189,7 @@ class GlassdoorBridge extends BridgeAbstract
];
if (!in_array($parts[1], $allowed_strings)) {
returnClientError('Please specify a URL pointing to the companies review page!');
throwClientException('Please specify a URL pointing to the companies review page!');
}
return $uri;

View File

@@ -28,7 +28,7 @@ class GlowficBridge extends BridgeAbstract
public function collectData()
{
$url = $this->getAPIURI();
$metadata = get_headers($url . '/replies', true) or returnClientError('Post did not return reply headers.');
$metadata = get_headers($url . '/replies', true);
$metadata['Last-Page'] = ceil($metadata['Total'] / $metadata['Per-Page']);
if (
!is_null($this->getInput('start_page')) &&

View File

@@ -2,7 +2,8 @@
class GoComicsBridge extends BridgeAbstract
{
const MAINTAINER = 'sky';
const MAINTAINER = 'TReKiE';
//const MAINTAINER = 'sky';
const NAME = 'GoComics Unofficial RSS';
const URI = 'https://www.gocomics.com/';
const CACHE_TIMEOUT = 21600; // 6h
@@ -13,32 +14,61 @@ class GoComicsBridge extends BridgeAbstract
'type' => 'text',
'exampleValue' => 'heartofthecity',
'required' => true
],
'date-in-title' => [
'name' => 'Add date and full name to each day\'s title',
'type' => 'checkbox',
'title' => 'Adds the date and the full name into the title of each day\'s comic',
],
'limit' => [
'name' => 'Limit',
'type' => 'number',
'title' => 'The number of recent comics to get',
'defaultValue' => 5
]
]];
public function collectData()
{
$html = getSimpleHTMLDOM($this->getURI());
$link = $this->getURI();
$landingpage = getSimpleHTMLDOM($link);
$element = $landingpage->find('div[data-post-url]', 0);
if ($element) {
$link = $element->getAttribute('data-post-url');
} else { // fallback for comics without data-post-url (assumes daily comic)
$nextcomiclink = $landingpage->find('a[class*="ComicNavigation_controls__button_previous__"]', 0)->href;
preg_match('/(\d{4}\/\d{2}\/\d{2})/', $nextcomiclink, $nclmatches);
if (!empty($nclmatches[1])) {
$nextdate = new DateTime($nclmatches[1]);
$nextdate = $nextdate->modify('+1 day')->format('Y/m/d');
$link = $link . '/' . $nextdate;
} else {
throw new \Exception('Could not find the first comic URL. Please create a new GitHub issue.');
}
}
//Get info from first page
$author = preg_replace('/By /', '', $html->find('.media-subheading', 0)->plaintext);
for ($i = 0; $i < $this->getInput('limit'); $i++) {
$html = getSimpleHTMLDOMCached($link, 86400);
$imagelink = $html->find('meta[property="og:image"]', 0)->content;
$parts = explode('/', $link);
$date = DateTime::createFromFormat('Y/m/d', implode('/', array_slice($parts, -3)));
$title = $html->find('meta[property="og:title"]', 0)->content;
preg_match('/by (.*?) for/', $title, $authormatches);
$author = $authormatches[1] ?? 'GoComics';
$link = self::URI . $html->find('.gc-deck--cta-0', 0)->find('a', 0)->href;
for ($i = 0; $i < 5; $i++) {
$item = [];
$page = getSimpleHTMLDOM($link);
$imagelink = $page->find('.comic.container', 0)->getAttribute('data-image');
$date = explode('/', $link);
$item['id'] = $imagelink;
$item['uri'] = $link;
$item['author'] = $author;
$item['title'] = 'GoComics ' . $this->getInput('comicname');
$item['timestamp'] = DateTime::createFromFormat('Ymd', $date[5] . $date[6] . $date[7])->getTimestamp();
if ($this->getInput('date-in-title') === true) {
$item['title'] = $title;
}
$item['timestamp'] = $date->setTime(0, 0, 0)->getTimestamp();
$item['content'] = '<img src="' . $imagelink . '" />';
$link = self::URI . $page->find('.js-previous-comic', 0)->href;
$link = rtrim(self::URI, '/') . $html->find('a[class*="ComicNavigation_controls__button_previous__"]', 0)->href;
$this->items[] = $item;
}
}

View File

@@ -141,7 +141,7 @@ class GogsBridge extends BridgeAbstract
protected function collectCommitsData($html)
{
$commits = $html->find('#commits-table tbody tr')
or returnServerError('Unable to find commits');
or throwServerException('Unable to find commits');
foreach ($commits as $commit) {
$this->items[] = [
@@ -157,7 +157,7 @@ class GogsBridge extends BridgeAbstract
protected function collectIssuesData($html)
{
$issues = $html->find('.issue.list li')
or returnServerError('Unable to find issues');
or throwServerException('Unable to find issues');
foreach ($issues as $issue) {
$uri = $issue->find('a', 0)->href;
@@ -171,8 +171,7 @@ class GogsBridge extends BridgeAbstract
];
if ($this->getInput('include_description')) {
$issue_html = getSimpleHTMLDOMCached($uri, 3600)
or returnServerError('Unable to load issue description');
$issue_html = getSimpleHTMLDOMCached($uri, 3600);
$issue_html = defaultLinkTo($issue_html, $uri);
@@ -186,7 +185,7 @@ class GogsBridge extends BridgeAbstract
protected function collectSingleIssueData($html)
{
$comments = $html->find('.comments .comment')
or returnServerError('Unable to find comments');
or throwServerException('Unable to find comments');
foreach ($comments as $comment) {
$this->items[] = [
@@ -204,7 +203,7 @@ class GogsBridge extends BridgeAbstract
protected function collectReleasesData($html)
{
$releases = $html->find('#release-list li')
or returnServerError('Unable to find releases');
or throwServerException('Unable to find releases');
foreach ($releases as $release) {
$this->items[] = [

View File

@@ -53,7 +53,7 @@ class GolemBridge extends FeedExpander
]
]];
const LIMIT = 5;
const HEADERS = ['Cookie: golem_consent20=simple|220101;'];
const HEADERS = ['Cookie: golem_consent20=simple|250101;'];
public function collectData()
{
@@ -82,7 +82,7 @@ class GolemBridge extends FeedExpander
// URI without RSS feed reference
$item['uri'] = $articlePage->find('head meta[name="twitter:url"]', 0)->content;
$categories = $articlePage->find('ul.tags__list li');
$categories = $articlePage->find('div.go-tag-list__tags a.go-tag');
foreach ($categories as $category) {
$trimmedcategories[] = trim(html_entity_decode($category->plaintext));
}
@@ -131,28 +131,35 @@ class GolemBridge extends FeedExpander
// delete known bad elements
foreach (
$article->find('div[id*="adtile"], #job-market, #seminars, iframe,
div.gbox_affiliate, div.toc') as $bad
$article->find('div[id*="adtile"], #job-market, #seminars, iframe, .go-article-header__title, .go-article-header__kicker,
.gbox_affiliate, div.toc, .go-button-bar, .go-alink-list, .go-teaser-block, .go-vh') as $bad
) {
$bad->remove();
}
// reload html, as remove() is buggy
$article = str_get_html($article->outertext);
// Add multipage headers, but only if they are different to the article header
$firstHeader = $page->find('.table-jtoc td', 0);
if (isset($firstHeader)) {
$firstHeader = html_entity_decode($firstHeader->title);
}
$multipageHeader = $article->find('header.paged-cluster-header h1', 0);
if (isset($multipageHeader) && $multipageHeader->plaintext !== $firstHeader) {
$item .= $multipageHeader;
}
$header = $article->find('header', 0);
foreach ($header->find('p, figure') as $element) {
$item .= $element;
}
$content = $article->find('div.formatted', 0);
// full image quality
foreach ($content->find('img[data-src-full][src*="."]') as $img) {
foreach ($article->find('img[data-src-full][src*="."]') as $img) {
$img->src = $img->getAttribute('data-src-full');
}
foreach ($content->find('p, h1, h2, h3, img[src*="."], iframe, video') as $element) {
foreach ($article->find('div.go-article-header__intro, p, h1, h2, h3, pre, img[src*="."], div[class*="golem_tablediv"], iframe, video') as $element) {
$item .= $element;
}

View File

@@ -109,7 +109,7 @@ class GoogleScholarBridge extends BridgeAbstract
case 'user':
$userId = $this->getInput('userId');
$uri = self::URI . '/citations?hl=en&view_op=list_works&sortby=pubdate&user=' . $userId;
$html = getSimpleHTMLDOM($uri) or returnServerError('Could not fetch Google Scholar data.');
$html = getSimpleHTMLDOM($uri);
$publications = $html->find('tr[class="gsc_a_tr"]');
@@ -184,7 +184,7 @@ class GoogleScholarBridge extends BridgeAbstract
$uri .= $sortBy ? '&scisbd=1' : '';
$uri .= $numResults ? '&num=' . $numResults : '';
$html = getSimpleHTMLDOM($uri) or returnServerError('Could not fetch Google Scholar data.');
$html = getSimpleHTMLDOM($uri);
$publications = $html->find('div[class="gs_r gs_or gs_scl"]');

View File

@@ -26,7 +26,7 @@ class GoogleSearchBridge extends BridgeAbstract
// todo: wrap this in try..catch because 429 too many requests happens a lot
$dom = getSimpleHTMLDOM($this->getURI(), ['Accept-language: en-US']);
if (!$dom) {
returnServerError('No results for this query.');
throwServerException('No results for this query.');
}
$result = $dom->find('div[id=res]', 0);

View File

@@ -1,6 +1,6 @@
<?php
class GovTrackBridge extends BridgeAbstract
class GovTrackBridge extends FeedExpander
{
const NAME = 'GovTrack';
const MAINTAINER = 'phantop';
@@ -18,64 +18,50 @@ class GovTrackBridge extends BridgeAbstract
'Major Legislative Activity' => 'major-bill-activity',
'New Bills and Resolutions' => 'introduced-bills',
'New Laws' => 'enacted-bills',
'Posts from Us' => 'posts'
]
],
'limit' => self::LIMIT
'News from Us' => 'posts'
]
],
'limit' => self::LIMIT
]];
public function collectData()
{
$html = getSimpleHTMLDOMCached($this->getURI());
if ($this->getInput('feed') != 'posts') {
$this->collectEvent($html);
return;
}
$html = defaultLinkTo($html, parent::getURI());
$limit = $this->getInput('limit') ?? 10;
foreach ($html->find('section') as $element) {
if (--$limit == 0) {
break;
}
$info = explode(' ', $element->find('p', 0)->innertext);
$item = [
'categories' => [implode(' ', array_slice($info, 4))],
'timestamp' => strtotime(implode(' ', array_slice($info, 0, 3))),
'title' => $element->find('a', 0)->innertext,
'uri' => $element->find('a', 0)->href,
];
$html = getSimpleHTMLDOMCached($item['uri']);
$html = defaultLinkTo($html, parent::getURI());
$content = $html->find('#content .col-md', 1);
$info = explode(' by ', $content->find('p', 0)->plaintext);
$content->removeChild($content->firstChild());
$item['author'] = implode(' ', array_slice($info, 1));
$item['content'] = $content->innertext;
$this->items[] = $item;
$limit = $this->getInput('limit') ?? 15;
if ($this->getInput('feed') == 'posts') {
$this->collectExpandableDatas($this->getURI() . '.rss', $limit);
} else {
$this->collectEvent($this->getURI(), $limit);
}
}
private function collectEvent($html)
protected function parseItem(array $item)
{
$opt = [];
preg_match('/"csrfmiddlewaretoken" value="(.*)"/', $html, $opt);
$html = getSimpleHTMLDOMCached($item['uri']);
$html = defaultLinkTo($html, parent::getURI());
$item['categories'] = [$html->find('.breadcrumb-item', 1)->plaintext];
$content = $html->find('#content .col-md', 1);
$item['author'] = explode(' by ', $content->firstChild()->plaintext)[1];
$content->removeChild($content->firstChild());
$item['content'] = $content->innertext;
return $item;
}
private function collectEvent($uri, $limit)
{
$html = getSimpleHTMLDOMCached($uri);
preg_match('/"csrfmiddlewaretoken" value="(.*)"/', $html, $preg);
$header = [
"cookie: csrftoken=$opt[1]",
"x-csrftoken: $opt[1]",
"cookie: csrftoken=$preg[1]",
"x-csrftoken: $preg[1]",
'referer: ' . parent::getURI(),
];
preg_match('/var selected_feed = "(.*)";/', $html, $opt);
$post = [
'count' => $this->getInput('limit') ?? 20,
'feed' => $opt[1]
];
$opt = [ CURLOPT_POSTFIELDS => $post ];
preg_match('/var selected_feed = "(.*)";/', $html, $preg);
$opt = [ CURLOPT_POSTFIELDS => [
'count' => $limit,
'feed' => $preg[1]
]];
$html = getContents(parent::getURI() . 'events/_load_events', $header, $opt);
$html = defaultLinkTo(str_get_html($html), parent::getURI());
@@ -83,10 +69,10 @@ class GovTrackBridge extends BridgeAbstract
foreach ($html->find('.tracked_event') as $event) {
$bill = $event->find('.event_title a, .event_body a', 0);
$date = explode(' ', $event->find('.event_date', 0)->plaintext);
preg_match('/Sponsor:(.*)\n/', $event->plaintext, $opt);
preg_match('/Sponsor:(.*)\n/', $event->plaintext, $preg);
$item = [
'author' => $opt[1] ?? '',
'author' => $preg[1] ?? '',
'content' => $event->find('td', 1)->innertext,
'enclosures' => [$event->find('img', 0)->src],
'timestamp' => strtotime(implode(' ', array_slice($date, 2))),
@@ -115,10 +101,10 @@ class GovTrackBridge extends BridgeAbstract
public function getURI()
{
if ($this->getInput('feed') != 'posts') {
$url = parent::getURI() . 'events/' . $this->getInput('feed');
} else {
if ($this->getInput('feed') == 'posts') {
$url = parent::getURI() . $this->getInput('feed');
} else {
$url = parent::getURI() . 'events/' . $this->getInput('feed');
}
return $url;
}

Some files were not shown because too many files have changed in this diff Show More