1
0
mirror of https://github.com/RSS-Bridge/rss-bridge.git synced 2025-08-17 22:02:09 +02:00

Compare commits

..

63 Commits

Author SHA1 Message Date
Quentin Delmas
b397a42876 version: Bump to 2018-09-09 2018-09-09 21:00:10 +01:00
Corentin Garcia
111c45d010 [GithubSearchBridge] Fix content parsing, add tags if present (#803)
* [GithubSearchBridge] Fix content parsing, add tags if present

* [GithubSearchBridge] Add categories (from tags)
2018-09-09 20:30:29 +01:00
Corentin Garcia
55b36b0455 [DauphineLibereBridge] Use https, fix content parsing (fix issue #780) (#811) 2018-09-09 20:23:59 +01:00
ORelio
de8cee6a1c Catching up | [Main] Debug mode, parse utils, MIME | [Bridges] Add/Improve 20 bridges (#802)
* Debug mode improvements

 - Improve debug warning message
 - Restore error reporting in debug mode
 - Fix 'notice' messages for unset fields

* Add parsing utility functions

html.php
 - extractFromDelimiters
 - stripWithDelimiters
 - stripRecursiveHTMLSection
 - markdownToHtml (partial)

bridges
 - remove now-duplicate functions
 - call functions from html.php instead

* [Anidex] New bridge

Anime torrent tracker

* [Anime-Ultime] Restore thumbnail

* [CNET] Recreate bridge

Full rewrite as the previous one was broken

* [Dilbert] Minor URI fix

Use new self::URI property

* [EstCeQuonMetEnProd] Fix content extraction

Bridge was broken

* [Facebook] Fix "SpSonsSoriSsés" label

... which was taking space in item title

* [Futura-Sciences] Use HTTPS, More cleanup

Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements

* [GBATemp] Multiple fixes

- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached

* [JapanExpo] Fix bridge, HTTPS, thumbnails

- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures

* [LeMondeInformatique] Fix bridge, HTTPS

- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue

* [Nextgov] Fix content extraction

- Restore thumbnail and use small image
- Field extraction fixes

* [NextInpact] Add categories and filtering by type

- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once

* [NyaaTorrents] New bridge

Anime torrent tracker

* [Releases3DS] Cache content, restore thumbnail

- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure

* [TheHackerNews] Fix bridge

 - Fix content extraction including article body
 - Restore thumbnail as enclosure

* [WeLiveSecurity] HTTPS, Fix content extraction

- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body

* [WordPress] Reduce timeout, more content selectors

- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup

* [YGGTorrent] Increase limit, use cache

- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached

* [ZDNet] Rewrite with FeedExpander

- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body

* [Main] Handle MIME type for enclosures

Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).

One can force enclosure type using #.ext anchor (hacky, needs improving)

* [FeedExpander] Improve field extraction

- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages

* [Pull] Coding style fixes for #802

* [Pull] Implementing changes for #802

 - Fix coding style issues with str append
 - Remove useless CACHE_TIMEOUT
 - Use count() instead of $limit
 - Use defaultLinkTo() + handle strings
 - Use http_build_query()
 - Fix missing </em>
 - Remove error_reporting(0)
 - warning CSS (@LogMANOriginal)
 - Fix typo in FeedExpander comment

* [Main] More documentation for markdownToHtml

See #802 for more details
2018-09-09 20:20:13 +01:00
Quentin Delmas
123fce4394 [ForGifsBridge] Fix permissions of ForGifsBridge 2018-09-09 17:34:36 +01:00
Quentin Delmas
a3f99c9c3f [GOGBridge] Added bridge for GOG.com 2018-09-09 17:32:36 +01:00
Eugene Molotov
bf30ad127c [FacebookBridge] Removes query string from post links
* [FacebookBridge] Removes query string from post links
2018-09-09 16:31:15 +01:00
logmanoriginal
37f84196b7 [GooglePlusPostBridge] Fix title is empty if content is too short
The bridge would generate empty titles if the content is longer than
50 characters, but doesn't have further spaces in it. With this commit
the title is correctly generated based on the contents, taking missing
spaces into account.

References #786
2018-09-08 17:07:57 +02:00
Corentin Garcia
44764f7182 [GrandComicsDatabaseBridge] Fix links in content (#804) 2018-09-08 11:12:27 +01:00
Antoine Cadoret
19f294d71d Add fields to leboncoin bridge (#783)
* [LeBonCoinBridge] Add fields to LeBonCoinBridge
2018-08-31 14:34:41 +01:00
Teromene
b0e33e4e01 Update LeBonCoinBridge to use the site's API (#795)
* Update LeBonCoinBridge to use the site's API
2018-08-28 14:20:02 +01:00
Eugene Molotov
558fa50a2a [core] Enabled debug mode before including core files (#790) 2018-08-25 20:02:47 +01:00
Eugene Molotov
ffb8b82c73 [FileCache] reseting cached file stat result to have correct getTime() result (#792)
* [FileCache] reseting cached file stat result to have correct getTime() result
2018-08-25 20:00:51 +01:00
Eugene Molotov
422c125d8e [core] Returning 304 http code when returning cached data (#793) 2018-08-25 20:00:38 +01:00
Quentin Delmas
059656c370 Fix phpcs. 2018-08-22 16:25:08 +01:00
Quentin Delmas
9fc1e97efe Avoid bot exclusion. 2018-08-22 16:21:39 +01:00
Quentin Delmas
be3620acb7 Add extension check for the "json" extension. 2018-08-22 16:21:20 +01:00
LogMANOriginal
16c0a61232 [README] Add a "Deploy to Cloud" button for Docker
Adds a button to deploy RSS-Bridge to the Docker Cloud as described here: https://docs.docker.com/docker-cloud/apps/deploy-to-cloud-btn/
2018-08-21 18:40:39 +02:00
Walter Barrett
704a87ad97 Icons: Allow Bridge-specified icons (#788) 2018-08-21 17:46:47 +02:00
sysadminstory
c4cccfe0f3 [LesJoiesDuCode] Switch to HTTPS and remove author (#787)
Website offers now HTTPS, therefore the bridge was switched to it.
The post author is not displayed anymore on the homepage, so it has been
removed.
2018-08-21 17:41:56 +02:00
Marcin C
d07deb0930 css: Modern look for RSS-Bridge (#781) 2018-08-21 17:22:46 +02:00
Piranhaplant
e7dab5d351 Fixed timestamp on Pixiv bridge (#785) 2018-08-18 16:54:24 -03:00
logmanoriginal
ad82d50bbd [CNETBridge] Remove bridge
CNET now provides public feeds at https://www.cnet.com/rss/

References #775
2018-08-12 11:02:44 +02:00
logmanoriginal
c305c1ded7 [BlaguesDeMerdeBridge] Adjust to layout changes
References #767
2018-08-10 21:08:47 +02:00
logmanoriginal
f14a5bd771 [CADBridge] Remove bridge
https://cad-comic.com/ now provides feeds at

- https://cad-comic.com/feed (rss)
- https://cad-comic.com/feed/atom (atom)

Thus multiple alternatives are available to choose from, making this
bridge obsolete:

- FilterBridge (using one of the feeds above)
- WordPressBridge (on the main site)
- One of the two available feeds

References #752
2018-08-10 19:53:32 +02:00
logmanoriginal
a20d5f9af0 tests: reuse RssBridge.php instead of implementing a custom solution 2018-08-10 15:33:32 +02:00
logmanoriginal
ee28b124e0 [DanbooruBridge] Fix bridge
This commit fixes an issue caused by self closing tags not supported
by simplehtmldom (<source>).

Adds a monkey patch to extend simplehtmldom with the ability to detect
that particular tag. Most of the code added is copied directly from
simplehtmldom (see vendor/simplehtmldom) with adjustments to account
for RSS-Bridge formatting.

Related to: https://sourceforge.net/p/simplehtmldom/bugs/83/

Notice: The tag itself is valid according to Mozilla:

The HTML <picture> element serves as a container for zero or more
<source> elements and one <img> element to provide versions of an
image for different display device scenarios. The browser will
consider each of the child <source> elements and select one
corresponding to the best match found; if no matches are found
among the <source> elements, the file specified by the <img>
element's src attribute is selected. The selected image is then
presented in the space occupied by the <img> element.

-- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture

References #753
2018-08-09 21:55:43 +02:00
LogMANOriginal
7dee3a175a [index] Add '?action=list' to list bridges (#493)
Adds a new action '?action=list' to return a list of bridges as JSON formatted text. Each bridge brings following information:

- status (active/inactive)
- uri
- name
- parameters
- maintainer
- description

For inactive bridges only the status is returned.
Bridges that cannot be instantiated are considered inactive.
2018-08-09 19:14:10 +02:00
logmanoriginal
5fea9fc1f5 bridges: Fix bridges failing unit test 2018-08-09 17:04:16 +02:00
logmanoriginal
6bceb2b2db [tests] Add unit test for bridge implementation
Adds unit test for bridge implementations:

- Custom functions must be in protected or private scope
- getName() must return a valid string (non-empty)
- getURI() must return a valid URI
- Each bridge must define constants for NAME, URI, DESCRIPTION and
  MAINTAINER. CACHE_TIMEOUT and PARAMETERS are optional.

The unit test is written for PHPUnit 6.x and will automatically be
tested by Travis-CI for PHP 7.0 (see .travis.yml).

Remarks:

Unit tests for bridge data were scrapped in #378 for complexity
reasons (tests would have to be maintained for each bridge). This
unit test, however, is written for testing all bridges without
taking specific implementation details into account.
2018-08-09 17:04:16 +02:00
Eugene Molotov
df81fa62d1 [VkBridge] Video attachment fixes (#766)
* use defaultLinkTo
* remove duplicate video links
* remove line ending before "Reposted" label
* return newline before reposted string
* remove comments
* use video links that won't require login
* set title if video has no title
2018-08-09 17:02:36 +02:00
Eugene Molotov
f8c6400373 [HtmlFormat] Hide "Categories" label, if array of categories is empty (#765) 2018-08-09 16:46:53 +02:00
logmanoriginal
de7622ebbf version: Bump to 2018-08-07 2018-08-07 18:37:38 +02:00
logmanoriginal
09c9d015b4 [ForGifsBridge] Add new bridge 2018-08-04 23:42:58 +02:00
logmanoriginal
3a496e3b18 [FilterBridge] Add option to build title from content
Adds a new option '&title_from_content=on' to build the title for feed
items from the feeds content. The title is generated from the first
whitespace after 50 characters of the content or the entire content if
the total size is lower than 50 characters.

References #587
2018-08-04 20:46:59 +02:00
Eugene Molotov
df58f5bbdb [core] Add urljoin (#756)
Adds php-urljoin from https://github.com/fluffy-critter/php-urljoin to replace the custom implementation of 'defaultLinkTo'
2018-08-02 06:31:56 +02:00
logmanoriginal
9d0452d11b [.travis] Use composer for HHVM
This fixes the HHVM build failing because pear doesn't exist in HHVM.
2018-08-01 19:37:10 +02:00
sublimz
f92ac49947 [LeBonCoinBridge] Add cities support (#751) 2018-08-01 17:25:18 +02:00
Benasse
a574fa15ac [YGGTorrentBridge] Order search result by publish date (#762) 2018-07-31 21:46:10 +02:00
Nemo
8f9a385b4d [AmazonPriceTrackerBridge] Improve Amazon scraper logic (#761)
- Now works on all websites, and even with products
  with multiple prices
- Closes #750
2018-07-31 21:44:37 +02:00
logmanoriginal
53bdfa3bf0 [GooglePlusPostBridge] Skip posts without message 2018-07-31 19:15:09 +02:00
logmanoriginal
53278b2eed [GooglePlusPostBridge] Add option to include image in content
References #600
2018-07-31 19:09:12 +02:00
logmanoriginal
5f3c55b808 [GooglePlusPostBridge] General cleanup 2018-07-31 18:55:35 +02:00
logmanoriginal
fb79a67370 [GooglePlusPostBridge] Normalize static::URI usage
This commit fixes a few things related to static::URI

1) Remove trailing slash from the URI to simplify using 'defaultLinkTo'
2) Use static::URI instead of self::URI for consistency
3) Remove custom implementation of 'defaultLinkTo'
2018-07-31 18:29:14 +02:00
logmanoriginal
3c4e12ceba [GooglePlusPostBridge] Add images to enclosures
Images are collected for each post and added to enclosures. Images or
animtions from lh3.googleusercontent.com are specifically handled in
order to return the animated version of the gif and the original sized
image (this is normally taken care of by JS in the browser).
2018-07-31 18:18:22 +02:00
logmanoriginal
0d1923c52f [GitHubGistBridge] Add new bridge
Adds a new bridge for https://gist.github.com

The bridge generates feeds for comments on a particular gist based on
the gist ID or full URI. For better readability the general behavior
of code sections is manually restored with the original CSS styles
from GitHub.
2018-07-29 16:31:47 +02:00
logmanoriginal
ce896b4247 [SkimfeedBridge] Add new bridge
New bridge for Skimfeed: https://skimfeed.com

Generates feeds for all features of Skimfeed:

- News (the ones displayed on the front page)
- Hot topics ("What's Hot" section on the front page)
- Tech news (preconfigured feeds in the menu bar)
- Custom feeds (using the configuration system of Skimfeed), see
https://skimfeed.com/custom.php

The number of items returned by the bridge can be limited for all
categories ('&limit=...'). This parameter is optional, all categories
are unlimited by default!

Authors are added with HTML anchors in order to allow quick navigation
to source channels.

The bridge ships with developer tools to auto-generate lists in the
future (especially useful for 'Tech news'!)

References #748
2018-07-27 23:18:32 +02:00
sysadminstory
a4b2d88dbe [DealabsBridge] Follow website change (#758) 2018-07-25 20:02:31 +02:00
logmanoriginal
65ec04ea98 [contents] Remove superfluous debug log from getContents
References #757
2018-07-25 19:56:46 +02:00
logmanoriginal
afb4de318b [FlickrBridge] Fix missing scheme for image URLs
References #754
2018-07-23 20:14:46 +02:00
Eugene Molotov
43bb17f995 [VkBridge] Converting hashtags to categories (#755)
* [VkBridge] Converting hashtags to categories
2018-07-22 16:43:00 +02:00
logmanoriginal
bae7a5879f [FlickrBridge] Fixed broken bridge
Following changes in the JSON data and selecting images for the
content (320x240 or bigger) and enclosure (largest version). All of
the data is now extracted from the JSON data instead of parsing the
DOM.

References #754
2018-07-22 14:06:04 +02:00
LogMANOriginal
bd760cbcee [README] Add docker build status 2018-07-21 21:59:48 +02:00
LogMANOriginal
cd20b4476f [README] Add label for latest release 2018-07-21 21:54:46 +02:00
LogMANOriginal
d83f2f285b Separate index and bridge card generating code into a separate classes (#734)
[html] Generate index and bridge cards using separate clases

Move HTML generating code from 'index.php' to 'Index.php', separating components into static functions.

Move HTML generation code for bridge cards from 'html.php' to 'BridgeCard.php', separating components into static functions.
2018-07-21 18:15:07 +02:00
logmanoriginal
15e6d77569 [FierPandaBridge] Fix bridge
This bridge now returns all articles from the front page, following
layout changes in the past.

References #679
2018-07-21 18:07:03 +02:00
logmanoriginal
f97d2ef254 [Torrent9Bridge] Remove bridge
The site moved from www.torrent9.pe to www.t9.pe and is now protected
by Cloudflare challenges, making it inaccessible to RSS-Bridge.
2018-07-21 17:45:22 +02:00
logmanoriginal
91ae2a23d7 [CpasbienBridge] Remove bridge
Removing this bridge for two reasons:

1) The service moved from www.cpasbien.cm to www.torrents9.blue,
changing the layout in the process (incompatible).

2) The new site is permanently protected by Cloudflare IUAM, making
it inaccessible by RSS-Bridge.

While it would certainly be possible to rewrite the bridge to work
with the new layout, the site is still inaccessible.

References #605
2018-07-21 17:43:29 +02:00
logmanoriginal
066ef1d7db [contents] Add Cloudflare challenge detection
Adds detection for servers responding with Cloudflare challenges,
throwing a server error if detected:

"The server responded with a Cloudflare challenge, which is not
supported by RSS-Bridge! If this error persists longer than a week,
please consider opening an issue on GitHub!"

This is supposed to support maintainers to identify broken bridges
for sites with Cloudflare enabled permanently. It doesn't circumvent
the protection in any form or shape!

The Cloudflare challenge is detected by analyzing the last response
header received from the server. If the HTTP Code is not 200 (OK)
and the server name contains 'cloudflare' ('Server: cloudflare'),
RSS-Bridge assumes the server responded with a challenge.

The header parsing is based on https://stackoverflow.com/a/18682872
2018-07-21 17:43:29 +02:00
LogMANOriginal
4facbf32e3 [InstructableBridge] Add new bridge (#724)
This commit adds a new bridge for http://www.instructables.com. This bridge
currently supports fetching content by category (all categories available 200+),
using available filters (featured, recent, popular, views, contest winners).
2018-07-21 15:25:13 +02:00
logmanoriginal
6bd76af326 [YoutubeBridge] Add duration limits for all modes
Adds duration limits (minimum duration, maximum duration) for all
modes (user/id/playlist/search). Duration limits are optional, so
existing subscriptions don't break.

The limits are specified by two separate parameters, each of which
is optional:

- `&duration_min=` (minimum duration in minutes, default: -1)
- `&duration_max=` (maximum duration in minutes, default: INF)

If duration limits are specified in either user, id or playlist mode,
the bridge defaults to fetching data from HTML intead of XML feeds,
which requires more bandwidth and takes longer, because each video is
loaded individually!

References #670
2018-07-21 14:33:07 +02:00
logmanoriginal
caa622ffec [search] Support searching by URI
Adds matching for URIs to the search bar, using the format
<scheme>://<host>/<path>

Searching by URI scheme is also supported:

"http://"  (returns all bridges with 'http'  scheme)
"https://" (returns all bridges with 'https' scheme)

The following examples are equivalent and will return both of the
Facebook bridges (FacebookBridge and FB2Bridge):

"https://www.facebook.com/facebook"
"https://www.facebook.com/facebook?..."
"https://www.facebook.com"
"http://www.facebook.com"
"https://facebook.com"
"http://facebook.com"
"facebook.com"
"facebook"

Notice: When the URI scheme is omitted, the search algorithm falls back
to regex matching. Searching for "www.facebook.com" doesn't work, as it
is missing the schema and doesn't match via regex!

Omitting the 'www.', however, does work. This was a design decision for
some bridges specify their URI with and others without 'www.'

A search term can still be specified in the browser URL using parameter
'q' => '?q=searchterm'.

References #743
2018-07-20 22:44:13 +02:00
teromene
c4d489f018 Add URI to ElloBridge elements. 2018-07-19 17:07:54 +02:00
72 changed files with 4640 additions and 1713 deletions

View File

@@ -3,12 +3,26 @@ sudo: false
language: php
install:
- pear channel-update pear.php.net
- pear install PHP_CodeSniffer
- if [[ $TRAVIS_PHP_VERSION == "hhvm" ]]; then
composer global require squizlabs/PHP_CodeSniffer;
else
pear channel-update pear.php.net;
pear install PHP_CodeSniffer;
fi
- if [[ $TRAVIS_PHP_VERSION == "7.0" ]]; then
composer global require phpunit/phpunit ^6;
fi
script:
- phpenv rehash
- phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p
- if [[ $TRAVIS_PHP_VERSION == "hhvm" ]]; then
/home/travis/.composer/vendor/bin/phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p;
else
phpcs . --standard=phpcs.xml --warning-severity=0 --extensions=php -p;
fi
- if [[ $TRAVIS_PHP_VERSION == "7.0" ]]; then
phpunit --configuration=phpunit.xml --include-path=lib/;
fi
matrix:
fast_finish: true

View File

@@ -1,6 +1,6 @@
rss-bridge
===
[![LICENSE](https://img.shields.io/badge/license-UNLICENSE-blue.svg)](UNLICENSE) [![Build Status](https://travis-ci.org/RSS-Bridge/rss-bridge.svg?branch=master)](https://travis-ci.org/RSS-Bridge/rss-bridge)
[![LICENSE](https://img.shields.io/badge/license-UNLICENSE-blue.svg)](UNLICENSE) [![GitHub release](https://img.shields.io/github/release/rss-bridge/rss-bridge.svg)](https://github.com/rss-bridge/rss-bridge/releases/latest) [![Build Status](https://travis-ci.org/RSS-Bridge/rss-bridge.svg?branch=master)](https://travis-ci.org/RSS-Bridge/rss-bridge) [![Docker Build Status](https://img.shields.io/docker/build/rssbridge/rss-bridge.svg)](https://hub.docker.com/r/rssbridge/rss-bridge/)
rss-bridge is a PHP project capable of generating ATOM feeds for websites which don't have one.
@@ -68,6 +68,7 @@ New bridges are disabled by default, so make sure to check regularly what's new
Deploy
===
[![Deploy on Scalingo](https://cdn.scalingo.com/deploy/button.svg)](https://my.scalingo.com/deploy?source=https://github.com/sebsauvage/rss-bridge)
[![Deploy to Docker Cloud](https://files.cloud.docker.com/images/deploy-to-dockercloud.svg)](https://cloud.docker.com/stack/deploy/?repo=https://github.com/rss-bridge/rss-bridge)
Authors
===

View File

@@ -92,6 +92,14 @@ class AmazonPriceTrackerBridge extends BridgeAbstract {
}
}
private function parseDynamicImage($attribute) {
$json = json_decode(html_entity_decode($attribute), true);
if ($json and count($json) > 0) {
return array_keys($json)[0];
}
}
/**
* Returns a generated image tag for the product
*/
@@ -99,11 +107,15 @@ class AmazonPriceTrackerBridge extends BridgeAbstract {
$imageSrc = $html->find('#main-image-container img', 0);
if ($imageSrc) {
$imageSrc = $imageSrc ? $imageSrc->getAttribute('data-old-hires') : '';
return <<<EOT
<img width="300" style="max-width:300;max-height:300" src="$imageSrc" alt="{$this->title}" />
EOT;
$hiresImage = $imageSrc->getAttribute('data-old-hires');
$dynamicImageAttribute = $imageSrc->getAttribute('data-a-dynamic-image');
$image = $hiresImage ?: $this->parseDynamicImage($dynamicImageAttribute);
}
$image = $image ?: 'https://placekitten.com/200/300';
return <<<EOT
<img width="300" style="max-width:300;max-height:300" src="$image" alt="{$this->title}" />
EOT;
}
/**
@@ -116,6 +128,39 @@ EOT;
return getSimpleHTMLDOM($uri) ?: returnServerError('Could not request Amazon.');
}
private function scrapePriceFromMetrics($html) {
$asinData = $html->find('#cerberus-data-metrics', 0);
// <div id="cerberus-data-metrics" style="display: none;"
// data-asin="B00WTHJ5SU" data-asin-price="14.99" data-asin-shipping="0"
// data-asin-currency-code="USD" data-substitute-count="-1" ... />
if ($asinData) {
return [
'price' => $asinData->getAttribute('data-asin-price'),
'currency' => $asinData->getAttribute('data-asin-currency-code'),
'shipping' => $asinData->getAttribute('data-asin-shipping')
];
}
return false;
}
private function scrapePriceGeneric($html) {
$priceDiv = $html->find('span.offer-price', 0) ?: $html->find('.a-color-price', 0);
preg_match('/^\s*([A-Z]{3}|£|\$)\s?([\d.,]+)\s*$/', $priceDiv->plaintext, $matches);
if (count($matches) === 3) {
return [
'price' => $matches[2],
'currency' => $matches[1],
'shipping' => '0'
];
}
return false;
}
/**
* Scrape method for Amazon product page
* @return [type] [description]
@@ -125,23 +170,16 @@ EOT;
$this->title = $this->getTitle($html);
$imageTag = $this->getImage($html);
$asinData = $html->find('#cerberus-data-metrics', 0);
// <div id="cerberus-data-metrics" style="display: none;"
// data-asin="B00WTHJ5SU" data-asin-price="14.99" data-asin-shipping="0"
// data-asin-currency-code="USD" data-substitute-count="-1" ... />
$currency = $asinData->getAttribute('data-asin-currency-code');
$shipping = $asinData->getAttribute('data-asin-shipping');
$price = $asinData->getAttribute('data-asin-price');
$data = $this->scrapePriceFromMetrics($html) ?: $this->scrapePriceGeneric($html);
$item = array(
'title' => $this->title,
'uri' => $this->getURI(),
'content' => "$imageTag<br/>Price: $price $currency",
'content' => "$imageTag<br/>Price: {$data['price']} {$data['currency']}",
);
if ($shipping !== '0') {
$item['content'] .= "<br>Shipping: $shipping $currency</br>";
if ($data['shipping'] !== '0') {
$item['content'] .= "<br>Shipping: {$data['shipping']} {$data['currency']}</br>";
}
$this->items[] = $item;

207
bridges/AnidexBridge.php Normal file
View File

@@ -0,0 +1,207 @@
<?php
class AnidexBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'Anidex';
const URI = 'https://anidex.info/';
const DESCRIPTION = 'Returns the newest torrents, with optional search criteria.';
const PARAMETERS = array(
array(
'id' => array(
'name' => 'Category',
'type' => 'list',
'values' => array(
'All categories' => '0',
'Anime' => '1,2,3',
'Anime - Sub' => '1',
'Anime - Raw' => '2',
'Anime - Dub' => '3',
'Live Action' => '4,5',
'Live Action - Sub' => '4',
'Live Action - Raw' => '5',
'Light Novel' => '6',
'Manga' => '7,8',
'Manga - Translated' => '7',
'Manga - Raw' => '8',
'Music' => '9,10,11',
'Music - Lossy' => '9',
'Music - Lossless' => '10',
'Music - Video' => '11',
'Games' => '12',
'Applications' => '13',
'Pictures' => '14',
'Adult Video' => '15',
'Other' => '16'
)
),
'lang_id' => array(
'name' => 'Language',
'type' => 'list',
'values' => array(
'All languages' => '0',
'English' => '1',
'Japanese' => '2',
'Polish' => '3',
'Serbo-Croatian' => '4',
'Dutch' => '5',
'Italian' => '6',
'Russian' => '7',
'German' => '8',
'Hungarian' => '9',
'French' => '10',
'Finnish' => '11',
'Vietnamese' => '12',
'Greek' => '13',
'Bulgarian' => '14',
'Spanish (Spain)' => '15',
'Portuguese (Brazil)' => '16',
'Portuguese (Portugal)' => '17',
'Swedish' => '18',
'Arabic' => '19',
'Danish' => '20',
'Chinese (Simplified)' => '21',
'Bengali' => '22',
'Romanian' => '23',
'Czech' => '24',
'Mongolian' => '25',
'Turkish' => '26',
'Indonesian' => '27',
'Korean' => '28',
'Spanish (LATAM)' => '29',
'Persian' => '30',
'Malaysian' => '31'
)
),
'group_id' => array(
'name' => 'Group ID',
'type' => 'number'
),
'r' => array(
'name' => 'Hide Remakes',
'type' => 'checkbox'
),
'b' => array(
'name' => 'Only Batches',
'type' => 'checkbox'
),
'a' => array(
'name' => 'Only Authorized',
'type' => 'checkbox'
),
'q' => array(
'name' => 'Keyword',
'description' => 'Keyword(s)',
'type' => 'text'
),
'h' => array(
'name' => 'Adult content',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide +18' => '1',
'Only +18' => '2'
)
)
)
);
public function collectData() {
// Build Search URL from user-provided parameters
$search_url = self::URI . '?s=upload_timestamp&o=desc';
foreach (array('id', 'lang_id', 'group_id') as $param_name) {
$param = $this->getInput($param_name);
if (!empty($param) && intval($param) != 0 && ctype_digit(str_replace(',', '', $param))) {
$search_url .= '&' . $param_name . '=' . $param;
}
}
foreach (array('r', 'b', 'a') as $param_name) {
$param = $this->getInput($param_name);
if (!empty($param) && boolval($param)) {
$search_url .= '&' . $param_name . '=1';
}
}
$query = $this->getInput('q');
if (!empty($query)) {
$search_url .= '&q=' . urlencode($query);
}
$opt = array();
$h = $this->getInput('h');
if (!empty($h) && intval($h) != 0 && ctype_digit($h)) {
$opt[CURLOPT_COOKIE] = 'anidex_h_toggle=' . $h;
}
// Retrieve torrent listing from search results, which does not contain torrent description
$html = getSimpleHTMLDOM($search_url, array(), $opt)
or returnServerError('Could not request Anidex: ' . $search_url);
$links = $html->find('a');
$results = array();
foreach ($links as $link)
if (strpos($link->href, '/torrent/') === 0 && !in_array($link->href, $results))
$results[] = $link->href;
if (empty($results) && empty($this->getInput('q')))
returnServerError('No results from Anidex: '.$search_url);
//Process each item individually
foreach ($results as $element) {
//Limit total amount of requests
if(count($this->items) >= 20) {
break;
}
$torrent_id = str_replace('/torrent/', '', $element);
//Ignore entries without valid torrent ID
if ($torrent_id != 0 && ctype_digit($torrent_id)) {
//Retrieve data for this torrent ID
$item_uri = self::URI . 'torrent/'.$torrent_id;
//Retrieve full description from torrent page
if ($item_html = getSimpleHTMLDOMCached($item_uri)) {
//Retrieve data from page contents
$item_title = str_replace(' (Torrent) - AniDex ', '', $item_html->find('title', 0)->plaintext);
$item_desc = $item_html->find('div.panel-body', 0);
$item_author = trim($item_html->find('span.fa-user', 0)->parent()->plaintext);
$item_date = strtotime(trim($item_html->find('span.fa-clock', 0)->parent()->plaintext));
$item_image = $this->getURI() . 'images/user_logos/default.png';
//Check for description-less torrent andn optionally extract image
$desc_title_found = false;
foreach ($item_html->find('h3.panel-title') as $h3) {
if (strpos($h3, 'Description') !== false) {
$desc_title_found = true;
break;
}
}
if ($desc_title_found) {
//Retrieve image for thumbnail or generic logo fallback
foreach ($item_desc->find('img') as $img) {
if (strpos($img->src, 'prez') === false) {
$item_image = $img->src;
break;
}
}
$item_desc = trim($item_desc->innertext);
} else {
$item_desc = '<em>No description.</em>';
}
//Build and add final item
$item = array();
$item['uri'] = $item_uri;
$item['title'] = $item_title;
$item['author'] = $item_author;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_desc;
$this->items[] = $item;
}
}
$element = null;
}
$results = null;
}
}

View File

@@ -5,7 +5,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
const NAME = 'Anime-Ultime';
const URI = 'http://www.anime-ultime.net/';
const CACHE_TIMEOUT = 10800; // 3h
const DESCRIPTION = 'Returns the 10 newest releases posted on Anime-Ultime';
const DESCRIPTION = 'Returns the newest releases posted on Anime-Ultime.';
const PARAMETERS = array( array(
'type' => array(
'name' => 'Type',
@@ -65,6 +65,13 @@ class AnimeUltimeBridge extends BridgeAbstract {
$item_link_element = $release->find('td', 0)->find('a', 0);
$item_uri = self::URI . $item_link_element->href;
$item_name = html_entity_decode($item_link_element->plaintext);
$item_image = self::URI . substr(
$item_link_element->onmouseover,
37,
strpos($item_link_element->onmouseover, ' ', 37) - 37
);
$item_episode = html_entity_decode(
str_pad(
$release->find('td', 1)->plaintext,
@@ -79,8 +86,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
if(!empty($item_uri)) {
// Retrieve description from description page and
// convert relative image src info absolute image src
// Retrieve description from description page
$html_item = getContents($item_uri)
or returnServerError('Could not request Anime-Ultime: ' . $item_uri);
$item_description = substr(
@@ -91,10 +97,9 @@ class AnimeUltimeBridge extends BridgeAbstract {
0,
strpos($item_description, '<div id="table">')
);
$item_description = str_replace(
'src="images', 'src="' . self::URI . 'images',
$item_description
);
// Convert relative image src into absolute image src, remove line breaks
$item_description = defaultLinkTo($item_description, self::URI);
$item_description = str_replace("\r", '', $item_description);
$item_description = str_replace("\n", '', $item_description);
$item_description = utf8_encode($item_description);
@@ -105,6 +110,7 @@ class AnimeUltimeBridge extends BridgeAbstract {
$item['title'] = $item_name . ' ' . $item_type . ' ' . $item_episode;
$item['author'] = $item_fansub;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_description;
$this->items[] = $item;
$processedOK++;

View File

@@ -1,31 +1,43 @@
<?php
class BlaguesDeMerdeBridge extends BridgeAbstract {
const MAINTAINER = 'superbaillot.net';
const MAINTAINER = 'superbaillot.net, logmanoriginal';
const NAME = 'Blagues De Merde';
const URI = 'http://www.blaguesdemerde.fr/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Blagues De Merde';
public function collectData(){
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request BDM.');
foreach($html->find('article.joke_contener') as $element) {
$item = array();
$temp = $element->find('a');
foreach($html->find('div.blague') as $element) {
$item = array();
$item['uri'] = static::URI . '#' . $element->id;
$item['author'] = $element->find('div[class="blague-footer"] p strong', 0)->plaintext;
// Let the title be everything up to the first <br>
$item['title'] = trim(explode("\n", $element->find('div.text', 0)->plaintext)[0]);
$item['content'] = strip_tags($element->find('div.text', 0));
// timestamp is part of:
// <p>Par <strong>{author}</strong> le {date} dans <strong>{category}</strong></p>
preg_match(
'/.+le(.+)dans.*/',
$element->find('div[class="blague-footer"]', 0)->plaintext,
$matches
);
$item['timestamp'] = strtotime($matches[1]);
$this->items[] = $item;
if(isset($temp[2])) {
$item['content'] = trim($element->find('div.joke_text_contener', 0)->innertext);
$uri = $temp[2]->href;
$item['uri'] = $uri;
$item['title'] = substr($uri, (strrpos($uri, '/') + 1));
$date = $element->find('li.bdm_date', 0)->innertext;
$time = mktime(0, 0, 0, substr($date, 3, 2), substr($date, 0, 2), substr($date, 6, 4));
$item['timestamp'] = $time;
$item['author'] = $element->find('li.bdm_pseudo', 0)->innertext;
$this->items[] = $item;
}
}
}
}

View File

@@ -1,45 +0,0 @@
<?php
class CADBridge extends FeedExpander {
const MAINTAINER = 'nyutag';
const NAME = 'CAD Bridge';
const URI = 'http://www.cad-comic.com/';
const CACHE_TIMEOUT = 7200; //2h
const DESCRIPTION = 'Returns the newest articles.';
public function collectData(){
$this->collectExpandableDatas('http://cdn2.cad-comic.com/rss.xml', 10);
}
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = $this->extractCADContent($item['uri']);
return $item;
}
private function extractCADContent($url) {
$html3 = getSimpleHTMLDOMCached($url);
// The request might fail due to missing https support or wrong URL
if($html3 == false)
return 'Daily comic not released yet';
$htmlpart = explode('/', $url);
switch ($htmlpart[3]) {
case 'cad':
preg_match_all('/http:\/\/cdn2\.cad-comic\.com\/comics\/cad-\S*png/', $html3, $url2);
break;
case 'sillies':
preg_match_all('/http:\/\/cdn2\.cad-comic\.com\/comics\/sillies-\S*gif/', $html3, $url2);
break;
default:
return 'Daily comic not released yet';
}
$img = implode($url2[0]);
$html3->clear();
unset($html3);
if ($img == '')
return 'Daily comic not released yet';
return '<img src="' . $img . '"/>';
}
}

View File

@@ -3,91 +3,107 @@ class CNETBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'CNET News';
const URI = 'http://www.cnet.com/';
const CACHE_TIMEOUT = 1800; // 30min
const DESCRIPTION = 'Returns the newest articles. <br /> You may specify a
topic found in some section URLs, else all topics are selected.';
const PARAMETERS = array( array(
'topic' => array(
'name' => 'Topic name'
const URI = 'https://www.cnet.com/';
const CACHE_TIMEOUT = 3600; // 1h
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array(
array(
'topic' => array(
'name' => 'Topic',
'type' => 'list',
'values' => array(
'All articles' => '',
'Apple' => 'apple',
'Google' => 'google',
'Microsoft' => 'tags-microsoft',
'Computers' => 'topics-computers',
'Mobile' => 'topics-mobile',
'Sci-Tech' => 'topics-sci-tech',
'Security' => 'topics-security',
'Internet' => 'topics-internet',
'Tech Industry' => 'topics-tech-industry'
)
)
)
));
);
public function collectData(){
private function cleanArticle($article_html) {
$offset_p = strpos($article_html, '<p>');
$offset_figure = strpos($article_html, '<figure');
$offset = ($offset_figure < $offset_p ? $offset_figure : $offset_p);
$article_html = substr($article_html, $offset);
$article_html = str_replace('href="/', 'href="' . self::URI, $article_html);
$article_html = str_replace(' height="0"', '', $article_html);
$article_html = str_replace('<noscript>', '', $article_html);
$article_html = str_replace('</noscript>', '', $article_html);
$article_html = StripWithDelimiters($article_html, '<a class="clickToEnlarge', '</a>');
$article_html = stripWithDelimiters($article_html, '<span class="nowPlaying', '</span>');
$article_html = stripWithDelimiters($article_html, '<span class="duration', '</span>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = stripWithDelimiters($article_html, '<svg', '</svg>');
return $article_html;
}
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
public function collectData() {
// Retrieve and check user input
$topic = str_replace('-', '/', $this->getInput('topic'));
if (!empty($topic) && (substr_count($topic, '/') > 1 || !ctype_alpha(str_replace('/', '', $topic))))
returnClientError('Invalid topic: ' . $topic);
// Retrieve webpage
$pageUrl = self::URI . (empty($topic) ? 'news/' : $topic.'/');
$html = getSimpleHTMLDOM($pageUrl)
or returnServerError('Could not request CNET: '.$pageUrl);
// Process articles
foreach($html->find('div.assetBody, div.riverPost') as $element) {
if(count($this->items) >= 10) {
break;
}
return false;
}
$article_title = trim($element->find('h2, h3', 0)->plaintext);
$article_uri = self::URI . substr($element->find('a', 0)->href, 1);
$article_thumbnail = $element->parent()->find('img[src]', 0)->src;
$article_timestamp = strtotime($element->find('time.assetTime, div.timeAgo', 0)->plaintext);
$article_author = trim($element->find('a[rel=author], a.name', 0)->plaintext);
$article_content = '<p><b>' . trim($element->find('p.dek', 0)->plaintext) . '</b></p>';
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
if (is_null($article_thumbnail))
$article_thumbnail = extractFromDelimiters($element->innertext, '<img src="', '"');
return $string;
}
if (!empty($article_title) && !empty($article_uri) && strpos($article_uri, self::URI . 'news/') !== false) {
function cleanArticle($article_html){
$article_html = '<p>' . substr($article_html, strpos($article_html, '<p>') + 3);
$article_html = stripWithDelimiters($article_html, '<span class="credit">', '</span>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = stripWithDelimiters($article_html, '<div class="shortcode related-links', '</div>');
$article_html = stripWithDelimiters($article_html, '<a class="clickToEnlarge">', '</a>');
return $article_html;
}
$article_html = getSimpleHTMLDOMCached($article_uri) or $article_html = null;
$pageUrl = self::URI . (empty($this->getInput('topic')) ? '' : 'topics/' . $this->getInput('topic') . '/');
$html = getSimpleHTMLDOM($pageUrl) or returnServerError('Could not request CNET: ' . $pageUrl);
$limit = 0;
if (!is_null($article_html)) {
foreach($html->find('div.assetBody') as $element) {
if($limit < 8) {
$article_title = trim($element->find('h2', 0)->plaintext);
$article_uri = self::URI . ($element->find('a', 0)->href);
$article_timestamp = strtotime($element->find('time.assetTime', 0)->plaintext);
$article_author = trim($element->find('a[rel=author]', 0)->plaintext);
if (empty($article_thumbnail))
$article_thumbnail = $article_html->find('div.originalImage', 0);
if (empty($article_thumbnail))
$article_thumbnail = $article_html->find('span.imageContainer', 0);
if (is_object($article_thumbnail))
$article_thumbnail = $article_thumbnail->find('img', 0)->src;
if(!empty($article_title) && !empty($article_uri) && strpos($article_uri, '/news/') !== false) {
$article_html = getSimpleHTMLDOM($article_uri)
or returnServerError('Could not request CNET: ' . $article_uri);
$article_content = trim(
cleanArticle(
$article_content .= trim(
$this->cleanArticle(
extractFromDelimiters(
$article_html,
'<div class="articleContent',
'<footer>'
$article_html, '<article', '<footer'
)
)
);
$item = array();
$item['uri'] = $article_uri;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['timestamp'] = $article_timestamp;
$item['content'] = $article_content;
$this->items[] = $item;
$limit++;
}
$item = array();
$item['uri'] = $article_uri;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['timestamp'] = $article_timestamp;
$item['enclosures'] = array($article_thumbnail);
$item['content'] = $article_content;
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('topic'))) {
$topic = $this->getInput('topic');
return 'CNET News Bridge' . (empty($topic) ? '' : ' - ' . $topic);
}
return parent::getName();
}
}

View File

@@ -26,7 +26,7 @@ class ContainerLinuxReleasesBridge extends BridgeAbstract {
]
];
public function getReleaseFeed($jsonUrl) {
private function getReleaseFeed($jsonUrl) {
$json = getContents($jsonUrl)
or returnServerError('Could not request Core OS Website.');
return json_decode($json, true);

View File

@@ -1,74 +0,0 @@
<?php
class CpasbienBridge extends BridgeAbstract {
const MAINTAINER = 'lagaisse';
const NAME = 'Cpasbien Bridge';
const URI = 'http://www.cpasbien.cm';
const CACHE_TIMEOUT = 86400; // 24h
const DESCRIPTION = 'Returns latest torrents from a request query';
const PARAMETERS = array( array(
'q' => array(
'name' => 'Search',
'required' => true,
'title' => 'Type your search'
)
));
public function collectData(){
$request = str_replace(' ', '-', trim($this->getInput('q')));
$html = getSimpleHTMLDOM(self::URI . '/recherche/' . urlencode($request) . '.html')
or returnServerError('No results for this query.');
foreach($html->find('#gauche', 0)->find('div') as $episode) {
if($episode->getAttribute('class') == 'ligne0'
|| $episode->getAttribute('class') == 'ligne1') {
$urlepisode = $episode->find('a', 0)->getAttribute('href');
$htmlepisode = getSimpleHTMLDOMCached($urlepisode, 86400 * 366 * 30);
$item = array();
$item['author'] = $episode->find('a', 0)->text();
$item['title'] = $episode->find('a', 0)->text();
$item['pubdate'] = $this->getCachedDate($urlepisode);
$textefiche = $htmlepisode->find('#textefiche', 0)->find('p', 1);
if(isset($textefiche)) {
$item['content'] = $textefiche->text();
} else {
$p = $htmlepisode->find('#textefiche', 0)->find('p');
if(!empty($p)) {
$item['content'] = $htmlepisode->find('#textefiche', 0)->find('p', 0)->text();
}
}
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['uri'] = self::URI . $htmlepisode->find('#telecharger', 0)->getAttribute('href');
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('q'))) {
return $this->getInput('q') . ' : ' . self::NAME;
}
return parent::getName();
}
private function getCachedDate($url){
debugMessage('getting pubdate from url ' . $url . '');
// Initialize cache
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR . '/pages');
$params = [$url];
$cache->setParameters($params);
// Get cachefile timestamp
$time = $cache->getTime();
return ($time !== false ? $time : time());
}
}

View File

@@ -1,7 +1,7 @@
<?php
class DanbooruBridge extends BridgeAbstract {
const MAINTAINER = 'mitsukarenai';
const MAINTAINER = 'mitsukarenai, logmanoriginal';
const NAME = 'Danbooru';
const URI = 'http://donmai.us/';
const CACHE_TIMEOUT = 1800; // 30min
@@ -57,11 +57,80 @@ class DanbooruBridge extends BridgeAbstract {
}
public function collectData(){
$html = getSimpleHTMLDOM($this->getFullURI())
$content = getContents($this->getFullURI())
or returnServerError('Could not request ' . $this->getName());
$html = Fix_Simple_Html_Dom::str_get_html($content);
foreach($html->find(static::PATHTODATA) as $element) {
$this->items[] = $this->getItemFromElement($element);
}
}
}
/**
* This class is a monkey patch to 'extend' simplehtmldom to recognize <source>
* tags (HTML5) as self closing tag. This patch should be removed once
* simplehtmldom was fixed. This seems to be a issue with more tags:
* https://sourceforge.net/p/simplehtmldom/bugs/83/
*
* The tag itself is valid according to Mozilla:
*
* The HTML <picture> element serves as a container for zero or more <source>
* elements and one <img> element to provide versions of an image for different
* display device scenarios. The browser will consider each of the child <source>
* elements and select one corresponding to the best match found; if no matches
* are found among the <source> elements, the file specified by the <img>
* element's src attribute is selected. The selected image is then presented in
* the space occupied by the <img> element.
*
* -- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture
*
* Notice: This class uses parts of the original simplehtmldom, adjusted to pass
* the guidelines of RSS-Bridge (formatting)
*/
final class Fix_Simple_Html_Dom extends simple_html_dom {
/* copy from simple_html_dom, added 'source' at the end */
protected $self_closing_tags = array(
'img' => 1,
'br' => 1,
'input' => 1,
'meta' => 1,
'link' => 1,
'hr' => 1,
'base' => 1,
'embed' => 1,
'spacer' => 1,
'source' => 1
);
/* copy from simplehtmldom, changed 'simple_html_dom' to 'Fix_Simple_Html_Dom' */
public static function str_get_html($str,
$lowercase = true,
$forceTagsClosed = true,
$target_charset = DEFAULT_TARGET_CHARSET,
$stripRN = true,
$defaultBRText = DEFAULT_BR_TEXT,
$defaultSpanText = DEFAULT_SPAN_TEXT)
{
$dom = new Fix_Simple_Html_Dom(null,
$lowercase,
$forceTagsClosed,
$target_charset,
$stripRN,
$defaultBRText,
$defaultSpanText);
if (empty($str) || strlen($str) > MAX_FILE_SIZE) {
$dom->clear();
return false;
}
$dom->load($str, $lowercase, $stripRN);
return $dom;
}
}

View File

@@ -3,7 +3,7 @@ class DauphineLibereBridge extends FeedExpander {
const MAINTAINER = 'qwertygc';
const NAME = 'Dauphine Bridge';
const URI = 'http://www.ledauphine.com/';
const URI = 'https://www.ledauphine.com/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'Returns the newest articles.';
@@ -49,8 +49,9 @@ class DauphineLibereBridge extends FeedExpander {
private function extractContent($url){
$html2 = getSimpleHTMLDOMCached($url);
$text = $html2->find('div.column', 0)->innertext;
$text = preg_replace('@<script[^>]*?>.*?</script>@si', '', $text);
return $text;
foreach ($html2->find('.noprint, link, script, iframe, .shareTool, .contentInfo') as $remove) {
$remove->outertext = '';
}
return $html2->find('div.content', 0)->innertext;
}
}

View File

@@ -157,7 +157,7 @@ class PepperBridgeAbstract extends BridgeAbstract {
/**
* Get the Deal data from the choosen group in the choosed order
*/
public function collectDataGroup()
protected function collectDataGroup()
{
$group = $this->getInput('group');
@@ -171,7 +171,7 @@ class PepperBridgeAbstract extends BridgeAbstract {
/**
* Get the Deal data from the choosen keywords and parameters
*/
public function collectDataKeywords()
protected function collectDataKeywords()
{
$q = $this->getInput('q');
$hide_expired = $this->getInput('hide_expired');
@@ -199,7 +199,7 @@ class PepperBridgeAbstract extends BridgeAbstract {
/**
* Get the Deal data using the given URL
*/
public function collectDeals($url){
protected function collectDeals($url){
$html = getSimpleHTMLDOM($url)
or returnServerError($this->i8n('request-error'));
$list = $html->find('article[id]');
@@ -241,9 +241,7 @@ class PepperBridgeAbstract extends BridgeAbstract {
' ', /* Notice this is a space! */
array(
'cept-description-container',
'overflow--wrap-break',
'size--all-s',
'size--fromW3-m'
'overflow--wrap-break'
)
);
@@ -577,7 +575,7 @@ class PepperBridgeAbstract extends BridgeAbstract {
* the "$lang" class variable in the local class
* @return various the local content needed
*/
public function i8n($key)
protected function i8n($key)
{
if (array_key_exists($key, $this->lang)) {
return $this->lang[$key];

View File

@@ -9,8 +9,8 @@ class DilbertBridge extends BridgeAbstract {
public function collectData(){
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request Dilbert: ' . $this->getURI());
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request Dilbert: ' . self::URI);
foreach($html->find('section.comic-item') as $element) {

View File

@@ -45,9 +45,10 @@ class ElloBridge extends BridgeAbstract {
$item = array();
$item['author'] = $this->getUsername($post, $postData);
$item['timestamp'] = strtotime($post->created_at);
$item['title'] = $this->findText($post->summary);
$item['title'] = strip_tags($this->findText($post->summary));
$item['content'] = $this->getPostContent($post->body);
$item['enclosures'] = $this->getEnclosures($post, $postData);
$item['uri'] = self::URI . $item['author'] . '/post/' . $post->token;
$content = $post->body;
$this->items[] = $item;
@@ -57,7 +58,7 @@ class ElloBridge extends BridgeAbstract {
}
public function findText($path) {
private function findText($path) {
foreach($path as $summaryElement) {
@@ -71,7 +72,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getPostContent($path) {
private function getPostContent($path) {
$content = '';
foreach($path as $summaryElement) {
@@ -92,7 +93,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getEnclosures($post, $postData) {
private function getEnclosures($post, $postData) {
$assets = [];
foreach($post->links->assets as $asset) {
@@ -108,7 +109,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getUsername($post, $postData) {
private function getUsername($post, $postData) {
foreach($postData->linked->users as $user) {
if($user->id == $post->links->author->id) {
@@ -118,7 +119,7 @@ class ElloBridge extends BridgeAbstract {
}
public function getAPIKey() {
private function getAPIKey() {
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR);
$cache->setParameters(['key']);

View File

@@ -7,19 +7,9 @@ class EstCeQuonMetEnProdBridge extends BridgeAbstract {
const CACHE_TIMEOUT = 21600; // 6h
const DESCRIPTION = 'Should we put a website in production today? (French)';
public function collectData(){
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request EstCeQuonMetEnProd: ' . $this->getURI());
public function collectData() {
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request EstCeQuonMetEnProd: ' . self::URI);
$item = array();
$item['uri'] = $this->getURI() . '#' . date('Y-m-d');
@@ -28,8 +18,8 @@ class EstCeQuonMetEnProdBridge extends BridgeAbstract {
$item['timestamp'] = strtotime('today midnight');
$item['content'] = str_replace(
'src="/',
'src="' . $this->getURI(),
trim(extractFromDelimiters($html->outertext, '<body role="document">', '<br /><br />'))
'src="' . self::URI,
trim(extractFromDelimiters($html->outertext, '<body role="document">', '<div id="share'))
);
$this->items[] = $item;

View File

@@ -275,7 +275,4 @@ EOD;
return 'http://facebook.com';
}
public function getCacheDuration(){
return 60 * 60 * 3; // 5 minutes
}
}

View File

@@ -253,17 +253,6 @@ class FacebookBridge extends BridgeAbstract {
private function collectUserData(){
//Extract a string using start and end delimiters
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
//Utility function for cleaning a Facebook link
$unescape_fb_link = function($matches){
if(is_array($matches) && count($matches) > 1) {
@@ -430,10 +419,12 @@ EOD;
if(isset($element)) {
defaultLinkTo($element, self::URI);
$author = str_replace(' | Facebook', '', $html->find('title#pageTitle', 0)->innertext);
$profilePic = 'https://graph.facebook.com/'
. $this->getInput('u')
. '/picture?width=200&amp;height=200';
. '/picture?width=200&amp;height=200#.image';
$this->authorName = $author;
@@ -489,6 +480,12 @@ EOD;
'',
$content);
//Remove "SpSonsSoriSsés"
$content = preg_replace(
'/(?iU)<a [^>]+ href="#" role="link" [^>}]+>.+<\/a>/iU',
'',
$content);
//Remove html nodes, keep only img, links, basic formatting
$content = strip_tags($content, '<a><img><i><u><br><p>');
@@ -536,7 +533,11 @@ EOD;
if(strlen($title) > 64)
$title = substr($title, 0, strpos(wordwrap($title, 64), "\n")) . '...';
$uri = self::URI . $post->find('abbr')[0]->parent()->getAttribute('href');
$uri = $post->find('abbr')[0]->parent()->getAttribute('href');
if (false !== strpos($uri, '?')) {
$uri = substr($uri, 0, strpos($uri, '?'));
}
//Build and add final item
$item['uri'] = htmlspecialchars_decode($uri);
@@ -544,6 +545,9 @@ EOD;
$item['title'] = $title;
$item['author'] = $author;
$item['timestamp'] = $date;
if(strpos($item['content'], '<img') === false)
$item['enclosures'] = array($profilePic);
$this->items[] = $item;
}
}

View File

@@ -8,17 +8,22 @@ class FierPandaBridge extends BridgeAbstract {
const DESCRIPTION = 'Returns latest articles from Fier Panda.';
public function collectData(){
$html = getSimpleHTMLDOM(self::URI)
or returnServerError('Could not request Fier Panda.');
foreach($html->find('div.container-content article') as $element) {
defaultLinkTo($html, static::URI);
foreach($html->find('article') as $article) {
$item = array();
$item['uri'] = $this->getURI() . $element->find('a', 0)->href;
$item['title'] = trim($element->find('h1 a', 0)->innertext);
// Remove the link at the end of the article
$element->find('p a', 0)->outertext = '';
$item['content'] = $element->find('p', 0)->innertext;
$item['uri'] = $article->find('a', 0)->href;
$item['title'] = $article->find('a', 0)->title;
$this->items[] = $item;
}
}
}

View File

@@ -6,6 +6,7 @@ class FilterBridge extends FeedExpander {
const NAME = 'Filter';
const CACHE_TIMEOUT = 3600; // 1h
const DESCRIPTION = 'Filters a feed of your choice';
const URI = 'https://github.com/rss-bridge/rss-bridge';
const PARAMETERS = array(array(
'url' => array(
@@ -26,11 +27,34 @@ class FilterBridge extends FeedExpander {
),
'defaultValue' => 'permit',
),
'title_from_content' => array(
'name' => 'Generate title from content',
'type' => 'checkbox',
'required' => false,
)
));
protected function parseItem($newItem){
$item = parent::parseItem($newItem);
if($this->getInput('title_from_content') && array_key_exists('content', $item)) {
$content = str_get_html($item['content']);
$pos = strpos($item['content'], ' ', 50);
$item['title'] = substr(
$content->plaintext,
0,
$pos
);
if(strlen($content->plaintext) >= $pos) {
$item['title'] .= '...';
}
}
switch(true) {
case $this->getFilterType() === 'permit':
if (preg_match($this->getFilter(), $item['title'])) {

View File

@@ -30,30 +30,76 @@ class FlickrBridge extends BridgeAbstract {
'title' => 'Insert username (as shown in the address bar)',
'exampleValue' => 'flickr'
)
),
)
);
public function collectData(){
switch($this->queriedContext) {
case 'Explore':
$key = 'photos';
$filter = 'photo-lite-models';
$html = getSimpleHTMLDOM(self::URI . 'explore')
or returnServerError('Could not request Flickr.');
break;
case 'By keyword':
$key = 'photos';
$filter = 'photo-lite-models';
$html = getSimpleHTMLDOM(self::URI . 'search/?q=' . urlencode($this->getInput('q')) . '&s=rec')
or returnServerError('No results for this query.');
break;
case 'By username':
$key = 'photoPageList';
$filter = 'photo-models';
$html = getSimpleHTMLDOM(self::URI . 'photos/' . urlencode($this->getInput('u')))
or returnServerError('Requested username can\'t be found.');
break;
default:
returnClientError('Invalid context: ' . $this->queriedContext);
}
$model_json = $this->extractJsonModel($html);
$photo_models = $this->getPhotoModels($model_json, $filter);
foreach($photo_models as $model) {
$item = array();
/* Author name depends on scope. On a keyword search the
* author is part of the picture data. On a username search
* the author is part of the owner data.
*/
if(array_key_exists('username', $model)) {
$item['author'] = $model['username'];
} elseif (array_key_exists('owner', reset($model_json)[0])) {
$item['author'] = reset($model_json)[0]['owner']['username'];
}
$item['title'] = (array_key_exists('title', $model) ? $model['title'] : 'Untitled');
$item['uri'] = self::URI . 'photo.gne?id=' . $model['id'];
$description = (array_key_exists('description', $model) ? $model['description'] : '');
$item['content'] = '<a href="'
. $item['uri']
. '"><img src="'
. $this->extractContentImage($model)
. '" style="max-width: 640px; max-height: 480px;"/></a><br><p>'
. $description
. '</p>';
$item['enclosures'] = $this->extractEnclosures($model);
$this->items[] = $item;
}
}
private function extractJsonModel($html) {
// Find SCRIPT containing JSON data
$model = $html->find('.modelExport', 0);
$model_text = $model->innertext;
@@ -62,59 +108,79 @@ class FlickrBridge extends BridgeAbstract {
$start = strpos($model_text, 'modelExport:') + strlen('modelExport:');
$end = strpos($model_text, 'auth:') - strlen('auth:');
// Dissect JSON data and remove trailing comma
// Extract JSON data, remove trailing comma
$model_text = trim(substr($model_text, $start, $end - $start));
$model_text = substr($model_text, 0, strlen($model_text) - 1);
$model_json = json_decode($model_text, true);
return json_decode($model_text, true);
foreach($html->find('.photo-list-photo-view') as $element) {
// Get the styles
$style = explode(';', $element->style);
// Get the background-image style
$backgroundImage = explode(':', end($style));
// URI type : url(//cX.staticflickr.com/X/XXXXX/XXXXXXXXX.jpg)
$imageURI = trim(str_replace(['url(', ')'], '', end($backgroundImage)));
// Get the image ID
$imageURIs = explode('_', basename($imageURI));
$imageID = reset($imageURIs);
// Use JSON data to build items
foreach(reset($model_json)[0][$key]['_data'] as $element) {
if($element['id'] === $imageID) {
$item = array();
/* Author name depends on scope. On a keyword search the
* author is part of the picture data. On a username search
* the author is part of the owner data.
*/
if(array_key_exists('username', $element)) {
$item['author'] = $element['username'];
} elseif (array_key_exists('owner', reset($model_json)[0])) {
$item['author'] = reset($model_json)[0]['owner']['username'];
}
$item['title'] = (array_key_exists('title', $element) ? $element['title'] : 'Untitled');
$item['uri'] = self::URI . 'photo.gne?id=' . $imageID;
$description = (array_key_exists('description', $element) ? $element['description'] : '');
$item['content'] = '<a href="'
. $item['uri']
. '"><img src="'
. $imageURI
. '" /></a><br><p>'
. $description
. '</p>';
$this->items[] = $item;
break;
}
}
}
}
private function getPhotoModels($json, $filter) {
// The JSON model contains a "legend" array, where each element contains
// the path to an element in the "main" object
$photo_models = array();
foreach($json['legend'] as $legend) {
$photo_model = $json['main'];
foreach($legend as $element) { // Traverse tree
$photo_model = $photo_model[$element];
}
// We are only interested in content
if($photo_model['_flickrModelRegistry'] === $filter) {
$photo_models[] = $photo_model;
}
}
return $photo_models;
}
private function extractEnclosures($model) {
$areas = array();
foreach($model['sizes'] as $size) {
$areas[$size['width'] * $size['height']] = $size['url'];
}
return array($this->fixURL(max($areas)));
}
private function extractContentImage($model) {
$areas = array();
$limit = 320 * 240;
foreach($model['sizes'] as $size) {
$image_area = $size['width'] * $size['height'];
if($image_area >= $limit) {
$areas[$image_area] = $size['url'];
}
}
return $this->fixURL(min($areas));
}
private function fixURL($url) {
// For some reason the image URLs don't include the protocol (https)
if(strpos($url, '//') === 0) {
$url = 'https:' . $url;
}
return $url;
}
}

41
bridges/ForGifsBridge.php Normal file
View File

@@ -0,0 +1,41 @@
<?php
class ForGifsBridge extends FeedExpander {
const MAINTAINER = 'logmanoriginal';
const NAME = 'forgifs Bridge';
const URI = 'https://forgifs.com';
const DESCRIPTION = 'Returns the forgifs feed with actual gifs instead of images';
public function collectData() {
$this->collectExpandableDatas('https://forgifs.com/gallery/srss/7');
}
protected function parseItem($feedItem) {
$item = parent::parseItem($feedItem);
$content = str_get_html($item['content']);
$img = $content->find('img', 0);
$poster = $img->src;
// The actual gif is the same path but its id must be decremented by one.
// Example:
// http://forgifs.com/gallery/d/279419-2/Reporter-videobombed-shoulder-checks.gif
// http://forgifs.com/gallery/d/279418-2/Reporter-videobombed-shoulder-checks.gif
// Notice how this changes ----------^
// Now let's extract that number and do some math
// Notice: Technically we could also load the content page but that would
// require unnecessary traffic. As long as it works...
$num = substr($img->src, 29, 6);
$num -= 1;
$img->src = substr_replace($img->src, $num, 29, strlen($num));
$img->width = 'auto';
$img->height = 'auto';
$item['content'] = $content;
return $item;
}
}

View File

@@ -3,7 +3,7 @@ class FuturaSciencesBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'Futura-Sciences Bridge';
const URI = 'http://www.futura-sciences.com/';
const URI = 'https://www.futura-sciences.com/';
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array( array(
@@ -90,42 +90,11 @@ class FuturaSciencesBridge extends FeedExpander {
or returnServerError('Could not request Futura-Sciences: ' . $item['uri']);
$item['content'] = $this->extractArticleContent($article);
$author = $this->extractAuthor($article);
$item['author'] = empty($author) ? $item['author'] : $author;
if (!empty($author))
$item['author'] = $author;
return $item;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
} return $string;
}
private function stripRecursiveHTMLSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr($string, $section_start, $section_end - $section_start + $close_tag_length);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while ($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
private function extractArticleContent($article){
$contents = $article->find('section.article-text-classic', 0)->innertext;
$headline = trim($article->find('p.description', 0)->plaintext);
@@ -137,6 +106,7 @@ class FuturaSciencesBridge extends FeedExpander {
'<div class="sharebar2',
'<div class="diaporamafullscreen"',
'<div class="module social-button',
'<div class="module social-share',
'<div style="margin-bottom:10px;" class="noprint"',
'<div class="ficheprevnext',
'<div class="bar noprint',
@@ -148,16 +118,17 @@ class FuturaSciencesBridge extends FeedExpander {
'<div id="forumcomments',
'<div ng-if="active"'
) as $div_start) {
$contents = $this->stripRecursiveHTMLSection($contents, 'div', $div_start);
$contents = stripRecursiveHTMLSection($contents, 'div', $div_start);
}
$contents = $this->stripWithDelimiters($contents, '<hr ', '/>');
$contents = $this->stripWithDelimiters($contents, '<p class="content-date', '</p>');
$contents = $this->stripWithDelimiters($contents, '<h1 class="content-title', '</h1>');
$contents = $this->stripWithDelimiters($contents, 'fs:definition="', '"');
$contents = $this->stripWithDelimiters($contents, 'fs:xt:clicktype="', '"');
$contents = $this->stripWithDelimiters($contents, 'fs:xt:clickname="', '"');
$contents = $this->stripWithDelimiters($contents, '<script ', '</script>');
$contents = stripWithDelimiters($contents, '<hr ', '/>');
$contents = stripWithDelimiters($contents, '<p class="content-date', '</p>');
$contents = stripWithDelimiters($contents, '<h1 class="content-title', '</h1>');
$contents = stripWithDelimiters($contents, 'fs:definition="', '"');
$contents = stripWithDelimiters($contents, 'fs:xt:clicktype="', '"');
$contents = stripWithDelimiters($contents, 'fs:xt:clickname="', '"');
$contents = StripWithDelimiters($contents, '<section class="module-toretain module-propal-nl', '</section>');
$contents = stripWithDelimiters($contents, '<script ', '</script>');
return $headline . trim($contents);
}

View File

@@ -20,50 +20,58 @@ class GBAtempBridge extends BridgeAbstract {
)
));
private function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function buildItem($uri, $title, $author, $timestamp, $content){
private function buildItem($uri, $title, $author, $timestamp, $thumbnail, $content){
$item = array();
$item['uri'] = $uri;
$item['title'] = $title;
$item['author'] = $author;
$item['timestamp'] = $timestamp;
$item['content'] = $content;
if (!empty($thumbnail)) {
$item['enclosures'] = array($thumbnail);
}
return $item;
}
private function cleanupPostContent($content, $site_url){
$content = str_replace(':arrow:', '&#x27a4;', $content);
$content = str_replace('href="attachments/', 'href="'.$site_url.'attachments/', $content);
$content = $this->stripWithDelimiters($content, '<script', '</script>');
$content = stripWithDelimiters($content, '<script', '</script>');
return $content;
}
private function findItemDate($item){
$time = 0;
$dateField = $item->find('abbr.DateTime', 0);
if (is_object($dateField)) {
$time = intval(
extractFromDelimiters(
$dateField->outertext,
'data-time="',
'"'
)
);
} else {
$dateField = $item->find('span.DateTime', 0);
$time = DateTime::createFromFormat(
'M j, Y \a\t g:i A',
extractFromDelimiters(
$dateField->outertext,
'title="',
'"'
)
)->getTimestamp();
}
return $time;
}
private function fetchPostContent($uri, $site_url){
$html = getSimpleHTMLDOM($uri);
$html = getSimpleHTMLDOMCached($uri);
if(!$html) {
return 'Could not request GBAtemp ' . $uri;
return 'Could not request GBAtemp: ' . $uri;
}
$content = $html->find('div.messageContent', 0)->innertext;
$content = $html->find('div.messageContent, blockquote.baseHtml', 0)->innertext;
return $this->cleanupPostContent($content, $site_url);
}
@@ -76,70 +84,56 @@ class GBAtempBridge extends BridgeAbstract {
case 'N':
foreach($html->find('li[class=news_item full]') as $newsItem) {
$url = self::URI . $newsItem->find('a', 0)->href;
$time = intval(
$this->extractFromDelimiters(
$newsItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$img = $this->getURI() . $newsItem->find('img', 0)->src . '#.image';
$time = $this->findItemDate($newsItem);
$author = $newsItem->find('a.username', 0)->plaintext;
$title = $newsItem->find('a', 1)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, $img, $content);
unset($newsItem); // Some items are heavy, freeing the item proactively helps saving memory
}
break;
case 'R':
foreach($html->find('li.portal_review') as $reviewItem) {
$url = self::URI . $reviewItem->find('a', 0)->href;
$img = $this->getURI() . extractFromDelimiters($reviewItem->find('a', 0)->style, 'image:url(', ')');
$title = $reviewItem->find('span.review_title', 0)->plaintext;
$content = getSimpleHTMLDOM($url)
or returnServerError('Could not request GBAtemp: ' . $uri);
$author = $content->find('a.username', 0)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$content->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($content);
$intro = '<p><b>' . ($content->find('div#review_intro', 0)->plaintext) . '</b></p>';
$review = $content->find('div#review_main', 0)->innertext;
$subheader = '<p><b>' . $content->find('div.review_subheader', 0)->plaintext . '</b></p>';
$procons = $content->find('table.review_procons', 0)->outertext;
$scores = $content->find('table.reviewscores', 0)->outertext;
$content = $this->cleanupPostContent($intro . $review . $subheader . $procons . $scores, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, $img, $content);
unset($reviewItem); // Free up memory
}
break;
case 'T':
foreach($html->find('li.portal-tutorial') as $tutorialItem) {
$url = self::URI . $tutorialItem->find('a', 0)->href;
$title = $tutorialItem->find('a', 0)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$tutorialItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($tutorialItem);
$author = $tutorialItem->find('a.username', 0)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, null, $content);
unset($tutorialItem); // Free up memory
}
break;
case 'F':
foreach($html->find('li.rc_item') as $postItem) {
$url = self::URI . $postItem->find('a', 1)->href;
$title = $postItem->find('a', 1)->plaintext;
$time = intval(
$this->extractFromDelimiters(
$postItem->find('abbr.DateTime', 0)->outertext,
'data-time="',
'"'
)
);
$time = $this->findItemDate($postItem);
$author = $postItem->find('a.username', 0)->plaintext;
$content = $this->fetchPostContent($url, self::URI);
$this->items[] = $this->buildItem($url, $title, $author, $time, $content);
$this->items[] = $this->buildItem($url, $title, $author, $time, null, $content);
unset($postItem); // Free up memory
}
break;
}
}

66
bridges/GOGBridge.php Normal file
View File

@@ -0,0 +1,66 @@
<?php
class GOGBridge extends BridgeAbstract {
const NAME = 'GOGBridge';
const MAINTAINER = 'teromene';
const URI = 'https://gog.com';
const DESCRIPTION = 'Returns the latest releases from GOG.com';
public function collectData() {
$values = getContents('https://www.gog.com/games/ajax/filtered?limit=25&sort=new') or
die('Unable to get the news pages from GOG !');
$decodedValues = json_decode($values);
$limit = 0;
foreach($decodedValues->products as $game) {
$item = array();
$item['author'] = $game->developer . ' / ' . $game->publisher;
$item['title'] = $game->title;
$item['id'] = $game->id;
$item['uri'] = self::URI . $game->url;
$item['content'] = $this->buildGameContentPage($game);
$item['timestamp'] = $game->globalReleaseDate;
foreach($game->gallery as $image) {
$item['enclosures'][] = $image . '.jpg';
}
$this->items[] = $item;
$limit += 1;
if($limit == 10) break;
}
}
private function buildGameContentPage($game) {
$gameDescriptionText = getContents('https://api.gog.com/products/' . $game->id . '?expand=description') or
die('Unable to get game description from GOG !');
$gameDescriptionValue = json_decode($gameDescriptionText);
$content = 'Genres: ';
$content .= implode(', ', $game->genres);
$content .= '<br />Supported Platforms: ';
if($game->worksOn->Windows) {
$content .= 'Windows ';
}
if($game->worksOn->Mac) {
$content .= 'Mac ';
}
if($game->worksOn->Linux) {
$content .= 'Linux ';
}
$content .= '<br />' . $gameDescriptionValue->description->full;
return $content;
}
}

View File

@@ -0,0 +1,164 @@
<?php
class GitHubGistBridge extends BridgeAbstract {
const NAME = 'GitHubGist comment bridge';
const URI = 'https://gist.github.com';
const DESCRIPTION = 'Generates feeds for Gist comments';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 3600;
const PARAMETERS = array(array(
'id' => array(
'name' => 'Gist',
'type' => 'text',
'required' => true,
'title' => 'Insert Gist ID or URI',
'exampleValue' => '2646763, https://gist.github.com/2646763'
)
));
private $filename;
public function getURI() {
$id = $this->getInput('id') ?: '';
$urlpath = parse_url($id, PHP_URL_PATH);
if($urlpath) {
$components = explode('/', $urlpath);
$id = end($components);
}
return static::URI . '/' . $id;
}
public function getName() {
return $this->filename ? $this->filename . ' - ' . static::NAME : static::NAME;
}
public function collectData() {
$html = getSimpleHTMLDOM($this->getURI(),
null,
null,
true,
true,
DEFAULT_TARGET_CHARSET,
false, // Do NOT remove line breaks
DEFAULT_BR_TEXT,
DEFAULT_SPAN_TEXT)
or returnServerError('Could not request ' . $this->getURI());
$html = defaultLinkTo($html, static::URI);
$fileinfo = $html->find('[class="file-info"]', 0)
or returnServerError('Could not find file info!');
$this->filename = $fileinfo->plaintext;
$comments = $html->find('div[class="timeline-comment-wrapper"]');
if(is_null($comments)) { // no comments yet
return;
}
foreach($comments as $comment) {
$uri = $comment->find('a[href^=#gistcomment]', 0)
or returnServerError('Could not find comment anchor!');
$title = $comment->find('div[class="unminimized-comment"] h3[class="timeline-comment-header-text"]', 0)
or returnServerError('Could not find comment header text!');
$datetime = $comment->find('[datetime]', 0)
or returnServerError('Could not find comment datetime!');
$author = $comment->find('a.author', 0)
or returnServerError('Could not find author name!');
$message = $comment->find('[class="comment-body"]', 0)
or returnServerError('Could not find comment body!');
$item = array();
$item['uri'] = $this->getURI() . $uri->href;
$item['title'] = str_replace('commented', 'commented on', $title->plaintext);
$item['timestamp'] = strtotime($datetime->datetime);
$item['author'] = '<a href="' . $author->href . '">' . $author->plaintext . '</a>';
$item['content'] = $this->fixContent($message);
// $item['enclosures'] = array();
// $item['categories'] = array();
$this->items[] = $item;
}
}
/** Removes all unnecessary tags and adds formatting */
private function fixContent($content){
// Restore code (inside <pre />) highlighting
foreach($content->find('pre') as $pre) {
$pre->style = <<<EOD
padding: 16px;
overflow: auto;
font-size: 85%;
line-height: 1.45;
background-color: #f6f8fa;
border-radius: 3px;
word-wrap: normal;
box-sizing: border-box;
margin-bottom: 16px;
EOD;
$code = $pre->find('code', 0);
if($code) {
$code->style = <<<EOD
white-space: pre;
word-break: normal;
EOD;
}
}
// find <code /> not inside <pre /> (`inline-code`)
foreach($content->find('code') as $code) {
if($code->parent()->tag === 'pre') {
continue;
}
$code->style = <<<EOD
background-color: rgba(27,31,35,0.05);
padding: 0.2em 0.4em;
border-radius: 3px;
EOD;
}
// restore text spacing
foreach($content->find('p') as $p) {
$p->style = 'margin-bottom: 16px;';
}
// Remove unnecessary tags
$content = strip_tags(
$content->innertext,
'<p><a><img><ol><ul><li><table><tr><th><td><string><pre><code><br><hr><h>'
);
return $content;
}
}

View File

@@ -34,13 +34,29 @@ class GithubSearchBridge extends BridgeAbstract {
$title = $element->find('h3', 0)->plaintext;
$item['title'] = $title;
if (count($element->find('p')) == 2) {
$content = $element->find('p', 0)->innertext;
// Description
if (count($element->find('p.d-inline-block')) != 0) {
$content = $element->find('p.d-inline-block', 0)->innertext;
} else{
$content = '';
$content = 'No description';
}
$item['content'] = $content;
// Tags
$content = $content . '<br />';
$tags = $element->find('a.topic-tag');
$tags_array = array();
if (count($tags) != 0) {
$content = $content . 'Tags : ';
foreach($tags as $tag_element) {
$tag_link = 'https://github.com' . $tag_element->href;
$tag_name = trim($tag_element->innertext);
$content = $content . '<a href="' . $tag_link . '">' . $tag_name . '</a> ';
array_push($tags_array, $tag_element->innertext);
}
}
$item['categories'] = $tags_array;
$item['content'] = $content;
$date = $element->find('relative-time', 0)->datetime;
$item['timestamp'] = strtotime($date);

View File

@@ -1,12 +1,12 @@
<?php
class GooglePlusPostBridge extends BridgeAbstract{
protected $_title;
protected $_url;
private $title;
private $url;
const MAINTAINER = 'Grummfy';
const MAINTAINER = 'Grummfy, logmanoriginal';
const NAME = 'Google Plus Post Bridge';
const URI = 'https://plus.google.com/';
const URI = 'https://plus.google.com';
const CACHE_TIMEOUT = 600; //10min
const DESCRIPTION = 'Returns user public post (without API).';
@@ -14,10 +14,16 @@ class GooglePlusPostBridge extends BridgeAbstract{
'username' => array(
'name' => 'username or Id',
'required' => true
),
'include_media' => array(
'name' => 'Include media',
'type' => 'checkbox',
'title' => 'Enable to include media in the feed content'
)
));
public function collectData(){
$username = $this->getInput('username');
// Usernames start with a + if it's not an ID
@@ -25,22 +31,20 @@ class GooglePlusPostBridge extends BridgeAbstract{
$username = '+' . $username;
}
// get content parsed
$html = getSimpleHTMLDOMCached(self::URI . urlencode($username) . '/posts')
$html = getSimpleHTMLDOM(static::URI . '/' . urlencode($username) . '/posts')
or returnServerError('No results for this query.');
// get title, url, ... there is a lot of intresting stuff in meta
$this->_title = $html->find('meta[property=og:title]', 0)->getAttribute('content');
$this->_url = $html->find('meta[property=og:url]', 0)->getAttribute('content');
$html = defaultLinkTo($html, static::URI);
$this->title = $html->find('meta[property=og:title]', 0)->getAttribute('content');
$this->url = $html->find('meta[property=og:url]', 0)->getAttribute('content');
// I don't even know where to start with this discusting html...
foreach($html->find('div[jsname=WsjYwc]') as $post) {
$item = array();
$item['author'] = $item['fullname'] = $post->find('div div div div a', 0)->innertext;
$item['id'] = $post->find('div div div', 0)->getAttribute('id');
$item['avatar'] = $post->find('div img', 0)->src;
$item['uri'] = self::URI . $post->find('div div div a', 1)->href;
$item['author'] = $post->find('div div div div a', 0)->innertext;
$item['uri'] = $post->find('div div div a', 1)->href;
$timestamp = $post->find('a.qXj2He span', 0);
@@ -51,61 +55,151 @@ class GooglePlusPostBridge extends BridgeAbstract{
$timestamp->getAttribute('aria-label')));
}
// hashtag to treat : https://plus.google.com/explore/tag
// $hashtags = array();
// foreach($post->find('a.d-s') as $hashtag){
// $hashtags[trim($hashtag->plaintext)] = self::URI . $hashtag->href;
// }
$message = $post->find('div[jsname=EjRJtf]', 0);
$item['content'] = '';
// Empty messages are not supported right now
if(!$message) {
continue;
}
// avatar display
$item['content'] .= '<div style="float:left; margin: 0 0.5em 0.5em 0;"><a href="'
. self::URI
. urlencode($this->getInput('username'));
$item['content'] .= '"><img align="top" alt="'
$item['content'] = '<div style="float: left; padding: 0 10px 10px 0;"><a href="'
. $this->url
. '"><img align="top" alt="'
. $item['author']
. '" src="'
. $item['avatar']
. '" /></a></div>';
. $post->find('div img', 0)->src
. '" /></a></div><div>'
. trim(strip_tags($message, '<a><p><div><img>'))
. '</div>';
$content = $post->find('div[jsname=EjRJtf]', 0);
// extract plaintext
$item['content_simple'] = $content->plaintext;
$item['title'] = substr($item['content_simple'], 0, 72) . '...';
// XXX ugly but I don't have any idea how to do a better stuff,
// str_replace on link doesn't work as expected and ask too many checks
foreach($content->find('a') as $link) {
$hasHttp = strpos($link->href, 'http');
$hasDoubleSlash = strpos($link->href, '//');
if((!$hasHttp && !$hasDoubleSlash)
|| (false !== $hasHttp && strpos($link->href, 'http') != 0)
|| (false === $hasHttp && false !== $hasDoubleSlash && $hasDoubleSlash != 0)) {
// skipp bad link, for some hashtag or other stuff
if(strpos($link->href, '/') == 0) {
$link->href = substr($link->href, 1);
}
$link->href = self::URI . $link->href;
}
// Make title at least 50 characters long, but don't add '...' if it is shorter!
if(strlen($message->plaintext) > 50) {
$end = strpos($message->plaintext, ' ', 50) ?: strlen($message->plaintext);
} else {
$end = strlen($message->plaintext);
}
$content = $content->innertext;
$item['content'] .= '<div style="margin-top: -1.5em">' . $content . '</div>';
$item['content'] = trim(strip_tags($item['content'], '<a><p><div><img>'));
if(strlen(substr($message->plaintext, 0, $end)) === strlen($message->plaintext)) {
$item['title'] = $message->plaintext;
} else {
$item['title'] = substr($message->plaintext, 0, $end) . '...';
}
$media = $post->find('[jsname="MTOxpb"]', 0);
if($media) {
$item['enclosures'] = array();
foreach($media->find('img') as $img) {
$item['enclosures'][] = $this->fixImage($img)->src;
}
if($this->getInput('include_media') === true && count($item['enclosures'] > 0)) {
$item['content'] .= '<div style="clear: both;"><a href="'
. $item['enclosures'][0]
. '"><img src="'
. $item['enclosures'][0]
. '" /></a></div>';
}
}
// Add custom parameters (only useful for JSON or Plaintext)
$item['fullname'] = $item['author'];
$item['avatar'] = $post->find('div img', 0)->src;
$item['id'] = $post->find('div div div', 0)->getAttribute('id');
$item['content_simple'] = $message->plaintext;
$this->items[] = $item;
}
}
public function getName(){
return $this->_title ?: 'Google Plus Post Bridge';
return $this->title ?: 'Google Plus Post Bridge';
}
public function getURI(){
return $this->_url ?: parent::getURI();
return $this->url ?: parent::getURI();
}
private function fixImage($img) {
// There are certain images like .gif which link to a static picture and
// get replaced dynamically via JS in the browser. If we want the "real"
// image we need to account for that.
$urlparts = parse_url($img->src);
if(array_key_exists('host', $urlparts)) {
// For some reason some URIs don't contain the scheme, assume https
if(!array_key_exists('scheme', $urlparts)) {
$urlparts['scheme'] = 'https';
}
$pathelements = explode('/', $urlparts['path']);
switch($urlparts['host']) {
case 'lh3.googleusercontent.com':
if(pathinfo(end($pathelements), PATHINFO_EXTENSION)) {
// The second to last element of the path specifies the
// image format. The URL is still valid if we remove it.
unset($pathelements[count($pathelements) - 2]);
} elseif(strrpos(end($pathelements), '=') !== false) {
// Some images go throug a proxy. For those images they
// add size information after an equal sign.
// Example: '=w530-h298-n'. Again this can safely be
// removed to get the original image.
$pathelements[count($pathelements) - 1] = substr(
end($pathelements),
0,
strrpos(end($pathelements), '=')
);
}
break;
}
$urlparts['path'] = implode('/', $pathelements);
}
$img->src = $this->build_url($urlparts);
return $img;
}
/**
* From: https://gist.github.com/Ellrion/f51ba0d40ae1d62eeae44fd1adf7b704
* slightly adjusted to work with PHP < 7.0
* @param array $parts
* @return string
*/
private function build_url(array $parts)
{
$scheme = isset($parts['scheme']) ? ($parts['scheme'] . '://') : '';
$host = isset($parts['host']) ? $parts['host'] : '';
$port = isset($parts['port']) ? (':' . $parts['port']) : '';
$user = isset($parts['user']) ? $parts['user'] : '';
$pass = isset($parts['pass']) ? (':' . $parts['pass']) : '';
$pass = ($user || $pass) ? ($pass . '@') : '';
$path = isset($parts['path']) ? $parts['path'] : '';
$query = isset($parts['query']) ? ('?' . $parts['query']) : '';
$fragment = isset($parts['fragment']) ? ('#' . $parts['fragment']) : '';
return implode('', [$scheme, $user, $pass, $host, $port, $path, $query, $fragment]);
}
}

View File

@@ -49,6 +49,7 @@ class GrandComicsDatabaseBridge extends BridgeAbstract {
}
// Build final item
$content = str_replace('href="/', 'href="' . static::URI, $content);
$item = array();
$item['title'] = $seriesName . ' - ' . $key_date;
$item['timestamp'] = strtotime($key_date);

View File

@@ -0,0 +1,370 @@
<?php
/**
* This class implements a bridge for http://www.instructables.com, supporting
* general feeds and feeds by category. Instructables doesn't support HTTPS as
* of now (23.06.2018), so all connections are insecure!
*
* Remarks:
* - For some reason it is very important to have the category URI end with a
* slash, otherwise the site defaults to the main category (i.e. Technology)!
* If you need to update the categories list, enable the 'listCategories'
* function (see comments below) and run the bridge with format=Html (see page
* source)
*/
class InstructablesBridge extends BridgeAbstract {
const NAME = 'Instructables Bridge';
const URI = 'http://www.instructables.com';
const DESCRIPTION = 'Returns general feeds and feeds by category';
const MAINTAINER = 'logmanoriginal';
const PARAMETERS = array(
'Category' => array(
'category' => array(
'name' => 'Category',
'type' => 'list',
'required' => true,
'values' => array(
'Play' => array(
'All' => '/play/',
'KNEX' => '/play/knex/',
'Offbeat' => '/play/offbeat/',
'Lego' => '/play/lego/',
'Airsoft' => '/play/airsoft/',
'Card Games' => '/play/card-games/',
'Guitars' => '/play/guitars/',
'Instruments' => '/play/instruments/',
'Magic Tricks' => '/play/magic-tricks/',
'Minecraft' => '/play/minecraft/',
'Music' => '/play/music/',
'Nerf' => '/play/nerf/',
'Nintendo' => '/play/nintendo/',
'Office Supplies' => '/play/office-supplies/',
'Paintball' => '/play/paintball/',
'Paper Airplanes' => '/play/paper-airplanes/',
'Party Tricks' => '/play/party-tricks/',
'PlayStation' => '/play/playstation/',
'Pranks and Humor' => '/play/pranks-and-humor/',
'Puzzles' => '/play/puzzles/',
'Siege Engines' => '/play/siege-engines/',
'Sports' => '/play/sports/',
'Table Top' => '/play/table-top/',
'Toys' => '/play/toys/',
'Video Games' => '/play/video-games/',
'Wii' => '/play/wii/',
'Xbox' => '/play/xbox/',
'Yo-Yo' => '/play/yo-yo/',
),
'Craft' => array(
'All' => '/craft/',
'Art' => '/craft/art/',
'Sewing' => '/craft/sewing/',
'Paper' => '/craft/paper/',
'Jewelry' => '/craft/jewelry/',
'Fashion' => '/craft/fashion/',
'Books & Journals' => '/craft/books-and-journals/',
'Cards' => '/craft/cards/',
'Clay' => '/craft/clay/',
'Duct Tape' => '/craft/duct-tape/',
'Embroidery' => '/craft/embroidery/',
'Felt' => '/craft/felt/',
'Fiber Arts' => '/craft/fiber-arts/',
'Gifts & Wrapping' => '/craft/gifts-and-wrapping/',
'Knitting & Crocheting' => '/craft/knitting-and-crocheting/',
'Leather' => '/craft/leather/',
'Mason Jars' => '/craft/mason-jars/',
'No-Sew' => '/craft/no-sew/',
'Parties & Weddings' => '/craft/parties-and-weddings/',
'Print Making' => '/craft/print-making/',
'Soap' => '/craft/soap/',
'Wallets' => '/craft/wallets/',
),
'Technology' => array(
'All' => '/technology/',
'Electronics' => '/technology/electronics/',
'Arduino' => '/technology/arduino/',
'Photography' => '/technology/photography/',
'Leds' => '/technology/leds/',
'Science' => '/technology/science/',
'Reuse' => '/technology/reuse/',
'Apple' => '/technology/apple/',
'Computers' => '/technology/computers/',
'3D Printing' => '/technology/3D-Printing/',
'Robots' => '/technology/robots/',
'Art' => '/technology/art/',
'Assistive Tech' => '/technology/assistive-technology/',
'Audio' => '/technology/audio/',
'Clocks' => '/technology/clocks/',
'CNC' => '/technology/cnc/',
'Digital Graphics' => '/technology/digital-graphics/',
'Gadgets' => '/technology/gadgets/',
'Kits' => '/technology/kits/',
'Laptops' => '/technology/laptops/',
'Lasers' => '/technology/lasers/',
'Linux' => '/technology/linux/',
'Microcontrollers' => '/technology/microcontrollers/',
'Microsoft' => '/technology/microsoft/',
'Mobile' => '/technology/mobile/',
'Raspberry Pi' => '/technology/raspberry-pi/',
'Remote Control' => '/technology/remote-control/',
'Sensors' => '/technology/sensors/',
'Software' => '/technology/software/',
'Soldering' => '/technology/soldering/',
'Speakers' => '/technology/speakers/',
'Steampunk' => '/technology/steampunk/',
'Tools' => '/technology/tools/',
'USB' => '/technology/usb/',
'Wearables' => '/technology/wearables/',
'Websites' => '/technology/websites/',
'Wireless' => '/technology/wireless/',
),
'Workshop' => array(
'All' => '/workshop/',
'Woodworking' => '/workshop/woodworking/',
'Tools' => '/workshop/tools/',
'Gardening' => '/workshop/gardening/',
'Cars' => '/workshop/cars/',
'Metalworking' => '/workshop/metalworking/',
'Cardboard' => '/workshop/cardboard/',
'Electric Vehicles' => '/workshop/electric-vehicles/',
'Energy' => '/workshop/energy/',
'Furniture' => '/workshop/furniture/',
'Home Improvement' => '/workshop/home-improvement/',
'Home Theater' => '/workshop/home-theater/',
'Hydroponics' => '/workshop/hydroponics/',
'Laser Cutting' => '/workshop/laser-cutting/',
'Lighting' => '/workshop/lighting/',
'Molds & Casting' => '/workshop/molds-and-casting/',
'Motorcycles' => '/workshop/motorcycles/',
'Organizing' => '/workshop/organizing/',
'Pallets' => '/workshop/pallets/',
'Repair' => '/workshop/repair/',
'Shelves' => '/workshop/shelves/',
'Solar' => '/workshop/solar/',
'Workbenches' => '/workshop/workbenches/',
),
'Home' => array(
'All' => '/home/',
'Halloween' => '/home/halloween/',
'Decorating' => '/home/decorating/',
'Organizing' => '/home/organizing/',
'Pets' => '/home/pets/',
'Life Hacks' => '/home/life-hacks/',
'Beauty' => '/home/beauty/',
'Christmas' => '/home/christmas/',
'Cleaning' => '/home/cleaning/',
'Education' => '/home/education/',
'Finances' => '/home/finances/',
'Gardening' => '/home/gardening/',
'Green' => '/home/green/',
'Health' => '/home/health/',
'Hiding Places' => '/home/hiding-places/',
'Holidays' => '/home/holidays/',
'Homesteading' => '/home/homesteading/',
'Kids' => '/home/kids/',
'Kitchen' => '/home/kitchen/',
'Life Skills' => '/home/life-skills/',
'Parenting' => '/home/parenting/',
'Pest Control' => '/home/pest-control/',
'Relationships' => '/home/relationships/',
'Reuse' => '/home/reuse/',
'Travel' => '/home/travel/',
),
'Outside' => array(
'All' => '/outside/',
'Bikes' => '/outside/bikes/',
'Survival' => '/outside/survival/',
'Backyard' => '/outside/backyard/',
'Beach' => '/outside/beach/',
'Birding' => '/outside/birding/',
'Boats' => '/outside/boats/',
'Camping' => '/outside/camping/',
'Climbing' => '/outside/climbing/',
'Fire' => '/outside/fire/',
'Fishing' => '/outside/fishing/',
'Hunting' => '/outside/hunting/',
'Kites' => '/outside/kites/',
'Knives' => '/outside/knives/',
'Knots' => '/outside/knots/',
'Paracord' => '/outside/paracord/',
'Rockets' => '/outside/rockets/',
'Skateboarding' => '/outside/skateboarding/',
'Snow' => '/outside/snow/',
'Water' => '/outside/water/',
),
'Food' => array(
'All' => '/food/',
'Dessert' => '/food/dessert/',
'Snacks & Appetizers' => '/food/snacks-and-appetizers/',
'Bacon' => '/food/bacon/',
'BBQ & Grilling' => '/food/bbq-and-grilling/',
'Beverages' => '/food/beverages/',
'Bread' => '/food/bread/',
'Breakfast' => '/food/breakfast/',
'Cake' => '/food/cake/',
'Candy' => '/food/candy/',
'Canning & Preserves' => '/food/canning-and-preserves/',
'Cocktails & Mocktails' => '/food/cocktails-and-mocktails/',
'Coffee' => '/food/coffee/',
'Cookies' => '/food/cookies/',
'Cupcakes' => '/food/cupcakes/',
'Homebrew' => '/food/homebrew/',
'Main Course' => '/food/main-course/',
'Pasta' => '/food/pasta/',
'Pie' => '/food/pie/',
'Pizza' => '/food/pizza/',
'Salad' => '/food/salad/',
'Sandwiches' => '/food/sandwiches/',
'Soups & Stews' => '/food/soups-and-stews/',
'Vegetarian & Vegan' => '/food/vegetarian-and-vegan/',
),
'Costumes' => array(
'All' => '/costumes/',
'Props' => '/costumes/props-and-accessories/',
'Animals' => '/costumes/animals/',
'Comics' => '/costumes/comics/',
'Fantasy' => '/costumes/fantasy/',
'For Kids' => '/costumes/for-kids/',
'For Pets' => '/costumes/for-pets/',
'Funny' => '/costumes/funny/',
'Games' => '/costumes/games/',
'Historic & Futuristic' => '/costumes/historic-and-futuristic/',
'Makeup' => '/costumes/makeup/',
'Masks' => '/costumes/masks/',
'Scary' => '/costumes/scary/',
'TV & Movies' => '/costumes/tv-and-movies/',
'Weapons & Armor' => '/costumes/weapons-and-armor/',
)
),
'title' => 'Select your category (required)',
'defaultValue' => 'Technology'
),
'filter' => array(
'name' => 'Filter',
'type' => 'list',
'required' => true,
'values' => array(
'Featured' => ' ',
'Recent' => 'recent/',
'Popular' => 'popular/',
'Views' => 'views/',
'Contest Winners' => 'winners/'
),
'title' => 'Select a filter',
'defaultValue' => 'Featured'
)
)
);
private $uri;
public function collectData() {
// Enable the following line to get the category list (dev mode)
// $this->listCategories();
$this->uri = static::URI;
switch($this->queriedContext) {
case 'Category': $this->uri .= $this->getInput('category') . $this->getInput('filter');
}
$html = getSimpleHTMLDOM($this->uri)
or returnServerError('Error loading category ' . $this->uri);
foreach($html->find('ul.explore-covers-list li') as $cover) {
$item = array();
$item['uri'] = static::URI . $cover->find('a.cover-image', 0)->href;
$item['title'] = $cover->find('.title', 0)->innertext;
$item['author'] = $this->getCategoryAuthor($cover);
$item['content'] = '<a href='
. $item['uri']
. '><img src='
. $cover->find('a.cover-image img', 0)->src
. '></a>';
$image = str_replace('.RECTANGLE1', '.LARGE', $cover->find('a.cover-image img', 0)->src);
$item['enclosures'] = [$image];
$this->items[] = $item;
}
}
public function getName() {
if(!is_null($this->getInput('category'))
&& !is_null($this->getInput('filter'))) {
foreach(self::PARAMETERS[$this->queriedContext]['category']['values'] as $key => $value) {
$subcategory = array_search($this->getInput('category'), $value);
if($subcategory !== false)
break;
}
$filter = array_search(
$this->getInput('filter'),
self::PARAMETERS[$this->queriedContext]['filter']['values']
);
return $subcategory . ' (' . $filter . ') - ' . static::NAME;
}
return parent::getName();
}
public function getURI() {
if(!is_null($this->getInput('category'))
&& !is_null($this->getInput('filter'))) {
return $this->uri;
}
return parent::getURI();
}
/**
* Returns a list of categories for development purposes (used to build the
* parameters list)
*/
private function listCategories(){
// Use arbitrary category to receive full list
$html = getSimpleHTMLDOM(self::URI . '/technology/');
foreach($html->find('.channel a') as $channel) {
$name = html_entity_decode(trim($channel->innertext));
// Remove unwanted entities
$name = str_replace("'", '', $name);
$name = str_replace('&#39;', '', $name);
$uri = $channel->href;
$category = explode('/', $uri)[1];
if(!isset($categories)
|| !array_key_exists($category, $categories)
|| !in_array($uri, $categories[$category]))
$categories[$category][$name] = $uri;
}
// Build PHP array manually
foreach($categories as $key => $value) {
$name = ucfirst($key);
echo "'{$name}' => array(\n";
echo "\t'All' => '/{$key}/',\n";
foreach($value as $name => $uri) {
echo "\t'{$name}' => '{$uri}',\n";
}
echo "),\n";
}
die;
}
/**
* Returns the author as anchor for a given cover.
*/
private function getCategoryAuthor($cover) {
return '<a href='
. static::URI . $cover->find('span.author a', 0)->href
. '>'
. $cover->find('span.author a', 0)->innertext
. '</a>';
}
}

View File

@@ -3,7 +3,7 @@ class JapanExpoBridge extends BridgeAbstract {
const MAINTAINER = 'Ginko';
const NAME = 'Japan Expo Actualités';
const URI = 'http://www.japan-expo-paris.com/fr/actualites';
const URI = 'https://www.japan-expo-paris.com/fr/actualites';
const CACHE_TIMEOUT = 14400; // 4h
const DESCRIPTION = 'Returns most recent entries from Japan Expo actualités.';
const PARAMETERS = array( array(
@@ -51,7 +51,7 @@ class JapanExpoBridge extends BridgeAbstract {
foreach($html->find('a._tile2') as $element) {
$url = $element->href;
$thumbnail = 'http://s.japan-expo.com/katana/images/JES049/paris.png';
$thumbnail = 'https://s.japan-expo.com/katana/images/JES049/paris.png';
preg_match('/url\(([^)]+)\)/', $element->find('img.rspvimgset', 0)->style, $img_search_result);
if(count($img_search_result) >= 2)
@@ -62,7 +62,8 @@ class JapanExpoBridge extends BridgeAbstract {
break;
}
$article_html = getSimpleHTMLDOMCached('Could not request JapanExpo: ' . $url);
$article_html = getSimpleHTMLDOMCached($url)
or returnServerError('Could not request JapanExpo: ' . $url);
$header = $article_html->find('header.pageHeadBox', 0);
$timestamp = strtotime($header->find('time', 0)->datetime);
$title_html = $header->find('div.section', 0)->next_sibling();
@@ -92,6 +93,7 @@ class JapanExpoBridge extends BridgeAbstract {
$item['uri'] = $url;
$item['title'] = $title;
$item['timestamp'] = $timestamp;
$item['enclosures'] = array($thumbnail);
$item['content'] = $content;
$this->items[] = $item;
$count++;

View File

@@ -8,8 +8,8 @@ class LeBonCoinBridge extends BridgeAbstract {
const PARAMETERS = array(
array(
'k' => array('name' => 'Mot Clé'),
'r' => array(
'keywords' => array('name' => 'Mots-Clés'),
'region' => array(
'name' => 'Région',
'type' => 'list',
'values' => array(
@@ -42,7 +42,114 @@ class LeBonCoinBridge extends BridgeAbstract {
'Réunion' => '26'
)
),
'c' => array(
'department' => array(
'name' => 'Département',
'type' => 'list',
'values' => array(
'' => '',
'Ain' => '1',
'Aisne' => '2',
'Allier' => '3',
'Alpes-de-Haute-Provence' => '4',
'Hautes-Alpes' => '5',
'Alpes-Maritimes' => '6',
'Ardèche' => '7',
'Ardennes' => '8',
'Ariège' => '9',
'Aube' => '10',
'Aude' => '11',
'Aveyron' => '12',
'Bouches-du-Rhône' => '13',
'Calvados' => '14',
'Cantal' => '15',
'Charente' => '16',
'Charente-Maritime' => '17',
'Cher' => '18',
'Corrèze' => '19',
'Corse-du-Sud' => '2A',
'Haute-Corse' => '2B',
'Côte-d\'Or' => '21',
'Côtes-d\'Armor' => '22',
'Creuse' => '23',
'Dordogne' => '24',
'Doubs' => '25',
'Drôme' => '26',
'Eure' => '27',
'Eure-et-Loir' => '28',
'Finistère' => '29',
'Gard' => '30',
'Haute-Garonne' => '31',
'Gers' => '32',
'Gironde' => '33',
'Hérault' => '34',
'Ille-et-Vilaine' => '35',
'Indre' => '36',
'Indre-et-Loire' => '37',
'Isère' => '38',
'Jura' => '39',
'Landes' => '40',
'Loir-et-Cher' => '41',
'Loire' => '42',
'Haute-Loire' => '43',
'Loire-Atlantique' => '44',
'Loiret' => '45',
'Lot' => '46',
'Lot-et-Garonne' => '47',
'Lozère' => '48',
'Maine-et-Loire' => '49',
'Manche' => '50',
'Marne' => '51',
'Haute-Marne' => '52',
'Mayenne' => '53',
'Meurthe-et-Moselle' => '54',
'Meuse' => '55',
'Morbihan' => '56',
'Moselle' => '57',
'Nièvre' => '58',
'Nord' => '59',
'Oise' => '60',
'Orne' => '61',
'Pas-de-Calais' => '62',
'Puy-de-Dôme' => '63',
'Pyrénées-Atlantiques' => '64',
'Hautes-Pyrénées' => '65',
'Pyrénées-Orientales' => '66',
'Bas-Rhin' => '67',
'Haut-Rhin' => '68',
'Rhône' => '69',
'Haute-Saône' => '70',
'Saône-et-Loire' => '71',
'Sarthe' => '72',
'Savoie' => '73',
'Haute-Savoie' => '74',
'Paris' => '75',
'Seine-Maritime' => '76',
'Seine-et-Marne' => '77',
'Yvelines' => '78',
'Deux-Sèvres' => '79',
'Somme' => '80',
'Tarn' => '81',
'Tarn-et-Garonne' => '82',
'Var' => '83',
'Vaucluse' => '84',
'Vendée' => '85',
'Vienne' => '86',
'Haute-Vienne' => '87',
'Vosges' => '88',
'Yonne' => '89',
'Territoire de Belfort' => '90',
'Essonne' => '91',
'Hauts-de-Seine' => '92',
'Seine-Saint-Denis' => '93',
'Val-de-Marne' => '94',
'Val-d\'Oise' => '95'
)
),
'cities' => array(
'name' => 'Villes',
'title' => 'Codes postaux séparés par des virgules'
),
'category' => array(
'name' => 'Catégorie',
'type' => 'list',
'values' => array(
@@ -51,7 +158,7 @@ class LeBonCoinBridge extends BridgeAbstract {
'Emploi et recrutement' => '71',
'Offres d\'emploi et jobs' => '33'
),
'VEHICULES' => array(
'VÉHICULES' => array(
'Tous' => '1',
'Voitures' => '2',
'Motos' => '3',
@@ -78,7 +185,7 @@ class LeBonCoinBridge extends BridgeAbstract {
'Hôtels' => '69',
'Hébergements insolites' => '70'
),
'MULTIMEDIA' => array(
'MULTIMÉDIA' => array(
'Tous' => '14',
'Informatique' => '15',
'Consoles & Jeux vidéo' => '43',
@@ -98,7 +205,7 @@ class LeBonCoinBridge extends BridgeAbstract {
'Jeux & Jouets' => '41',
'Vins & Gastronomie' => '48'
),
'MATERIEL PROFESSIONNEL' => array(
'MATÉRIEL PROFESSIONNEL' => array(
'Tous' => '56',
'Matériel Agricole' => '57',
'Transport - Manutention' => '58',
@@ -114,14 +221,14 @@ class LeBonCoinBridge extends BridgeAbstract {
'Tous' => '31',
'Prestations de services' => '34',
'Billetterie' => '35',
'Evénements' => '49',
'Événements' => '49',
'Cours particuliers' => '36',
'Covoiturage' => '65'
),
'MAISON' => array(
'Tous' => '18',
'Ameublement' => '19',
'Electroménager' => '20',
'Électroménager' => '20',
'Arts de la table' => '45',
'Décoration' => '39',
'Linge de maison' => '46',
@@ -131,53 +238,145 @@ class LeBonCoinBridge extends BridgeAbstract {
'Chaussures' => '53',
'Accessoires & Bagagerie' => '47',
'Montres & Bijoux' => '42',
'Equipement bébé' => '23',
'Équipement bébé' => '23',
'Vêtements bébé' => '54',
),
'AUTRES' => '37'
)
),
'o' => array(
'pricemin' => array(
'name' => 'Prix min',
'type' => 'number'
),
'pricemax' => array(
'name' => 'Prix max',
'type' => 'number'
),
'estate' => array(
'name' => 'Type de bien',
'type' => 'list',
'values' => array(
'' => '',
'Maison' => '1',
'Appartement' => '2',
'Terrain' => '3',
'Parking' => '4',
'Autre' => '5'
)
),
'roomsmin' => array(
'name' => 'Pièces min',
'type' => 'number'
),
'roomsmax' => array(
'name' => 'Pièces max',
'type' => 'number'
),
'squaremin' => array(
'name' => 'Surface min',
'type' => 'number'
),
'squaremax' => array(
'name' => 'Surface max',
'type' => 'number'
),
'mileagemin' => array(
'name' => 'Kilométrage min',
'type' => 'number'
),
'mileagemax' => array(
'name' => 'Kilométrage max',
'type' => 'number'
),
'yearmin' => array(
'name' => 'Année min',
'type' => 'number'
),
'yearmax' => array(
'name' => 'Année max',
'type' => 'number'
),
'cubiccapacitymin' => array(
'name' => 'Cylindrée min',
'type' => 'number'
),
'cubiccapacitymax' => array(
'name' => 'Cylindrée max',
'type' => 'number'
),
'fuel' => array(
'name' => 'Énergie',
'type' => 'list',
'values' => array(
'' => '',
'Essence' => '1',
'Diesel' => '2',
'GPL' => '3',
'Électrique' => '4',
'Hybride' => '6',
'Autre' => '5'
)
),
'owner' => array(
'name' => 'Vendeur',
'type' => 'list',
'values' => array(
'Tous' => '',
'Particuliers' => 'private',
'Professionnels' => 'pro',
'Professionnels' => 'pro'
)
)
)
);
public function collectData(){
public static $LBC_API_KEY = 'ba0c2dad52b3ec';
$params = array(
'text' => $this->getInput('k'),
'region' => $this->getInput('r'),
'category' => $this->getInput('c'),
'owner_type' => $this->getInput('o'),
);
private function getRange($field, $range_min, $range_max){
$url = self::URI . 'recherche/?' . http_build_query($params);
$html = getContents($url)
or returnServerError('Could not request LeBonCoin. Tried: ' . $url);
if(!preg_match('/^<script>window.FLUX_STATE[^\r\n]*/m', $html, $matches)) {
returnServerError('Could not parse JSON in page content.');
if(!is_null($range_min)
&& !is_null($range_max)
&& $range_min > $range_max) {
returnClientError('Min-' . $field . ' must be lower than max-' . $field . '.');
}
$clean_match = str_replace(
array('</script>', '<script>window.FLUX_STATE = '),
array('', ''),
$matches[0]
);
$json = json_decode($clean_match);
if(!is_null($range_min)
&& is_null($range_max)) {
returnClientError('Max-' . $field . ' is needed when min-' . $field . ' is setted (range).');
}
if($json->adSearch->data->total === 0) {
return array(
'min' => $range_min,
'max' => $range_max
);
}
public function collectData(){
$url = 'https://api.leboncoin.fr/finder/search/';
$data = $this->buildRequestJson();
$header = array(
'Content-Type: application/json',
'Content-Length: ' . strlen($data),
'api_key: ' . self::$LBC_API_KEY
);
$opts = array(
CURL_CUSTOMREQUEST => 'POST',
CURLOPT_POSTFIELDS => $data
);
$content = getContents($url, $header, $opts)
or returnServerError('Could not request LeBonCoin. Tried: ' . $url);
$json = json_decode($content);
if($json->total === 0) {
return;
}
foreach($json->adSearch->data->ads as $element) {
foreach($json->ads as $element) {
$item['title'] = $element->subject;
$item['content'] = $element->body;
@@ -219,4 +418,121 @@ class LeBonCoinBridge extends BridgeAbstract {
$this->items[] = $item;
}
}
private function buildRequestJson() {
$requestJson = new StdClass();
$requestJson->owner_type = $this->getInput('owner');
$requestJson->filters->location = array();
$requestJson->filters->keywords = array(
'text' => $this->getInput('keywords')
);
if($this->getInput('region') != '') {
$requestJson->filters->location['regions'] = [$this->getInput('region')];
}
if($this->getInput('department') != '') {
$requestJson->filters->location['departments'] = [$this->getInput('department')];
}
if($this->getInput('cities') != '') {
$requestJson->filters->location['city_zipcodes'] = array();
foreach (explode(',', $this->getInput('cities')) as $zipcode) {
$requestJson->filters->location['city_zipcodes'][] = array(
'zipcode' => trim($zipcode)
);
}
}
$requestJson->filters->category = array(
'id' => $this->getInput('category')
);
if($this->getInput('pricemin') != ''
|| $this->getInput('pricemax') != '') {
$requestJson->filters->ranges->price = $this->getRange(
'price',
$this->getInput('pricemin'),
$this->getInput('pricemax')
);
}
if($this->getInput('estate') != '') {
$requestJson->filters->enums['real_estate_type'] = [$this->getInput('estate')];
}
if($this->getInput('roomsmin') != ''
|| $this->getInput('roomsmax') != '') {
$requestJson->filters->ranges->rooms = $this->getRange(
'rooms',
$this->getInput('roomsmin'),
$this->getInput('roomsmax')
);
}
if($this->getInput('squaremin') != ''
|| $this->getInput('squaremax') != '') {
$requestJson->filters->ranges->square = $this->getRange(
'square',
$this->getInput('squaremin'),
$this->getInput('squaremax')
);
}
if($this->getInput('mileagemin') != ''
|| $this->getInput('mileagemax') != '') {
$requestJson->filters->ranges->mileage = $this->getRange(
'mileage',
$this->getInput('mileagemin'),
$this->getInput('mileagemax')
);
}
if($this->getInput('yearmin') != ''
|| $this->getInput('yearmax') != '') {
$requestJson->filters->ranges->regdate = $this->getRange(
'year',
$this->getInput('yearmin'),
$this->getInput('yearmax')
);
}
if($this->getInput('cubiccapacitymin') != ''
|| $this->getInput('cubiccapacitymax') != '') {
$requestJson->filters->ranges->cubic_capacity = $this->getRange(
'cubic_capacity',
$this->getInput('cubiccapacitymin'),
$this->getInput('cubiccapacitymax')
);
}
if($this->getInput('fuel') != '') {
$requestJson->filters->enums['fuel'] = [$this->getInput('fuel')];
}
$requestJson->limit = 30;
return json_encode($requestJson);
}
}

View File

@@ -3,8 +3,7 @@ class LeMondeInformatiqueBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'Le Monde Informatique';
const URI = 'http://www.lemondeinformatique.fr/';
const CACHE_TIMEOUT = 1800; // 30min
const URI = 'https://www.lemondeinformatique.fr/';
const DESCRIPTION = 'Returns the newest articles.';
public function collectData(){
@@ -15,30 +14,26 @@ class LeMondeInformatiqueBridge extends FeedExpander {
$item = parent::parseItem($newsItem);
$article_html = getSimpleHTMLDOMCached($item['uri'])
or returnServerError('Could not request LeMondeInformatique: ' . $item['uri']);
$item['content'] = $this->cleanArticle($article_html->find('div#article', 0)->innertext);
$item['title'] = $article_html->find('h1.cleanprint-title', 0)->plaintext;
//Deduce thumbnail URL from article image URL
$item['enclosures'] = array(
str_replace(
'/grande/',
'/petite/',
$article_html->find('.article-image', 0)->find('img', 0)->src
)
);
//No response header sets the encoding, explicit conversion is needed or subsequent xml_encode() will fail
$item['content'] = utf8_encode($this->cleanArticle($article_html->find('div.col-primary', 0)->innertext));
$item['author'] = utf8_encode($article_html->find('div.author-infos', 0)->find('b', 0)->plaintext);
return $item;
}
private function stripCDATA($string){
$string = str_replace('<![CDATA[', '', $string);
$string = str_replace(']]>', '', $string);
return $string;
}
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function cleanArticle($article_html){
$article_html = $this->stripWithDelimiters($article_html, '<script', '</script>');
$article_html = $this->stripWithDelimiters($article_html, '<h1 class="cleanprint-title"', '</h1>');
$article_html = stripWithDelimiters($article_html, '<script', '</script>');
$article_html = explode('<p class="contact-error', $article_html)[0] . '</div>';
return $article_html;
}
}

View File

@@ -3,7 +3,7 @@ class LesJoiesDuCodeBridge extends BridgeAbstract {
const MAINTAINER = 'superbaillot.net';
const NAME = 'Les Joies Du Code';
const URI = 'http://lesjoiesducode.fr/';
const URI = 'https://lesjoiesducode.fr/';
const CACHE_TIMEOUT = 7200; // 2h
const DESCRIPTION = 'LesJoiesDuCode';
@@ -27,15 +27,7 @@ class LesJoiesDuCodeBridge extends BridgeAbstract {
}
$content = $temp->innertext;
$auteur = $temp->find('i', 0);
$pos = strpos($auteur->innertext, 'by');
if($pos > 0) {
$auteur = trim(str_replace('*/', '', substr($auteur->innertext, ($pos + 2))));
$item['author'] = $auteur;
}
$item['content'] .= trim($content);
$item['content'] = trim($content);
$item['uri'] = $url;
$item['title'] = trim($titre);

View File

@@ -6,16 +6,6 @@ class NeuviemeArtBridge extends FeedExpander {
const URI = 'http://www.9emeart.fr/';
const DESCRIPTION = 'Returns the newest articles.';
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
protected function parseItem($item){
$item = parent::parseItem($item);
@@ -34,16 +24,16 @@ class NeuviemeArtBridge extends FeedExpander {
}
$article_content = '';
if($article_image) {
if ($article_image) {
$article_content = '<p><img src="' . $article_image . '" /></p>';
}
$article_content .= str_replace(
'src="/', 'src="' . self::URI,
$article_html->find('div.newsGenerique_con', 0)->innertext
);
$article_content = $this->stripWithDelimiters($article_content, '<script', '</script>');
$article_content = $this->stripWithDelimiters($article_content, '<style', '</style>');
$article_content = $this->stripWithDelimiters($article_content, '<link', '>');
$article_content = stripWithDelimiters($article_content, '<script', '</script>');
$article_content = stripWithDelimiters($article_content, '<style', '</style>');
$article_content = stripWithDelimiters($article_content, '<link', '>');
$item['content'] = $article_content;

View File

@@ -6,29 +6,105 @@ class NextInpactBridge extends FeedExpander {
const URI = 'https://www.nextinpact.com/';
const DESCRIPTION = 'Returns the newest articles.';
const PARAMETERS = array( array(
'feed' => array(
'name' => 'Feed',
'type' => 'list',
'values' => array(
'Tous nos articles' => 'news',
'Nos contenus en accès libre' => 'acces-libre',
'Blog' => 'blog',
'Bons plans' => 'bonsplans'
)
),
'filter_premium' => array(
'name' => 'Premium',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide Premium' => '1',
'Only Premium' => '2'
)
),
'filter_brief' => array(
'name' => 'Brief',
'type' => 'list',
'values' => array(
'No filter' => '0',
'Hide Brief' => '1',
'Only Brief' => '2'
)
)
));
public function collectData(){
$this->collectExpandableDatas(self::URI . 'rss/news.xml', 10);
$feed = $this->getInput('feed');
if (empty($feed))
$feed = 'news';
$this->collectExpandableDatas(self::URI . 'rss/' . $feed . '.xml');
}
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = $this->extractContent($item['uri']);
$item['content'] = $this->extractContent($item, $item['uri']);
if (is_null($item['content']))
return null; //Filtered article
return $item;
}
private function extractContent($url){
$html2 = getSimpleHTMLDOMCached($url);
$text = '<p><em>'
. $html2->find('span.sub_title', 0)->innertext
. '</em></p><p><img src="'
. $html2->find('div.container_main_image_article', 0)->find('img.dedicated', 0)->src
. '" alt="-" /></p><div>'
. $html2->find('div[itemprop=articleBody]', 0)->innertext
. '</div>';
private function extractContent($item, $url){
$html = getSimpleHTMLDOMCached($url);
if (!is_object($html))
return 'Failed to request NextInpact: ' . $url;
foreach(array(
'filter_premium' => 'h2.title_reserve_article',
'filter_brief' => 'div.brief-inner-content'
) as $param_name => $selector) {
$param_val = intval($this->getInput($param_name));
if ($param_val != 0) {
$element_present = is_object($html->find($selector, 0));
$element_wanted = ($param_val == 2);
if ($element_present != $element_wanted) {
return null; //Filter article
}
}
}
if (is_object($html->find('div[itemprop=articleBody], div.brief-inner-content', 0))) {
$subtitle = trim($html->find('span.sub_title, div.brief-head', 0));
if(is_object($subtitle) && $subtitle->plaintext !== $item['title']) {
$subtitle = '<p><em>' . $subtitle->plaintext . '</em></p>';
} else {
$subtitle = '';
}
$postimg = $html->find(
'div.container_main_image_article, div.image-brief-container, div.image-brief-side-container', 0
);
if(is_object($postimg)) {
$postimg = '<p><img src="'
. $postimg->find('img.dedicated', 0)->src
. '" alt="-" /></p>';
} else {
$postimg = '';
}
$text = $subtitle
. $postimg
. $html->find('div[itemprop=articleBody], div.brief-inner-content', 0)->outertext;
} else {
$text = $item['content']
. '<p><em>Failed retrieve full article content</em></p>';
}
$premium_article = $html->find('h2.title_reserve_article', 0);
if (is_object($premium_article)) {
$text .= '<p><em>' . $premium_article->innertext . '</em></p>';
}
$premium_article = $html2->find('h2.title_reserve_article', 0);
if (is_object($premium_article))
$text = $text . '<p><em>' . $premium_article->innertext . '</em></p>';
return $text;
}
}

View File

@@ -32,43 +32,39 @@ class NextgovBridge extends FeedExpander {
protected function parseItem($newsItem){
$item = parent::parseItem($newsItem);
$item['content'] = '';
$article_thumbnail = 'https://cdn.nextgov.com/nextgov/images/logo.png';
$item['content'] = '<p><b>' . $item['content'] . '</b></p>';
$namespaces = $newsItem->getNamespaces(true);
if(isset($namespaces['media'])) {
$media = $newsItem->children($namespaces['media']);
if(isset($media->content)) {
$attributes = $media->content->attributes();
$item['content'] = '<img src="' . $attributes['url'] . '">';
$item['content'] = '<p><img src="' . $attributes['url'] . '"></p>' . $item['content'];
$article_thumbnail = str_replace(
'large.jpg',
'small.jpg',
strval($attributes['url'])
);
}
}
$item['enclosures'] = array($article_thumbnail);
$item['content'] .= $this->extractContent($item['uri']);
return $item;
}
private function stripWithDelimiters($string, $start, $end){
while (strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
private function extractContent($url){
$article = getSimpleHTMLDOMCached($url)
or returnServerError('Could not request Nextgov: ' . $url);
$article = getSimpleHTMLDOMCached($url);
$contents = $article->find('div.wysiwyg', 0)->innertext;
$contents = $this->stripWithDelimiters($contents, '<div class="ad-container">', '</div>');
$contents = $this->stripWithDelimiters($contents, '<div', '</div>'); //ad outer div
return $this->stripWithDelimiters($contents, '<script', '</script>');
$contents = ($article_thumbnail == '' ? '' : '<p><img src="' . $article_thumbnail . '" /></p>')
. '<p><b>'
. $article_subtitle
. '</b></p>'
. trim($contents);
if (!is_object($article))
return 'Could not request Nextgov: ' . $url;
$contents = $article->find('div.wysiwyg', 0);
$contents->find('svg.content-tombstone', 0)->outertext = '';
$contents = $contents->innertext;
$contents = stripWithDelimiters($contents, '<div class="ad-container">', '</div>');
$contents = stripWithDelimiters($contents, '<div', '</div>'); //ad outer div
return trim(stripWithDelimiters($contents, '<script', '</script>'));
}
}

View File

@@ -0,0 +1,127 @@
<?php
class NyaaTorrentsBridge extends BridgeAbstract {
const MAINTAINER = 'ORelio';
const NAME = 'NyaaTorrents';
const URI = 'https://nyaa.si/';
const DESCRIPTION = 'Returns the newest torrents, with optional search criteria.';
const PARAMETERS = array(
array(
'f' => array(
'name' => 'Filter',
'type' => 'list',
'values' => array(
'No filter' => '0',
'No remakes' => '1',
'Trusted only' => '2'
)
),
'c' => array(
'name' => 'Category',
'type' => 'list',
'values' => array(
'All categories' => '0_0',
'Anime' => '1_0',
'Anime - AMV' => '1_1',
'Anime - English' => '1_2',
'Anime - Non-English' => '1_3',
'Anime - Raw' => '1_4',
'Audio' => '2_0',
'Audio - Lossless' => '2_1',
'Audio - Lossy' => '2_2',
'Literature' => '3_0',
'Literature - English' => '3_1',
'Literature - Non-English' => '3_2',
'Literature - Raw' => '3_3',
'Live Action' => '4_0',
'Live Action - English' => '4_1',
'Live Action - Idol/PV' => '4_2',
'Live Action - Non-English' => '4_3',
'Live Action - Raw' => '4_4',
'Pictures' => '5_0',
'Pictures - Graphics' => '5_1',
'Pictures - Photos' => '5_2',
'Software' => '6_0',
'Software - Apps' => '6_1',
'Software - Games' => '6_2',
)
),
'q' => array(
'name' => 'Keyword',
'description' => 'Keyword(s)',
'type' => 'text'
)
)
);
public function collectData() {
// Build Search URL from user-provided parameters
$search_url = self::URI . '?s=id&o=desc&'
. http_build_query(array(
'f' => $this->getInput('f'),
'c' => $this->getInput('c'),
'q' => $this->getInput('q')
));
// Retrieve torrent listing from search results, which does not contain torrent description
$html = getSimpleHTMLDOM($search_url)
or returnServerError('Could not request Nyaa: ' . $search_url);
$links = $html->find('a');
$results = array();
foreach ($links as $link)
if (strpos($link->href, '/view/') === 0 && !in_array($link->href, $results))
$results[] = $link->href;
if (empty($results) && empty($this->getInput('q')))
returnServerError('No results from Nyaa: ' . $url, 500);
//Process each item individually
foreach ($results as $element) {
//Limit total amount of requests
if(count($this->items) >= 20) {
break;
}
$torrent_id = str_replace('/view/', '', $element);
//Ignore entries without valid torrent ID
if ($torrent_id != 0 && ctype_digit($torrent_id)) {
//Retrieve data for this torrent ID
$item_uri = self::URI . 'view/' . $torrent_id;
//Retrieve full description from torrent page
if ($item_html = getSimpleHTMLDOMCached($item_uri)) {
//Retrieve data from page contents
$item_title = str_replace(' :: Nyaa', '', $item_html->find('title', 0)->plaintext);
$item_desc = str_get_html(markdownToHtml($item_html->find('#torrent-description', 0)->innertext));
$item_author = extractFromDelimiters($item_html->outertext, 'href="/user/', '"');
$item_date = intval(extractFromDelimiters($item_html->outertext, 'data-timestamp="', '"'));
//Retrieve image for thumbnail or generic logo fallback
$item_image = $this->getURI() . 'static/img/avatar/default.png';
foreach ($item_desc->find('img') as $img) {
if (strpos($img->src, 'prez') === false) {
$item_image = $img->src;
break;
}
}
//Build and add final item
$item = array();
$item['uri'] = $item_uri;
$item['title'] = $item_title;
$item['author'] = $item_author;
$item['timestamp'] = $item_date;
$item['enclosures'] = array($item_image);
$item['content'] = $item_desc;
$this->items[] = $item;
}
}
$element = null;
}
$results = null;
}
}

View File

@@ -40,7 +40,8 @@ class PixivBridge extends BridgeAbstract {
preg_match_all($timeRegex, $result['url'], $dt, PREG_SET_ORDER, 0);
$elementDate = DateTime::createFromFormat('YmdHis',
$dt[0][1] . $dt[0][2] . $dt[0][3] . $dt[0][4] . $dt[0][5] . $dt[0][6]);
$dt[0][1] . $dt[0][2] . $dt[0][3] . $dt[0][4] . $dt[0][5] . $dt[0][6],
new DateTimeZone('Asia/Tokyo'));
$item['timestamp'] = $elementDate->getTimestamp();
$item['content'] = "<img src='" . $this->cacheImage($result['url'], $item['id']) . "' />";
@@ -48,7 +49,7 @@ class PixivBridge extends BridgeAbstract {
}
}
public function cacheImage($url, $illustId) {
private function cacheImage($url, $illustId) {
$url = str_replace('_master1200', '', $url);
$url = str_replace('c/240x240/img-master/', 'img-original/', $url);

View File

@@ -9,16 +9,6 @@ class Releases3DSBridge extends BridgeAbstract {
public function collectData(){
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
function typeToString($type){
switch($type) {
case 1: return '3DS Game';
@@ -76,8 +66,8 @@ class Releases3DSBridge extends BridgeAbstract {
$ignDate = time();
$ignCoverArt = '';
$ignSearchUrl = 'http://www.ign.com/search?q=' . urlencode($name);
if($ignResult = getSimpleHTMLDOM($ignSearchUrl)) {
$ignSearchUrl = 'https://www.ign.com/search?q=' . urlencode($name);
if($ignResult = getSimpleHTMLDOMCached($ignSearchUrl)) {
$ignCoverArt = $ignResult->find('div.search-item-media', 0)->find('img', 0)->src;
$ignDesc = $ignResult->find('div.search-item-description', 0)->plaintext;
$ignLink = $ignResult->find('div.search-item-sub-title', 0)->find('a', 1)->href;
@@ -127,6 +117,7 @@ class Releases3DSBridge extends BridgeAbstract {
$item['title'] = $name;
$item['author'] = $publisher;
$item['timestamp'] = $ignDate;
$item['enclosures'] = array($ignCoverArt);
$item['uri'] = empty($ignLink) ? $searchLinkDuckDuckGo : $ignLink;
$item['content'] = $ignDescription . $releaseDescription . $releaseSearchLinks;
$this->items[] = $item;

825
bridges/SkimfeedBridge.php Normal file
View File

@@ -0,0 +1,825 @@
<?php
class SkimfeedBridge extends BridgeAbstract {
const CONTEXT_NEWS_BOX = 'News box';
const CONTEXT_HOT_TOPICS = 'Hot topics';
const CONTEXT_TECH_NEWS = 'Tech news';
const CONTEXT_CUSTOM = 'Custom feed';
const NAME = 'Skimfeed Bridge';
const URI = 'https://skimfeed.com';
const DESCRIPTION = 'Returns feeds from Skimfeed, also supports custom feeds!';
const MAINTAINER = 'logmanoriginal';
const CACHE_TIMEOUT = 3600;
const PARAMETERS = array(
self::CONTEXT_NEWS_BOX => array( // auto-generated (see below)
'box_channel' => array(
'name' => 'Channel',
'type' => 'list',
'required' => true,
'title' => 'Select your channel',
'values' => array(
'Hacker News' => '/news/hacker-news.html',
'QZ' => '/news/qz.html',
'The Verge' => '/news/the-verge.html',
'Slashdot' => '/news/slashdot.html',
'Lifehacker' => '/news/lifehacker.html',
'Gizmag' => '/news/gizmag.html',
'Fast Company' => '/news/fast-company.html',
'Engadget' => '/news/engadget.html',
'Wired' => '/news/wired.html',
'MakeUseOf' => '/news/makeuseof.html',
'Techcrunch' => '/news/techcrunch.html',
'Apple Insider' => '/news/apple-insider.html',
'ArsTechnica' => '/news/arstechnica.html',
'Tech in Asia' => '/news/tech-in-asia.html',
'FastCoExist' => '/news/fastcoexist.html',
'Digital Trends' => '/news/digital-trends.html',
'AnandTech' => '/news/anandtech.html',
'How to Geek' => '/news/how-to-geek.html',
'Geek' => '/news/geek.html',
'BBC Technology' => '/news/bbc-technology.html',
'Extreme Tech' => '/news/extreme-tech.html',
'Packet Storm Sec' => '/news/packet-storm-sec.html',
'MedGadget' => '/news/medgadget.html',
'Design' => '/news/design.html',
'The Next Web' => '/news/the-next-web.html',
'Bit-Tech' => '/news/bit-tech.html',
'Next Big Future' => '/news/next-big-future.html',
'A VC' => '/news/a-vc.html',
'Copyblogger' => '/news/copyblogger.html',
'Smashing Mag' => '/news/smashing-mag.html',
'Continuations' => '/news/continuations.html',
'Cult of Mac' => '/news/cult-of-mac.html',
'SecuriTeam' => '/news/securiteam.html',
'The Tech Block' => '/news/the-tech-block.html',
'BetaBeat' => '/news/betabeat.html',
'PC Mag' => '/news/pc-mag.html',
'Venture Beat' => '/news/venture-beat.html',
'ReadWriteWeb' => '/news/readwriteweb.html',
'High Scalability' => '/news/high-scalability.html',
)
)
),
self::CONTEXT_HOT_TOPICS => array(),
self::CONTEXT_TECH_NEWS => array( // auto-generated (see below)
'tech_channel' => array(
'name' => 'Tech channel',
'type' => 'list',
'required' => true,
'title' => 'Select your tech channel',
'values' => array(
'Agg' => array(
'Reddit' => '/news/reddit.html',
'Tech Insider' => '/news/tech-insider.html',
'Digg' => '/news/digg.html',
'Meta Filter' => '/news/meta-filter.html',
'Fark' => '/news/fark.html',
'Mashable' => '/news/mashable.html',
'Ad Week' => '/news/ad-week.html',
'The Chive' => '/news/the-chive.html',
'BoingBoing' => '/news/boingboing.html',
'Vice' => '/news/vice.html',
'ClientsFromHell' => '/news/clientsfromhell.html',
'How Stuff Works' => '/news/how-stuff-works.html',
'Buzzfeed' => '/news/buzzfeed.html',
'BoingBoing' => '/news/boingboing.html',
'Cracked' => '/news/cracked.html',
'Weird News' => '/news/weird-news.html',
'ITOTD' => '/news/itotd.html',
'Metafilter' => '/news/metafilter.html',
'TheOnion' => '/news/theonion.html',
),
'Cars' => array(
'Reddit Cars' => '/news/reddit-cars.html',
'NYT Auto' => '/news/nyt-auto.html',
'Truth About Cars' => '/news/truth-about-cars.html',
'AutoBlog' => '/news/autoblog.html',
'AutoSpies' => '/news/autospies.html',
'Autoweek' => '/news/autoweek.html',
'The Garage' => '/news/the-garage.html',
'Car and Driver' => '/news/car-and-driver.html',
'EGM Car Tech' => '/news/egm-car-tech.html',
'Top Gear' => '/news/top-gear.html',
'eGarage' => '/news/egarage.html',
),
'Comics' => array(
'Penny Arcade' => '/news/penny-arcade.html',
'XKCD' => '/news/xkcd.html',
'Channelate' => '/news/channelate.html',
'Savage Chicken' => '/news/savage-chicken.html',
'Dinosaur Comics' => '/news/dinosaur-comics.html',
'Explosm' => '/news/explosm.html',
'PoorlyDLines' => '/news/poorlydlines.html',
'Moonbeard' => '/news/moonbeard.html',
'Nedroid' => '/news/nedroid.html',
),
'Design' => array(
'FastCoCreate' => '/news/fastcocreate.html',
'Dezeen' => '/news/dezeen.html',
'Design Boom' => '/news/design-boom.html',
'Mmminimal' => '/news/mmminimal.html',
'We Heart' => '/news/we-heart.html',
'CreativeBloq' => '/news/creativebloq.html',
'TheDSGNblog' => '/news/thedsgnblog.html',
'Grainedit' => '/news/grainedit.html',
),
'Football' => array(
'Mail Football' => '/news/mail-football.html',
'Yahoo Football' => '/news/yahoo-football.html',
'FourFourTwo' => '/news/fourfourtwo.html',
'Goal' => '/news/goal.html',
'BBC Football' => '/news/bbc-football.html',
'TalkSport' => '/news/talksport.html',
'101 Great Goals' => '/news/101-great-goals.html',
'Who Scored' => '/news/who-scored.html',
'Football365 Champ' => '/news/football365-champ.html',
'Football365 Premier' => '/news/football365-premier.html',
'BleacherReport' => '/news/bleacherreport.html',
),
'Gaming' => array(
'Polygon' => '/news/polygon.html',
'Gamespot' => '/news/gamespot.html',
'RockPaperShotgun' => '/news/rockpapershotgun.html',
'VG247' => '/news/vg247.html',
'IGN' => '/news/ign.html',
'Reddit Games' => '/news/reddit-games.html',
'TouchArcade' => '/news/toucharcade.html',
'GamesRadar' => '/news/gamesradar.html',
'Siliconera' => '/news/siliconera.html',
'Reddit GameDeals' => '/news/reddit-gamedeals.html',
'Joystiq' => '/news/joystiq.html',
'GameInformer' => '/news/gameinformer.html',
'PSN Blog' => '/news/psn-blog.html',
'Reddit GamerNews' => '/news/reddit-gamernews.html',
'Steam' => '/news/steam.html',
'DualShockers' => '/news/dualshockers.html',
'ShackNews' => '/news/shacknews.html',
'CheapAssGamer' => '/news/cheapassgamer.html',
'Eurogamer' => '/news/eurogamer.html',
'Major Nelson' => '/news/major-nelson.html',
'Reddit Truegaming' => '/news/reddit-truegaming.html',
'GameTrailers' => '/news/gametrailers.html',
'GamaSutra' => '/news/gamasutra.html',
'USGamer' => '/news/usgamer.html',
'Shoryuken' => '/news/shoryuken.html',
'Destructoid' => '/news/destructoid.html',
'ArsGaming' => '/news/arsgaming.html',
'XBOX Blog' => '/news/xbox-blog.html',
'GiantBomb' => '/news/giantbomb.html',
'VideoGamer' => '/news/videogamer.html',
'Pocket Tactics' => '/news/pocket-tactics.html',
'WiredGaming' => '/news/wiredgaming.html',
'AllGamesBeta' => '/news/allgamesbeta.html',
'OnGamers' => '/news/ongamers.html',
'Reddit GameBundles' => '/news/reddit-gamebundles.html',
'Kotaku' => '/news/kotaku.html',
'PCGamer' => '/news/pcgamer.html',
),
'Investing' => array(
'Seeking Alpha' => '/news/seeking-alpha.html',
'BBC Business' => '/news/bbc-business.html',
'Harvard Biz' => '/news/harvard-biz.html',
'Market Watch' => '/news/market-watch.html',
'Investor Place' => '/news/investor-place.html',
'Money Week' => '/news/money-week.html',
'Moneybeat' => '/news/moneybeat.html',
'Dealbook' => '/news/dealbook.html',
'Economist Business' => '/news/economist-business.html',
'Economist' => '/news/economist.html',
'Economist CN' => '/news/economist-cn.html',
),
'Long' => array(
'The Atlantic' => '/news/the-atlantic.html',
'Reddit Long' => '/news/reddit-long.html',
'Paris Review' => '/news/paris-review.html',
'New Yorker' => '/news/new-yorker.html',
'LongForm' => '/news/longform.html',
'LongReads' => '/news/longreads.html',
'The Browser' => '/news/the-browser.html',
'The Feature' => '/news/the-feature.html',
),
'MMA' => array(
'MMA Weekly' => '/news/mma-weekly.html',
'MMAFighting' => '/news/mmafighting.html',
'Reddit MMA' => '/news/reddit-mma.html',
'Sherdog Articles' => '/news/sherdog-articles.html',
'FightLand Vice' => '/news/fightland-vice.html',
'Sherdog Forum' => '/news/sherdog-forum.html',
'MMA Junkie' => '/news/mma-junkie.html',
'Sherdog MMA Video' => '/news/sherdog-mma-video.html',
'BloodyElbow' => '/news/bloodyelbow.html',
'CageWriter' => '/news/cagewriter.html',
'Sherdog News' => '/news/sherdog-news.html',
'MMAForum' => '/news/mmaforum.html',
'MMA Junkie Radio' => '/news/mma-junkie-radio.html',
'UFC News' => '/news/ufc-news.html',
'FightLinker' => '/news/fightlinker.html',
'Bodybuilding MMA' => '/news/bodybuilding-mma.html',
'BleacherReport MMA' => '/news/bleacherreport-mma.html',
'FiveOuncesofPain' => '/news/fiveouncesofpain.html',
'Sherdog Pictures' => '/news/sherdog-pictures.html',
'CagePotato' => '/news/cagepotato.html',
'Sherdog Radio' => '/news/sherdog-radio.html',
'ProMMARadio' => '/news/prommaradio.html',
),
'Mobile' => array(
'Macrumors' => '/news/macrumors.html',
'Android Police' => '/news/android-police.html',
'GSM Arena' => '/news/gsm-arena.html',
'DigiTrend Mobile' => '/news/digitrend-mobile.html',
'Mobile Nation' => '/news/mobile-nation.html',
'TechRadar' => '/news/techradar.html',
'ZDNET Mobile' => '/news/zdnet-mobile.html',
'MacWorld' => '/news/macworld.html',
'Android Dev Blog' => '/news/android-dev-blog.html',
),
'News' => array(
'Daily Mail' => '/news/daily-mail.html',
'Business Insider' => '/news/business-insider.html',
'The Guardian' => '/news/the-guardian.html',
'Fox' => '/news/fox.html',
'BBC World' => '/news/bbc-world.html',
'MSNBC' => '/news/msnbc.html',
'ABC News' => '/news/abc-news.html',
'Al Jazeera' => '/news/al-jazeera.html',
'Business Insider India' => '/news/business-insider-india.html',
'Observer' => '/news/observer.html',
'NYT Tech' => '/news/nyt-tech.html',
'NYT World' => '/news/nyt-world.html',
'CNN' => '/news/cnn.html',
'Japan Times' => '/news/japan-times.html',
'WorldCrunch' => '/news/worldcrunch.html',
'Pro publica' => '/news/pro-publica.html',
'OZY' => '/news/ozy.html',
'Times of India' => '/news/times-of-india.html',
'The Australian' => '/news/the-australian.html',
'Harpers' => '/news/harpers.html',
'Moscow Times' => '/news/moscow-times.html',
'The Times' => '/news/the-times.html',
'Reuters Tech' => '/news/reuters-tech.html',
),
'Politics' => array(
'FreeRepublic' => '/news/freerepublic.html',
'Salon' => '/news/salon.html',
'DrudgeReport' => '/news/drudgereport.html',
'TheHill' => '/news/thehill.html',
'TheBlaze' => '/news/theblaze.html',
'InfoWars' => '/news/infowars.html',
'New Republic' => '/news/new-republic.html',
'WashTimes' => '/news/washtimes.html',
'RealCleanPol' => '/news/realcleanpol.html',
'Fact Check' => '/news/fact-check.html',
'DailyKos' => '/news/dailykos.html',
'NewsMax' => '/news/newsmax.html',
'Politico' => '/news/politico.html',
'Michelle Malkin' => '/news/michelle-malkin.html',
),
'Reddit' => array(
'R Movies' => '/news/r-movies.html',
'R News' => '/news/r-news.html',
'Futurology' => '/news/futurology.html',
'R All' => '/news/r-all.html',
'R Music' => '/news/r-music.html',
'R Askscience' => '/news/r-askscience.html',
'R Technology' => '/news/r-technology.html',
'R Bestof' => '/news/r-bestof.html',
'R Askreddit' => '/news/r-askreddit.html',
'R Worldnews' => '/news/r-worldnews.html',
'R Explainlikeimfive' => '/news/r-explainlikeimfive.html',
'R Iama' => '/news/r-iama.html',
),
'Science' => array(
'PhysOrg' => '/news/physorg.html',
'Hack-a-day' => '/news/hack-a-day.html',
'Reddit Science' => '/news/reddit-science.html',
'Stats Blog' => '/news/stats-blog.html',
'Flowing Data' => '/news/flowing-data.html',
'Eureka Alert' => '/news/eureka-alert.html',
'Robotics BizRev' => '/news/robotics-bizrev.html',
'Planet big Data' => '/news/planet-big-data.html',
'Makezine' => '/news/makezine.html',
'MIT Tech' => '/news/mit-tech.html',
'R Bloggers' => '/news/r-bloggers.html',
'DataIsBeautiful' => '/news/dataisbeautiful.html',
'Ted Videos' => '/news/ted-videos.html',
'Advanced Science' => '/news/advanced-science.html',
'Robotiq' => '/news/robotiq.html',
'Science Daily' => '/news/science-daily.html',
'IEEE Robotics' => '/news/ieee-robotics.html',
'PSFK' => '/news/psfk.html',
'Discover Magazine' => '/news/discover-magazine.html',
'DataTau' => '/news/datatau.html',
'RoboHub' => '/news/robohub.html',
'Discovery' => '/news/discovery.html',
'Smart Data' => '/news/smart-data.html',
'Whats Big Data' => '/news/whats-big-data.html',
),
'Tech' => array(
'Hacker News' => '/news/hacker-news.html',
'The Verge' => '/news/the-verge.html',
'Lifehacker' => '/news/lifehacker.html',
'Fast Company' => '/news/fast-company.html',
'ArsTechnica' => '/news/arstechnica.html',
'MakeUseOf' => '/news/makeuseof.html',
'FastCoExist' => '/news/fastcoexist.html',
'How to Geek' => '/news/how-to-geek.html',
'The Next Web' => '/news/the-next-web.html',
'Engadget' => '/news/engadget.html',
'Gizmag' => '/news/gizmag.html',
'QZ' => '/news/qz.html',
'Wired' => '/news/wired.html',
'Techcrunch' => '/news/techcrunch.html',
'Slashdot' => '/news/slashdot.html',
'Extreme Tech' => '/news/extreme-tech.html',
'AnandTech' => '/news/anandtech.html',
'Digital Trends' => '/news/digital-trends.html',
'Next Big Future' => '/news/next-big-future.html',
'Apple Insider' => '/news/apple-insider.html',
'Geek' => '/news/geek.html',
'BBC Technology' => '/news/bbc-technology.html',
'Bit-Tech' => '/news/bit-tech.html',
'Packet Storm Sec' => '/news/packet-storm-sec.html',
'Design' => '/news/design.html',
'High Scalability' => '/news/high-scalability.html',
'Smashing Mag' => '/news/smashing-mag.html',
'The Tech Block' => '/news/the-tech-block.html',
'A VC' => '/news/a-vc.html',
'Tech in Asia' => '/news/tech-in-asia.html',
'ReadWriteWeb' => '/news/readwriteweb.html',
'PC Mag' => '/news/pc-mag.html',
'Continuations' => '/news/continuations.html',
'Copyblogger' => '/news/copyblogger.html',
'Cult of Mac' => '/news/cult-of-mac.html',
'BetaBeat' => '/news/betabeat.html',
'MedGadget' => '/news/medgadget.html',
'SecuriTeam' => '/news/securiteam.html',
'Venture Beat' => '/news/venture-beat.html',
),
'Trend' => array(
'Trend Hunter' => '/news/trend-hunter.html',
'ApartmentT' => '/news/apartmentt.html',
'GQ' => '/news/gq.html',
'Digital Trends' => '/news/digital-trends.html',
'Cool Hunting' => '/news/cool-hunting.html',
'FastCoDesign' => '/news/fastcodesign.html',
'TC Startups' => '/news/tc-startups.html',
'Killer Startups' => '/news/killer-startups.html',
'DigiInfo' => '/news/digiinfo.html',
'New Startups' => '/news/new-startups.html',
'DigiTrends' => '/news/digitrends.html',
),
'Watches' => array(
'Hodinkee' => '/news/hodinkee.html',
'Quill and Pad' => '/news/quill-and-pad.html',
'Monochrome' => '/news/monochrome.html',
'Deployant' => '/news/deployant.html',
'Watches by SJX' => '/news/watches-by-sjx.html',
'Fratello Watches' => '/news/fratello-watches.html',
'A Blog to Watch' => '/news/a-blog-to-watch.html',
'Wound for Life' => '/news/wound-for-life.html',
'Watch Paper' => '/news/watch-paper.html',
'Watch Report' => '/news/watch-report.html',
'Perpetuelle' => '/news/perpetuelle.html',
),
'Youtube' => array(
'LinusTechTips' => '/news/linustechtips.html',
'MetalJesusRocks' => '/news/metaljesusrocks.html',
'TotalBiscuit' => '/news/totalbiscuit.html',
'DexBonus' => '/news/dexbonus.html',
'Lon Siedman' => '/news/lon-siedman.html',
'MKBHD' => '/news/mkbhd.html',
'Terry A Davis' => '/news/terry-a-davis.html',
'HappyConsole' => '/news/happyconsole.html',
'Austin Evans' => '/news/austin-evans.html',
'NCIX' => '/news/ncix.html',
),
)
),
),
self::CONTEXT_CUSTOM => array(
'config' => array(
'name' => 'Configuration',
'type' => 'text',
'required' => true,
'title' => 'Enter feed numbers from Skimfeed!',
'exampleValue' => '5,8,2,l,p,9,23'
)
),
'global' => array(
'limit' => array(
'name' => 'Limit',
'type' => 'number',
'title' => 'Limits the number of returned items in the feed',
'exampleValue' => 10
)
)
);
public function getURI() {
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$channel = $this->getInput('box_channel');
if($channel) {
return static::URI . $channel;
}
break;
case self::CONTEXT_HOT_TOPICS:
return static::URI;
case self::CONTEXT_TECH_NEWS:
$channel = $this->getInput('tech_channel');
if($channel) {
return static::URI . $channel;
}
break;
case self::CONTEXT_CUSTOM:
$config = $this->getInput('config');
return static::URI . '/custom.php?f=' . urlencode($config);
}
return parent::getURI();
}
public function getName() {
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$channel = $this->getInput('box_channel');
$title = array_search(
$channel,
static::PARAMETERS[self::CONTEXT_NEWS_BOX]['box_channel']['values']
);
return $title . ' - ' . static::NAME;
case self::CONTEXT_HOT_TOPICS:
return 'Hot topics - ' . static::NAME;
case self::CONTEXT_TECH_NEWS:
$channel = $this->getInput('tech_channel');
$titles = array();
foreach(static::PARAMETERS[self::CONTEXT_TECH_NEWS]['tech_channel']['values'] as $ch) {
$titles = array_merge($titles, $ch);
}
$title = array_search($channel, $titles);
return $title . ' - ' . static::NAME;
case self::CONTEXT_CUSTOM:
return 'Custom - ' . static::NAME;
}
return parent::getName();
}
public function collectData() {
// enable to export parameter lists
// $this->exportBoxChannels(); die;
// $this->exportTechChannels(); die;
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Request to ' . $this->getURI() . ' failed!');
defaultLinkTo($html, static::URI);
switch($this->queriedContext) {
case self::CONTEXT_NEWS_BOX:
$author = array_search(
$this->getInput('box_channel'),
static::PARAMETERS[self::CONTEXT_NEWS_BOX]['box_channel']['values']
);
$author = '<a href="'
. $this->getURI()
. '">'
. $author
. '</a>';
$this->extractFeed($html, $author);
break;
case self::CONTEXT_HOT_TOPICS:
$this->extractHotTopics($html);
break;
case self::CONTEXT_TECH_NEWS:
$authors = array();
foreach(static::PARAMETERS[self::CONTEXT_TECH_NEWS]['tech_channel']['values'] as $ch) {
$authors = array_merge($authors, $ch);
}
$author = '<a href="'
. $this->getURI()
. '">'
. array_search($this->getInput('tech_channel'), $authors)
. '</a>';
$this->extractFeed($html, $author);
break;
case self::CONTEXT_CUSTOM:
$this->extractCustomFeed($html);
break;
}
}
private function extractFeed($html, $author) {
$articles = $html->find('li')
or returnServerError('Could not find articles!');
if(count($articles) === 1
&& stristr($articles[0]->plaintext, 'Nothing new in the last 48 hours')) {
return; // Nothing to show
}
$limit = $this->getInput('limit') ?: -1;
foreach($articles as $article) {
$anchor = $article->find('a', 0)
or returnServerError('Could not find anchor!');
$item = array();
$item['uri'] = $this->getTarget($anchor);
$item['title'] = trim($anchor->plaintext);
// The timestamp is encoded as relative time (max. the last 48 hours)
// like this: "- 7 hours". It should always be at the end of the article:
$age = substr($article->plaintext, strrpos($article->plaintext, '-'));
$item['timestamp'] = strtotime($age);
$item['author'] = $author;
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit) {
return;
}
}
}
private function extractHotTopics($html) {
$topics = $html->find('#popbox ul li')
or returnServerError('Could not find topics!');
$limit = $this->getInput('limit') ?: -1;
foreach($topics as $topic) {
$anchor = $topic->find('a', 0)
or returnServerError('Could not find anchor!');
$item = array();
$item['uri'] = $this->getTarget($anchor);
$item['title'] = $anchor->title;
$this->items[] = $item;
if($limit > 0 && count($this->items) >= $limit) {
return;
}
}
}
private function extractCustomFeed($html) {
$boxes = $html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$author = '<a href="' . $anchor->href . '">' . trim($anchor->plaintext) . '</a>';
$uri = $anchor->href;
$box_html = getSimpleHTMLDOM($uri)
or returnServerError('Could not load custom feed!');
$this->extractFeed($box_html, $author);
}
}
private function getTarget($anchor) {
// Anchors are linked to Skimfeed, luckily the target URI is encoded
// in that URI via '&u=<URI>':
$query = parse_url($anchor->href, PHP_URL_QUERY);
foreach(explode('&', $query) as $parameter) {
list($key, $value) = explode('=', $parameter);
if($key !== 'u') {
continue;
}
return urldecode($value);
}
}
/**
* dev-mode!
* Requires '&format=Html'
*
* Returns the 'box' array from the source site
*/
private function exportBoxChannels() {
$html = getSimpleHTMLDOMCached(static::URI)
or returnServerError('No contents received from Skimfeed!');
if(!$this->isCompatible($html)) {
returnServerError('Skimfeed version is not compatible!');
}
$boxes = $html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
// begin of 'channel' list
$message = <<<EOD
'box_channel' => array(
'name' => 'Channel',
'type' => 'list',
'required' => true,
'title' => 'Select your channel',
'values' => array(
EOD;
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$title = trim($anchor->plaintext);
$uri = $anchor->href;
// add value
$message .= "\t\t'{$title}' => '{$uri}', \n";
}
// end of 'box' list
$message .= <<<EOD
)
),
EOD;
echo <<<EOD
<!DOCTYPE html>
<html>
<body>
<code style="white-space: pre-wrap;">{$message}</code>
</body>
</html>
EOD;
}
/**
* dev-mode!
* Requires '&format=Html'
*
* Returns the 'techs' array from the source site
*/
private function exportTechChannels() {
$html = getSimpleHTMLDOMCached(static::URI)
or returnServerError('No contents received from Skimfeed!');
if(!$this->isCompatible($html)) {
returnServerError('Skimfeed version is not compatible!');
}
$channels = $html->find('#menubar a')
or returnServerError('Could not find channels!');
// begin of 'tech_channel' list
$message = <<<EOD
'tech_channel' => array(
'name' => 'Tech channel',
'type' => 'list',
'required' => true,
'title' => 'Select your tech channel',
'values' => array(
EOD;
foreach($channels as $channel) {
if($channel->href === '#'
|| $channel->class === 'homelink'
|| $channel->plaintext === 'Twitter'
|| $channel->plaintext === 'Weather'
|| $channel->plaintext === '+Custom') {
continue;
}
$title = trim($channel->plaintext);
$uri = '/' . $channel->href;
$message .= "\t\t'{$title}' => array(\n";
$channel_html = getSimpleHTMLDOMCached(static::URI . $uri)
or returnServerError('Could not load tech channel ' . $channel->plaintext . '!');
$boxes = $channel_html->find('#boxx .boxes')
or returnServerError('Could not find boxes!');
foreach($boxes as $box) {
$anchor = $box->find('span.boxtitles a', 0)
or returnServerError('Could not find box anchor!');
$boxtitle = trim($anchor->plaintext);
$boxuri = $anchor->href;
$message .= "\t\t\t'{$boxtitle}' => '{$boxuri}', \n";
}
$message .= "\t\t),\n";
}
// end of 'box' list
$message .= <<<EOD
)
),
EOD;
echo <<<EOD
<!DOCTYPE html>
<html>
<body>
<code style="white-space: pre-wrap;">{$message}</code>
</body>
</html>
EOD;
}
/**
* Checks if the reported skimfeed version is compatible
*/
private function isCompatible($html) {
$title = $html->find('title', 0);
if(!$title) {
return false;
}
if($title->plaintext === 'Skimfeed V5.5 - Tech News') {
return true;
}
return false;
}
}

View File

@@ -31,7 +31,7 @@ class SupInfoBridge extends BridgeAbstract {
}
}
public function fetchArticle($link) {
private function fetchArticle($link) {
$articleHTML = getSimpleHTMLDOM(self::URI . $link)
or returnServerError('Unable to fetch article !');

View File

@@ -8,67 +8,66 @@ class TheHackerNewsBridge extends BridgeAbstract {
public function collectData(){
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
function stripRecursiveHtmlSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr(
$string,
$section_start,
$section_end - $section_start + $close_tag_length
);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
$html = getSimpleHTMLDOM($this->getURI())
or returnServerError('Could not request TheHackerNews: ' . $this->getURI());
$limit = 0;
foreach($html->find('article') as $element) {
foreach($html->find('div.body-post') as $element) {
if($limit < 5) {
$article_url = $element->find('a.entry-title', 0)->href;
$article_author = trim($element->find('span.vcard', 0)->plaintext);
$article_title = $element->find('a.entry-title', 0)->plaintext;
$article_timestamp = strtotime($element->find('span.updated', 0)->plaintext);
$article = getSimpleHTMLDOM($article_url)
or returnServerError('Could not request TheHackerNews: ' . $article_url);
$article_url = $element->find('a.story-link', 0)->href;
$article_author = trim($element->find('i.fa-user', 0)->parent()->plaintext);
$article_title = $element->find('h2.home-title', 0)->plaintext;
$contents = $article->find('div.articlebodyonly', 0)->innertext;
$contents = stripRecursiveHtmlSection($contents, 'div', '<div class=\'clear\'');
$contents = stripWithDelimiters($contents, '<script', '</script>');
//Date without time
$article_timestamp = strtotime(
extractFromDelimiters(
$element->find('i.fa-calendar', 0)->parent()->outertext,
'</i>',
'<span>'
)
);
//Article thumbnail in lazy-loading image
if (is_object($element->find('img[data-echo]', 0))) {
$article_thumbnail = array(
extractFromDelimiters(
$element->find('img[data-echo]', 0)->outertext,
"data-echo='",
"'"
)
);
} else {
$article_thumbnail = array();
}
if ($article = getSimpleHTMLDOMCached($article_url)) {
//Article body
$contents = $article->find('div.articlebody', 0)->innertext;
$contents = stripRecursiveHtmlSection($contents, 'div', '<div class="ad_');
$contents = stripWithDelimiters($contents, 'id="google_ads', '</iframe>');
$contents = stripWithDelimiters($contents, '<script', '</script>');
//Date with time
if (is_object($article->find('meta[itemprop=dateModified]', 0))) {
$article_timestamp = strtotime(
extractFromDelimiters(
$article->find('meta[itemprop=dateModified]', 0)->outertext,
"content='",
"'"
)
);
}
} else {
$contents = 'Could not request TheHackerNews: ' . $article_url;
}
$item = array();
$item['uri'] = $article_url;
$item['title'] = $article_title;
$item['author'] = $article_author;
$item['enclosures'] = $article_thumbnail;
$item['timestamp'] = $article_timestamp;
$item['content'] = trim($contents);
$this->items[] = $item;

View File

@@ -1,102 +0,0 @@
<?php
class Torrent9Bridge extends BridgeAbstract {
const MAINTAINER = 'lagaisse';
const NAME = 'Torrent9 Bridge';
const URI = 'http://www.torrent9.pe';
const CACHE_TIMEOUT = 86400; // 24h = 86400s
const DESCRIPTION = 'Returns latest torrents';
const PAGE_SERIES = 'torrents_series';
const PAGE_SERIES_VOSTFR = 'torrents_series_vostfr';
const PAGE_SERIES_FR = 'torrents_series_french';
const PARAMETERS = array(
'From search' => array(
'q' => array(
'name' => 'Search',
'required' => true,
'title' => 'Type your search'
)
),
'By page' => array(
'page' => array(
'name' => 'Page',
'type' => 'list',
'required' => false,
'values' => array(
'Series' => self::PAGE_SERIES,
'Series VOST' => self::PAGE_SERIES_VOSTFR,
'Series FR' => self::PAGE_SERIES_FR,
),
'defaultValue' => self::PAGE_SERIES
)
)
);
public function collectData(){
if($this->queriedContext === 'From search') {
$request = str_replace(' ', '-', trim($this->getInput('q')));
$page = self::URI . '/search_torrent/' . urlencode($request) . '.html';
} else {
$request = $this->getInput('page');
$page = self::URI . '/' . $request . '.html';
}
$html = getSimpleHTMLDOM($page)
or returnServerError('No results for this query.');
foreach($html->find('table', 0)->find('tr') as $episode) {
if($episode->parent->tag == 'tbody') {
$urlepisode = self::URI . $episode->find('a', 0)->getAttribute('href');
//30 years = forever
$htmlepisode = getSimpleHTMLDOMCached($urlepisode, 86400 * 366 * 30);
$item = array();
$item['author'] = $episode->find('a', 0)->text();
$item['title'] = $episode->find('a', 0)->text();
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['pubdate'] = $this->getCachedDate($urlepisode);
$textefiche = $htmlepisode->find('.movie-information', 0)->find('p', 1);
if(isset($textefiche)) {
$item['content'] = $textefiche->text();
} else {
$p = $htmlepisode->find('.movie-information', 0)->find('p');
if(!empty($p)) {
$item['content'] = $htmlepisode->find('.movie-information', 0)->find('p', 0)->text();
}
}
$item['id'] = $episode->find('a', 0)->getAttribute('href');
$item['uri'] = self::URI . $htmlepisode->find('.download', 0)->getAttribute('href');
$this->items[] = $item;
}
}
}
public function getName(){
if(!is_null($this->getInput('q'))) {
return $this->getInput('q') . ' : ' . self::NAME;
}
return parent::getName();
}
private function getCachedDate($url){
debugMessage('getting pubdate from url ' . $url . '');
// Initialize cache
$cache = Cache::create('FileCache');
$cache->setPath(CACHE_DIR . '/pages');
$params = [$url];
$cache->setParameters($params);
// Get cachefile timestamp
$time = $cache->getTime();
return ($time !== false ? $time : time());
}
}

View File

@@ -17,8 +17,14 @@ class VkBridge extends BridgeAbstract
)
);
protected $videos = array();
protected $pageName;
protected function getAccessToken()
{
return 'c8071613517c155c6cfbd2a059b2718e9c37b89094c4766834969dda75f657a2c1cbb49bab4c5e649f1db';
}
public function getURI()
{
if (!is_null($this->getInput('u'))) {
@@ -51,11 +57,20 @@ class VkBridge extends BridgeAbstract
$pageName = $pageName->plaintext;
$this->pageName = htmlspecialchars_decode($pageName);
}
foreach ($html->find('div.replies') as $comment_block) {
$comment_block->outertext = '';
}
$html->load($html->save());
$pinned_post_item = null;
$last_post_id = 0;
foreach ($html->find('.post') as $post) {
defaultLinkTo($post, self::URI);
$post_videos = array();
$is_pinned_post = false;
if (strpos($post->getAttribute('class'), 'post_fixed') !== false) {
$is_pinned_post = true;
@@ -114,7 +129,7 @@ class VkBridge extends BridgeAbstract
}
$article_title = $article->find($article_title_selector, 0)->innertext;
$article_author = $article->find($article_author_selector, 0)->innertext;
$article_link = self::URI . ltrim($article->getAttribute('href'), '/');
$article_link = $article->getAttribute('href');
$article_img_element_style = $article->find($article_thumb_selector, 0)->getAttribute('style');
preg_match('/background-image: url\((.*)\)/', $article_img_element_style, $matches);
if (count($matches) > 0) {
@@ -126,20 +141,22 @@ class VkBridge extends BridgeAbstract
// get video on post
$video = $post->find('div.post_video_desc', 0);
$main_video_link = '';
if (is_object($video)) {
$video_title = $video->find('div.post_video_title', 0)->plaintext;
$video_link = self::URI . ltrim( $video->find('a.lnk', 0)->getAttribute('href'), '/' );
$content_suffix .= "<br>Video: <a href='$video_link'>$video_title</a>";
$video_link = $video->find('a.lnk', 0)->getAttribute('href');
$this->appendVideo($video_title, $video_link, $content_suffix, $post_videos);
$video->outertext = '';
$main_video_link = $video_link;
}
// get all other videos
foreach($post->find('a.page_post_thumb_video') as $a) {
$video_title = $a->getAttribute('aria-label');
$video_title = htmlspecialchars_decode($a->getAttribute('aria-label'));
$temp = explode(' ', $video_title, 2);
if (count($temp) > 1) $video_title = $temp[1];
$video_link = self::URI . ltrim( $a->getAttribute('href'), '/' );
$content_suffix .= "<br>Video: <a href='$video_link'>$video_title</a>";
$video_link = $a->getAttribute('href');
if ($video_link != $main_video_link) $this->appendVideo($video_title, $video_link, $content_suffix, $post_videos);
$a->outertext = '';
}
@@ -155,14 +172,14 @@ class VkBridge extends BridgeAbstract
foreach($post->find('.page_album_wrap') as $el) {
$a = $el->find('.page_album_link', 0);
$album_title = $a->find('.page_album_title_text', 0)->getAttribute('title');
$album_link = self::URI . ltrim($a->getAttribute('href'), '/');
$album_link = $a->getAttribute('href');
$el->outertext = '';
$content_suffix .= "<br>Album: <a href='$album_link'>$album_title</a>";
}
// get photo documents
foreach($post->find('a.page_doc_photo_href') as $a) {
$doc_link = self::URI . ltrim($a->getAttribute('href'), '/');
$doc_link = $a->getAttribute('href');
$doc_gif_label_element = $a->find('.page_gif_label', 0);
$doc_title_element = $a->find('.doc_label', 0);
@@ -188,7 +205,7 @@ class VkBridge extends BridgeAbstract
if (is_object($doc_title_element)) {
$doc_title = $doc_title_element->innertext;
$doc_link = self::URI . ltrim($doc_title_element->getAttribute('href'), '/');
$doc_link = $doc_title_element->getAttribute('href');
$content_suffix .= "<br>Doc: <a href='$doc_link'>$doc_title</a>";
} else {
@@ -228,20 +245,29 @@ class VkBridge extends BridgeAbstract
$item = array();
$item['content'] = strip_tags(backgroundToImg($post->find('div.wall_text', 0)->innertext), '<br><img>');
$item['content'] .= $content_suffix;
$item['categories'] = array();
// get post hashtags
foreach($post->find('a') as $a) {
$href = $a->getAttribute('href');
$prefix = '/feed?section=search&q=%23';
$innertext = $a->innertext;
if ($href && substr($href, 0, strlen($prefix)) === $prefix) {
$item['categories'][] = urldecode(substr($href, strlen($prefix)));
} else if (substr($innertext, 0, 1) == '#') {
$item['categories'][] = $innertext;
}
}
// get post link
$post_link = $post->find('a.post_link', 0)->getAttribute('href');
preg_match('/wall-?\d+_(\d+)/', $post_link, $preg_match_result);
$item['post_id'] = intval($preg_match_result[1]);
if (substr(self::URI, -1) == '/') {
$post_link = self::URI . ltrim($post_link, '/');
} else {
$post_link = self::URI . $post_link;
}
$item['uri'] = $post_link;
$item['timestamp'] = $this->getTime($post);
$item['title'] = $this->getTitle($item['content']);
$item['author'] = $post_author;
$item['videos'] = $post_videos;
if ($is_pinned_post) {
// do not append it now
$pinned_post_item = $item;
@@ -252,16 +278,18 @@ class VkBridge extends BridgeAbstract
}
if (is_null($pinned_post_item)) {
return;
} else if (count($this->items) == 0) {
$this->items[] = $pinned_post_item;
} else if ($last_post_id < $pinned_post_item['post_id']) {
$this->items[] = $pinned_post_item;
usort($this->items, function ($item1, $item2) {
return $item2['post_id'] - $item1['post_id'];
});
if (!is_null($pinned_post_item)) {
if (count($this->items) == 0) {
$this->items[] = $pinned_post_item;
} else if ($last_post_id < $pinned_post_item['post_id']) {
$this->items[] = $pinned_post_item;
usort($this->items, function ($item1, $item2) {
return $item2['post_id'] - $item1['post_id'];
});
}
}
$this->getCleanVideoLinks();
}
private function getPhoto($a) {
@@ -326,7 +354,7 @@ class VkBridge extends BridgeAbstract
}
public function getContents()
private function getContents()
{
ini_set('user-agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0');
@@ -335,5 +363,51 @@ class VkBridge extends BridgeAbstract
return getContents($this->getURI(), $header);
}
protected function appendVideo($video_title, $video_link, &$content_suffix, array &$post_videos)
{
if (!$video_title) $video_title = '(empty)';
preg_match('/video([0-9-]+_[0-9]+)/', $video_link, $preg_match_result);
if (count($preg_match_result) > 1) {
$video_id = $preg_match_result[1];
$this->videos[ $video_id ] = array(
'url' => $video_link,
'title' => $video_title,
);
$post_videos[] = $video_id;
} else {
$content_suffix .= '<br>Video: <a href="'.htmlspecialchars($video_link).'">'.$video_title.'</a>';
}
}
protected function getCleanVideoLinks() {
$result = $this->api('video.get', array(
'videos' => implode(',', array_keys($this->videos)),
'count' => 200
));
if (isset($result['error'])) return;
foreach($result['response']['items'] as $item) {
$video_id = strval($item['owner_id']).'_'.strval($item['id']);
$this->videos[$video_id]['url'] = $item['player'];
}
foreach($this->items as &$item) {
foreach($item['videos'] as $video_id) {
$video_link = $this->videos[$video_id]['url'];
$video_title = $this->videos[$video_id]['title'];
$item['content'] .= '<br>Video: <a href="'.htmlspecialchars($video_link).'">'.$video_title.'</a>';
}
unset($item['videos']);
}
}
protected function api($method, array $params)
{
$params['v'] = '5.80';
$params['access_token'] = $this->getAccessToken();
return json_decode( getContents('https://api.vk.com/method/'.$method.'?'.http_build_query($params)), true );
}
}

View File

@@ -3,37 +3,24 @@ class WeLiveSecurityBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'We Live Security';
const URI = 'http://www.welivesecurity.com/';
const URI = 'https://www.welivesecurity.com/';
const DESCRIPTION = 'Returns the newest articles.';
private function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
protected function parseItem($item){
$item = parent::parseItem($item);
$article_html = getSimpleHTMLDOMCached($item['uri']);
if(!$article_html) {
$item['content'] .= '<p>Could not request ' . $this->getName() . ': ' . $item['uri'] . '</p>';
$item['content'] .= '<p><em>Could not request ' . $this->getName() . ': ' . $item['uri'] . '</em></p>';
return $item;
}
$article_content = $article_html->find('div.wlistingsingletext', 0)->innertext;
$article_content = $this->stripWithDelimiters($article_content, '<script', '</script>');
$article_content = '<p><b>'
. $item['content']
. '</b></p>'
. trim($article_content);
$item['content'] = $article_content;
$article_content = $article_html->find('div.formatted', 0)->innertext;
$article_content = stripWithDelimiters($article_content, '<script', '</script>');
$article_content = stripRecursiveHTMLSection($article_content, 'div', '<div class="comments');
$article_content = stripRecursiveHTMLSection($article_content, 'div', '<div class="similar-articles');
$article_content = stripRecursiveHTMLSection($article_content, 'span', '<span class="meta');
$item['content'] = trim($article_content);
return $item;
}

View File

@@ -3,8 +3,7 @@ class WordPressBridge extends FeedExpander {
const MAINTAINER = 'aledeg';
const NAME = 'Wordpress Bridge';
const URI = 'https://wordpress.org/';
const CACHE_TIMEOUT = 10800; // 3h
const DESCRIPTION = 'Returns the newest full posts of a Wordpress powered website';
const DESCRIPTION = 'Returns the newest full posts of a WordPress powered website';
const PARAMETERS = array( array(
'url' => array(
@@ -13,8 +12,8 @@ class WordPressBridge extends FeedExpander {
)
));
private function clearContent($content){
$content = preg_replace('/<script[^>]*>[^<]*<\/script>/', '', $content);
private function cleanContent($content){
$content = stripWithDelimiters($content, '<script', '</script>');
$content = preg_replace('/<div class="wpa".*/', '', $content);
$content = preg_replace('/<form.*\/form>/', '', $content);
return $content;
@@ -27,6 +26,10 @@ class WordPressBridge extends FeedExpander {
$article = null;
switch(true) {
case !is_null($article_html->find('[itemprop=articleBody]', 0)):
// highest priority content div
$article = $article_html->find('[itemprop=articleBody]', 0);
break;
case !is_null($article_html->find('article', 0)):
// most common content div
$article = $article_html->find('article', 0);
@@ -39,15 +42,37 @@ class WordPressBridge extends FeedExpander {
// another common content div
$article = $article_html->find('.post-content', 0);
break;
case !is_null($article_html->find('.post', 0)):
// for old WordPress themes without HTML5
$article = $article_html->find('.post', 0);
break;
}
foreach ($article->find('h1.entry-title') as $title)
if ($title->plaintext == $item['title'])
$title->outertext = '';
$article_image = $article_html->find('img.wp-post-image', 0);
if(!empty($item['content']) && (!is_object($article_image) || empty($article_image->src))) {
$article_image = str_get_html($item['content'])->find('img.wp-post-image', 0);
}
if(is_object($article_image) && !empty($article_image->src)) {
if(empty($article_image->getAttribute('data-lazy-src'))) {
$article_image = $article_image->src;
} else {
$article_image = $article_image->getAttribute('data-lazy-src');
}
$mime_type = getMimeType($article_image);
if (strpos($mime_type, 'image') === false)
$article_image .= '#.image'; // force image
if (empty($item['enclosures']))
$item['enclosures'] = array($article_image);
else
$item['enclosures'] = array_merge($item['enclosures'], $article_image);
}
if(!is_null($article)) {
$item['content'] = $this->clearContent($article->innertext);
$item['content'] = $this->cleanContent($article->innertext);
}
return $item;

View File

@@ -101,7 +101,7 @@ class YGGTorrentBridge extends BridgeAbstract {
. $category
. '&sub_category='
. $subcategory
. '&do=search')
. '&do=search&order=desc&sort=publish_date')
or returnServerError('Unable to query Yggtorrent !');
$count = 0;
@@ -110,8 +110,8 @@ class YGGTorrentBridge extends BridgeAbstract {
foreach($results->find('tr') as $row) {
$count++;
if($count == 1) continue;
if($count == 12) break;
if($count == 1) continue; // Skip table header
if($count == 22) break; // Stop processing after 21 items (20 + 1 table header)
$item = array();
$item['timestamp'] = $row->find('.hidden', 1)->plaintext;
$item['title'] = $row->find('a', 1)->plaintext;
@@ -127,7 +127,7 @@ class YGGTorrentBridge extends BridgeAbstract {
}
public function collectTorrentData($url) {
private function collectTorrentData($url) {
//For weird reason, the link we get can be invalid, we fix it.
$url_full = explode('/', $url);
@@ -135,7 +135,7 @@ class YGGTorrentBridge extends BridgeAbstract {
$url_full[5] = urlencode($url_full[5]);
$url_full[6] = urlencode($url_full[6]);
$url = implode('/', $url_full);
$page = getSimpleHTMLDOM($url) or returnServerError('Unable to query Yggtorrent page !');
$page = getSimpleHTMLDOMCached($url) or returnServerError('Unable to query Yggtorrent page !');
$author = $page->find('.informations', 0)->find('a', 4)->plaintext;
$content = $page->find('.default', 1);
return array('author' => $author, 'content' => $content);

View File

@@ -45,9 +45,25 @@ class YoutubeBridge extends BridgeAbstract {
'type' => 'number',
'exampleValue' => 1
)
),
'global' => array(
'duration_min' => array(
'name' => 'min. duration (minutes)',
'type' => 'number',
'title' => 'Minimum duration for the video in minutes',
'exampleValue' => 5
),
'duration_max' => array(
'name' => 'max. duration (minutes)',
'type' => 'number',
'title' => 'Maximum duration for the video in minutes',
'exampleValue' => 10
)
)
);
private $feedName = '';
private function ytBridgeQueryVideoInfo($vid, &$author, &$desc, &$time){
$html = $this->ytGetSimpleHTMLDOM(self::URI . "watch?v=$vid");
@@ -113,6 +129,17 @@ class YoutubeBridge extends BridgeAbstract {
private function ytBridgeParseHtmlListing($html, $element_selector, $title_selector, $add_parsed_items = true) {
$limit = $add_parsed_items ? 10 : INF;
$count = 0;
$duration_min = $this->getInput('duration_min') ?: -1;
$duration_min = $duration_min * 60;
$duration_max = $this->getInput('duration_max') ?: INF;
$duration_max = $duration_max * 60;
if($duration_max < $duration_min) {
returnClientError('Max duration must be greater than min duration!');
}
foreach($html->find($element_selector) as $element) {
if($count < $limit) {
$author = '';
@@ -121,6 +148,20 @@ class YoutubeBridge extends BridgeAbstract {
$vid = str_replace('/watch?v=', '', $element->find('a', 0)->href);
$vid = substr($vid, 0, strpos($vid, '&') ?: strlen($vid));
$title = $this->ytBridgeFixTitle($element->find($title_selector, 0)->plaintext);
// The duration comes in one of the formats:
// hh:mm:ss / mm:ss / m:ss
// 01:03:30 / 15:06 / 1:24
$durationText = trim($element->find('span[class="video-time"]', 0)->plaintext);
$durationText = preg_replace('/([\d]{1,2})\:([\d]{2})/', '00:$1:$2', $durationText);
sscanf($durationText, '%d:%d:%d', $hours, $minutes, $seconds);
$duration = $hours * 3600 + $minutes * 60 + $seconds;
if($duration < $duration_min || $duration > $duration_max) {
continue;
}
if($title != '[Private Video]' && strpos($vid, 'googleads') === false) {
if ($add_parsed_items) {
$this->ytBridgeQueryVideoInfo($vid, $author, $desc, $time);
@@ -168,7 +209,7 @@ class YoutubeBridge extends BridgeAbstract {
}
if(!empty($url_feed) && !empty($url_listing)) {
if($xml = $this->ytGetSimpleHTMLDOM($url_feed)) {
if(!$this->skipFeeds() && $xml = $this->ytGetSimpleHTMLDOM($url_feed)) {
$this->ytBridgeParseXmlFeed($xml);
} elseif($html = $this->ytGetSimpleHTMLDOM($url_listing)) {
$this->ytBridgeParseHtmlListing($html, 'li.channels-content-item', 'h3');
@@ -182,7 +223,7 @@ class YoutubeBridge extends BridgeAbstract {
$html = $this->ytGetSimpleHTMLDOM($url_listing)
or returnServerError("Could not request YouTube. Tried:\n - $url_listing");
$item_count = $this->ytBridgeParseHtmlListing($html, 'tr.pl-video', '.pl-video-title a', false);
if ($item_count <= 15 && ($xml = $this->ytGetSimpleHTMLDOM($url_feed))) {
if ($item_count <= 15 && !$this->skipFeeds() && ($xml = $this->ytGetSimpleHTMLDOM($url_feed))) {
$this->ytBridgeParseXmlFeed($xml);
} else {
$this->ytBridgeParseHtmlListing($html, 'tr.pl-video', '.pl-video-title a');
@@ -215,6 +256,10 @@ class YoutubeBridge extends BridgeAbstract {
}
}
private function skipFeeds() {
return ($this->getInput('duration_min') || $this->getInput('duration_max'));
}
public function getName(){
// Name depends on queriedContext:
switch($this->queriedContext) {

View File

@@ -1,9 +1,9 @@
<?php
class ZDNetBridge extends BridgeAbstract {
class ZDNetBridge extends FeedExpander {
const MAINTAINER = 'ORelio';
const NAME = 'ZDNet Bridge';
const URI = 'http://www.zdnet.com/';
const URI = 'https://www.zdnet.com/';
const DESCRIPTION = 'Technology News, Analysis, Comments and Product Reviews for IT Professionals.';
//http://www.zdnet.com/zdnet.opml
@@ -160,143 +160,42 @@ class ZDNetBridge extends BridgeAbstract {
));
public function collectData(){
function stripCdata($string){
$string = str_replace('<![CDATA[', '', $string);
$string = str_replace(']]>', '', $string);
return trim($string);
}
function extractFromDelimiters($string, $start, $end){
if(strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
}
return false;
}
function stripWithDelimiters($string, $start, $end){
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
function stripRecursiveHtmlSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr(
$string,
$section_start,
$section_end - $section_start + $close_tag_length
);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while ($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
$baseUri = self::URI;
$baseUri = static::URI;
$feed = $this->getInput('feed');
if(strpos($feed, 'downloads!') !== false) {
$feed = str_replace('downloads!', '', $feed);
$baseUri = str_replace('www.', 'downloads.', $baseUri);
}
$url = $baseUri . trim($feed, '/') . '/rss.xml';
$html = getSimpleHTMLDOM($url)
or returnServerError('Could not request ZDNet: ' . $url);
$limit = 0;
$this->collectExpandableDatas($url);
}
foreach($html->find('item') as $element) {
if($limit < 10) {
$article_url = preg_replace(
'/([^#]+)#ftag=.*/',
'$1',
stripCdata(extractFromDelimiters($element->innertext, '<link>', '</link>'))
);
protected function parseItem($item){
$item = parent::parseItem($item);
$article_author = stripCdata(extractFromDelimiters($element->innertext, 'role="author">', '<'));
$article_title = stripCdata($element->find('title', 0)->plaintext);
$article_subtitle = stripCdata($element->find('description', 0)->plaintext);
$article_timestamp = strtotime(stripCdata($element->find('pubDate', 0)->plaintext));
$article = getSimpleHTMLDOM($article_url)
or returnServerError('Could not request ZDNet: ' . $article_url);
$article = getSimpleHTMLDOMCached($item['uri']);
if(!$article)
returnServerError('Could not request ZDNet: ' . $url);
if(!empty($article_author)) {
$author = $article_author;
} else {
$author = $article->find('meta[name=author]', 0);
if(is_object($author)) {
$author = $author->content;
} else {
$author = 'ZDNet';
}
}
$thumbnail = $article->find('meta[itemprop=image]', 0);
if(is_object($thumbnail)) {
$thumbnail = $thumbnail->content;
} else {
$thumbnail = '';
}
$contents = $article->find('article', 0)->innertext;
foreach(array(
'<div class="shareBar"',
'<div class="shortcodeGalleryWrapper"',
'<div class="relatedContent',
'<div class="downloadNow',
'<div data-shortcode',
'<div id="sharethrough',
'<div id="inpage-video'
) as $div_start) {
$contents = stripRecursiveHtmlSection($contents, 'div', $div_start);
}
$contents = stripWithDelimiters($contents, '<script', '</script>');
$contents = stripWithDelimiters($contents, '<meta itemprop="image"', '>');
$contents = trim(stripWithDelimiters($contents, '<section class="sharethrough-top', '</section>'));
$content_img = strpos($contents, '<img'); //Look for first image
if (($content_img !== false && $content_img < 512) || $thumbnail == '') {
$content_img = ''; //Image already present on article beginning or no thumbnail
} else {
$content_img = '<p><img src="'.$thumbnail.'" /></p>'; //Include thumbnail
}
$contents = $content_img
. '<p><b>'
. $article_subtitle
. '</b></p>'
. $contents;
$item = array();
$item['author'] = $author;
$item['uri'] = $article_url;
$item['title'] = $article_title;
$item['timestamp'] = $article_timestamp;
$item['content'] = $contents;
$this->items[] = $item;
$limit++;
}
$contents = $article->find('article', 0)->innertext;
foreach(array(
'<div class="shareBar"',
'<div class="shortcodeGalleryWrapper"',
'<div class="relatedContent',
'<div class="downloadNow',
'<div data-shortcode',
'<div id="sharethrough',
'<div id="inpage-video'
) as $div_start) {
$contents = stripRecursiveHtmlSection($contents, 'div', $div_start);
}
$contents = stripWithDelimiters($contents, '<script', '</script>');
$contents = stripWithDelimiters($contents, '<meta itemprop="image"', '>');
$contents = stripWithDelimiters($contents, '<svg class="svg-symbol', '</svg>');
$contents = trim(stripWithDelimiters($contents, '<section class="sharethrough-top', '</section>'));
$item['content'] = $contents;
return $item;
}
}

View File

@@ -27,6 +27,7 @@ class FileCache implements CacheInterface {
public function getTime(){
$cacheFile = $this->getCacheFile();
clearstatcache(false, $cacheFile);
if(file_exists($cacheFile)) {
return filemtime($cacheFile);
}

View File

@@ -18,7 +18,11 @@ class AtomFormat extends FormatAbstract{
$uri = !empty($extraInfos['uri']) ? $extraInfos['uri'] : 'https://github.com/RSS-Bridge/rss-bridge';
$uriparts = parse_url($uri);
$icon = $this->xml_encode($uriparts['scheme'] . '://' . $uriparts['host'] .'/favicon.ico');
if(!empty($extraInfos['icon'])) {
$icon = $extraInfos['icon'];
} else {
$icon = $this->xml_encode($uriparts['scheme'] . '://' . $uriparts['host'] .'/favicon.ico');
}
$uri = $this->xml_encode($uri);
@@ -35,7 +39,7 @@ class AtomFormat extends FormatAbstract{
foreach($item['enclosures'] as $enclosure) {
$entryEnclosures .= '<link rel="enclosure" href="'
. $this->xml_encode($enclosure)
. '"/>'
. '" type="' . getMimeType($enclosure) . '" />'
. PHP_EOL;
}
}

View File

@@ -48,7 +48,7 @@ class HtmlFormat extends FormatAbstract {
}
$entryCategories = '';
if(isset($item['categories'])) {
if(isset($item['categories']) && count($item['categories']) > 0) {
$entryCategories = '<div class="categories"><p>Categories:</p>';
foreach($item['categories'] as $category) {

View File

@@ -37,7 +37,7 @@ class MrssFormat extends FormatAbstract {
if(isset($item['enclosures'])) {
$entryEnclosures .= '<enclosure url="'
. $this->xml_encode($item['enclosures'][0])
. '"/>';
. '" type="' . getMimeType($item['enclosures'][0]) . '" />';
if(count($item['enclosures']) > 1) {
$entryEnclosures .= PHP_EOL;
@@ -45,7 +45,7 @@ class MrssFormat extends FormatAbstract {
Some media files might not be shown to you. Consider using the ATOM format instead!';
foreach($item['enclosures'] as $enclosure) {
$entryEnclosures .= '<atom:link rel="enclosure" href="'
. $enclosure . '" />'
. $enclosure . '" type="' . getMimeType($enclosure) . '" />'
. PHP_EOL;
}
}

182
index.php
View File

@@ -1,4 +1,31 @@
<?php
/*
Create a file named 'DEBUG' for enabling debug mode.
For further security, you may put whitelisted IP addresses in the file,
one IP per line. Empty file allows anyone(!).
Debugging allows displaying PHP error messages and bypasses the cache: this
can allow a malicious client to retrieve data about your server and hammer
a provider throught your rss-bridge instance.
*/
if(file_exists('DEBUG')) {
$debug_whitelist = trim(file_get_contents('DEBUG'));
$debug_enabled = empty($debug_whitelist)
|| in_array($_SERVER['REMOTE_ADDR'],
explode("\n", str_replace("\r", '', $debug_whitelist)
)
);
if($debug_enabled) {
ini_set('display_errors', '1');
error_reporting(E_ALL);
define('DEBUG', true);
if (empty($debug_whitelist)) {
define('DEBUG_INSECURE', true);
}
}
}
require_once __DIR__ . '/lib/RssBridge.php';
define('PHP_VERSION_REQUIRED', '5.6.0');
@@ -15,34 +42,16 @@ Configuration::loadConfiguration();
Authentication::showPromptIfNeeded();
date_default_timezone_set('UTC');
error_reporting(0);
/*
Move the CLI arguments to the $_GET array, in order to be able to use
rss-bridge from the command line
*/
parse_str(implode('&', array_slice($argv, 1)), $cliArgs);
$params = array_merge($_GET, $cliArgs);
/*
Create a file named 'DEBUG' for enabling debug mode.
For further security, you may put whitelisted IP addresses in the file,
one IP per line. Empty file allows anyone(!).
Debugging allows displaying PHP error messages and bypasses the cache: this
can allow a malicious client to retrieve data about your server and hammer
a provider throught your rss-bridge instance.
*/
if(file_exists('DEBUG')) {
$debug_whitelist = trim(file_get_contents('DEBUG'));
$debug_enabled = empty($debug_whitelist)
|| in_array($_SERVER['REMOTE_ADDR'], explode("\n", $debug_whitelist));
if($debug_enabled) {
ini_set('display_errors', '1');
error_reporting(E_ALL);
define('DEBUG', true);
}
if (isset($argv)) {
parse_str(implode('&', array_slice($argv, 1)), $cliArgs);
$params = array_merge($_GET, $cliArgs);
} else {
$params = $_GET;
}
// FIXME : beta test UA spoofing, please report any blacklisting by PHP-fopen-unfriendly websites
@@ -95,10 +104,51 @@ try {
$whitelist_selection = array_map('strtolower', $whitelist_selection);
}
$showInactive = filter_input(INPUT_GET, 'show_inactive', FILTER_VALIDATE_BOOLEAN);
$action = array_key_exists('action', $params) ? $params['action'] : null;
$bridge = array_key_exists('bridge', $params) ? $params['bridge'] : null;
if($action === 'display' && !empty($bridge)) {
// Return list of bridges as JSON formatted text
if($action === 'list') {
$list = new StdClass();
$list->bridges = array();
$list->total = 0;
foreach(Bridge::listBridges() as $bridgeName) {
$bridge = Bridge::create($bridgeName);
if($bridge === false) { // Broken bridge, show as inactive
$list->bridges[$bridgeName] = array(
'status' => 'inactive'
);
continue;
}
$status = Bridge::isWhitelisted($whitelist_selection, strtolower($bridgeName)) ? 'active' : 'inactive';
$list->bridges[$bridgeName] = array(
'status' => $status,
'uri' => $bridge->getURI(),
'name' => $bridge->getName(),
'icon' => $bridge->getIcon(),
'parameters' => $bridge->getParameters(),
'maintainer' => $bridge->getMaintainer(),
'description' => $bridge->getDescription()
);
}
$list->total = count($list->bridges);
header('Content-Type: application/json');
echo json_encode($list, JSON_PRETTY_PRINT);
} elseif($action === 'display' && !empty($bridge)) {
// DEPRECATED: 'nameBridge' scheme is replaced by 'name' in bridge parameter values
// this is to keep compatibility until futher complete removal
if(($pos = strpos($bridge, 'Bridge')) === (strlen($bridge) - strlen('Bridge'))) {
@@ -154,6 +204,7 @@ try {
try {
$bridge->setCache($cache);
$bridge->setCacheTimeout($cache_timeout);
$bridge->dieIfNotModified();
$bridge->setDatas($params);
} catch(Error $e) {
http_response_code($e->getCode());
@@ -170,6 +221,7 @@ try {
$format = Format::create($format);
$format->setItems($bridge->getItems());
$format->setExtraInfos($bridge->getExtraInfos());
$format->setLastModified($bridge->getCacheTime());
$format->display();
} catch(Error $e) {
http_response_code($e->getCode());
@@ -180,8 +232,8 @@ try {
header('Content-Type: text/html');
die(buildBridgeException($e, $bridge));
}
die;
} else {
echo BridgeList::create($whitelist_selection, $showInactive);
}
} catch(HttpException $e) {
http_response_code($e->getCode());
@@ -190,81 +242,3 @@ try {
} catch(\Exception $e) {
die($e->getMessage());
}
$formats = Format::searchInformation();
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="description" content="Rss-bridge" />
<title>RSS-Bridge</title>
<link href="static/style.css" rel="stylesheet">
<script src="static/search.js"></script>
<script src="static/select.js"></script>
<noscript>
<style>
.searchbar {
display: none;
}
</style>
</noscript>
</head>
<body onload="search()">
<?php
$status = '';
if(defined('DEBUG') && DEBUG === true) {
$status .= 'debug mode active';
}
$query = filter_input(INPUT_GET, 'q');
echo <<<EOD
<header>
<h1>RSS-Bridge</h1>
<h2>·Reconnecting the Web·</h2>
<p class="status">{$status}</p>
</header>
<section class="searchbar">
<h3>Search</h3>
<input type="text" name="searchfield"
id="searchfield" placeholder="Enter the bridge you want to search for"
onchange="search()" onkeyup="search()" value="{$query}">
</section>
EOD;
$activeFoundBridgeCount = 0;
$showInactive = filter_input(INPUT_GET, 'show_inactive', FILTER_VALIDATE_BOOLEAN);
$inactiveBridges = '';
$bridgeList = Bridge::listBridges();
foreach($bridgeList as $bridgeName) {
if(Bridge::isWhitelisted($whitelist_selection, strtolower($bridgeName))) {
echo displayBridgeCard($bridgeName, $formats);
$activeFoundBridgeCount++;
} elseif($showInactive) {
// inactive bridges
$inactiveBridges .= displayBridgeCard($bridgeName, $formats, false) . PHP_EOL;
}
}
echo $inactiveBridges;
?>
<section class="footer">
<a href="https://github.com/RSS-Bridge/rss-bridge">RSS-Bridge ~ Public Domain</a><br />
<p class="version"> <?= Configuration::getVersion() ?> </p>
<?= $activeFoundBridgeCount; ?>/<?= count($bridgeList) ?> active bridges. <br />
<?php
if($activeFoundBridgeCount !== count($bridgeList)) {
// FIXME: This should be done in pure CSS
if(!$showInactive)
echo '<a href="?show_inactive=1"><button class="small">Show inactive bridges</button></a><br />';
else
echo '<a href="?show_inactive=0"><button class="small">Hide inactive bridges</button></a><br />';
}
?>
</section>
</body>
</html>

View File

@@ -246,6 +246,15 @@ abstract class BridgeAbstract implements BridgeInterface {
return static::NAME;
}
public function getIcon(){
// Return cached icon when bridge is using cached data
if(isset($this->extraInfos)) {
return $this->extraInfos['icon'];
}
return '';
}
public function getParameters(){
return static::PARAMETERS;
}
@@ -262,7 +271,8 @@ abstract class BridgeAbstract implements BridgeInterface {
public function getExtraInfos(){
return array(
'name' => $this->getName(),
'uri' => $this->getURI()
'uri' => $this->getURI(),
'icon' => $this->getIcon()
);
}
@@ -282,4 +292,27 @@ abstract class BridgeAbstract implements BridgeInterface {
public function getCacheTimeout(){
return isset($this->cacheTimeout) ? $this->cacheTimeout : static::CACHE_TIMEOUT;
}
public function getCacheTime(){
return !is_null($this->cache) ? $this->cache->getTime() : false;
}
public function dieIfNotModified(){
if ((defined('DEBUG') && DEBUG === true)) return; // disabled in debug mode
$if_modified_since = isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : false;
if (!$if_modified_since) return; // If-Modified-Since value is required
$last_modified = $this->getCacheTime();
if (!$last_modified) return; // did not detect cache time
if (time() - $this->getCacheTimeout() > $last_modified) return; // cache timeout
$last_modified = (gmdate('D, d M Y H:i:s ', $last_modified) . 'GMT');
if ($if_modified_since == $last_modified) {
header('HTTP/1.1 304 Not Modified');
die();
}
}
}

260
lib/BridgeCard.php Normal file
View File

@@ -0,0 +1,260 @@
<?php
final class BridgeCard {
private static function buildFormatButtons($formats) {
$buttons = '';
foreach($formats as $name) {
$buttons .= '<button type="submit" name="format" value="'
. $name
. '">'
. $name
. '</button>'
. PHP_EOL;
}
return $buttons;
}
private static function getFormHeader($bridgeName, $isHttps = false) {
$form = <<<EOD
<form method="GET" action="?">
<input type="hidden" name="action" value="display" />
<input type="hidden" name="bridge" value="{$bridgeName}" />
EOD;
if(!$isHttps) {
$form .= '<div class="secure-warning">Warning :
This bridge is not fetching its content through a secure connection</div>';
}
return $form;
}
private static function getForm($bridgeName,
$formats,
$isActive = false,
$isHttps = false,
$parameterName = '',
$parameters = array()) {
$form = BridgeCard::getFormHeader($bridgeName, $isHttps);
foreach($parameters as $id => $inputEntry) {
if(!isset($inputEntry['exampleValue']))
$inputEntry['exampleValue'] = '';
if(!isset($inputEntry['defaultValue']))
$inputEntry['defaultValue'] = '';
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode($parameterName)
. '-'
. urlencode($id);
$form .= '<label for="'
. $idArg
. '">'
. filter_var($inputEntry['name'], FILTER_SANITIZE_STRING)
. ' : </label>'
. PHP_EOL;
if(!isset($inputEntry['type']) || $inputEntry['type'] === 'text') {
$form .= BridgeCard::getTextInput($inputEntry, $idArg, $id);
} elseif($inputEntry['type'] === 'number') {
$form .= BridgeCard::getNumberInput($inputEntry, $idArg, $id);
} else if($inputEntry['type'] === 'list') {
$form .= BridgeCard::getListInput($inputEntry, $idArg, $id);
} elseif($inputEntry['type'] === 'checkbox') {
$form .= BridgeCard::getCheckboxInput($inputEntry, $idArg, $id);
}
}
if($isActive) {
$form .= BridgeCard::buildFormatButtons($formats);
} else {
$form .= '<span style="font-weight: bold;">Inactive</span>';
}
return $form . '</form>' . PHP_EOL;
}
private static function getInputAttributes($entry) {
$retVal = '';
if(isset($entry['required']) && $entry['required'] === true)
$retVal .= ' required';
if(isset($entry['pattern']))
$retVal .= ' pattern="' . $entry['pattern'] . '"';
if(isset($entry['title']))
$retVal .= ' title="' . filter_var($entry['title'], FILTER_SANITIZE_STRING) . '"';
return $retVal;
}
private static function getTextInput($entry, $id, $name) {
return '<input '
. BridgeCard::getInputAttributes($entry)
. ' id="'
. $id
. '" type="text" value="'
. filter_var($entry['defaultValue'], FILTER_SANITIZE_STRING)
. '" placeholder="'
. filter_var($entry['exampleValue'], FILTER_SANITIZE_STRING)
. '" name="'
. $name
. '" /><br>'
. PHP_EOL;
}
private static function getNumberInput($entry, $id, $name) {
return '<input '
. BridgeCard::getInputAttributes($entry)
. ' id="'
. $id
. '" type="number" value="'
. filter_var($entry['defaultValue'], FILTER_SANITIZE_NUMBER_INT)
. '" placeholder="'
. filter_var($entry['exampleValue'], FILTER_SANITIZE_NUMBER_INT)
. '" name="'
. $name
. '" /><br>'
. PHP_EOL;
}
private static function getListInput($entry, $id, $name) {
$list = '<select '
. BridgeCard::getInputAttributes($entry)
. ' id="'
. $id
. '" name="'
. $name
. '" >';
foreach($entry['values'] as $name => $value) {
if(is_array($value)) {
$list .= '<optgroup label="' . htmlentities($name) . '">';
foreach($value as $subname => $subvalue) {
if($entry['defaultValue'] === $subname
|| $entry['defaultValue'] === $subvalue) {
$list .= '<option value="'
. $subvalue
. '" selected>'
. $subname
. '</option>';
} else {
$list .= '<option value="'
. $subvalue
. '">'
. $subname
. '</option>';
}
}
$list .= '</optgroup>';
} else {
if($entry['defaultValue'] === $name
|| $entry['defaultValue'] === $value) {
$list .= '<option value="'
. $value
. '" selected>'
. $name
. '</option>';
} else {
$list .= '<option value="'
. $value
. '">'
. $name
. '</option>';
}
}
}
$list .= '</select><br>';
return $list;
}
private static function getCheckboxInput($entry, $id, $name) {
return '<input '
. BridgeCard::getInputAttributes($entry)
. ' id="'
. $id
. '" type="checkbox" name="'
. $name
. '" '
. ($entry['defaultValue'] === 'checked' ?: '')
. ' /><br>'
. PHP_EOL;
}
static function displayBridgeCard($bridgeName, $formats, $isActive = true){
$bridge = Bridge::create($bridgeName);
if($bridge == false)
return '';
$isHttps = strpos($bridge->getURI(), 'https') === 0;
$uri = $bridge->getURI();
$name = $bridge->getName();
$icon = $bridge->getIcon();
$description = $bridge->getDescription();
$parameters = $bridge->getParameters();
if(defined('PROXY_URL') && PROXY_BYBRIDGE) {
$parameters['global']['_noproxy'] = array(
'name' => 'Disable proxy (' . ((defined('PROXY_NAME') && PROXY_NAME) ? PROXY_NAME : PROXY_URL) . ')',
'type' => 'checkbox'
);
}
if(CUSTOM_CACHE_TIMEOUT) {
$parameters['global']['_cache_timeout'] = array(
'name' => 'Cache timeout in seconds',
'type' => 'number',
'defaultValue' => $bridge->getCacheTimeout()
);
}
$card = <<<CARD
<section id="bridge-{$bridgeName}" data-ref="{$bridgeName}">
<h2><a href="{$uri}">{$name}</a></h2>
<p class="description">{$description}</p>
<input type="checkbox" class="showmore-box" id="showmore-{$bridgeName}" />
<label class="showmore" for="showmore-{$bridgeName}">Show more</label>
CARD;
// If we don't have any parameter for the bridge, we print a generic form to load it.
if(count($parameters) === 0
|| count($parameters) === 1 && array_key_exists('global', $parameters)) {
$card .= BridgeCard::getForm($bridgeName, $formats, $isActive, $isHttps);
} else {
foreach($parameters as $parameterName => $parameter) {
if(!is_numeric($parameterName) && $parameterName === 'global')
continue;
if(array_key_exists('global', $parameters))
$parameter = array_merge($parameter, $parameters['global']);
if(!is_numeric($parameterName))
$card .= '<h5>' . $parameterName . '</h5>' . PHP_EOL;
$card .= BridgeCard::getForm($bridgeName, $formats, $isActive, $isHttps, $parameterName, $parameter);
}
}
$card .= '<label class="showless" for="showmore-' . $bridgeName . '">Show less</label>';
$card .= '<p class="maintainer">' . $bridge->getMaintainer() . '</p>';
$card .= '</section>';
return $card;
}
}

View File

@@ -48,6 +48,13 @@ interface BridgeInterface {
*/
public function getName();
/**
* Returns the bridge icon
*
* @return string Bridge icon
*/
public function getIcon();
/**
* Returns the bridge parameters
*

136
lib/BridgeList.php Normal file
View File

@@ -0,0 +1,136 @@
<?php
final class BridgeList {
private static function getHead() {
return <<<EOD
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="description" content="RSS-Bridge" />
<title>RSS-Bridge</title>
<link href="static/style.css" rel="stylesheet">
<script src="static/search.js"></script>
<script src="static/select.js"></script>
<noscript>
<style>
.searchbar {
display: none;
}
</style>
</noscript>
</head>
EOD;
}
private static function getBridges($whitelist, $showInactive, &$totalBridges, &$totalActiveBridges) {
$body = '';
$totalActiveBridges = 0;
$inactiveBridges = '';
$bridgeList = Bridge::listBridges();
$formats = Format::searchInformation();
$totalBridges = count($bridgeList);
foreach($bridgeList as $bridgeName) {
if(Bridge::isWhitelisted($whitelist, strtolower($bridgeName))) {
$body .= BridgeCard::displayBridgeCard($bridgeName, $formats);
$totalActiveBridges++;
} elseif($showInactive) {
// inactive bridges
$inactiveBridges .= BridgeCard::displayBridgeCard($bridgeName, $formats, false) . PHP_EOL;
}
}
$body .= $inactiveBridges;
return $body;
}
private static function getHeader() {
$warning = '';
if(defined('DEBUG') && DEBUG === true) {
if(defined('DEBUG_INSECURE') && DEBUG_INSECURE === true) {
$warning .= <<<EOD
<section class="critical-warning">Warning : Debug mode is active from any location,
make sure only you can access RSS-Bridge.</section>
EOD;
} else {
$warning .= <<<EOD
<section class="warning">Warning : Debug mode is active from your IP address,
your requests will bypass the cache.</section>
EOD;
}
}
return <<<EOD
<header>
<h1>RSS-Bridge</h1>
<h2>Reconnecting the Web</h2>
{$warning}
</header>
EOD;
}
private static function getSearchbar() {
$query = filter_input(INPUT_GET, 'q');
return <<<EOD
<section class="searchbar">
<h3>Search</h3>
<input type="text" name="searchfield"
id="searchfield" placeholder="Enter the bridge you want to search for"
onchange="search()" onkeyup="search()" value="{$query}">
</section>
EOD;
}
private static function getFooter($totalBridges, $totalActiveBridges, $showInactive) {
$version = Configuration::getVersion();
$inactive = '';
if($totalActiveBridges !== $totalBridges) {
if(!$showInactive) {
$inactive = '<a href="?show_inactive=1"><button class="small">Show inactive bridges</button></a><br>';
} else {
$inactive = '<a href="?show_inactive=0"><button class="small">Hide inactive bridges</button></a><br>';
}
}
return <<<EOD
<section class="footer">
<a href="https://github.com/rss-bridge/rss-bridge">RSS-Bridge ~ Public Domain</a><br>
<p class="version">{$version}</p>
{$totalActiveBridges}/{$totalBridges} active bridges.<br>
{$inactive}
</section>
EOD;
}
static function create($whitelist, $showInactive = true) {
$totalBridges = 0;
$totalActiveBridges = 0;
return '<!DOCTYPE html><html lang="en">'
. BridgeList::getHead()
. '<body onload="search()">'
. BridgeList::getHeader()
. BridgeList::getSearchbar()
. BridgeList::getBridges($whitelist, $showInactive, $totalBridges, $totalActiveBridges)
. BridgeList::getFooter($totalBridges, $totalActiveBridges, $showInactive)
. '</body></html>';
}
}

View File

@@ -1,7 +1,7 @@
<?php
class Configuration {
public static $VERSION = '2018-07-17';
public static $VERSION = '2018-09-09';
public static $config = null;
@@ -27,6 +27,9 @@ class Configuration {
if(!extension_loaded('curl'))
die('"curl" extension not loaded. Please check "php.ini"');
if(!extension_loaded('json'))
die('"json" extension not loaded. Please check "php.ini"');
// Check cache folder permissions (write permissions required)
if(!is_writable(CACHE_DIR))
die('RSS-Bridge does not have write permissions for ' . CACHE_DIR . '!');

View File

@@ -115,12 +115,27 @@ abstract class FeedExpander extends BridgeAbstract {
}
protected function parseATOMItem($feedItem){
$item = array();
// Some ATOM entries also contain RSS 2.0 fields
$item = $this->parseRSS_2_0_Item($feedItem);
if(isset($feedItem->id)) $item['uri'] = (string)$feedItem->id;
if(isset($feedItem->title)) $item['title'] = (string)$feedItem->title;
if(isset($feedItem->updated)) $item['timestamp'] = strtotime((string)$feedItem->updated);
if(isset($feedItem->author)) $item['author'] = (string)$feedItem->author->name;
if(isset($feedItem->content)) $item['content'] = (string)$feedItem->content;
//When "link" field is present, URL is more reliable than "id" field
if (count($feedItem->link) === 1) {
$this->uri = (string)$feedItem->link[0]['href'];
} else {
foreach($feedItem->link as $link) {
if(strtolower($link['rel']) === 'alternate') {
$item['uri'] = (string)$link['href'];
break;
}
}
}
return $item;
}
@@ -130,6 +145,7 @@ abstract class FeedExpander extends BridgeAbstract {
if(isset($feedItem->title)) $item['title'] = (string)$feedItem->title;
// rss 0.91 doesn't support timestamps
// rss 0.91 doesn't support authors
// rss 0.91 doesn't support enclosures
if(isset($feedItem->description)) $item['content'] = (string)$feedItem->description;
return $item;
}
@@ -154,11 +170,17 @@ abstract class FeedExpander extends BridgeAbstract {
$namespaces = $feedItem->getNamespaces(true);
if(isset($namespaces['dc'])) $dc = $feedItem->children($namespaces['dc']);
if(isset($namespaces['media'])) $media = $feedItem->children($namespaces['media']);
if(isset($feedItem->guid)) {
foreach($feedItem->guid->attributes() as $attribute => $value) {
if($attribute === 'isPermaLink'
&& ($value === 'true' || filter_var($feedItem->guid, FILTER_VALIDATE_URL))) {
&& ($value === 'true' || (
filter_var($feedItem->guid, FILTER_VALIDATE_URL)
&& !filter_var($item['uri'], FILTER_VALIDATE_URL)
)
)
) {
$item['uri'] = (string)$feedItem->guid;
break;
}
@@ -170,11 +192,21 @@ abstract class FeedExpander extends BridgeAbstract {
} elseif(isset($dc->date)) {
$item['timestamp'] = strtotime((string)$dc->date);
}
if(isset($feedItem->author)) {
$item['author'] = (string)$feedItem->author;
} elseif (isset($feedItem->creator)) {
$item['author'] = (string)$feedItem->creator;
} elseif(isset($dc->creator)) {
$item['author'] = (string)$dc->creator;
} elseif(isset($media->credit)) {
$item['author'] = (string)$media->credit;
}
if(isset($feedItem->enclosure) && !empty($feedItem->enclosure['url'])) {
$item['enclosures'] = array((string)$feedItem->enclosure['url']);
}
return $item;
}
@@ -199,10 +231,14 @@ abstract class FeedExpander extends BridgeAbstract {
}
public function getURI(){
return $this->uri ?: parent::getURI();
return !empty($this->uri) ? $this->uri : parent::getURI();
}
public function getName(){
return $this->name ?: parent::getName();
return !empty($this->name) ? $this->name : parent::getName();
}
public function getIcon(){
return !empty($this->icon) ? $this->icon : parent::getIcon();
}
}

View File

@@ -7,6 +7,7 @@ abstract class FormatAbstract implements FormatInterface {
$contentType,
$charset,
$items,
$lastModified,
$extraInfos;
public function setCharset($charset){
@@ -27,11 +28,18 @@ abstract class FormatAbstract implements FormatInterface {
return $this;
}
public function setLastModified($lastModified){
$this->lastModified = $lastModified;
}
protected function callContentType(){
header('Content-Type: ' . $this->contentType);
}
public function display(){
if ($this->lastModified) {
header('Last-Modified: ' . gmdate('D, d M Y H:i:s ', $this->lastModified) . 'GMT');
}
echo $this->stringify();
return $this;
@@ -51,12 +59,12 @@ abstract class FormatAbstract implements FormatInterface {
}
/**
* Define common informations can be required by formats and set default value for unknow values
* Define common informations can be required by formats and set default value for unknown values
* @param array $extraInfos array with know informations (there isn't merge !!!)
* @return this
*/
public function setExtraInfos(array $extraInfos = array()){
foreach(array('name', 'uri') as $infoName) {
foreach(array('name', 'uri', 'icon') as $infoName) {
if(!isset($extraInfos[$infoName])) {
$extraInfos[$infoName] = '';
}

View File

@@ -16,6 +16,8 @@ require __DIR__ . '/FeedExpander.php';
require __DIR__ . '/Cache.php';
require __DIR__ . '/Authentication.php';
require __DIR__ . '/Configuration.php';
require __DIR__ . '/BridgeCard.php';
require __DIR__ . '/BridgeList.php';
require __DIR__ . '/validation.php';
require __DIR__ . '/html.php';
@@ -32,6 +34,17 @@ if(!file_exists($vendorLibSimpleHtmlDom)) {
}
require_once $vendorLibSimpleHtmlDom;
$vendorLibPhpUrlJoin = __DIR__ . PATH_VENDOR . '/php-urljoin/src/urljoin.php';
if(!file_exists($vendorLibPhpUrlJoin)) {
throw new \HttpException('"php-urljoin" library is missing.
Get it from https://github.com/fluffy-critter/php-urljoin and place the script "urljoin.php" in '
. substr(PATH_VENDOR, 4)
. '/php-urljoin/src/',
500);
}
require_once $vendorLibPhpUrlJoin;
/* Example use
require_once __DIR__ . '/lib/RssBridge.php';
@@ -49,6 +62,7 @@ require_once $vendorLibSimpleHtmlDom;
->setExtraInfos(array(
'name' => $bridge->getName(),
'uri' => $bridge->getURI(),
'icon' => $bridge->getIcon(),
))
->display();

View File

@@ -21,15 +21,34 @@ function getContents($url, $header = array(), $opts = array()){
curl_setopt($ch, CURLOPT_PROXY, PROXY_URL);
}
$content = curl_exec($ch);
// We always want the response header as part of the data!
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
$curlError = curl_error($ch);
$curlErrno = curl_errno($ch);
curl_close($ch);
if($content === false)
if($data === false)
debugMessage('Cant\'t download ' . $url . ' cUrl error: ' . $curlError . ' (' . $curlErrno . ')');
return $content;
$headerSize = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$header = substr($data, 0, $headerSize);
$headers = parseResponseHeader($header);
$finalHeader = end($headers);
if(array_key_exists('http_code', $finalHeader)
&& strpos($finalHeader['http_code'], '200') === false
&& array_key_exists('Server', $finalHeader)
&& strpos($finalHeader['Server'], 'cloudflare') !== false) {
returnServerError(<<< EOD
The server responded with a Cloudflare challenge, which is not supported by RSS-Bridge!<br>
If this error persists longer than a week, please consider opening an issue on GitHub!
EOD
);
}
curl_close($ch);
return substr($data, $headerSize);
}
function getSimpleHTMLDOM($url,
@@ -98,3 +117,88 @@ $defaultSpanText = DEFAULT_SPAN_TEXT){
$defaultBRText,
$defaultSpanText);
}
/**
* Parses the provided response header into an associative array
*
* Based on https://stackoverflow.com/a/18682872
*/
function parseResponseHeader($header) {
$headers = array();
$requests = explode("\r\n\r\n", trim($header));
foreach ($requests as $request) {
$header = array();
foreach (explode("\r\n", $request) as $i => $line) {
if($i === 0) {
$header['http_code'] = $line;
} else {
list ($key, $value) = explode(': ', $line);
$header[$key] = $value;
}
}
$headers[] = $header;
}
return $headers;
}
/**
* Determine MIME type from URL/Path file extension
* Remark: Built-in functions mime_content_type or fileinfo requires fetching remote content
* Remark: A bridge can hint for a MIME type by appending #.ext to a URL, e.g. #.image
* Based on https://stackoverflow.com/a/1147952
*/
function getMimeType($url) {
static $mime = null;
if (is_null($mime)) {
$mime = array(
'jpg' => 'image/jpeg',
'gif' => 'image/gif',
'png' => 'image/png',
'image' => 'image/*'
);
if (is_file('/etc/mime.types')) {
$file = fopen('/etc/mime.types', 'r');
while(($line = fgets($file)) !== false) {
$line = trim(preg_replace('/#.*/', '', $line));
if(!$line)
continue;
$parts = preg_split('/\s+/', $line);
if(count($parts) == 1)
continue;
$type = array_shift($parts);
foreach($parts as $part)
$mime[$part] = $type;
}
fclose($file);
}
}
if (strpos($url, '?') !== false) {
$url_temp = substr($url, 0, strpos($url, '?'));
if (strpos($url, '#') !== false) {
$anchor = substr($url, strpos($url, '#'));
$url_temp .= $anchor;
}
$url = $url_temp;
}
$ext = strtolower(pathinfo($url, PATHINFO_EXTENSION));
if (!empty($mime[$ext])) {
return $mime[$ext];
}
return 'application/octet-stream';
}

View File

@@ -20,7 +20,7 @@ function debugMessage($text){
$calling = $backtrace[2];
$message = $calling['file'] . ':'
. $calling['line'] . ' class '
. $calling['class'] . '->'
. (isset($calling['class']) ? $calling['class'] : '<no-class>') . '->'
. $calling['function'] . ' - '
. $text;

View File

@@ -1,304 +1,4 @@
<?php
function displayBridgeCard($bridgeName, $formats, $isActive = true){
$getHelperButtonsFormat = function($formats){
$buttons = '';
foreach($formats as $name) {
$buttons .= '<button type="submit" name="format" value="'
. $name
. '">'
. $name
. '</button>'
. PHP_EOL;
}
return $buttons;
};
$getFormHeader = function($bridgeName){
return <<<EOD
<form method="GET" action="?">
<input type="hidden" name="action" value="display" />
<input type="hidden" name="bridge" value="{$bridgeName}" />
EOD;
};
$bridge = Bridge::create($bridgeName);
if($bridge == false)
return '';
$HTTPSWarning = '';
if(strpos($bridge->getURI(), 'https') !== 0) {
$HTTPSWarning = '<div class="secure-warning">Warning :
This bridge is not fetching its content through a secure connection</div>';
}
$name = '<a href="' . $bridge->getURI() . '">' . $bridge->getName() . '</a>';
$description = $bridge->getDescription();
$card = <<<CARD
<section id="bridge-{$bridgeName}" data-ref="{$bridgeName}">
<h2>{$name}</h2>
<p class="description">
{$description}
</p>
<input type="checkbox" class="showmore-box" id="showmore-{$bridgeName}" />
<label class="showmore" for="showmore-{$bridgeName}">Show more</label>
CARD;
// If we don't have any parameter for the bridge, we print a generic form to load it.
if(count($bridge->getParameters()) == 0) {
$card .= $getFormHeader($bridgeName);
$card .= $HTTPSWarning;
if($isActive) {
if(defined('PROXY_URL') && PROXY_BYBRIDGE) {
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode('proxyoff')
. '-'
. urlencode('_noproxy');
$card .= '<input id="'
. $idArg
. '" type="checkbox" name="_noproxy" />'
. PHP_EOL;
$card .= '<label for="'
. $idArg
. '">Disable proxy ('
. ((defined('PROXY_NAME') && PROXY_NAME) ? PROXY_NAME : PROXY_URL)
. ')</label><br />'
. PHP_EOL;
} if(CUSTOM_CACHE_TIMEOUT) {
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode('_cache_timeout');
$card .= '<label for="'
. $idArg
. '">Cache timeout in seconds : </label>'
. PHP_EOL;
$card .= '<input id="'
. $idArg
. '" type="number" value="'
. $bridge->getCacheTimeout()
. '" name="_cache_timeout" /><br />'
. PHP_EOL;
}
$card .= $getHelperButtonsFormat($formats);
} else {
$card .= '<span style="font-weight: bold;">Inactive</span>';
}
$card .= '</form>' . PHP_EOL;
}
$hasGlobalParameter = array_key_exists('global', $bridge->getParameters());
if($hasGlobalParameter) {
$globalParameters = $bridge->getParameters()['global'];
}
foreach($bridge->getParameters() as $parameterName => $parameter) {
if(!is_numeric($parameterName) && $parameterName == 'global')
continue;
if($hasGlobalParameter)
$parameter = array_merge($parameter, $globalParameters);
if(!is_numeric($parameterName))
$card .= '<h5>' . $parameterName . '</h5>' . PHP_EOL;
$card .= $getFormHeader($bridgeName);
$card .= $HTTPSWarning;
foreach($parameter as $id => $inputEntry) {
$additionalInfoString = '';
if(isset($inputEntry['required']) && $inputEntry['required'] === true)
$additionalInfoString .= ' required';
if(isset($inputEntry['pattern']))
$additionalInfoString .= ' pattern="' . $inputEntry['pattern'] . '"';
if(isset($inputEntry['title']))
$additionalInfoString .= ' title="' . $inputEntry['title'] . '"';
if(!isset($inputEntry['exampleValue']))
$inputEntry['exampleValue'] = '';
if(!isset($inputEntry['defaultValue']))
$inputEntry['defaultValue'] = '';
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode($parameterName)
. '-'
. urlencode($id);
$card .= '<label for="'
. $idArg
. '">'
. $inputEntry['name']
. ' : </label>'
. PHP_EOL;
if(!isset($inputEntry['type']) || $inputEntry['type'] == 'text') {
$card .= '<input '
. $additionalInfoString
. ' id="'
. $idArg
. '" type="text" value="'
. $inputEntry['defaultValue']
. '" placeholder="'
. $inputEntry['exampleValue']
. '" name="'
. $id
. '" /><br />'
. PHP_EOL;
} elseif($inputEntry['type'] == 'number') {
$card .= '<input '
. $additionalInfoString
. ' id="'
. $idArg
. '" type="number" value="'
. $inputEntry['defaultValue']
. '" placeholder="'
. $inputEntry['exampleValue']
. '" name="'
. $id
. '" /><br />'
. PHP_EOL;
} else if($inputEntry['type'] == 'list') {
$card .= '<select '
. $additionalInfoString
. ' id="'
. $idArg
. '" name="'
. $id
. '" >';
foreach($inputEntry['values'] as $name => $value) {
if(is_array($value)) {
$card .= '<optgroup label="' . htmlentities($name) . '">';
foreach($value as $subname => $subvalue) {
if($inputEntry['defaultValue'] === $subname
|| $inputEntry['defaultValue'] === $subvalue) {
$card .= '<option value="'
. $subvalue
. '" selected>'
. $subname
. '</option>';
} else {
$card .= '<option value="'
. $subvalue
. '">'
. $subname
. '</option>';
}
}
$card .= '</optgroup>';
} else {
if($inputEntry['defaultValue'] === $name
|| $inputEntry['defaultValue'] === $value) {
$card .= '<option value="'
. $value
. '" selected>'
. $name
. '</option>';
} else {
$card .= '<option value="'
. $value
. '">'
. $name
. '</option>';
}
}
}
$card .= '</select><br >';
} elseif($inputEntry['type'] == 'checkbox') {
if($inputEntry['defaultValue'] === 'checked')
$card .= '<input '
. $additionalInfoString
. ' id="'
. $idArg
. '" type="checkbox" name="'
. $id
. '" checked /><br />'
. PHP_EOL;
else
$card .= '<input '
. $additionalInfoString
. ' id="'
. $idArg
. '" type="checkbox" name="'
. $id
. '" /><br />'
. PHP_EOL;
}
}
if($isActive) {
if(defined('PROXY_URL') && PROXY_BYBRIDGE) {
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode('proxyoff')
. '-'
. urlencode('_noproxy');
$card .= '<input id="'
. $idArg
. '" type="checkbox" name="_noproxy" />'
. PHP_EOL;
$card .= '<label for="'
. $idArg
. '">Disable proxy ('
. ((defined('PROXY_NAME') && PROXY_NAME) ? PROXY_NAME : PROXY_URL)
. ')</label><br />'
. PHP_EOL;
} if(CUSTOM_CACHE_TIMEOUT) {
$idArg = 'arg-'
. urlencode($bridgeName)
. '-'
. urlencode('_cache_timeout');
$card .= '<label for="'
. $idArg
. '">Cache timeout in seconds : </label>'
. PHP_EOL;
$card .= '<input id="'
. $idArg
. '" type="number" value="'
. $bridge->getCacheTimeout()
. '" name="_cache_timeout" /><br />'
. PHP_EOL;
}
$card .= $getHelperButtonsFormat($formats);
} else {
$card .= '<span style="font-weight: bold;">Inactive</span>';
}
$card .= '</form>' . PHP_EOL;
}
$card .= '<label class="showless" for="showmore-' . $bridgeName . '">Show less</label>';
$card .= '<p class="maintainer">' . $bridge->getMaintainer() . '</p>';
$card .= '</section>';
return $card;
}
function sanitize($textToSanitize,
$removedTags = array('script', 'iframe', 'input', 'form'),
$keptAttributes = array('title', 'href', 'src'),
@@ -340,21 +40,137 @@ function backgroundToImg($htmlContent) {
}
/**
* Convert relative links in HTML into absolute links
* @param $content HTML content to fix. Supports HTML objects or string objects
* @param $server full URL to the page containing relative links
* @return content with fixed URLs, as HTML object or string depending on input type
*/
function defaultLinkTo($content, $server){
$string_convert = false;
if (is_string($content)) {
$string_convert = true;
$content = str_get_html($content);
}
foreach($content->find('img') as $image) {
if(strpos($image->src, 'http') === false
&& strpos($image->src, '//') === false
&& strpos($image->src, 'data:') === false)
$image->src = $server . $image->src;
$image->src = urljoin($server, $image->src);
}
foreach($content->find('a') as $anchor) {
if(strpos($anchor->href, 'http') === false
&& strpos($anchor->href, '//') === false
&& strpos($anchor->href, '#') !== 0
&& strpos($anchor->href, '?') !== 0)
$anchor->href = $server . $anchor->href;
$anchor->href = urljoin($server, $anchor->href);
}
if ($string_convert) {
$content = $content->outertext;
}
return $content;
}
/**
* Extract the first part of a string matching the specified start and end delimiters
* @param $string input string, e.g. '<div>Post author: John Doe</div>'
* @param $start start delimiter, e.g. 'author: '
* @param $end end delimiter, e.g. '<'
* @return extracted string, e.g. 'John Doe', or false if the delimiters were not found.
*/
function extractFromDelimiters($string, $start, $end) {
if (strpos($string, $start) !== false) {
$section_retrieved = substr($string, strpos($string, $start) + strlen($start));
$section_retrieved = substr($section_retrieved, 0, strpos($section_retrieved, $end));
return $section_retrieved;
} return false;
}
/**
* Remove one or more part(s) of a string using a start and end delmiters
* @param $string input string, e.g. 'foo<script>superscript()</script>bar'
* @param $start start delimiter, e.g. '<script'
* @param $end end delimiter, e.g. '</script>'
* @return cleaned string, e.g. 'foobar'
*/
function stripWithDelimiters($string, $start, $end) {
while(strpos($string, $start) !== false) {
$section_to_remove = substr($string, strpos($string, $start));
$section_to_remove = substr($section_to_remove, 0, strpos($section_to_remove, $end) + strlen($end));
$string = str_replace($section_to_remove, '', $string);
}
return $string;
}
/**
* Remove HTML sections containing one or more sections using the same HTML tag
* @param $string input string, e.g. 'foo<div class="ads"><div>ads</div>ads</div>bar'
* @param $tag_name name of the HTML tag, e.g. 'div'
* @param $tag_start start of the HTML tag to remove, e.g. '<div class="ads">'
* @return cleaned string, e.g. 'foobar'
*/
function stripRecursiveHTMLSection($string, $tag_name, $tag_start){
$open_tag = '<' . $tag_name;
$close_tag = '</' . $tag_name . '>';
$close_tag_length = strlen($close_tag);
if(strpos($tag_start, $open_tag) === 0) {
while(strpos($string, $tag_start) !== false) {
$max_recursion = 100;
$section_to_remove = null;
$section_start = strpos($string, $tag_start);
$search_offset = $section_start;
do {
$max_recursion--;
$section_end = strpos($string, $close_tag, $search_offset);
$search_offset = $section_end + $close_tag_length;
$section_to_remove = substr($string, $section_start, $section_end - $section_start + $close_tag_length);
$open_tag_count = substr_count($section_to_remove, $open_tag);
$close_tag_count = substr_count($section_to_remove, $close_tag);
} while ($open_tag_count > $close_tag_count && $max_recursion > 0);
$string = str_replace($section_to_remove, '', $string);
}
}
return $string;
}
/**
* Convert Markdown tags into HTML tags. Only a subset of the Markdown syntax is implemented.
* @param $string input string in Markdown format
* @return output string in HTML format
*/
function markdownToHtml($string) {
//For more details about how these regex work:
// https://github.com/RSS-Bridge/rss-bridge/pull/802#discussion_r216138702
// Images: https://regex101.com/r/JW9Evr/1
// Links: https://regex101.com/r/eRGVe7/1
// Bold: https://regex101.com/r/2p40Y0/1
// Italic: https://regex101.com/r/xJkET9/1
// Separator: https://regex101.com/r/ZBEqFP/1
// Plain URL: https://regex101.com/r/2JHYwb/1
// Site name: https://regex101.com/r/qIuKYE/1
$string = preg_replace('/\!\[([^\]]+)\]\(([^\) ]+)(?: [^\)]+)?\)/', '<img src="$2" alt="$1" />', $string);
$string = preg_replace('/\[([^\]]+)\]\(([^\)]+)\)/', '<a href="$2">$1</a>', $string);
$string = preg_replace('/\*\*(.*)\*\*/U', '<b>$1</b>', $string);
$string = preg_replace('/\*(.*)\*/U', '<i>$1</i>', $string);
$string = preg_replace('/__(.*)__/U', '<b>$1</b>', $string);
$string = preg_replace('/_(.*)_/U', '<i>$1</i>', $string);
$string = preg_replace('/[-]{6,99}/', '<hr />', $string);
$string = str_replace('&#10;', '<br />', $string);
$string = preg_replace('/([^"])(https?:\/\/[^ "<]+)([^"])/', '$1<a href="$2">$2</a>$3', $string.' ');
$string = preg_replace('/([^"\/])(www\.[^ "<]+)([^"])/', '$1<a href="http://$2">$2</a>$3', $string.' ');
//As the regex are not perfect, we need to fix <i> and </i> that are introduced in URLs
// Fixup regex <i>: https://regex101.com/r/NTRPf6/1
// Fixup regex </i>: https://regex101.com/r/aNklRp/1
$count = 1;
while($count > 0) {
$string = preg_replace('/ (src|href)="([^"]+)<i>([^"]+)"/U', ' $1="$2_$3"', $string, -1, $count);
}
$count = 1;
while($count > 0) {
$string = preg_replace('/ (src|href)="([^"]+)<\/i>([^"]+)"/U', ' $1="$2_$3"', $string, -1, $count);
}
return '<div>' . trim($string) . '</div>';
}

16
phpunit.xml Normal file
View File

@@ -0,0 +1,16 @@
<phpunit
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://schema.phpunit.de/4.5/phpunit.xsd"
colors="true"
processIsolation="false"
timeoutForSmallTests="1"
timeoutForMediumTests="1"
timeoutForLargeTests="6" >
<testsuites>
<testsuite name="Standard test suite">
<file>tests/BridgeImplementationTest.php</file>
</testsuite>
</testsuites>
</phpunit>

View File

@@ -1,119 +1,87 @@
html, body, div, span, applet, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, a, abbr, acronym, address, big, cite, code, del, dfn, em, img, ins, kbd, q, s, samp, small, strike, strong, sub, sup, tt, var, b, u, i, center, dl, dt, dd, ol, ul, li, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, tr, th, td, article, aside, canvas, details, figcaption, figure, footer, header, hgroup, menu, nav, section, summary, time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
vertical-align: baseline;
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section {
display: block;
article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section {
display: block;
}
/* Let's go for the actual style */
body {
background-color: #EEEEEE;
font-family: 'Noto Sans';
body {
background-color: #f0f0f0;
font-family: -apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";
}
section {
a, a:link, a:visited {
color: #2196F3;
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
img {
max-width: 100%;
}
/* Section */
section {
background-color: #FFFFFF;
width: 90%;
width: 60%;
margin: 30px auto;
padding: 10px 15px;
box-shadow: 0px 1px 2px rgba(0,0,0, 0.25);
padding: 15px 15px;
text-align: center;
box-shadow: 0 6px 15px rgba(0, 0, 0, 0.09);
border-radius: 4px;
}
section > h2 {
section > h2 {
font-size: 200%;
font-weight: bold;
text-align: center;
}
h1.pagetitle {
h1.pagetitle {
margin: 40px 0 20px;
font-size: 300%;
font-weight: bold;
text-align: center;
color: #2196F3;
}
h1.pagetitle > a {
h1.pagetitle > a {
color: #2196F3;
}
a.backlink, a.backlink:link, a.backlink:visited, a.itemtitle, a.itemtitle:link, a.itemtitle:visited {
a.backlink, a.backlink:link, a.backlink:visited, a.itemtitle, a.itemtitle:link, a.itemtitle:visited {
color: #2196F3;
}
.buttons {
.buttons {
text-align: center;
}
section > div.content, section > div.categories,
section > div.content, section > div.attachments {
section > div.content, section > div.attachments {
padding: 10px;
}
section > div.categories > li.category,
section > div.attachments > li.enclosure {
section > div.attachments > li.enclosure {
list-style-type: circle;
list-style-position: inside;
}
section > time, section > p.author {
section > time, section > p.author {
color: #888;
font-size: 80%;
padding: 10px;
}
button.backbutton, button.rss-feed {
line-height: 1em;
button {
line-height: 1.9em;
color: #FFF;
font-weight: bold;
vertical-align: middle;
padding: 6px 12px;
margin: 12px auto 0px;
box-shadow: 0px 1px 2px rgba(0, 0, 0, 0.3);
border-radius: 2px;
border-radius: 4px;
border: 1px solid transparent;
width: 200px;
background: #2196F3 none repeat scroll 0% 0%;
cursor: pointer;
margin: 10px;
}
img {
max-width: 100%;
width: 200px;
}
button:hover {
background: #49afff;
}

View File

@@ -3,20 +3,53 @@ function search() {
var searchTerm = document.getElementById('searchfield').value;
var searchableElements = document.getElementsByTagName('section');
var regexMatch = new RegExp(searchTerm, "i");
var regexMatch = new RegExp(searchTerm, 'i');
// Attempt to create anchor from search term (will default to 'localhost' on failure)
var searchTermUri = document.createElement('a');
searchTermUri.href = searchTerm;
if(searchTermUri.hostname == 'localhost') {
searchTermUri = null;
} else {
// Ignore "www."
if(searchTermUri.hostname.indexOf('www.') === 0) {
searchTermUri.hostname = searchTermUri.hostname.substr(4);
}
}
for(var i = 0; i < searchableElements.length; i++) {
var textValue = searchableElements[i].getAttribute('data-ref');
if(textValue != null) {
var anchors = searchableElements[i].getElementsByTagName('a');
if(textValue.match(regexMatch) == null && searchableElements[i].style.display != "none") {
if(anchors != null && anchors.length > 0) {
searchableElements[i].style.display = "none";
var uriValue = anchors[0]; // First anchor is bridge URI
} else if(textValue.match(regexMatch) != null) {
// Ignore "www."
if(uriValue.hostname.indexOf('www.') === 0) {
uriValue.hostname = uriValue.hostname.substr(4);
}
searchableElements[i].style.display = "block";
}
if(textValue != null || uriValue != null) {
if(textValue.match(regexMatch) != null ||
uriValue.hostname.match(regexMatch) ||
searchTermUri != null &&
uriValue.hostname != 'localhost' && (
uriValue.href.match(regexMatch) != null ||
uriValue.hostname == searchTermUri.hostname)) {
searchableElements[i].style.display = 'block';
} else {
searchableElements[i].style.display = 'none';
}

View File

@@ -1,310 +1,217 @@
html, body, div, span, applet, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, a, abbr, acronym, address, big, cite, code, del, dfn, em, img, ins, kbd, q, s, samp, small, strike, strong, sub, sup, tt, var, b, u, i, center, dl, dt, dd, ol, ul, li, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, tr, th, td, article, aside, canvas, details, figcaption, figure, footer, header, hgroup, menu, nav, section, summary, time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section {
display: block;
article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section {
display: block;
}
/* Let's go for the actual style */
body {
background-color: #EEEEEE;
font-family: 'Noto Sans';
}
header {
text-shadow:0 5px 6px rgba(150,150,150,0.69);
text-align: center;
color: #1182DB;
}
header > h1 {
font-size: 300%;
}
header > h2 {
margin-left: 1em;
font-size: 120%;
}
header > p.status {
font-weight: bold;
margin: 1em;
color: red;
}
input[type="text"] {
background-color: white;
color: #404552;
border: 0px;
border-bottom: 2px solid #2196F3;
font-size: 1.1em;
margin-left: 8px;
padding-left: 4px;
}
.searchbar {
width: 50%;
margin: auto;
}
.searchbar input[type="text"] {
width: 100%;
margin: auto;
font-size: 1.4em;
text-align: center;
}
.searchbar input[type="text"]::placeholder {
text-align: center;
}
.searchbar input[type="text"]:focus::-webkit-input-placeholder {
opacity: 0;
}
.searchbar input[type="text"]:focus::-moz-placeholder {
opacity: 0;
}
.searchbar input[type="text"]:focus:-moz-placeholder {
opacity: 0;
}
.searchbar input[type="text"]:focus:-ms-input-placeholder {
opacity: 0;
}
.searchbar > h3 {
font-size: 150%;
font-weight: bold;
color: #1182DB;
}
section {
background-color: #FFFFFF;
width: 80%;
margin: 30px auto;
padding: 10px 15px;
text-align: center;
box-shadow: 0px 1px 2px rgba(0,0,0, 0.25);
}
section.footer {
opacity: 0.5;
}
section.footer:hover {
opacity: 1;
}
section.footer .version {
font-size: 80%;
}
section > h2 {
font-size: 200%;
font-weight: bold;
body {
background-color: #f0f0f0;
font-family: -apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";
}
a, a:link, a:visited {
color: #2196F3;
text-decoration: none;
}
button {
a:hover {
text-decoration: underline;
}
/* Header */
header {
margin-top: 40px;
text-align: center;
color: #1182DB;
}
header > h1 {
font-size: 500%;
font-weight: bold;
}
header > h2 {
margin-left: 1em;
font-size: 200%;
}
header > section.warning {
width: 40%;
background-color: #ffc600;
color: #5f5f5f;
}
header > section.critical-warning {
width: 40%;
background-color: #cf3e3e;
font-weight: bold;
color: white;
}
/* Input boxes */
input[type="text"] {
background-color: white;
color: #404552;
border: 1px solid #dedede;
margin-left: 8px;
margin-bottom: 10px;
padding: 5px 10px;
}
input[type="text"]:focus {
outline: none;
border-color: #888;
}
.searchbar {
width: 40%;
margin: 40px auto 100px;
}
.searchbar input[type="text"] {
width: 90%;
margin: auto;
font-size: 1.1em;
text-align: center;
margin-bottom: 10px;
}
.searchbar input[type="text"]::placeholder {
text-align: center;
}
.searchbar input[type="text"]:focus::-webkit-input-placeholder { opacity: 0; }
.searchbar input[type="text"]:focus::-moz-placeholder { opacity: 0; }
.searchbar input[type="text"]:focus:-moz-placeholder { opacity: 0; }
.searchbar input[type="text"]:focus:-ms-input-placeholder { opacity: 0; }
.searchbar > h3 {
font-size: 200%;
font-weight: bold;
color: #1182DB;
margin-bottom: 10px;
}
/* Section */
section {
background-color: #FFFFFF;
width: 60%;
margin: 30px auto;
padding: 15px 15px;
text-align: center;
box-shadow: 0 6px 15px rgba(0, 0, 0, 0.09);
border-radius: 4px;
}
section.footer {
opacity: 0.5;
}
section.footer:hover {
opacity: 1;
}
section > h2 {
font-size: 200%;
font-weight: bold;
}
/* Buttons */
button {
line-height: 1.9em;
color: #FFF;
font-weight: bold;
vertical-align: middle;
padding: 6px 12px;
margin: 12px auto 0px;
box-shadow: 0px 1px 2px rgba(0, 0, 0, 0.3);
border-radius: 2px;
border-radius: 4px;
border: 1px solid transparent;
min-width: 140px;
background: #2196F3 none repeat scroll 0% 0%;
cursor: pointer;
width: calc(20% - 4px);
}
button.small {
button.small {
width: auto;
line-height: 1.2em;
}
button:hover {
background: #49afff;
}
.description {
.description {
margin: 10px;
text-decoration: underline;
}
h5 {
h5 {
margin: 20px;
font-weight: bold;
}
form {
form {
margin-bottom: 6px;
}
.maintainer {
font-size: 60%;
.maintainer {
color: #888888;
font-size: 70%;
text-align: right;
}
.secure-warning {
.secure-warning {
background-color: #ffc600;
color: #5f5f5f;
box-shadow: 0px 1px 2px rgba(0, 0, 0, 0.3);
border-radius: 2px;
border: 1px solid transparent;
width: 80%;
margin: auto;
margin-bottom: 6px;
}
form {
form {
display: none;
}
h5 {
select {
padding: 5px 10px;
margin-left: 8px;
}
h5 {
display: none;
}
.showmore-box {
/* Show more / less */
.showmore-box {
display: none;
}
.showmore, .showless {
.showmore, .showless {
color: #888888;
cursor: pointer;
}
.showmore-box:checked ~ .showmore {
.showmore:hover, .showless:hover {
color: #000;
cursor: pointer;
}
.showmore-box:checked ~ .showmore {
display: none;
}
.showmore-box:not(:checked) ~ .showless {
.showmore-box:not(:checked) ~ .showless {
display: none;
}
.showmore-box:checked ~ form, .showmore-box:checked ~ h5 {
.showmore-box:checked ~ form, .showmore-box:checked ~ h5 {
display: block;
}
/* Additional styles for error pages */
.exception-message {
.exception-message {
background-color: #c00000;
color: #FFFFFF;
font-weight: bold;
box-shadow: 0px 1px 2px rgba(0, 0, 0, 0.3);
border-radius: 2px;
border: 1px solid transparent;
width: 80%;
margin: auto;
margin-bottom: 6px;
}
.advice {
.advice {
margin-left: auto;
margin-right: auto;
display: table;
}
.advice > li {
.advice > li {
text-align: left;
}
}

View File

@@ -0,0 +1,191 @@
<?php
use PHPUnit\Framework\TestCase;
use PHPUnit\Framework\TestResult;
use PHPUnit\Framework\AssertionFailedError;
require_once(__DIR__ . '/../lib/RssBridge.php');
Bridge::setDir(__DIR__ . '/../bridges/');
/**
* This class checks bridges for implementation details:
*
* - A bridge must not implement public functions other than the ones specified
* by the bridge interfaces. Custom functions must be defined in private or
* protected scope.
* - getName() must return a valid string (non-empty)
* - getURI() must return a valid URI
* - A bridge must define constants for NAME, URI, DESCRIPTION and MAINTAINER,
* CACHE_TIMEOUT and PARAMETERS are optional
*/
final class BridgeImplementationTest extends TestCase {
private function CheckBridgePublicFunctions($bridgeName){
$parent_methods = array();
if(in_array('BridgeInterface', class_parents($bridgeName))) {
$parent_methods = array_merge($parent_methods, get_class_methods('BridgeInterface'));
}
if(in_array('BridgeAbstract', class_parents($bridgeName))) {
$parent_methods = array_merge($parent_methods, get_class_methods('BridgeAbstract'));
}
if(in_array('FeedExpander', class_parents($bridgeName))) {
$parent_methods = array_merge($parent_methods, get_class_methods('FeedExpander'));
}
// Receive all non abstract methods
$methods = array_diff(get_class_methods($bridgeName), $parent_methods);
$method_names = implode(', ', $methods);
$errmsg = $bridgeName
. ' implements additional public method(s): '
. $method_names
. '! Custom functions must be defined in private or protected scope!';
$this->assertEmpty($method_names, $errmsg);
}
private function CheckBridgeGetNameDefaultValue($bridgeName){
if(in_array('BridgeAbstract', class_parents($bridgeName))) { // Is bridge
if(!$this->isFunctionMemberOf($bridgeName, 'getName'))
return;
$bridge = new $bridgeName();
$abstract = new BridgeAbstractTest();
$message = $bridgeName . ': \'getName\' must return a valid name!';
$this->assertNotEmpty(trim($bridge->getName()), $message);
}
}
// Checks whether the getURI function returns empty or default values
private function CheckBridgeGetURIDefaultValue($bridgeName){
if(in_array('BridgeAbstract', class_parents($bridgeName))) { // Is bridge
if(!$this->isFunctionMemberOf($bridgeName, 'getURI'))
return;
$bridge = new $bridgeName();
$abstract = new BridgeAbstractTest();
$message = $bridgeName . ': \'getURI\' must return a valid URI!';
$this->assertNotEmpty(trim($bridge->getURI()), $message);
}
}
private function CheckBridgePublicConstants($bridgeName){
// Assertion only works for BridgeAbstract!
if(in_array('BridgeAbstract', class_parents($bridgeName))) {
$ref = new ReflectionClass($bridgeName);
$constants = $ref->getConstants();
$ref = new ReflectionClass('BridgeAbstract');
$parent_constants = $ref->getConstants();
foreach($parent_constants as $key => $value) {
$this->assertArrayHasKey($key, $constants, 'Constant ' . $key . ' missing in ' . $bridgeName);
// Skip optional constants
if($key !== 'PARAMETERS' && $key !== 'CACHE_TIMEOUT') {
$this->assertNotEquals($value, $constants[$key], 'Constant ' . $key . ' missing in ' . $bridgeName);
}
}
}
}
private function isFunctionMemberOf($bridgeName, $functionName){
$bridgeReflector = new ReflectionClass($bridgeName);
$bridgeMethods = $bridgeReflector->GetMethods();
$bridgeHasMethod = false;
foreach($bridgeMethods as $method) {
if($method->name === $functionName && $method->class === $bridgeReflector->name) {
return true;
}
}
return false;
}
public function testBridgeImplementation($bridgeName){
require_once('bridges/' . $bridgeName . '.php');
$this->CheckBridgePublicFunctions($bridgeName);
$this->CheckBridgePublicConstants($bridgeName);
$this->CheckBridgeGetNameDefaultValue($bridgeName);
$this->CheckBridgeGetURIDefaultValue($bridgeName);
}
public function count() {
return count(Bridge::listBridges());
}
public function run(TestResult $result = null) {
if ($result === null) {
$result = new TestResult;
}
foreach (Bridge::listBridges() as $bridge) {
$bridge .= 'Bridge';
$result->startTest($this);
PHP_Timer::start();
$stopTime = null;
try {
$this->testBridgeImplementation($bridge);
} catch (AssertionFailedError $e) {
$stopTime = PHP_Timer::stop();
$result->addFailure($this, $e, $stopTime);
} catch (Exception $e) {
$stopTime = PHP_Timer::stop();
$result->addError($this, $e, $stopTime);
}
if ($stopTime === null) {
$stopTime = PHP_Timer::stop();
}
$result->endTest($this, $stopTime);
}
return $result;
}
}
class BridgeAbstractTest extends BridgeAbstract {
public function collectData(){}
}

131
vendor/php-urljoin/src/urljoin.php vendored Normal file
View File

@@ -0,0 +1,131 @@
<?php
/*
A spiritual port of Python's urlparse.urljoin() function to PHP. Why this isn't in the standard library is anyone's guess.
Author: fluffy, http://beesbuzz.biz/
Latest version at: https://github.com/plaidfluff/php-urljoin
*/
function urljoin($base, $rel) {
if (!$base) {
return $rel;
}
if (!$rel) {
return $base;
}
$uses_relative = array('', 'ftp', 'http', 'gopher', 'nntp', 'imap',
'wais', 'file', 'https', 'shttp', 'mms',
'prospero', 'rtsp', 'rtspu', 'sftp',
'svn', 'svn+ssh', 'ws', 'wss');
$pbase = parse_url($base);
$prel = parse_url($rel);
if (array_key_exists('path', $pbase) && $pbase['path'] === '/') {
unset($pbase['path']);
}
if (isset($prel['scheme'])) {
if ($prel['scheme'] != $pbase['scheme'] || in_array($prel['scheme'], $uses_relative) == false) {
return $rel;
}
}
$merged = array_merge($pbase, $prel);
// Handle relative paths:
// 'path/to/file.ext'
// './path/to/file.ext'
if (array_key_exists('path', $prel) && substr($prel['path'], 0, 1) != '/') {
// Normalize: './path/to/file.ext' => 'path/to/file.ext'
if (substr($prel['path'], 0, 2) === './') {
$prel['path'] = substr($prel['path'], 2);
}
if (array_key_exists('path', $pbase)) {
$dir = preg_replace('@/[^/]*$@', '', $pbase['path']);
$merged['path'] = $dir . '/' . $prel['path'];
} else {
$merged['path'] = '/' . $prel['path'];
}
}
if(array_key_exists('path', $merged)) {
// Get the path components, and remove the initial empty one
$pathParts = explode('/', $merged['path']);
array_shift($pathParts);
$path = [];
$prevPart = '';
foreach ($pathParts as $part) {
if ($part == '..' && count($path) > 0) {
// Cancel out the parent directory (if there's a parent to cancel)
$parent = array_pop($path);
// But if it was also a parent directory, leave it in
if ($parent == '..') {
array_push($path, $parent);
array_push($path, $part);
}
} else if ($prevPart != '' || ($part != '.' && $part != '')) {
// Don't include empty or current-directory components
if ($part == '.') {
$part = '';
}
array_push($path, $part);
}
$prevPart = $part;
}
$merged['path'] = '/' . implode('/', $path);
}
$ret = '';
if (isset($merged['scheme'])) {
$ret .= $merged['scheme'] . ':';
}
if (isset($merged['scheme']) || isset($merged['host'])) {
$ret .= '//';
}
if (isset($prel['host'])) {
$hostSource = $prel;
} else {
$hostSource = $pbase;
}
// username, password, and port are associated with the hostname, not merged
if (isset($hostSource['host'])) {
if (isset($hostSource['user'])) {
$ret .= $hostSource['user'];
if (isset($hostSource['pass'])) {
$ret .= ':' . $hostSource['pass'];
}
$ret .= '@';
}
$ret .= $hostSource['host'];
if (isset($hostSource['port'])) {
$ret .= ':' . $hostSource['port'];
}
}
if (isset($merged['path'])) {
$ret .= $merged['path'];
}
if (isset($prel['query'])) {
$ret .= '?' . $prel['query'];
}
if (isset($prel['fragment'])) {
$ret .= '#' . $prel['fragment'];
}
return $ret;
}