From 54f615f1d3e2e82f8581a9c33e8cc9ce559f9089 Mon Sep 17 00:00:00 2001
From: "Edward Z. Yang" Clients like their YouTube videos. It gives them a warm fuzzy feeling when
+they see a neat little embedded video player on their websites that can play
+the latest clips from their documentary "Fido and the Bones of Spring".
+All joking aside, the ability to embed YouTube videos or other active
+content in their pages is something that a lot of people like. This is a bad idea. The moment you embed anything untrusted,
+you will definitely be slammed by a manner of nasties that can be
+embedded in things from your run of the mill Flash movie to
+Quicktime movies.
+Even Luckily for us, however, whitelisting saves the day. Sure, letting users
+include any old random flash file could be dangerous, but if it's
+from a specific website, it probably is okay. If no amount of pleading will
+convince the people upstairs that they should just settle with just linking
+to their movies, you may find this technique very useful. Below is custom code that allows users to embed
+YouTube videos. This is not favoritism: this trick can easily be adapted for
+other forms of embeddable content. Usually, websites like YouTube give us boilerplate code that you can insert
+into your documents. YouTube's code goes like this: There are two things to note about this code: What point 2 means is that if we have code like There is a bit going on here, so let's explain. There are a number of possible problems with the code above, depending
+on how you look at it. The width and height of the final YouTube movie cannot be adjusted. This
+is because I am lazy. If you really insist on letting users change the size
+of the movie, what you need to do is package up the attributes inside the
+span tag (along with the movie ID). It gets complicated though: a malicious
+user can specify an outrageously large height and width and attempt to crash
+the user's operating system/browser. You need to either cap it by limiting
+the amount of digits allowed in the regex or using a callback to check the
+number. By allowing this code onto our website, we are trusting that YouTube has
+tech-savvy enough people not to allow their users to inject malicious
+code into the Flash files. An exploit on YouTube means an exploit on your
+site. Even though YouTube is run by the reputable Google, it
+doesn't
+mean they are
+invulnerable.
+You're putting a certain measure of the job on an external provider (just as
+you have by entrusting your user input to HTML Purifier), and
+it is important that you are cognizant of the risk. This should go without saying, but if you're going to adapt this code
+for Google Video or the like, make sure you do it right. It's
+extremely easy to allow a character too many in the final section and
+suddenly you're introducing XSS into HTML Purifier's XSS free output. HTML
+Purifier may be well written, but it cannot guard against vulnerabilities
+introduced after it has finished. It would probably be a good idea if this code was added to the core
+library. Look out for the inclusion of this into the core as a decorator
+or the like.Embedding YouTube Videos
+img
tags, which HTML Purifier allows by default, can be
+dangerous. Be distrustful of anything that tells a browser to load content
+from another website automatically.Sample
+
+
+<object width="425" height="350">
+ <param name="movie" value="http://www.youtube.com/v/AyPzM5WK8ys" />
+ <param name="wmode" value="transparent" />
+ <embed src="http://www.youtube.com/v/AyPzM5WK8ys"
+ type="application/x-shockwave-flash"
+ wmode="transparent" width="425" height="350" />
+</object>
+
+
+
+
+
+<embed>
is not recognized by W3C, so if you want
+ standards-compliant code, you'll have to get rid of it.<span
+class="embed-youtube">AyPzM5WK8ys</span>
your
+application can reconstruct the full object from this small snippet that
+passes through HTML Purifier unharmed.
+<?php
+
+class HTMLPurifierX_PreserveYouTube extends HTMLPurifier
+{
+ function purify($html, $config = null) {
+ $pre_regex = '#<object[^>]+>.+?'.
+ 'http://www.youtube.com/v/([A-Za-z0-9]+).+?</object>#';
+ $pre_replace = '<span class="youtube-embed">\1</span>';
+ $html = preg_replace($pre_regex, $pre_replace, $html);
+ $html = parent::purify($html, $config);
+ $post_regex = '#<span class="youtube-embed">([A-Za-z0-9]+)</span>#';
+ $post_replace = '<object width="425" height="350" '.
+ 'data="http://www.youtube.com/v/\1">'.
+ '<param name="movie" value="http://www.youtube.com/v/\1"></param>'.
+ '<param name="wmode" value="transparent"></param>'.
+ '<!--[if IE]>'.
+ '<embed src="http://www.youtube.com/v/\1"'.
+ 'type="application/x-shockwave-flash"'.
+ 'wmode="transparent" width="425" height="350" />'.
+ '<![endif]-->'.
+ '</object>';
+ $html = preg_replace($post_regex, $post_replace, $html);
+ return $html;
+ }
+}
+
+$purifier = new HTMLPurifierX_PreserveYouTube();
+$html_still_with_youtube = $purifier->purify($html_with_youtube);
+
+?>
+
+
+
+
+
+HTMLPurifierX
because it's
+ userspace code. Don't use HTMLPurifier
in front of your
+ class, since it might clobber another class in the library.new HTMLPurifier
to new
+ HTMLPurifierX_PreserveYouTube
. There's other ways to go about
+ doing this: if you were calling a function that wrapped HTML Purifier,
+ you could paste the PHP right there. If you wanted to be really
+ fancy, you could make a decorator for HTMLPurifier.Warning
+
+Cannot change width and height
+
+Trusts media's host's security
+
+Poorly written adaptations compromise security
+
+Future plans
+
+
Click here to see the unpurified version (breaks validation).
+ + +