From 56117f23e1f65581919b2acc547be589bbf8bb79 Mon Sep 17 00:00:00 2001 From: Goran Rakic Date: Tue, 10 Jul 2012 23:23:52 +0200 Subject: [PATCH 1/2] Issue #62: Improve filtering section --- _posts/07-04-01-Data-Filtering.md | 37 +++++++++++++++++++++++++++---- 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/_posts/07-04-01-Data-Filtering.md b/_posts/07-04-01-Data-Filtering.md index 2741d49..9fe7bf5 100644 --- a/_posts/07-04-01-Data-Filtering.md +++ b/_posts/07-04-01-Data-Filtering.md @@ -4,9 +4,27 @@ isChild: true ## Data Filtering -Never ever (ever) trust foreign input introduced to your PHP code. That leads to dangerous places. Instead, always sanitize and validate foreign input before trusting and using it in your code. +Never ever (ever) trust foreign input introduced to your PHP code. Always sanitize and validate +foreign input before using it in code. -PHP provides the `filter_var` and `filter_input` functions to help you do this. These two functions can sanitize text and validate formats (e.g. email addresses). +PHP functions `filter_var` and `filter_input` can sanitize text and validate text formats (e.g. +email addresses). + +Foreign input comes in many different ways. HTML form data provided by the users is straight +forward. But most of HTTP request data, data from foreign web services, both uploaded and downloaded +files and much else are too. While foreign input may be stored, combined and accessed later, it is +still foreign input. Every time you process, output, concatenate or include some data you should ask +yourself if the data is filtered properly and can it be trusted. + +Filtering is tailored to the specific data usage. When including foreign input into the HTML page, +one way to protect from Cross-Site Scripting (XSS) attack is to sanitize by removing all HTML tags +in the input. But when using the same foreign input as a shell command argument removing HTML is +pointless, and the built-in `escapeshellarg` function may be used for sanitization. Or input may be +used as a concatenated filepath part, allowing only number or nothing, which can be done with +validation. + +For performance, you can store filtered data and have it ready for usage next time. Just remember +that data filtered for one kind of the output may not be sufficiently filtered for the other. * [Learn about data filtering][1] * [Learn about `filter_var`][4] @@ -14,13 +32,23 @@ PHP provides the `filter_var` and `filter_input` functions to help you do this. ### Sanitization -Sanitization removes (or escapes) illegal or unsafe characters from foreign input. For example, you should sanitize foreign input before including the input in HTML or inserting it into a raw SQL query. When you use bound parameters with [PDO](#databases), it will sanitize the input for you. +Sanitization removes (or escapes) illegal or unsafe characters from foreign input. + +For example, you should sanitize foreign input before including the input in HTML or inserting it +into a raw SQL query. When you use bound parameters with [PDO](#databases), it will +sanitize the input for you. + +Sometimes it is required to allow some safe HTML tags in the input when including it in the HTML +page. This is very hard to do and many avoid it by using other more restricted formattings like +Markdown or BBCode, although whitelisting libraries like [HTML Purifier][html-purifier] exists for +this reason. [See Sanitization Filters][2] ### Validation -Validation ensures that foreign input is what you expect. For example, you may want to validate an email address, a phone number, or age when processing a registration submission. +Validation ensures that foreign input is what you expect. For example, you may want to validate an +email address, a phone number, or age when processing a registration submission. [See Validation Filters][3] @@ -29,3 +57,4 @@ Validation ensures that foreign input is what you expect. For example, you may w [3]: http://www.php.net/manual/en/filter.filters.validate.php [4]: http://php.net/manual/en/function.filter-var.php [5]: http://www.php.net/manual/en/function.filter-input.php +[html-purifier]: http://htmlpurifier.org/ From 18ffa45e23c5b47e8db74a04daabe0db91700900 Mon Sep 17 00:00:00 2001 From: Goran Rakic Date: Thu, 12 Jul 2012 12:58:50 +0200 Subject: [PATCH 2/2] Rewrite 'many different ways' --- _posts/07-04-01-Data-Filtering.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/_posts/07-04-01-Data-Filtering.md b/_posts/07-04-01-Data-Filtering.md index 9fe7bf5..d3ee2d1 100644 --- a/_posts/07-04-01-Data-Filtering.md +++ b/_posts/07-04-01-Data-Filtering.md @@ -10,11 +10,11 @@ foreign input before using it in code. PHP functions `filter_var` and `filter_input` can sanitize text and validate text formats (e.g. email addresses). -Foreign input comes in many different ways. HTML form data provided by the users is straight -forward. But most of HTTP request data, data from foreign web services, both uploaded and downloaded -files and much else are too. While foreign input may be stored, combined and accessed later, it is -still foreign input. Every time you process, output, concatenate or include some data you should ask -yourself if the data is filtered properly and can it be trusted. +Foreign input is not just the HTML form data submitted by the user. Most of HTTP request data, data +from foreign web services, both uploaded and downloaded files and much else are foreign inputs too. +While foreign input can be stored, combined and accessed later, it is still a foreign input. Every +time you process, output, concatenate or include some data in your code you should ask yourself if +the data is filtered properly and can it be trusted. Filtering is tailored to the specific data usage. When including foreign input into the HTML page, one way to protect from Cross-Site Scripting (XSS) attack is to sanitize by removing all HTML tags @@ -39,7 +39,7 @@ into a raw SQL query. When you use bound parameters with [PDO](#databases), it w sanitize the input for you. Sometimes it is required to allow some safe HTML tags in the input when including it in the HTML -page. This is very hard to do and many avoid it by using other more restricted formattings like +page. This is very hard to do and many avoid it by using other more restricted formatting like Markdown or BBCode, although whitelisting libraries like [HTML Purifier][html-purifier] exists for this reason.