Views: 8528
Last Modified: 12.09.2014

Sanitizer is a tool that analyzes the html code introduced by a user. The main task of the sanitizer consists in preventing the implementation/display of a potentially hazardous code in HTML.

The sanitizer comes handy where the user introduces arbitrary html. E.g., in a visual editor or when copying a text from MS Word. In addition to the control of the introduced code, the sanitizer also partially monitors layout validity. In particular, it closes unclosed tags.

How to Filter a Text

If a text (containing HTML tags) typed by the user must be filtered from undesirable HTML tags using the sanitizer, it can be achieved as follows:

$Sanitizer = new CBXSanitizer;

$Sanitizer->AddTags( array (
                  'a' = > array('href','id','style','alt'...),
                  'br' => array(),
                     .... ));

$pureHtml = $Sanitizer->SanitizeHtml($html);

The sanitizer will filter out all tags and attributes which are not contained in the “white” list generated by the function AddTags().

The sanitizer includes 3 preset filtration levels:

SECURE_LEVEL_HIGH (high level) includes the following list:

$arTags = array(
                        'b'        => array(),
                        'br'        => array(),
                        'big'        => array(),
                        'blockquote'    => array(),
                        'code'        => array(),
                        'del'        => array(),
                        'dt'        => array(),
                        'dd'        => array(),
                        'font'        => array(),
                        'h1'        => array(),
                        'h2'        => array(),
                        'h3'        => array(),
                        'h4'        => array(),
                        'h5'        => array(),
                        'h6'        => array(),
                        'hr'        => array(),
                        'i'        => array(),
                        'ins'        => array(),
                        'li'        => array(),
                        'ol'        => array(),
                        'p'        => array(),
                        'small'        => array(),
                        's'        => array(),
                        'sub'        => array(),
                        'sup'        => array(),
                        'strong'    => array(),
                        'pre'        => array(),
                        'u'        => array(),
                        'ul'        => array()
                    );

SECURE_LEVEL_MIDDLE (middle level) includes:

$arTags = array(
                        'a'        => array('href', 'title','name','alt'),
                        'b'        => array(),
                        'br'        => array(),
                        'big'        => array(),
                        'blockquote'    => array('title'),
                        'code'        => array(),
                        'caption'    => array(),
                        'del'        => array('title'),
                        'dt'        => array(),
                        'dd'        => array(),
                        'font'        => array('color','size'),
                        'color'        => array(),
                        'h1'        => array(),
                        'h2'        => array(),
                        'h3'        => array(),
                        'h4'        => array(),
                        'h5'        => array(),
                        'h6'        => array(),
                        'hr'        => array(),
                        'i'        => array(),
                        'img'        => array('src','alt','height','width','title'),
                        'ins'        => array('title'),
                        'li'        => array(),
                        'ol'        => array(),
                        'p'        => array(),
                        'pre'        => array(),
                        's'        => array(),
                        'small'        => array(),
                        'strong'    => array(),
                        'sub'        => array(),
                        'sup'        => array(),
                        'table'        => array('border','width'),
                        'tbody'        => array('align','valign'),
                        'td'        => array('width','height','align','valign'),
                        'tfoot'        => array('align','valign'),
                        'th'        => array('width','height'),
                        'thead'        => array('align','valign'),
                        'tr'        => array('align','valign'),
                        'u'        => array(),
                        'ul'        => array()
                    );

SECURE_LEVEL_LOW (low level) includes:

$arTags = array(
                        'a'        => array('href', 'title','name','style','id','class','shape','coords','alt','target'),
                        'b'        => array('style','id','class'),
                        'br'        => array('style','id','class'),
                        'big'        => array('style','id','class'),
                        'blockquote'    => array('title','style','id','class'),
                        'caption'    => array('style','id','class'),
                        'code'        => array('style','id','class'),
                        'del'        => array('title','style','id','class'),
                        'div'        => array('title','style','id','class','align'),
                        'dt'        => array('style','id','class'),
                        'dd'        => array('style','id','class'),
                        'font'        => array('color','size','face','style','id','class'),
                        'h1'        => array('style','id','class','align'),
                        'h2'        => array('style','id','class','align'),
                        'h3'        => array('style','id','class','align'),
                        'h4'        => array('style','id','class','align'),
                        'h5'        => array('style','id','class','align'),
                        'h6'        => array('style','id','class','align'),
                        'hr'        => array('style','id','class'),
                        'i'        => array('style','id','class'),
                        'img'        => array('src','alt','height','width','title'),
                        'ins'        => array('title','style','id','class'),
                        'li'        => array('style','id','class'),
                        'map'        => array('shape','coords','href','alt','title','style','id','class','name'),
                        'ol'        => array('style','id','class'),
                        'p'        => array('style','id','class','align'),
                        'pre'        => array('style','id','class'),
                        's'        => array('style','id','class'),
                        'small'        => array('style','id','class'),
                        'strong'    => array('style','id','class'),
                        'span'        => array('title','style','id','class','align'),
                        'sub'        => array('style','id','class'),
                        'sup'        => array('style','id','class'),
                        'table'        => array('border','width','style','id','class','cellspacing','cellpadding'),
                        'tbody'        => array('align','valign','style','id','class'),
                        'td'        => array('width','height','style','id','class','align','valign','colspan','rowspan'),
                        'tfoot'        => array('align','valign','style','id','class','align','valign'),
                        'th'        => array('width','height','style','id','class','colspan','rowspan'),
                        'thead'        => array('align','valign','style','id','class'),
                        'tr'        => array('align','valign','style','id','class'),
                        'u'        => array('style','id','class'),
                        'ul'        => array('style','id','class')
                    );

The sanitizer can be used together with a preset level as follows:

$Sanitizer = new CBXSanitizer;

$Sanitizer->SetLevel(CBXSanitizer::SECURE_LEVEL_MIDDLE);

$pureHtml = $Sanitizer->SanitizeHtml($html);

The CBXSanitizer class functions are available for working with sanitizer.




Courses developed by Bitrix24