Sanitizer is a tool that analyzes the html code introduced by a user. The main task of the sanitizer consists in preventing the implementation/display of a potentially hazardous code in HTML.
The sanitizer comes handy where the user introduces arbitrary html. E.g., in a visual editor or when copying a text from MS Word. In addition to the control of the introduced code, the sanitizer also partially monitors layout validity. In particular, it closes unclosed tags.
How to Filter a Text
If a text (containing HTML tags) typed by the user must be filtered from undesirable HTML tags using the sanitizer, it can be achieved as follows:
$Sanitizer = new CBXSanitizer;
$Sanitizer->AddTags( array (
'a' = > array('href','id','style','alt'...),
'br' => array(),
.... ));
$pureHtml = $Sanitizer->SanitizeHtml($html);
The sanitizer will filter out all tags and attributes which are not contained in the “white” list generated by the function AddTags().
The sanitizer includes 3 preset filtration levels:
SECURE_LEVEL_HIGH (high level) includes the following list:
$arTags = array(
'b' => array(),
'br' => array(),
'big' => array(),
'blockquote' => array(),
'code' => array(),
'del' => array(),
'dt' => array(),
'dd' => array(),
'font' => array(),
'h1' => array(),
'h2' => array(),
'h3' => array(),
'h4' => array(),
'h5' => array(),
'h6' => array(),
'hr' => array(),
'i' => array(),
'ins' => array(),
'li' => array(),
'ol' => array(),
'p' => array(),
'small' => array(),
's' => array(),
'sub' => array(),
'sup' => array(),
'strong' => array(),
'pre' => array(),
'u' => array(),
'ul' => array()
);
SECURE_LEVEL_MIDDLE (middle level) includes:
$arTags = array(
'a' => array('href', 'title','name','alt'),
'b' => array(),
'br' => array(),
'big' => array(),
'blockquote' => array('title'),
'code' => array(),
'caption' => array(),
'del' => array('title'),
'dt' => array(),
'dd' => array(),
'font' => array('color','size'),
'color' => array(),
'h1' => array(),
'h2' => array(),
'h3' => array(),
'h4' => array(),
'h5' => array(),
'h6' => array(),
'hr' => array(),
'i' => array(),
'img' => array('src','alt','height','width','title'),
'ins' => array('title'),
'li' => array(),
'ol' => array(),
'p' => array(),
'pre' => array(),
's' => array(),
'small' => array(),
'strong' => array(),
'sub' => array(),
'sup' => array(),
'table' => array('border','width'),
'tbody' => array('align','valign'),
'td' => array('width','height','align','valign'),
'tfoot' => array('align','valign'),
'th' => array('width','height'),
'thead' => array('align','valign'),
'tr' => array('align','valign'),
'u' => array(),
'ul' => array()
);
SECURE_LEVEL_LOW (low level) includes:
$arTags = array(
'a' => array('href', 'title','name','style','id','class','shape','coords','alt','target'),
'b' => array('style','id','class'),
'br' => array('style','id','class'),
'big' => array('style','id','class'),
'blockquote' => array('title','style','id','class'),
'caption' => array('style','id','class'),
'code' => array('style','id','class'),
'del' => array('title','style','id','class'),
'div' => array('title','style','id','class','align'),
'dt' => array('style','id','class'),
'dd' => array('style','id','class'),
'font' => array('color','size','face','style','id','class'),
'h1' => array('style','id','class','align'),
'h2' => array('style','id','class','align'),
'h3' => array('style','id','class','align'),
'h4' => array('style','id','class','align'),
'h5' => array('style','id','class','align'),
'h6' => array('style','id','class','align'),
'hr' => array('style','id','class'),
'i' => array('style','id','class'),
'img' => array('src','alt','height','width','title'),
'ins' => array('title','style','id','class'),
'li' => array('style','id','class'),
'map' => array('shape','coords','href','alt','title','style','id','class','name'),
'ol' => array('style','id','class'),
'p' => array('style','id','class','align'),
'pre' => array('style','id','class'),
's' => array('style','id','class'),
'small' => array('style','id','class'),
'strong' => array('style','id','class'),
'span' => array('title','style','id','class','align'),
'sub' => array('style','id','class'),
'sup' => array('style','id','class'),
'table' => array('border','width','style','id','class','cellspacing','cellpadding'),
'tbody' => array('align','valign','style','id','class'),
'td' => array('width','height','style','id','class','align','valign','colspan','rowspan'),
'tfoot' => array('align','valign','style','id','class','align','valign'),
'th' => array('width','height','style','id','class','colspan','rowspan'),
'thead' => array('align','valign','style','id','class'),
'tr' => array('align','valign','style','id','class'),
'u' => array('style','id','class'),
'ul' => array('style','id','class')
);
The sanitizer can be used together with a preset level as follows:
$Sanitizer = new CBXSanitizer;
$Sanitizer->SetLevel(CBXSanitizer::SECURE_LEVEL_MIDDLE);
$pureHtml = $Sanitizer->SanitizeHtml($html);
The CBXSanitizer class functions are available for working with sanitizer.