I know this a common question but everything I found seems to remove white space.
I'm looking for a regular expression that will strip unprintable characters WITHOUT changing any whitespace. This a function that all user input will be filtered through, which means all the characters you could normally type on a keyboard are valid. Ex: the accents you see in Spanish are valid. Basically anything you could display using the UTF 8 charset.
Because this is SQL Server, I don't think the "SET NAMES UTF8" approach will work.
Here's what I have.
function stripNonPrintable($input)
{
return preg_replace('/[\x00\x08\x0B\x0C\x0E-\x1F]/', '', $input);
}
Try something like this:
function stripNonPrintable($input) {
$bad=array(
'\x00\x08\x0B\x0C\x0E-\x1F'
);
$fixed=array(
''
);
return str_replace($bad, $fixed, $input);
}
You could always escape the whitespace first:
function stripNonPrintable($input)
{
$input = preg_replace('/ /','%%%%SPACE%%%%%%', $input);
$input = preg_replace('/\t/','%%%%TAB%%%%%%', $input);
$input = preg_replace('/
/','%%%%NEWLINE%%%%%%', $input);
$input = preg_replace('/[\x00\x08\x0B\x0C\x0E-\x1F]/', '', $input);
$input = str_replace('%%%%SPACE%%%%%%', ' ', $input);
$input = str_replace('%%%%TAB%%%%%%', "\t", $input);
$input = str_replace('%%%%NEWLINE%%%%%%', "
", $input);
}
Not elegant, but it works.