I'm currently using this script to sanitize a block of text...
function rseo_sanitize($s) {
$result = preg_replace("/[^a-zA-Z0-9'-]+/", "", html_entity_decode($s, ENT_QUOTES));
return $result;
}
I'd like to add support for a collection of special characters such as ñ, á, é, í, ó, ú, etc
How can I integrate those (and the larger collection of spanish characters) into the preg_replace?
You can use /\pL+/u
to match all letter symbols in Unicode.
There is no separate plane for Spanish letters only in PCRE, but you could try:
/[^\p{Latin}0-9'-]+/u
This includes everything from the ISO Latin-1 charset I believe. That encompasses other european languages, not just spanish. But otherwise you would really have to list the desired letters individually.
You should use the \w
along with the u
modifier
Example:
/[^\w]+/u