Can anyone help me explaining the use of preg_replace()
in following line:
if ( isset( $data['title'] ) ) $this->title = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] );
// $title is variable for storing title of a blog post
Here is full constructor code where variable properties are set:
public function __construct( $data=array() ) {
if ( isset( $data['id'] ) ) $this->id = (int) $data['id'];
if ( isset( $data['publicationDate'] ) ) $this->publicationDate = (int) $data['publicationDate'];
if ( isset( $data['title'] ) ) $this->title = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] );
if ( isset( $data['summary'] ) ) $this->summary = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['summary'] );
if ( isset( $data['content'] ) ) $this->content = $data['content'];
}
I am unable to understand usage of preg_replace
and it's need here. Help me in explaining this - Thanks in advance
It looks like whoever wrote that got a little crazy with escaping characters, which is not needed for most of those characters in the character class. This regex could be re-written as
/[^.,\-_'"@?!:$ a-zA-Z0-9()]/ <-- note that only the dash needs to be escaped here
This basically saying that the evaluated string CAN NOT have any of the following characters:
.,-_'"@?!:$ a-zA-Z0-9()
This is because the opening carat ^
in the character class makes this a negation class.
Here the intended usage is to remove characters that are not in that list from the output by replacing with empty string. So if you had something like
$this->title = 'Here is a bad symbol >';
This would become:
'Here is a bad symbol '
preg_replace looks for "search pattern" and replaces found substrings with value, so preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] )
looks for some symbols and replace them with empty string
pattern /[^...]/ means - all symbols which ARE NOT in this list, ie this code replaces all non-alphanumeric and some punctuation with empty string, so, text like a#b
will be just ab
, but a?b
will be a?b
as ? in "allowed" symbols.
Don't be confused about many \ symbols - they are need for escaping only, so basically the list of allowed symbols is .,-_'"@?!:$ a-zA-Z0-9()