preg_replace()的用法 - php

Can anyone help me explaining the use of preg_replace() in following line:

if ( isset( $data['title'] ) ) $this->title = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] );
 // $title is variable for storing title of a blog post

Here is full constructor code where variable properties are set:

public function __construct( $data=array() ) {
    if ( isset( $data['id'] ) ) $this->id = (int) $data['id'];
    if ( isset( $data['publicationDate'] ) ) $this->publicationDate = (int) $data['publicationDate'];
    if ( isset( $data['title'] ) ) $this->title = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] );
    if ( isset( $data['summary'] ) ) $this->summary = preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['summary'] );
    if ( isset( $data['content'] ) ) $this->content = $data['content'];
  }

I am unable to understand usage of preg_replace and it's need here. Help me in explaining this - Thanks in advance

It looks like whoever wrote that got a little crazy with escaping characters, which is not needed for most of those characters in the character class. This regex could be re-written as

/[^.,\-_'"@?!:$ a-zA-Z0-9()]/ <-- note that only the dash needs to be escaped here

This basically saying that the evaluated string CAN NOT have any of the following characters:

.,-_'"@?!:$ a-zA-Z0-9()

This is because the opening carat ^ in the character class makes this a negation class.

Here the intended usage is to remove characters that are not in that list from the output by replacing with empty string. So if you had something like

$this->title = 'Here is a bad symbol >';

This would become:

'Here is a bad symbol '

preg_replace looks for "search pattern" and replaces found substrings with value, so preg_replace ( "/[^\.\,\-\_\'\"\@\?\!\:\$ a-zA-Z0-9()]/", "", $data['title'] ) looks for some symbols and replace them with empty string

pattern /[^...]/ means - all symbols which ARE NOT in this list, ie this code replaces all non-alphanumeric and some punctuation with empty string, so, text like a#b will be just ab, but a?b will be a?b as ? in "allowed" symbols.

Don't be confused about many \ symbols - they are need for escaping only, so basically the list of allowed symbols is .,-_'"@?!:$ a-zA-Z0-9()