清理MySQL数据的算法

Let's say I have a table of 100,000 MySQL records in a table with 2 columns: title and description. There's also a table containing all the bad words that need to be sanitized.

For e.g. let's say the title column contains the string "Fuck this" and the profanity table says that the "Fuck" string should be replaced with "F***".

Currently I implemented it with a brute force method, but this is way too slow. It checks every single substring from the sentence and compares it with every single string that exists in the profanity filter.

public function sanitizeSiteProfanity($word, $replacement)
{
    $query = $this->_ci->db->select('title, description')->get('top_sites')->result_array();
    $n = $query->num_rows();
    for($i = 0; $i < $n; $i++)
    {
        str_replace($word, $replacement, $query[$i]['title']);
        str_replace($word, $replacement, $query[$i]['description']);
    }   
}

Is there a faster method to sanitize all the substrings?

I don't know if there is a fast way to sanitize the data. It seems that you have to loop through all the words for the replacement, because one title could have multiple offensive words.

If you are looking for complete words, a full text index and contains should speed things up. Essentially, you would set up a loop for each of the words and then run:

update table
    set title = replace(title, 'F***')
    where match (title) against ('Fuck' in boolean mode);

You would need to put this in a stored procedure loop. But, the match() would be quite fast and this would probably significantly speed up the current process.

The best way to optimize this is to delegate the replacement step to the database and let mysql do the heavy lifting. You'll need to use the REPLACE mysql built-in. The (not-so-big) drawback is that you'll need to use explicit sql instead of the code igniter expression builder.