What could be the best way to mark some keywords in an html code.
As an Example, I have this html code:
$text = '
<h1>Lorem Ipsum</h1>
<p>Lorem ipsum dolor sit āmet, consetetur sadipscing elitr, sed diam nonumy<br>
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
</p>
<p><img src="test.jpg" alt="Lorem Ipsum">
<p>At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren,<br>
one, nò sea takimata 1 sanctus est <a href="#" title="Lorem Ipsum">Lorem ipsum</a> dolor sit amet. Lörem ipsum dolor sit amet,<br>
consetetur sadipscing elitr, sed diam lorem ipsum nonumy eirmod tempor invidunt ut labore et<br>
dolore magna aliquyam erat, sed diam voluptua.
</p>
';
And i would like to highlight the word "Lorem Ipsum" like this: <span class="tooltip">Lorem Ipsum</span>
Since the keywords originate from a database, it can happen that certain words only occur once and therefore twice:
$keywords = ['Lorem Ipsum', 'Lorem']
In this case, there should only be one marker. Because I don't want a code like this:
<span class="tooltip"><span class="tooltip">Lorem</span> Ipsum</span>
Also all tag attributes like title
and alt
should be ignored. The same should apply to links, because I don't wan't a double function like hover and click. So the marked result should look like this:
$text = '
<h1><span class="tooltip">Lorem Ipsum</span></h1>
<p><span class="tooltip">Lorem ipsum</span> dolor sit āmet, consetetur sadipscing elitr, sed diam nonumy<br>
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
</p>
<p><img src="test.jpg" alt="Lorem Ipsum">
<p>At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren,<br>
one, nò sea takimata 1 sanctus est <a href="#" title="Lorem Ipsum">Lorem ipsum</a> dolor sit amet. Lörem ipsum dolor sit amet,<br>
consetetur sadipscing elitr, sed diam <span class="tooltip">lorem ipsum</span> nonumy eirmod tempor invidunt ut labore et<br>
dolore magna aliquyam erat, sed diam voluptua.
</p>
';
As you can see, the keyword Lorem Ipsum
should also match lorem ipsum
in lowercase.
I would like to know what is the most promising way to solve this problem. PHP and Javascript would be possible. Could someone help me with an approach? Has anyone ever had to solve this problem?
This is similar to bad words filter using php Using Google to search for it.
this function will determine if your text contain any of the words and replace it with * this is pretty much similar to what you looking for and same approach you will have to do.
Check if your text contain the word from the array
A. Yes Contain, you need to clean that word from your text of all html tags you can use something like this strip_tags("Hello <b>world!</b>");
and then replace with whatever you want or just wrap it with <Mark>
B. No Doesn't contain, then you will continue.
Your string case Upper or Lower you can fix it by using
lcfirst('January'); // january ucfirst('January'); // January ucwords('a title without caps'); // A Title Without Caps
Example Function will change your words in the array to *
function filterwords($text){
$filterWords = array('Lorem Ipsum','Lorem','Else');
$filterCount = sizeof($filterWords);
for($i=0; $i<$filterCount; $i++){
$text = preg_replace('/\b'.$filterWords[$i].'\b/ie',"str_repeat('*',strlen('$0'))",$text);
}
return $text;
}
Usage
echo filterwords("
<h1><mark>Lorem Ipsum</mark></h1>
<p><mark>Lorem ipsum</mark> dolor sit āmet, consetetur sadipscing elitr, sed diam nonumy<br>
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
</p>
<p><img src="test.jpg" alt="Lorem Ipsum">
<p>At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren,<br>
one, nò sea takimata 1 sanctus est <a href="#" title="Lorem Ipsum">Lorem ipsum</a> dolor sit amet. Lörem ipsum dolor sit amet,<br>
consetetur sadipscing elitr, sed diam <mark>lorem ipsum</mark> nonumy eirmod tempor invidunt ut labore et<br>
dolore magna aliquyam erat, sed diam voluptua.
</p>
");
Update if you want to highlight also you can do this using JS
function highlight(text) {
var inputText = document.getElementById("inputText");
var innerHTML = inputText.innerHTML;
var index = innerHTML.indexOf(text);
if (index >= 0) {
innerHTML = innerHTML.substring(0,index) + "<span class='highlight'>" + innerHTML.substring(index,index+text.length) + "</span>" + innerHTML.substring(index + text.length);
inputText.innerHTML = innerHTML;
}
}
function highlight(text) {
var inputText = document.getElementById("inputText");
var innerHTML = inputText.innerHTML;
var index = innerHTML.indexOf(text);
if (index >= 0) {
innerHTML = innerHTML.substring(0,index) + "<span class='highlight'>" + innerHTML.substring(index,index+text.length) + "</span>" + innerHTML.substring(index + text.length);
inputText.innerHTML = innerHTML;
}
}
.highlight {
background-color: yellow;
}
<button onclick="highlight('fox')">Highlight</button>
<div id="inputText">
The fox went over the fence
</div>
</div>