I scraped html data from web sites using simplehtmldom_1_5 and after scraped I want to insert every text in a <p>
tag with different id of every <p>
tag as a explained below:
Suppose scrapped data:
<div class="maincontainer">
<div class="first">
first text
</div>
<div class="second">
second text
</div>
<div class="third">
third text
</div>
<div class="fourth">
fourth text
</div>
fifth string
</div>
And I want result like this below:
<div class="maincontainer">
<div class="first">
<p id="1">first text </p>
</div>
<div class="second">
<p id="2">second text </p>
</div>
<div class="third">
<p id="3">third text </p>
</div>
<div class="fourth">
<p id="4">fourth text </p>
</div>
<p id="5"> fifth string </p>
</div>
Guys I want to do this during scraping not after scraping.
You can use jQuery like this:
$(document).ready(function () {
$('div').each(function (i) {
var $this = $(this);
if (!$this.find('div, span, div, img, ul, a').length) {
var elData = $this.html();
if ($.trim(elData) != '') {
var appendData = '<p id="' + (i + 1) + '">' + elData + '</p>';
$this.html(appendData);
}
}
});
});
@ JsFiddle
maybe not the most efficent way to do so Working jsFiddle
http://jsfiddle.net/Diabl0570/FhZZQ/1/
//jquery
$(function(){
var count = 1;
$("div.maincontainer div").each(function(){
var html = $(this).html();
$(this).html("<span id='"+count+"'>"+ html + "</span>");
count= count+1;
});
});
Using perl, You could try bellow code. I called the input file as "xml.xml". I suppose something similar works in php, doesn't it?
#!/usr/bin/perl
use strict;
use warnings;
open my $fh, "<xml.xml" or die;
my $i;
while (<$fh>) {
if (/\s*</||/^\s*$/) { print; next }
++$i;
s{^(\s*)(.*)$}{$1<p id="$i">$2</p>};
redo;
}
close $fh;