I'am currently learning how to scrape websites with 'SIMPLE_HTML_DOM_PARSER' and I'am facing difficulty , the page click here as we can see has 3 columns
'Si no','code','course name' and many links under it . So when i execute the below code
<?php
include 'simple_html_dom.php';
$html = file_get_html('http://www.sitttrkerala.ac.in/index.php?r=site%2Fdiploma-syllabus-courses&prog=CM');
foreach($html->find('td') as $element){
echo htmlspecialchars($element);
echo "<br>";
}
?>
The above code is only returning-
<td style="text-align:center">Sl No.</td>
<td class="style1">Code</td>
<td class="style1">Course Name</td>
There are multiple links under those td's and not even one is bieng displayed. I also tried
echo htmlspecialchars($element->href);//nothing bieng displayed
I think the dom returned does not contain those links see this code
include 'simple_html_dom.php';
$html = file_get_html('http://www.sitttrkerala.ac.in/index.php?r=site%2Fdiploma-syllabus-courses&prog=CM');
echo ($html);
?>
And the output returned is
Government of Kerala
Department of Technical Education
Login
SITTTR Login
Member Login
Govt. Logo
State Institute of Technical Teachers' Training & Research, Kalamassery
HMT Junction, Kalamassery - 683 104, Phone: 0484-2542355,
Fax: 0484-2542355, E-mail: jd_cdc@yahoo.com, sitttr@gmail.com
Main Menu
Home
About Us
Vision & Mission
Joint Director's Desk
Officer's & Staff
RTI
Contact
Institutions
Polytechnic Colleges
Government
Aided
Self-Financing
IHRD
Government Commercial Institutes
Government Institute of Fashion Designing
Technical High Schools
Academic
Courses
Diploma Programmes
Diploma Programmes (Evening)
Syllabus
Diploma - Revision 2015
Diploma - Revision 2010
Diploma - Revision 2006
Diploma - Model Question Papers
Diploma - Lab Manual
Academic Calendar
Academic Calendar - Diploma
Training
More »
Notifications
Orders
Downloads
Photo Gallery
Important Links
News & Events
Site Map
Feedback
Disclaimer
DIPLOMA SYLLABUS
Sl No. Code Course Name //empty there should be multiple links under it
Important Links
Home
Sitemap
Disclaimer
Contact Information
State Institute of Technical Teachers Training & Research
HMT Junction, Kalamassery - 683 104
Phone: 0484-2542355, Fax: 0484-2542355
E-mail: jd_cdc@yahoo.com, sitttr@gmail.com
Website: http://www.sitttrkerala.ac.in
Contact Us | Sitemap | Disclaimer | RTI
Why would the links be not returning i tried curl too but failed.Stuck for the past 2 weeks.