Novice programmer here just starting to learn php and trying to make my very own web scraper. I've done some extensive searching and I can't seem to find a solution.
I created a form to allow users to submit queries which then scrapes images from pinterest and displays the top hits. However, on the first loading of the page after a query has been submitted I get : "Notice: Undefined offset: 0 in C:\xampp\htdocs\domwebcrawler.php on line 27" (28 & 29). AFTER X number of refreshes, the page will eventually load with the pictures.
These lines correspond to my lines of code
HTML/PHP
<html>
<head>
<link type="text/css" href="domwebcrawler.css" rel="stylesheet" media="all" />
</head>
<body>
<form action="<?php echo $_SERVER['PHP_SELF'] ?>" method="get">
<input type="text" name="searchquery"> <input type="submit"> <br>
What do you want to search today?
<?php
include 'simple_html_dom.php';
$dom = new simple_html_dom();
@$query = $_GET["searchquery"];
if (!empty($query)) {
$dom->load_file('http://pinterest.com/search/pins/?q=' . urlencode($query));
$images= $dom->find('.PinHolder img');
$descriptions = $dom->find('.description');
$repins = $dom->find('.RepinsCount');
?>
<div class="js-masonry" data-masonry-options='{"itemSelector": ".pins", "columnWidth":10}'>
<?php
for ($i=0; $i< 20 ; $i++) {
echo '<div class="pins">';
if($images[$i])
echo '<div class="pinimg">' . $images[$i] . '</div>';
if($descriptions[$i])
echo '<div class="description">'. $descriptions[$i] . '</div>';
if($repins[$i])
echo '<div class="repin_count">' . $repins[$i] . '</div>';
echo '</div>';
};
};
?>
</div>
</body>
<script src="masonry.js"></script>
<script src="jquery.js" type="text/javascript"></script>
<script src="jquery.lazyload.js" type="text/javascript"></script>
</html>
CSS
.pins {
padding: 1%;
margin:1%;
border:solid 3px black;
width: 200px;
}
.pinimg img{
width:100%;
}
.description, .repin_count {
text-align: center;
}
I think it might have something due to the fact that the page loads before all the scraped content is loaded? But I'm not sure!
All help (& criticism of inefficient code) is welcome!
Warm regards
Your code is assuming there is always something in the [0] spot of $images, $descriptions, and $repins.
Use isset to avoid that message
if($images[$i])
echo '<div class="pinimg">' . $images[$i] . '</div>';
if($descriptions[$i])
echo '<div class="description">'. $descriptions[$i] . '</div>';
if($repins[$i])
echo '<div class="repin_count">' . $repins[$i] . '</div>';
@Amal was right on point. Just make sure to add isset. Each of the objects you retrieve are arrays. You are accessing an index of the array that doesn't exist.
<?php
include 'simple_html_dom.php';
$dom = new simple_html_dom();
$query = "html";
function print_type($var){
echo gettype($var);
echo "</br>";
if (is_array($var)){
echo sizeof($var);
}
echo "</br>";
}
if (!empty($query)) {
$dom->load_file('http://pinterest.com/search/pins/?q=' . urlencode($query));
$images= $dom->find('.PinHolder img');
print_type($images);
$descriptions = $dom->find('.description');
print_type($descriptions);
$repins = $dom->find('.RepinsCount');
print_type($repins);
?>
<html>
<head>
<!-- put some info here-->
<title>Pinterest parser</title>
</head>
<body>
<?php
for ($i=0; $i< 20 ; $i++) {
?>
<div class="pins">
<?php
if(isset($images[$i])){
?>
<div class="pinimg">
<?php
echo $images[$i]
?>
</div> <!-- end pinimg -->
<?php
}
?>
<?php
if(isset($descriptions[$i])){
?>
<div class="description">
<?php
echo $descriptions[$i]
?>
</div><!-- end description -->
<?php
}
?>
<?php
if(isset($repins[$i])){
?>
<div class="repin_count">
<?php
echo $repins[$i]
?>
</div> <!-- end repin_count -->
<?php
}
?>
</div><!-- end pins-->
<?php
}
}
?>
</body>
</html>