I have a php page with a text input where the user is supposed to paste a remote URL of an image, and I will have to store it in the server and display it to the user. Now the problem is, I don't trust a user will always provide a proper image url, and I don't want them to upload a pdf or other file, or a huge, few gb worth of file. Now I can check the extension, but that isn't very helpful, and I hear I can check the mime-type, but I don't know how I can open the file once and check all the validations like mime-type and file size in one go, and then copy the file over. Moreover, since the file will be pretty much served as it is(with a minor name change), I would like to know if it is possible to make sure that the file doesn't have any injected virus or problematic code.
Any suggestions appreciated.
Well there are really multiple things that can be done here. I would suggest using cURL as your mechanism for transferring the file (rather than file_get_contents()
or similar). The reason for this is that you can first send a HEAD request against the resource to just get the header information before committing to actually download it. From the headers, you should be able to evaluate the file name, file size, mime-type information, etc. Note that NONE of this information should be trusted, but it at least gives you a sanity check before committing to the file download.
Once you have done the sanity check, you can download the file into a local snadbox directory. This should not be a web-accessible directory. You could use exif_imagetype()
to determine if the file is indeed an image of the type you are interested in.
Assuming this all looks good, I would just do the last bit of cleanup-and renaming in GD library (perhaps use imagecreatefrom*()
functions to make final image from the temp download file).
You can use exif_imagetype()
to see if its an image.
If you want to be 100% sure that its not malware or something weird. its a good idea to use the GD library and save it via the GD library. So there is no dangerous
code inside.
With Curl you have no problem with https, you may store a file and check it.
Here is the code to check content-type for image then file is checked with exif_imagetype() (enable php_mbstring and php_exif extentions).
$url = 'https://www.google.com/images/icons/ui/doodle_plus/doodle_plus_google_logo_on_grey.gif';
$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,60);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_exec( $ch ) ;
if(!curl_errno($ch))
{
$type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);
if ( stripos($type, 'image') !== FALSE )
{
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_HEADER, false);
$filename = tempnam('/path/to/store/file/', 'prefix');
$fp=fopen($filename,'wb');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
fclose($fp);
if ( exif_imagetype($filename) !== FALSE )
{
echo "100% IMAGE!";
// take it!
}
unlink($filename);
}
}
curl_close($ch);