For a project, I need to pull out the value of a character ('v') from an HTML page generated by me.
The HTML page contains the following links with much garbage data around it:
/watch?v=blablabla&list=blablabla&index=7&feature=blablabla /watch?v=blablabla&list=blablabla&index=8&feature=blablabla
The task is the values of 'v' has to be retrieved & stored under categories in an XML.
Try using regular expressions with preg_match_all
$file = file('path/file.html');
preg_match_all("/\/watch\?v=([a-z0-9]+)&list=[a-z0-9]*&index=[0-9]*/i", $file, $matches);
I'm not sure what the URL's will look like, so the regexp will have to be altered for that.
Try http://gskinner.com/RegExr/ to fine-tune your expression