Here is the HTML:
<article class="module_article featured">
<a title="Exclusive: Strictly's Vincent Simone welcomes baby boy" href="h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/"><h1 class="article_title">Exclusive: Strictly's Vincent Simone welcomes baby boy</h1></a> <a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<p>HELLO! Online can exclusively reveal that Strictly Come Dancing professional Vincent...</p>
</article>
<article class="module_article featured">
<a title="Exclusive: Strictly's Vincent Simone welcomes baby boy" href="h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/"><h1 class="article_title">Exclusive: Strictly's Vincent Simone welcomes baby boy</h1></a> <a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<p>HELLO! Online can exclusively reveal that Strictly Come Dancing professional Vincent...</p>
</article>
Here is my XPATH:
$articleLinks = $finder->query('article[contains(@class,"module_article")]//@href');
As you can see, its grabbing both hrefs
. I need the first one only.
Use this XPATH expression :
(/article[contains(@class,"module_article")]//@href)[1]
output :
h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/
Update(as per the last edit)
/article[contains(@class,"module_article")]/a[1]/@href
DEMO Example:
<foo>
<a href='#1'>1</a>
<bar>
<a href='#2'>2</a>
</bar>
</foo>
<foo>
<a href='#3'>3</a>
<baz> <a href='#4'>4</a> </baz>
</foo>
XPATH
/foo/a[1]/@href
output:
#1
#3
To retrieve the first <a>
with the href
:
$finder->query('article[contains(@class,"module_article")]/a[1]/@href')