正则表达式从PHP中提取前3个匹配实例[关闭]

I have a full html file in PHP variable.. and would like to extract first 3 values in the html which are formatted as such q?s=XXX OR q?s=XX or q?s=XXXX (where X is a stock symbol).

$html variable contains:

<a name='mkt-movers' class='anchor'><\/a><h2 class='Fz-l Fw-200 Mend-4 D-i'>Market Movers<\/h2><\/div><div class=\"bd\">\t<div class=\"dropdown rapid-nf Fw-200 Bdrs\">
            <form class=\"SelectBox SelectBoxNoBorder\">
                <div class=\"SelectBox-Pick\">
                    <span class=\"SelectBox-Text\">U.S. Composite<\/span>
\t\t    <i class='Icon'>&#xe002;<\/i>
                <\/div>

                <select data-plugin=\"selectbox\"  class='Start-0' name='selectBox' >
\t\t    <option value=\"0\" selected=\"selected\" class=\"Selected\">U.S. Composite<\/option><option value=\"1\" >Nasdaq<\/option><option value=\"2\" >NYSE Market<\/option><option value=\"3\" >NYSE<\/option>
                <\/select>
                <noscript>
                    <Btn type=\"submit\" class=\"Hidden\">Select<\/Btn>
                <\/noscript>
            <\/form>
\t<\/div><div class=\"content\"><div class=\"mod-85ac7b2b-640f-323f-a1c1-00b2f4865d18 mod active\"><div id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18\" class=\"yom-mod yom-app yom-data yfi-table wp yfi-mmovers fin-glass-disabled\">
\t<a name=\"mkt-movers\" class=\"anchor\"><\/a>
    <div class=\"hd\">
        <h2 class=\"Fw-200 Fz-l M-0\"><\/h2>
    <\/div>
    <div class=\"bd yom-tabview\">
            <ul role=\"tablist\" data-plugin='tabpanel' class='FinTabs Mb-10'>
                <li class=\"Grid-U Mend-8 FinTab-Item Selected rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" >Most Actives<\/a>
                <\/li>
                <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\" >% Gainers<\/a>
                <\/li>
                <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\" >% Losers<\/a>
                <\/li>
            <\/ul>
\t<div class=\"yfi-panelcontainer yui3-tabview-panel\">
            <div role=\"tabpanel\" id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" class=\" Selected\" data-start=\"0\" data-count=\"10\" data-content=\"mostactive\" >
        \t<div class=\"original\">
                
        <table summary=\"1\" class=\"yom-data col-8 phatable\" >
          <caption><\/caption>
          <colgroup><col><col><col><col><col><col><col><col><\/colgroup>
          <thead>
            <tr>
                <th id=\"table-31-0-0\" class=\"symbol  txt-color\" scope=\"col\"><span>Symbol<\/span><\/th>
                <th id=\"table-31-0-1\" class=\"name  txt-color\" scope=\"col\"><span>Company Name<\/span><\/th>
          

I want to extract the first 3 stock symbols in the large full HTML string above. I.e. output = "BAC", "GE", "MSFT".

Note - stock symbols could be 1, 2, 3 or 4 characters long.

Any ideas to get this would be appreciated - thanks!!

This should work, try:

if(preg_match_all('~(?<=q\?s=)[-A-Z.]{1,5}~', $source, $out))
{
    // The matches are in [0] (whole pattern)
    echo "<pre>"; print_r($out[0]); echo "</pre>";

    // If you need first 3
    #$out[0] = array_slice($out[0],0,3);
    #echo "<pre>"; print_r($out[0]); echo "</pre>";

    // If you need them unique:
    $out[0] = array_unique($out[0]);
    echo "<pre>"; print_r($out[0]); echo "</pre>";

} else {
    echo "FAIL";
}

I changed the pattern a bit, to match stock symbols like in this list to ~(?<=q\?s=)[-A-Z.]{1,5}~

  • It looks behind for q?=
  • If found, matches 1-5 of characters: A-Z,., -

This should do it.

preg_match_all("/q\?s=([A-Za-z\.]{1,5})/",$html,$matches);

for ($i = 1; $i <= 3; $i++) {
    if (isset($matches[$i])) {
        echo $i;
    }
}

This will match everything in your html string. You than run a loop from 1 to 3 to get the matches. Note: the matches captured with the parenthesis will start at $matches[1]. $matches[0] will contain the text that matched the full pattern.

Here's the documentation about preg_match: http://us2.php.net/preg_match