I am trying to split a string into terms in PHP using preg_split. I need to extract normal words ( \w ) but also currency ( even currency symbol ) and numeric terms ( including commas and decimal points ). Can anyone help me out, as I cannot seem to create a valid regex to use for preg_split to achieve this. Thanks
Why not use preg_match_all()
instead of preg_split()
?
$str = '"1.545" "$143" "$13.43" "1.5b" "hello" "G9"'
. ' This is a test sentence, with some. 123. numbers'
. ' 456.78 and punctuation! signs.';
$digitsPattern = '\$?\d+(\.\d+)?';
$wordsPattern = '[[:alnum:]]+';
preg_match_all('/('.$digitsPattern.'|'.$wordsPattern.')/i', $str, $matches);
print_r($matches[0]);
Does it solve your problem to split on whitespace? "/\s+/"
What about preg_match_all()
each word with this [\S]+\b
then you get an array with the words in it.
Big brown fox - $20.25 will return
preg_match_all('/[\S]+\b/', $str, $matches);
$matches = array(
[0] = 'Big',
[1] = 'brown',
[2] = 'fox',
[3] = '$20.25'
)