I have follow Regex:
(.*?)( EUR)\1*
and on String 2 mm; EUR
it matches 2mm;
and EUR
but the String 2 mm;
matches nothing!?
But why ? I thought the *
meant zero or more times? Can you help me? Thank you!
It would have been a better approach to tell what result you want.
If you want a Regex to match "2 mm; EUR" or "2 mm;", it implies that you want to match a string starting with some kind of number (might be a millimiter length? something like that), ending with a ";" and eventually followed by the string " EUR".
If it is what you want, your regex should have a ";" inside and mark EUR with a "?" (0 or 1)
([
]+ .*?);( EUR)?
Yes you are right *
means 'zero or more times'. What you don't seem to understand is \1
. It means 'the content captured by the first caturing group.
Your regular expression (.*?)( EUR)\1*
means therefore:
any string, followed by the four letters EUR
(with a space), followed by zero or more times the start of the string.
If the string is 2 mm; EUR2 mm;2 mm;
, (.*?)
will match 2 mm;
, ( EUR)
will match EUR
, and \1*
will match 2 mm;2 mm;
.
Now that you understand your error, you will find more easily the correct expression. Just remove \1
.
(.*?)( EUR)*
will match anything, followed be zero or more times EUR
.
^(.*?)( EUR.*)?$
will match anything before ' EUR', or the whole string if it is missing. Notice that we added the start and end marks to make sure the whole string is captured when there is no EUR.