I am writing a reular expression to validate input string, which is a line separated list of sizes ([width]x[height]).
Valid input example:
300x200
50x80
100x100
The regular expression I initially came up with is (https://regex101.com/r/H9JDjA/1):
^(\d+x\d+[
||
]*)+$
This regular expression matches my input but also matches this invalid input (size can't be 100x100x200):
300x200
50x80
100x100x200
Adding a word boundary at the end seems to have fixed this issue:
^(\d+x\d+[
||
]*\b)+$
My questions:
How to validate input having multiple training new line characters in this input? The following doesn't work for some kind of inputs like this:
500x500 100x100 384384
^(\d+x\d+[ || ]\b)+|[ || ]$
Isolate the problem with this target 100x100x200
For now, forget about the anchors in the regex.
The minimum regex is \d+x\d+
since it only has to be satisfied once
for a match to take place.
The maximum is something like this \d+x\d+ (?: (?:? | )* \d+x\d+ )*
Since ? |
is optional, it can be reduced to this \d+x\d+ (?: \d+x\d+ )*
The result, when you applied to the target string is:
100x100
x200 matches.
But, since you've anchored the regex ^$
, it is forced to break up
the middle 100 to make it match.
100x10
from \d+x\d+0x200
from (?: \d+x\d+ )*
So, that is why the first regex seemingly matches 100x100x200
.
To avoid all of that, just require a line break between them, and
make the trailing linebreaks optional (if you need to validate the whole
string, otherwise leave it and the end anchor off).
^\d+x\d+(?:(?:? |)+\d+x\d+)*(?:? |)*$
A better view of it
^
\d+ x \d+
(?:
(?: ?
| )+
\d+ x \d+
)*
(?: ?
| )*
$
Try this regex out
^[0-9]{1,4}x[0-9]{1,4}|[( || )]+$
It'll match these inputs.
1x1
10x10
100x100
2000x2938
but not this
100x100x200
Your initial regular expression "fails" because of the +
:
^(\d+x\d+[
||
]*)+$
-----------------------^ here
Your parenthesis pattern (\d+x\d+[ || ]*
) says match one or more number followed by an "x" followed by one or more number followed by zero or more newlines. The +
after that says match one or more of the entire parenthesis pattern, which means that for an input like 100x200x300
your pattern matches 100x200
and then 200x300
, so it looks like it matches the entire line.
If you're simply trying to extract dimensions from a newline-separated string, I would use the following regular expression with a multiline flag:
^(\d+x\d+)$
https://regex101.com/r/H9JDjA/2
Side note: In your expression, [ || ]
is actually saying match any one instance of ,
,
|
, ,
|
, or (i.e. it's quite redundant, and you probably aren't meaning to match
|
). If you want to match a sequential set of any combination of or
, you can simply use
[ ]+
.
You can use multiline modifier, which should make life easier:
var input = "
\
300x200x400
\
50x80
\
\
\
300x200
\
50x80
\
100x100x200x100
";
var allSizes = input.match(/^\d+x\d+/gm); // multiline modifier assumes each line has start and end
for (var size in allSizes)
console.log(allSizes[size]);
Prints:
300x200
50x80
300x200
50x80
100x100