I need to use a regular expression to validate a string (written in spanish) that represents a full date string... I don't need to validate if the actual string is a valid date (leap years, etc...)
The string looks like this:
23 de septiembre del 2003
23 de septiembre de 1965
If the year is greater then 2000, the the word 'del' is use prior to the year, if not the word 'de' is use...
i did my research and found how to get the first 2 digits:
$pattern = ([0-9]+);
.. then i got lost on how to put it all together...
Help !
/\b\d{1,2} de [a-z]+ (de 1\d{3}|del 2\d{3})/i
Explanation:
\b ... requires a word boundary, since the following character is a digit
(and thus a word character) this will only match if the date is
preceded by a character that is not a letter, not a digit and
not an underscore
\d{1,2} ... one or two digits
de ... literally "de"
[a-z]+ ... any letter from a-z, at least once but an arbitrary number of times
(de 1\d{3} ... literally "de" followed by "1" and 3 more digits
| ... or
del 2\d{3}) ... literally "del" followed by "2" and 3 more digits
i ... make the whole thing case-insensitive (you can omit this if needed)
Also note, that all spaces in the regex are treated just like any other character.
Alternatively, instead of [a-z]+
you could specify a list of valid months like
/\b\d{1,2} de (...|septiembre|...) (de 1\d{3}|del 2\d{3})/i
(replace ... with more month names an |
to separate them)