I need to match a variety of datestamps in a text string
$date_day_pattern = '0[1-9]|[12][0-9]|3[01]';
$date_month_pattern = '0[1-9]|1[0-2]';
$date_year_pattern = '[12][0-9]|20[12][0-9]';
$date_pattern = "(?<!\d)($date_day_pattern)[^\d]?($date_month_pattern)[^\d]?($date_year_pattern)(?!\d)";
preg_match("/$date_pattern/m", $input, $matches);
This works when matching for
01-05-2015
01-05-15
01052015
010515
But I also need to match for datestamps where day/month doesn't have zerofill. But then the datestamp must have a seperator between day, month and year
1-5-2015
The pattern must not match
152015
You could add an additional alternative to the first two expressions:
[1-9](?=\D)
(?<=\D)[1-9](?=\D)
This will become:
$date_day_pattern = '0[1-9]|[12][0-9]|3[01]|[1-9](?=\D)';
$date_month_pattern = '0[1-9]|1[0-2]|(?<=\D)[1-9](?=\D)';
$date_year_pattern = '[12][0-9]|20[12][0-9]';
$date_pattern = "(?<!\d)($date_day_pattern)\D?($date_month_pattern)\D?($date_year_pattern)(?!\d)";
The above will even match weird, but non-ambiguous strings like:
2-0116
... but it will not allow:
201-16
Note that \D
is equivalent to [^\d]
.
You can use conditional subpattern feature for this i.e. if leading 0
is matched then make separator optional otherwise make it mandatory.
$date_pattern =
'(?<!\d)((0)?[1-9]|[12][0-9]|3[01])(?(2)\D?|\D)(0?[1-9]|1[0-2])(?(2)\D?|\D)([12][0-9]|20[12][0-9])(?!\d)';
(0)?
(?(2)\D?|\D)
makes \D
optional if group #2 is matched otherwise \D
(non-digit) is required.