How to check string on it has mixed (latin & cyrillic) symbols in one word? For example:
$str = 'This is test string'; //ok
$str = 'This is тест string'; //ok
$str = 'This is тестstring'; // <-- fail, how to detect this?
More examples:
$str = 'This is тест_123 string'; //ok
$str = 'This is {тест}_string'; //fail
$str = 'Абвгabcd'; //fail
$str = 'Абвг_abcd'; //fail
$str = 'Абвг abcd'; //ok
$str = 'This sentence has русское word'; //ok
$str = 'This has splittedкириллицаletters word'; //fail
Found solution, it passed all tests
$result = preg_match_all('/\S*[а-яА-Я]\S*[a-zA-Z]\S*|\S*[a-zA-Z]\S*[а-яА-Я]\S*/', $str, $matches);
This will return 0
for no matches and 1
for a match. You need to add any special characters that are not allowed into [a-z]
, such as [a-z}{]
:
$result = preg_match('/([a-z]\p{Cyrillic})|(\p{Cyrillic}[a-z])/iu', $str, $matches);
To get the words pass $matches
as the third parameter and it will be populated with the matches. To get more than one match:
preg_match_all('/([a-z]\p{Cyrillic})|(\p{Cyrillic}[a-z])/iu', $str, $matches);
To do the opposite and find the good words:
preg_match_all('/([a-z]\s+\p{Cyrillic})|(\p{Cyrillic}\s+[a-z])/iu', $str, $matches);