I have a strange bug in my site: I have a get variable in url ?getClass=9а
For some reason when I output it's strlen it is on 1 symbol more than it actually is. For example:
9a strlen - 3
10a strlen - 4
11a strlen - 4
The other strange thing is that when I try to substring using it's strlen:
$classNumber=substr($_GET['getClass'],0,strlen($_GET['getClass']-1));
$classLetter=substr($_GET['getClass'], strlen($_GET['getClass']-1));
The result is like this: 9a $classNumber=9 $classLetter=а that's ok
10а $classNumer=1 $classLetter=0a that's wrong
11a $classNumber=11 $classLetter=a that's ok again. What's wrong with it?
This correct answer was originally posted by user4035 but downvoted and deleted for some reason.
The reason of this behaviour is that you are using a cyrillic "а", and not a latin one. And it is considered as a unicode character, represented with 2 bytes. You need to use mb_strlen function:
<?php
print strlen($_GET['getClass'])."<br>";
print mb_strlen($_GET['getClass'], 'utf8');
For input: "9а" it will print:
3
2
But if you use plain ASCII, the functions will give the same result:
getClass=99
2
2
You should substract the -1 from the strlen result not from the string.
$classNumber=substr($_GET['getClass'],0,strlen($_GET['getClass'])-1);
$classLetter=substr($_GET['getClass'], strlen($_GET['getClass'])-1);