There is a question about this from four years ago. One answer suggests using bytes(str).length, which still works as described, as an estimate that potentially overestimates the number of characters.
The exact solution posted there works in what was the latest compiler version back then (0.4.11), but it does not work on a current version (0.8.11). So this works:
pragma solidity 0.4.11;
contract utf8StringLength
{
function utfStringLength(string str) constant
returns (uint length)
{
uint i=0;
bytes memory string_rep = bytes(str);
while (i<string_rep.length)
{
if (string_rep[i]>>7==0)
i+=1;
else if (string_rep[i]>>5==0x6)
i+=2;
else if (string_rep[i]>>4==0xE)
i+=3;
else if (string_rep[i]>>3==0x1E)
i+=4;
else
//For safety
i+=1;
length++;
}
}
}
But it fails with pragma solidity 0.8.11;. Even making some quick fixes, like changing the function header to function utfStringLength(string memory str) public pure returns (uint length) throws errors like TypeError: Operator == not compatible with types bytes1 and int_const 6 for string_rep[i]>>5==0x6. string_rep[i]>>5==bytes1(0x6) or string_rep[i]>>5==bytes1(6) do not work either. Is there a better way to get the exact string length in current compiler versions than using an old compiler version?
0x1E--->bytes1(uint8(0x1E)). – Ismael Jan 13 '22 at 02:40