Unicode is a single unified character set, an industry standard for the consistent representation and manipulation of text expressed in most of the world's writing systems.
It handles practically any script and language used on this planet and supports a comprehensive set of mathematical and technical symbols.
Unicode can be implemented by different character encodings and is predominant in the internationalisation and localisation of computer software. The most commonly used encodings are UTF-8.
- UTF-8 stands for Unicode Transformation Format-8. It is an octet (8-bit) lossless encoding of Unicode characters.
- It uses 1 byte for all ASCII characters (the same code values as in the standard ASCII encoding) and up to 4 bytes for other characters.