Encoding utility

UTF-8 Inspector

Type or paste any string and see how UTF-8 encodes it: codepoints (U+xxxx), byte counts per character, BOM detection, per-character byte breakdown. Useful when an APDU or NDEF record is misrendering and you need to know if it’s a UTF-8, UTF-16, or BMPString issue.

Client-sidePer-char viewBOM aware

Input

Any text.

Result

Type something to convert.

All processing runs locally.

About UTF-8

UTF-8 encodes Unicode codepoints in 1–4 bytes. ASCII (U+0000–U+007F) is 1 byte; common Latin / Greek / Cyrillic accents are 2 bytes; CJK and most BMP scripts are 3 bytes; emoji and astral characters are 4 bytes.

Spec

RFC 3629 (UTF-8).

When you really wanted ASCII

Use the ASCII converter for plain ASCII inputs.

ASCII ↔ HEX →

Smart-card use

JavaCard UTF8String shows up in BER-TLV — see TLV parser.

TLV parser →