Last week, I decided to buy a new web domain with my remaining web cash. I always thought that it would be nice to have a short domain name which will be easy to remember to use as a URL Shortener. Something like cedric.com came into my mind but eventually it was already registered. So I thought what would happen if I buy cédric.com with the accentuated “é”. Without doing any prior research and being elated by the idea of a brand new domain name, I headed to GoDaddy and found that cédric.com was indeed available.
After proceeding to checkout here’s what happened:
- On the address bar of my browser; I typed in: cédric.com and surprisingly and unexpectedly the address was converted to: xn--cdric-bsa.com
This is was totally unanticipated and it didn’t appear nice at all. Not cool enough to be used as a URL Shortener. I decided to search upon it only then and unveiled the mystery.
- Web domain names are encoded using ASCII which is a limited set of characters that does not include accents or non-alphabetical characters unlike Unicode.
- So, for International Domain Names (IDN), non-ASCII characters are encoded using Punycode
- This is why my domain cédric.com got translated as xn--cdric-bsa.com
So moral of the story, never, I repeat never try to buy domain names with non-ASCII characters unless you want one of these bizarre, odd and abnormal domain names.
“Learning is experience, everything else is just information” – Albert Einstein