A few days ago, there was a need to convert the Unicode encoding corresponding to emoji into text, such as the “smiley face” corresponding to 1f601, but I did not find a C# method to convert 1f601 into text. How to convert using Encoding.Unicode None of them are right. In the end, I copied the emoji characters directly, and they were displayed directly in Visual Studio. Then just use the characters without converting them, and then let it go.
Today I was working on the Markdown editor. Due to the GFM issue earlier, I also tested the encoding. I didn’t find any reliable information. I found a lot of emoji and Unicode comparison tables, https://apps.timwhitlock.info/emoji /tables/unicode Take a smiley face https://apps.timwhitlock.info/unicode/inspect/hex/1F601 for surgery~
1. Convert emoticon characters to encoding
【C#】
Encoding.UTF32.GetBytes("😁") -> ["1", "f6", "1", "0"]
【js】
"😁".codePointAt(0).toString(16) -> 1f601
【java】
byte[] bytes = "😀".getBytes("utf-32"); System.out.println(getBytesCode(bytes)); private static String getBytesCode(byte[] bytes) { String code = ""; for (byte b : bytes) { code += "\\x" + Integer.toHexString(b & 0xff); } return code; }
UTF-32 results are consistent
【C#】
Encoding.UTF8.GetBytes("😁") -> ["f0", "9f", "98", "81"]
【js】
encodeURIComponent("😁") -> %F0%9F%98%81
UTF-8 results are consistent
2. Encoding to emoticon characters
【js】
String.fromCodePoint('0x1f601') utf-32
【java】
String emojiName = "1f601"; //actually 4 bytes int emojiCode = Integer.valueOf(emojiName, 16); byte[] emojiBytes = int2bytes(emojiCode); String emojiChar = new String(emojiBytes, "utf-32"); System.out.println(emojiChar); public static byte[] int2bytes(int num){ byte[] result = new byte[4]; result[0] = (byte)((num >>> 24) & 0xff);//Explanation 1 result[1] = (byte)((num >>> 16)& 0xff ); result[2] = (byte)((num >>> 8) & 0xff ); result[3] = (byte)((num >>> 0) & 0xff ); return result; }
c# Examples of conversion between Chinese characters and Unicode encoding
////// /// Convert string to Unicode /// /// Source string /// Unicode encoded string public static string String2Unicode(string source) { byte[] bytes = Encoding.Unicode.GetBytes(source); StringBuilder stringBuilder = new StringBuilder(); for (int i = 0; i <bytes.Length; i += 2) { stringBuilder.AppendFormat("\\u{0}{1}", bytes[i + 1].ToString("x").PadLeft(2, '0'), bytes[i].ToString("x") .PadLeft(2, '0')); } return stringBuilder.ToString(); } ////// Unicode to string /// /// Unicode encoded string /// Normal string public static string Unicode2String(string source) { return new Regex(@"\\u([0-9A-F]{4})", RegexOptions.IgnoreCase | RegexOptions.Compiled).Replace( source, x => string.Empty + Convert.ToChar(Convert.ToUInt16(x.Result("$1"), 16))); }
Reference address:
https://www.jianshu.com/p/8a416537deb3
https://blog.csdn.net/a19881029/article/details/13511729
https://apps.timwhitlock.info/emoji/tables/unicode
This concludes this article on the implementation of the conversion between emoji expressions and unicode encoding (JS, JAVA, C#). For more information on the conversion between emoji expressions and unicode encoding, please search previous articles or continue Browse the relevant articles below and hope you will support me in the future!