如何在 C# 中获取 unicode 字符的十进制值?

How do i get the decimal value of a unicode character in C#?(如何在 C# 中获取 unicode 字符的十进制值?)
本文介绍了如何在 C# 中获取 unicode 字符的十进制值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 C# 中获取 unicode 字符的数值?

例如,如果泰米尔语字符 (

解决方案

和Java基本一样.如果您将它作为 char 获得,则可以隐式转换为 int:

char c = 'u0b85';//隐式转换:char 基本上是一个 16 位无符号整数诠释 x = c;Console.WriteLine(x);//打印 2949

如果您将它作为字符串的一部分,请先获取该单个字符:

string text = GetText();int x = 文本[2];//管他呢...

请注意,不在基本多语言平面中的字符将表示为两个 UTF-16 代码单元. .NET 支持查找完整的 Unicode 代码点,但它并不简单.

How do i get the numeric value of a unicode character in C#?

For example if tamil character (U+0B85) given, output should be 2949 (i.e. 0x0B85)

See also

  • C++: How to get decimal value of a unicode character in c++
  • Java: How can I get a Unicode character's code?

Multi code-point characters

Some characters require multiple code points. In this example, UTF-16, each code unit is still in the Basic Multilingual Plane:

  • (i.e. U+0072 U+0327 U+030C)
  • (i.e. U+0072 U+0338 U+0327 U+0316 U+0317 U+0300 U+0301 U+0302 U+0308 U+0360)

The larger point being that one "character" can require more than 1 UTF-16 code unit, it can require more than 2 UTF-16 code units, it can require more than 3 UTF-16 code units.

The larger point being that one "character" can require dozens of unicode code points. In UTF-16 in C# that means more than 1 char. One character can require 17 char.

My question was about converting char into a UTF-16 encoding value. Even if an entire string of 17 char only represents one "character", i still want to know how to convert each UTF-16 unit into a numeric value.

e.g.

String s = "அ";

int i = Unicode(s[0]);

Where Unicode returns the integer value, as defined by the Unicode standard, for the first character of the input expression.

解决方案

It's basically the same as Java. If you've got it as a char, you can just convert to int implicitly:

char c = 'u0b85';

// Implicit conversion: char is basically a 16-bit unsigned integer
int x = c;
Console.WriteLine(x); // Prints 2949

If you've got it as part of a string, just get that single character first:

string text = GetText();
int x = text[2]; // Or whatever...

Note that characters not in the basic multilingual plane will be represented as two UTF-16 code units. There is support in .NET for finding the full Unicode code point, but it's not simple.

这篇关于如何在 C# 中获取 unicode 字符的十进制值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

DispatcherQueue null when trying to update Ui property in ViewModel(尝试更新ViewModel中的Ui属性时DispatcherQueue为空)
Drawing over all windows on multiple monitors(在多个监视器上绘制所有窗口)
Programmatically show the desktop(以编程方式显示桌面)
c# Generic Setlt;Tgt; implementation to access objects by type(按类型访问对象的C#泛型集实现)
InvalidOperationException When using Context Injection in ASP.Net Core(在ASP.NET核心中使用上下文注入时发生InvalidOperationException)
LINQ many-to-many relationship, how to write a correct WHERE clause?(LINQ多对多关系,如何写一个正确的WHERE子句?)