unicode encoding c# examples

unicode encoding c# examples



Unicode Encoding in C# Examples

In C#, Unicode encoding is crucial for representing and manipulating text data, particularly in applications that handle multiple languages or need compatibility with different platforms. The System.Text namespace provides several classes to manage Unicode encoding, including UnicodeEncoding for UTF-16 encoding. This article offers practical examples of using Unicode encoding in C#.

Understanding Unicode and UTF-16

Unicode is a universal character encoding standard that assigns a unique code point to every character across various languages and scripts. UTF-16 is a common encoding format that uses two bytes per character for most characters but can use four bytes for supplementary characters.

UnicodeEncoding Class

The UnicodeEncoding class handles UTF-16 encoding in .NET, supporting both big-endian and little-endian byte ordering.

Encoding and Decoding Examples

Example 1: Encode String to UTF-16 Bytes

Here's an example of converting a string to UTF-16 bytes:

 

using System;
using System.Text;

public class UnicodeEncodingExample
{
    public static void Main()
    {
        // Initialize the string to encode
        string originalText = "Hello, 世界!";

        // Create an instance of UnicodeEncoding
        UnicodeEncoding unicode = new UnicodeEncoding();

        // Encode the string to a byte array
        byte[] encodedBytes = unicode.GetBytes(originalText);

        // Display the encoded bytes
        Console.WriteLine("Encoded bytes:");
        foreach (byte b in encodedBytes)
        {
            Console.Write($"{b:X2} ");
        }
    }
}

Example 2: Decode UTF-16 Bytes to String

To decode bytes back to a string using UTF-16, the following example demonstrates:

 

using System;
using System.Text;

public class UnicodeDecodingExample
{
    public static void Main()
    {
        // UTF-16 encoded bytes (for "Hello, 世界!")
        byte[] encodedBytes = { 72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 44, 0, 32, 0, 39, 30, 121, 16, 33, 0 };

        // Create an instance of UnicodeEncoding
        UnicodeEncoding unicode = new UnicodeEncoding();

        // Decode the byte array back to a string
        string decodedText = unicode.GetString(encodedBytes);

        // Display the decoded string
        Console.WriteLine("Decoded string: " + decodedText);
    }
}

Example 3: Big-Endian Unicode Encoding

In cases where big-endian encoding is required:

 

using System;
using System.Text;

public class BigEndianUnicodeEncodingExample
{
    public static void Main()
    {
        // Create an instance of UnicodeEncoding with big-endian byte order
        UnicodeEncoding bigEndianUnicode = new UnicodeEncoding(true, true);

        // Encode a string
        string text = "Bonjour, monde!";
        byte[] encodedBytes = bigEndianUnicode.GetBytes(text);

        // Display the encoded bytes in big-endian order
        Console.WriteLine("Big-endian encoded bytes:");
        foreach (byte b in encodedBytes)
        {
            Console.Write($"{b:X2} ");
        }
    }
}

Practical Applications

  • Internationalization: Ensure that text data can handle multiple languages and scripts.
  • Data Exchange: Facilitate text exchange between applications with different encoding standards.
  • File I/O: Read and write files in formats that support Unicode characters.

Conclusion

Unicode encoding in C# is essential for working with diverse text data. By understanding and implementing UnicodeEncoding correctly, developers can manage globalized text applications effectively.


Leave a reply Your email address will not be published. Required fields are marked*

Popular Posts

How to calculate sum in c#

5 months ago
How to calculate sum in c#

Chaos Engineering: Building Resilient Systems with simmy in polly library

5 months ago
Chaos Engineering: Building Resilient Systems with simmy in polly library

How to use hashtable in c# with example

4 months ago
How to use hashtable in c# with example

c# namespace without braces

4 months ago
 c# namespace without braces

Tags