C# string

Summary: in this tutorial, you’ll learn about the C# string type and the basic string operations.

Introduction to the C# string

C# uses the string keyword to represent the string type. The string keyword is an alias for the System.String type. Therefore, the string and String are equivalent.

Declare a string

The following example declares a string variable without initializing it:

string message;Code language: C# (cs)

After declaring the string variable, you can assign it a string literal. To form a string literal, you place the string text inside double quotes ("..."). For example:

message = "Hi";Code language: C# (cs)

The following example declares and initializes the string using one statement:

string message = "Hi";Code language: C# (cs)

To create a zero-length string, you use the String.Empty like this:

string message = String.Empty;Code language: C# (cs)

It’s equivalent to the following:

string message  = "";Code language: C# (cs)

Get the length of a string

A string has the Length property that returns the length of a string. To access the Length property, you use the dot operator (.) like this:

string message = "Hello";

Console.WriteLine(message.Length);Code language: C# (cs)

Output:

5Code language: C# (cs)

Concatenate two strings

To concatenate two strings into one, you use the + operator. For example:

string message = "Good" + " Morning";

Console.WriteLine(message);Code language: C# (cs)

Output:

Good Morning!Code language: C# (cs)

To append a string to another, you can also use the += operator. For example:

string message = "Good";
message += " Morning!";

Console.WriteLine(message);Code language: C# (cs)

Output:

Good Morning!Code language: C# (cs)

The String provides you with Join() method that allows you to concatenate two or more strings into a single string using a separator.

Besides the + operator, you can use the Concat() method to concatenate two or more strings into a string.

C# string is immutable

C# string is immutable. It means that when you make any changes to a string, you’ll always get a new string. For example:

string message = "C#";
message += " string";

Console.WriteLine(message);Code language: C# (cs)

Output:

C# stringCode language: C# (cs)

In this example:

  • First, define the message string variable and initialize it to the string literal "C#".
  • Second, concatenate the message string variable with another string literal " string!".
  • Third, show the message string to the console.

When concatenating the message with the " string", C# doesn’t change the original string message but creates a new string that holds the concatenated string.

Accessing individual characters

Internally, C# stores a string as a collection of read-only characters. To access an individual character in a string, you use the square bracket notation [] with an index:

s[index]Code language: C# (cs)

The first character has an index of 0. The second character has an index of 1, and so on. For example:

string message = "Hello";

Console.WriteLine(message[0]); // HCode language: C# (cs)

Output:

HCode language: C# (cs)

Because a string is immutable, you can only read individual characters from it.

The following example results in a compilation error because it attempts to change the first character of a string:

string name = "Jill";
name[0] = 'B';Code language: C# (cs)

Escape sequences

A literal string can contain special characters like tabs, and newlines, … using a backslash (\). They are called escape sequences. For example:

string header = "id\tname";

Console.WriteLine(header);Code language: C# (cs)

Output:

id      nameCode language: C# (cs)

The header string has the \t character as the tab character. So when we display it to the console, the output has a tab character between the id and name.

If a string literal contains double quotes, you need to use the backslash character \ to escape them. For example:

string message = "\"C# is awesome\". They said";
Console.WriteLine(message);Code language: C# (cs)

Output:

"C# is awesome". They said.Code language: C# (cs)

In this example, the literal string contains two double quotes:

"C# is awesome". They said.Code language: C# (cs)

Therefore, we use the backslash character (\) to escape each of them:

"\"C# is awesome\". They said."Code language: C# (cs)

If a string contains the backslash character as a literal character, you need to use another backslash character to escape it like this:

string path = "C:\\users\\";

Console.WriteLine(path);Code language: C# (cs)

Output:

C:\users\Code language: C# (cs)

In this example, the directory path “C:\users\” string contains the backslashes. Therefore, we need to escape them using backslashes.

Verbatim string

If a string contains backslashes, you can escape them using backslashes. But double backslashes make the string difficult to read.

To fix this, you can turn a literal string into a verbatim string by prefixing the @ symbol. The verbatim string disables escape characters so that a backslash is a backslash. For example:

string path = @"C:\users\";

Console.WriteLine(path);Code language: C# (cs)

Output:

C:\users\Code language: C# (cs)

Because verbatim strings preserve newline characters as part of the string text, you can use them to create multiline strings. For example:

string content = @"I'm a multiline 
string that span multiple 
lines";

Console.WriteLine(content);Code language: C# (cs)

Output:

I'm a multiline
string that span multiple
linesCode language: C# (cs)

Interpolated string

Suppose you have a variable called name:

string name = "Joe";Code language: C# (cs)

And you want to embed the variable in a literal string.

To do that, you prefix the literal string with the $ character and place the variable inside the curly braces {}:

string name = "Joe";
string greeting = $"Hello {name}!";

Console.WriteLine(greeting);Code language: C# (cs)

Output:

Hello Joe!Code language: C# (cs)

A literal string with the prefix $ is an interpolated string.

When encountering the $ prefix, the compiler replaces the {name} variable with its value. This feature is called string interpolation.

UTF-8 strings

The web uses UTF-8 as the character encoding. Each character takes 1 to 4 bytes.

But in .NET, the string type uses UTF-16 by default. It means that each character takes at least 2 bytes in size.

If you use C# to process characters for the web, you need to convert UTF-16 to UTF-8.

Note that if you use ASP.NET Core, the framework does the conversion for you automatically.

To convert a string in UTF-16 to UTF-8, you use the following:

var utf8 = Encoding.UTF8.GetBytes("Hello WWW");Code language: JavaScript (javascript)

This manual conversion creates a big overhead and slow down the program.

To solve this issue, C# 11 introduced the concept of UTF-8 string. A UTF-8 string has a suffix of u8 like this:

var utf8 = "Hello WWW"u8;Code language: JavaScript (javascript)

The utf-8 string syntax brings not only elegant syntax but is also more efficient than converting a string from UTF-16 to UTF-8.

Note that C# 11 also introduced the concept of raw strings that we will cover in another tutorial.

Summary

  • C# uses the string keyword to represent the string type.
  • The string keyword is an alias for the System.String type. Therefore, string and String are the same.
  • C# strings are immutable.
  • Use the Length property to get the length of the string.
  • Use the + operator to concatenate two strings and return a new string.
  • Use the square bracket with an index to access an individual character in a string.
  • Use a verbatim string with the @ prefix to disable the escape character so that backslashes have no special meaning.
  • Use an interpolated string with the $ prefix to embed a variable in a literal string.
  • Use the u8 suffix to create a string literal with UTF-8 encoding.
Was this tutorial helpful ?