Golang String: The Complete Guide

Golang Strings are different compared to other programming languages. 

Golang Strings

Golang string is a slice of bytes. Create a string by enclosing its contents inside ” “(double quotes) and not ‘ ‘(single quotes). The UTF-8 character can be defined in the memory size from 1 byte (ASCII compatible) to 4 bytes. Hence in Go, all the characters are represented in int32 (size of 4 bytes) data type.

The coding unit is the number of bits the encoding uses for a single unit cell.

So, UTF-8 uses 8 bits, and UTF-16 uses 16 bits for the coding unit, which means UTF-8 needs a minimum of 8 bits or 1 byte to represent the character.

The code point is any numerical value that defines a character, represented by one or more code units depending on an encoding.

As UTF-8 is compatible with ASCII, all ASCII characters are represented in a single byte (8 bits); hence, UTF-8 needs only 1 code unit to represent them.

For the standard Go compiler, the internal structure of any string type is declared like the following.

type _string struct {
	elements *byte // underlying bytes
	len      int   // number of bytes
}

From the declaration, we know that a string is a byte sequence wrapper. Therefore, we can view a string as an (element-immutable) byte slice.

In Go, a byte is a built-in alias of type uint8. Let’s see the example of strings.

// hello.go

package main

import (
	"fmt"
)

func main() {
	name := "Baby Yoda"
	fmt.Println(name)
}

Output

➜  hello go run hello.go
Baby Yoda
➜  hello

In the above code, we have defined the name as a string and then used the Println() function to print the String in the console.

Strings in Go are UTF-8 encoded by default, which makes more sense in the current digital world.

As UTF-8 supports the ASCII character set, you don’t need to worry about encoding in most cases.

Golang string length

To find a string length in Golang, use the len() function. The len() is a built-in Go function that returns the length of the String. The len() function counts space as one character.

// hello.go

package main

import (
	"fmt"
)

func main() {
	name := "Baby Yoda"
	fmt.Println("String length of Baby Yoda is: ", len(name))
}

Output

➜  hello go run hello.go
String length of Baby Yoda is:  9
➜  hello

The string len() function counts space as one character, so the output is 9, not 8.

All characters in the string “Baby Yoda” are valid ASCII characters; hence, we hope to see each character occupy only a byte in memory.

The Golang len() is a universal function to find the length of any data type, and it’s not particular to strings.

How to access characters(bytes) of String in Go

To access a slice of bytes of String in Golang, use the for loop. A string is a slice of bytes; it’s possible to access each byte of a string.

// hello.go

package main

import (
	"fmt"
)

func accessBytes(str string) {
	for i := 0; i < len(str); i++ {
		fmt.Printf("%c \n", str[i])
	}
}

func main() {
	name := "Baby Yoda"
	accessBytes(name)
}

Output

➜  hello go run hello.go
B
a
b
y

Y
o
d
a
➜  hello

We used for loop to loop through one by one byte of String and display it using %c to print the characters or bytes of String.

In Go, the String is, in effect, a read-only slice of bytes.

For now, imagine a slice is like a simple array. Hence in the above case, we see the byte (uint8) values of the string str, which is internally a slice.

Hence str[i] prints the decimal value of the byte held by the character. But to see individual characters, you can use the %c format string in the Printf statement.

If you do not know how to format strings, check out the Formatted I/O example.

The %c format specifier prints the characters of the String.

Constructing a string from a slice of bytes in Go

Use the UTF-8 decoded values and construct the String from that UTF-8 decoded values.

// hello.go

package main

import (
	"fmt"
)

func main() {
	byteSlice := []byte{0x42, 0x61, 0x62, 0x79, 0x20, 0x59, 0x6F, 0x64, 0x61}
	str := string(byteSlice)
	fmt.Println(str)
}

Output

➜  hello go run hello.go
Baby Yoda
➜  hello

How to modify Golang strings

Golang strings cannot be modified or changed. This is because strings are immutable in Go. Therefore, once it is created, it’s impossible to change it.

See the following code.

// hello.go

package main

import (
	"fmt"
)

func mutate(str string) string {
	str[0] = 'M'
	return str
}

func main() {
	data := "Netflix"
	fmt.Println(mutate(data))
}

Output

➜  hello go run hello.go
# command-line-arguments
./hello.go:8:9: cannot assign to str[0]
➜  hello

From the output, you can see that we can not assign any character(byte) to an already created string.

We can work around this string immutability.

First, we need to convert strings into a slice of runes.

Then that slice is mutated with the needed changes and converted back to the new String.

See the following code.

// hello.go

package main

import (
	"fmt"
)

func mutate(str []rune) string {
	str[0] = 'M'
	return string(str)
}

func main() {
	data := "Netflix"
	fmt.Println(mutate([]rune(data)))
}

Output

➜  hello go run hello.go
Metflix
➜  hello

In the above program, the mutate function accepts a rune slice as an argument.

It then changes the first item of the slice to ‘M’, converts the rune back to a string, and returns it.

The method mutate() is called, and data is converted to a slice of runes and passed to mutate. This program outputs Netflix.

Now, what the heck are runes in Go? Let’s discuss it in detail.

Golang runes example

Strings are a slice of bytes, as seen in this article.

When we use for loop with range, we get rune because each character in the String is represented by rune data type.

In Golang, a character is represented between a single quote, AKA character literal.

Hence, any valid UTF-8 character within a single quote (‘) is a rune, and its type is int32.

See the following code.

// hello.go

package main

import (
	"fmt"
)

func main() {
	data := 'Δ'
	fmt.Printf("%x \n", data)
	fmt.Printf("%v \n", data)
	fmt.Printf("%T", data)
}

Output

➜  hello go run hello.go
394
916
int32                                                                                                                                                                                                       
➜  hello

The above program will print 395, 916, int32, hexadecimal/decimal value, and data type of code point value of Δ in the UTF-8 table.

One point to remember is that we have put the symbol in a single quote ‘ ‘ and if we change it to the double quotes, we will get the String in the output, and it won’t be a rune.

String literals in Go

Instead of double quotes, we can use the backtick (`) character to represent a string in Go. In quotes (“), you need to escape newlines, tabs, and other characters that do not need to be escaped in backticks.

// hello.go

package main

import "fmt"

func main() {
	data := `Maeve Wiley
  and Otis Milburn`
	fmt.Println(data)
}

Output

➜  hello go run hello.go
Maeve Wiley
  and Otis Milburn
➜  hello

So, we do not need to use “\n” to go to a new line. Instead, it will automatically detect the next line and tab in the backtick.

Conclusion

String values can be used as constants.

Golang supports two styles of a string literal, the double-quote style or interpreted literals and the back-quote style or raw String literals.

The zero values of string types are blank strings, which can be represented with ” “(double quotes) or ` `(backticks) in literal.

Strings can be concatenated with + and += operators.

The strings in Go are immutable   The length of string values also can’t be modified separately. An addressable string value can only be overwritten as a whole by assigning another string value to it.

The Unicode standard specifies the unique value for each character in all kinds of human language. But the basic unit in Unicode is not a character but a code point instead.

Each corresponds to the character for most code points, but each consists of many code points for a few characters. Code points are represented as rune values in G . In Go, the rune is a built-in alias of type int32.

You can find more about strings and functions on the official doc: strings.

That’s it.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.