Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single sentence per line for Runes #2825

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 20 additions & 7 deletions concepts/runes/about.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
# About

The `rune` type in Go is an alias for `int32`. Given this underlying `int32` type, the `rune` type holds a signed 32-bit integer value. However, unlike an `int32` type, the integer value stored in a `rune` type represents a single Unicode character.
The `rune` type in Go is an alias for `int32`.
Given this underlying `int32` type, the `rune` type holds a signed 32-bit integer value.
However, unlike an `int32` type, the integer value stored in a `rune` type represents a single Unicode character.

## Unicode and Unicode Code Points

Unicode is a superset of ASCII that represents characters by assigning a unique number to every character. This unique number is called a Unicode code point. Unicode aims to represent all the world's characters including various alphabets, numbers, symbols, and even emoji as Unicode code points.
Unicode is a superset of ASCII that represents characters by assigning a unique number to every character.
This unique number is called a Unicode code point.
Unicode aims to represent all the world's characters including various alphabets, numbers, symbols, and even emoji as Unicode code points.

In Go, the `rune` type represents a single Unicode code point.

Expand All @@ -21,7 +25,9 @@ The following table contains example Unicode characters along with their Unicode

## UTF-8

UTF-8 is a variable-width character encoding that is used to encode every Unicode code point as 1, 2, 3, or 4 bytes. Since a Unicode code point can be encoded as a maximum of 4 bytes, the `rune` type needs to be able to hold up to 4 bytes of data. That is why the `rune` type is an alias for `int32` as an `int32` type is capable of holding up to 4 bytes of data.
UTF-8 is a variable-width character encoding that is used to encode every Unicode code point as 1, 2, 3, or 4 bytes.
Since a Unicode code point can be encoded as a maximum of 4 bytes, the `rune` type needs to be able to hold up to 4 bytes of data.
That is why the `rune` type is an alias for `int32` as an `int32` type is capable of holding up to 4 bytes of data.

Go source code files are encoded using UTF-8.

Expand Down Expand Up @@ -76,9 +82,14 @@ fmt.Printf("myRune Unicode character: %c\n", myRune)

## Runes and Strings

Strings in Go are encoded using UTF-8 which means they contain Unicode characters. Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes. However, runes are stored as 1, 2, 3, or 4 bytes depending on the character. Due to this, strings are really just a sequence of bytes. In Go, slices are used to represent sequences and these slices can be iterated over using `range`.
Strings in Go are encoded using UTF-8 which means they contain Unicode characters.
Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes.
However, runes are stored as 1, 2, 3, or 4 bytes depending on the character.
Due to this, strings are really just a sequence of bytes.
In Go, slices are used to represent sequences and these slices can be iterated over using `range`.

Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes. In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:
Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes.
In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:

```go
myString := "❗hello"
Expand All @@ -94,7 +105,8 @@ for index, char := range myString {
// Index: 7 Character: o Code Point: U+006F
```

Since runes can be stored as 1, 2, 3, or 4 bytes, the length of a string may not always equal the number of characters in the string. Use the builtin `len` function to get the length of a string in bytes and the `utf8.RuneCountInString` function to get the number of runes in a string:
Since runes can be stored as 1, 2, 3, or 4 bytes, the length of a string may not always equal the number of characters in the string.
Use the builtin `len` function to get the length of a string in bytes and the `utf8.RuneCountInString` function to get the number of runes in a string:

```go
import "unicode/utf8"
Expand All @@ -118,7 +130,8 @@ fmt.Println(myString)
// Output: exercism
```

Similarly, a string can be type converted to a slice of runes. Remember, without formatting verbs, printing a rune yields its integer (decimal) value:
Similarly, a string can be type converted to a slice of runes.
Remember, without formatting verbs, printing a rune yields its integer (decimal) value:

```go
myString := "exercism"
Expand Down
24 changes: 18 additions & 6 deletions concepts/runes/introduction.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
# Introduction

The `rune` type in Go is an alias for `int32`. Given this underlying `int32` type, the `rune` type holds a signed 32-bit integer value. However, unlike an `int32` type, the integer value stored in a `rune` type represents a single Unicode character.
The `rune` type in Go is an alias for `int32`.
Given this underlying `int32` type, the `rune` type holds a signed 32-bit integer value.
However, unlike an `int32` type, the integer value stored in a `rune` type represents a single Unicode character.

## Unicode and Unicode Code Points

Unicode is a superset of ASCII that represents characters by assigning a unique number to every character. This unique number is called a Unicode code point. Unicode aims to represent all the world's characters including various alphabets, numbers, symbols, and even emoji as Unicode code points.
Unicode is a superset of ASCII that represents characters by assigning a unique number to every character.
This unique number is called a Unicode code point.
Unicode aims to represent all the world's characters including various alphabets, numbers, symbols, and even emoji as Unicode code points.

In Go, the `rune` type represents a single Unicode code point.

Expand All @@ -21,7 +25,9 @@ The following table contains example Unicode characters along with their Unicode

## UTF-8

UTF-8 is a variable-width character encoding that is used to encode every Unicode code point as 1, 2, 3, or 4 bytes. Since a Unicode code point can be encoded as a maximum of 4 bytes, the `rune` type needs to be able to hold up to 4 bytes of data. That is why the `rune` type is an alias for `int32` as an `int32` type is capable of holding up to 4 bytes of data.
UTF-8 is a variable-width character encoding that is used to encode every Unicode code point as 1, 2, 3, or 4 bytes.
Since a Unicode code point can be encoded as a maximum of 4 bytes, the `rune` type needs to be able to hold up to 4 bytes of data.
That is why the `rune` type is an alias for `int32` as an `int32` type is capable of holding up to 4 bytes of data.

Go source code files are encoded using UTF-8.

Expand Down Expand Up @@ -67,9 +73,14 @@ fmt.Printf("myRune Unicode code point: %U\n", myRune)

## Runes and Strings

Strings in Go are encoded using UTF-8 which means they contain Unicode characters. Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes. However, runes are stored as 1, 2, 3, or 4 bytes depending on the character. Due to this, strings are really just a sequence of bytes. In Go, slices are used to represent sequences and these slices can be iterated over using `range`.
Strings in Go are encoded using UTF-8 which means they contain Unicode characters.
Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes.
However, runes are stored as 1, 2, 3, or 4 bytes depending on the character.
Due to this, strings are really just a sequence of bytes.
In Go, slices are used to represent sequences and these slices can be iterated over using `range`.

Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes. In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:
Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes.
In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:

```go
myString := "❗hello"
Expand All @@ -85,7 +96,8 @@ for index, char := range myString {
// Index: 7 Character: o Code Point: U+006F
```

Since runes can be stored as 1, 2, 3, or 4 bytes, the length of a string may not always equal the number of characters in the string. Use the builtin `len` function to get the length of a string in bytes and the `utf8.RuneCountInString` function to get the number of runes in a string:
Since runes can be stored as 1, 2, 3, or 4 bytes, the length of a string may not always equal the number of characters in the string.
Use the builtin `len` function to get the length of a string in bytes and the `utf8.RuneCountInString` function to get the number of runes in a string:

```go
import "unicode/utf8"
Expand Down
Loading