Elixir Strings

Welcome to a tutorial on Elixir. Here you will learn about Strings in Elixir.

In Elixir, Strings are inserted between double quotes, and they are encoded in UTF-8. This is quite de-similar from C and C++ in which the default strings are ASCII encoded and only 256 different characters are possible, as the UTF-8 consists of 1,112,064 code points. It means that UTF-8 encoding consists of those many different possible characters. But, we can use symbols such as ö, ł, etc, since the strings use utf-8.

 

Create a String

Let’s create a string variable, by simply assigning a string to a variable, as shown below.

str = "Hello world"

But, to print this to your console, just call the IO.puts function and pass it the variable str:

str = str = "Hello world" 
IO.puts(str)

And the output is:

Hello World

 

Empty Strings

Here we can create an empty string by using the string literal, "". Check the example below.

a = ""
if String.length(a) === 0 do
   IO.puts("a is an empty string")
end

The output is:

a is an empty string

 

String Interpolation

String interpolation is a simple way of constructing a new String value from a mix of constants, variables, literals, and expressions by including their values inside a string literal. Thus, Elixir supports string interpolation, so to make use of a variable in a string, when writing it, wrap it with curly braces and prepend the curly braces with a '#' sign. This is shown below.

x = "Justin" 
y = "My Name is #{x}"
IO.puts(y)

the above code takes the value of x and substitutes it with y. The output will look like this:

My Name is Justin

 

String Concatenation

You already learned about String concatenation in the previous tutorial, where the '<>' operator is used to concatenate strings in Elixir. Check out the example on how to concatenate 2 strings,

x = "Justin"
y = "Drake"
z = x <> " " <> y
IO.puts(z)

The output is:

Justin Drake

 

String Length

We can use the String.length function to obtain the length of a string, by simply passing the string as a parameter and the size will be displayed. Check out the example below.

IO.puts(String.length("Hello"))

The output is:

5

 

Reversing a String

We can reverse a string by simply passing it to the String.reverse function. This is shown below.

IO.puts(String.reverse("Elixir"))

The output is:

rixilE

 

String Comparison

Now to compare 2 strings, the == or the === operators can be used. This is shown below:

var_1 = "Hello world"
var_2 = "Hello Elixir"
if var_1 === var_2 do
   IO.puts("#{var_1} and #{var_2} are the same")
else
   IO.puts("#{var_1} and #{var_2} are not the same")
end

The output is: 

Hello world and Hello elixir are not the same.

 

String Matching

You have already known the use of the =~ string match operator. Now, to confirm if a string matches a regex, is the string match operator or the String.match? the function can be used. Check out the example below.

IO.puts(String.match?("foo", ~r/foo/))
IO.puts(String.match?("bar", ~r/foo/))

The output is: 

true 
false

Also, this can be achieved by using the =~ operator, as shown below.

IO.puts("foo" =~ ~r/foo/)

The output is: 

true

 

String Functions

Elixir supports a large number of functions related to strings, the table below shows a few of the most used functions and their purpose.

Sr.No.Function and its Purpose
1at(string, position): This returns the grapheme at the position of the given utf8 string. If position is greater than string length, then it returns nil
2capitalize(string): This converts the first character in the given string to uppercase and the remainder to lowercase
3contains?(string, contents): This checks if a string contains any of the given contents
4downcase(string): This converts all characters in the given string to lowercase
5ends_with?(string, suffixes): This returns true if a string ends with any of the suffixes given
6first(string): This returns the first graphene from a utf8 string, nil if the string is empty
7last(string): This returns the last grapheme from a utf8 string, nil if the string is empty
8replace(subject, pattern, replacement, options \ []): This returns a new string created by replacing occurrences of pattern in subject with replacement
9slice(string, start, len): This returns a substring starting at the offset start and of length len
10split(string): This divides a string into substrings at each Unicode whitespace occurrence with leading and trailing whitespace ignored. The groups of whitespace are treated as a single occurrence. However, divisions do not occur on non-breaking whitespace
11upcase(string): This converts all characters in the given string to uppercase

 

Binaries

A binary is simply a sequence of bytes. Binaries are defined using << >>, as shown below.

<< 0, 1, 2, 3 >>

But, interestingly, those bytes can be organized in any way, even in a sequence that does not make them a valid string. Check out the example below.

<< 239, 191, 191 >>

But, Strings are also binaries, also, the string concatenation operator <> is a Binary concatenation operator: check out the example below

IO.puts(<< 0, 1 >> <> << 2, 3 >>)

The output is:

<< 0, 1, 2, 3 >>

Note that the ł character representation takes up 2 bytes since it is utf-8 encoded.

Now, since each number represented in a binary is meant to be a byte when this value goes up from 255, it will be truncated. But, to prevent this, we can make use of a size modifier to specify how many bits we want that number to take. Check out the example below.

IO.puts(<< 256 >>) # truncated, it'll print << 0 >>
IO.puts(<< 256 :: size(16) >>) #Takes 16 bits/2 bytes, will print << 1, 0 >>

The output is:

<< 0 >>
<< 1, 0 >>

Also, we use the utf8 modifier, if a character is a code point then, it will be produced in the output or else the bytes. This is shown below.

IO.puts(<< 256 :: utf8 >>)

The output is:

Ā

In addition, we have a function called is_binary that checks if a given variable is a binary. But, take note that only variables which are stored as multiples of 8bits are binaries.

 

Bitstrings

If we define a binary by making use of the size modifier and passing it a value that is not a multiple of 8, we end up having a bitstring rather than a binary. Check out the example below.

bs = << 1 :: size(1) >>
IO.puts(bs)
IO.puts(is_binary(bs))
IO.puts(is_bitstring(bs))

The output is:

<< 1::size(1) >>
false
true

This means that variable bs is not a binary, but rather a bitstring. Also, we can say that a binary is a bitstring where the number of bits is divisible by 8. Interestingly, pattern matching works even on binaries and on bitstrings in the same manner.