My notes while learning Elixir

2021-02-04· Updated 2021-03-03 · christian fei

Chris

These are my notes for learning Elixir.

Started digging deeper in the "Getting started" over on elixir-lang.org.

Something more up to date than the official doc is hard to find.

In the future I'll probably also take a look at the following resources:

I've also the physical book "Programming Elixir", might take the dust off it too.

Installation

As easy as pacman -S elixir on my Manjaro/Arch machine.

Currently I'm using Elixir 1.11.0 (compiled with Erlang/OTP 23)

First things learned

Documentation

Documentation is at the heart of Elixir.

Hopping into the iex shell, the first things done was to check the documentation for the + operator:

h Kernel.+/2

The /2 defines the arity of the function.

Atoms

Atoms are Elixir's way of defining constants/"unique values"/symbols.

It's namely "a constant whose value is its own name".

Interesting default constants are false, true and nil.

In fact it's just syntactic sugar that Elixir allows you to omit the leading ::

:true == true
:false == false
:nil == nil

All result in a true statement.

Strings

Strings are binary, a "contiguous sequence of bytes".

String interpolation is available with "#{var}".

The IO and String module are useful for dealing with strings.

Single-quotes and double-quotes are not the same!

One represent charlists, the other strings.

Anonymous functions

Delimited by fn and end.

As anonymous functions can be passed around like variables, in fact they are variables.

Anonymous functions have an arity, and are also closures.

Meaning that they can access variables that are defined in the function scope.

Also variable assignments inside the function scope do not affect the outer scopes.

To call an anonymous function, you use a dot . before the parenthesis.

This to avoid confusion between a named function and an anonymous one.

Named functions can only be defined inside a Module.

Lists

Square brackets to define a list of values of any type.

[42, true]

Concatenate and subtract lists with ++/2 and --/2.

An existing list is never modified (I guess it's safe to say "any variable" is never modified).

Immutability is at the heart of elixir. This guarantees that when passing data around, no one will be able to mutate it in memory.

hd/1 and tl/1 are used to get the head and tail of lists.

hd returns the first element, tl the "rest" of the list.

Stored as linked lists in memory. Thus determining its size and concatenating elements, is a linear operation.

`i`

You can use i to get more information about a variable or module.

An interesting piece of code is the following:

The list [104, 101, 108, 108, 111] is a list of printable ASCII numbers, thus when evaluated is treated as a charlist:

iex(29)> [104, 101, 108, 108, 111]  
'hello'

Information about the charlist:

iex(24)> i [104, 101, 108, 108, 111]
Term
  'hello'
Data type
  List
Description
  This is a list of integers that is printed as a sequence of characters
  delimited by single quotes because all the integers in it represent printable
  ASCII characters. Conventionally, a list of Unicode code points is known as a
  charlist and a list of ASCII characters is a subset of it.
Raw representation
  [104, 101, 108, 108, 111]
Reference modules
  List
Implemented protocols
  Collectable, Enumerable, IEx.Info, Inspect, List.Chars, String.Chars

Tuples

Can contain any value, like lists.

Accessing an element by index is fast, as getting its size. Tuples are stored contiguously in memory.

iex> a = {:ok, 42}
iex> elem(a, 0)
:ok
iex> tuple_size(a)
2
iex> put_elem(a, true)
{:ok, 42, true}
iex> a
{:ok, 42}

Like lists, tuples are immutable.

Used often to return extra information from a function.

As a convention: _size functions are used for data structure which can be accessed in constant time (like tuples).

_length if the operation is linear.

Basic operators

and, or and not expect a boolean.

&&, || and ! accept values of any type.

String concatenation is done via <>.

Lists are manipulated with ++ and --.

Classic comparison operators like ==, !=, ===, <=, >=, < and > are available.

There is also a sorting order when comparing different data types:

number < atom < reference < function < port < pid < tuple < map < list < bitstring

Pattern matching

= is the match operator.

It can be used when defining a variable and assigning a value, thus a "match" is defined between the variable name and its value:

iex> a = 1
iex> 1 = a
iex> 2 = a
** (MatchError) no match of right hand side value: 1

Better than "match", the word "bound" is used when "binding" a value to a variable.

The match operator can be used for destructuring complex data structures, like tuples, lists etc.

At the same time, when the structure of a destructuring doesn't match (e.g. different size, different type), a MatchError is raised.

Noteworthy, you can destructure the head and tail of lists with |:

iex> [head | tail] = [1,2,3]

The [ | ] format is also used for prepending items to list:

iex> [1 | [2,3,4]]
[1, 2, 3, 4]

^ Pin operator

Without the pin operator you can do the following, e.g. re-binding a variable:

iex> a = 1
iex> a = 2

You can pattern match against a variable's existing value with the pin operator:

iex> a = 1
iex> ^a = 2
** (MatchError) no match of right hand side value: 2

case

Used for pattern matching against many patterns/values.

The syntax is:

case value do
  pattern1 -> ...
  pattern2 -> ...
end

Use the pin operator if you want to pattern match again an existing variable.

You can use guards to define extra conditions.

There is also a weird syntax for anonymous functions..

cond

Used for matching different conditions, namely the first that does not evaluate to nil or false.

cond do
  condition1 -> ...
  condition2 -> ...
end

Similar to else if in other languages.

if and unless

These macros can be used to check only one condition.

do/end blocks

Do/end blocks often can also be rewritten using Elixirs keyword lists:

if 42, do: true
if 42, do: true, else: false

Keyword lists are useful to remove verbosity when writing short blocks of code.

Binaries, strings, and charlists

Elixir uses the UTF-8 character encoding, allowing all Unicode code points.

To reveal the code point of any character, put a ? in front of it:

iex> ?a
97
iex> ??
63

Bitstrings

It's a contiguous sequence of bits in memory.

The bitstring constructor is <<0>>, e.g. to build the null byte <<0>>.

You can concatenate the null byte using to a string to see it's binary representation:

iex> "test" <> <<0>>
<<116, 101, 115, 116, 0>>

Each character or number is stored in 1 byte (8 bits), but you can specify the number of bits with ::n

Binaries

A binary is a bitstring where the number of bits is divisible by 8.

Pattern matching on binaries can be done with the bitstring constructor:

iex> <<0, 1, x>> = <<0, 1, 2>>
iex> x
2

Since Strings are binaries, you can pattern match on strings using the binary constructor:

iex> <<head, rest::binary>> = "test"
iex> head
116
iex> ?t
116

This will pattern match head and rest. But with bytes, meaning head will only contain the first byte of the character "t".

If you had a character that uses more than 1 byte, you need to match it with the utf8 modifier:

iex> <<head::utf8, rest::binary>> = "über"

Charlists

A charlist is a list of integers. Where all the integers are valid code points.

Charlists can often be found in older Erlang libraries.

To concatenate charlists (since lists, and not binaries) you need to use the ++ operator, instead of the <> operator.

Keyword lists

One of the two types of associative data-structures in Elixir (together with maps).

In simple terms a keyword lists is a list of tuples. The first item of the tuples is an atom, representing the key.

iex> kwlist = [{:a, 1}, {:b, 2}]

A shorter syntax is [a: 1, b: 2].

Since keyword lists are, well, lists, you can use all operators that apply to lists.

Three importanta characteristics of Keyword lists:

keys are atoms
keys are ordered as specified
keys can occur more than once

Keyword lists can make up pretty DSL's, like in Ecto:

query = from w in Weather,
      where: w.prcp > 0,
      where: w.temp < 20,
     select: w

The examples above about the shorthand of writing if/else conditions are internally converted to keyword lists:

iex> if 42, do: true, else: false

Can be rewritten as

iex> if(42, [do: true, else: false])
#or
iex> if(42, [{:do, true}, {:else, false}])

Often, when the last parameter of a function is a keyword list, the extended syntax is optional.

Pattern matching on keyword lists is generally not advised, since access is linear and the size and order must be the same.

For manipulating Keyword lists, the Keyword module is used.

In elixir, Keyword lists are generally used for passing optional values.

Maps come in handy when you need to have the guarantee that one key is associated to one value.

Maps

Maps allow any value as a key, and don't follow any ordering:

iex> %{:a => 1, 2 => :b}
%{2 => :b, :a => 1}

Pattern matching is useful when dealing with maps, since a subset of the map is used (e.g. you don't need to specify all potential keys of a map).

iex> %{:a => a} = %{:a => 1, 2 => :b}
iex> a
1

An empty map matches all maps!

To access map keys:

iex> map = %{:a => 1}
iex> map[:a]
1

The Map module is used to manipulate maps:


iex(6)> Map.get(%{:a => 1}, :a)
1
iex(7)> Map.put(%{:a => 1}, :b, 2)
%{a: 1, b: 2}
iex(8)> Map.to_list(%{:a => 1, :b => 2})
[a: 1, b: 2]

To update a key (it has to exist) in a map, you can use the | syntax:

iex> map = %{:a => 1}
iex> %{map | :a => 2}
%{:a => 2}

For convenience, if all keys of a Map are atoms, you can use a shorter syntax (like with keyword lists):

iex> map = %{a: 1, b:2}

To access a key, you can use the . syntax:

iex> map = %{a: 1}
iex> map.a
1

Macros for updating nested data-structures (like maps in keyword lists, etc):

put_in/2
update_in/2
get_and_update_in/2

Modules

Modules are used to group functions.

Using the defmodule macro + the def macro to define functions.

defp is used for private functions.

Compilation of a module is done with elixirc. It creates a file names Elixir.<Module>.beam which contains the bytecode for the VM.

The ebin directory usually contains the compiled bytecode.

Modules used for scriping (e.g. no bytecode is persisted) are defined as .exs.

These files can be executed directly with the elixir cli.

The bytecode is compiled and loaded into memory.

Functions

Defined through def and defp inside a module.

Support the keyword list syntax, e.g.

def add(x, y) do
  x + y
end

# or

def add(x, y), do: x + y

Multiple functions with the same name are distringuished by their arity and guards.

Function capturing

Function capturing is used to assign a named function to a variable, allowing it to accept predefined arguments.

E.g.

is_zero = &Math.zero?/1

subtract = &(&1 - 42)

Math.zero? can be captured because Elixir support the format &Module.function() for functions from a module.

Default arguments

Default arguments are defined as \\ inside a def clause.

Multiple declarations with default arguments of the same function are support, but it is needed to define a function head (without body) dor declaring defaults.

E.g. as from here

defmodule Concat do
  # A function head declaring defaults
  def join(a, b \\ nil, sep \\ " ")

  def join(a, b, _sep) when is_nil(b) do
    a
  end

  def join(a, b, sep) do
    a <> sep <> b
  end
end

Loops and recursion

Loops in functional languages are written in with recursion in mind.

This is due to the immutability of the language.

You can define a function (with appropriate guard or base condition) to recursively determine the result.

Enum.reduce and Enum.map often come in handy instead of defining your own module, to do essentially the same.

Elixir uses tail call optimization, meaning a recursion avoid adding a new stack frame to the call stack and performing the compution as the final action of the defined procedure.

E.g. to reduce a list or map over all the values (using function capture syntax):

iex> Enum.reduce([0, 1, 2], 0, &+/2)
3
iex> Enum.map([0, 1, 2], &(&1 * 2))
[0, 2, 4]

Enumerables

The Enum module is used to manipulate structures that implement the Enumerable protocol.

For example lists and maps are enumerables, as many structures in Elixir.

This is where the keyword "polymorphic" comes in. This because the Enum module can handly different data types.

Another important aspect is that all the functions in the Enum module are eager, meaning that a list has to be evaluated fully and needs to be processed as a whole.

E.g. a list of 100000 number when mapped needs to be processed as a whole, and not every number one by one (-> increased memory footprint).

Streams

The Stream module enables lazy evaluation. Meaning that each component of the stream is evaluated separately, in a composable fashion.

In general, use the Enum, until laziness is required.

Composable because you can combine multiple stream operations:

iex> 1..100_000 |> Stream.map(&(&1 * 3)) |> Stream.filter(odd?)
#Stream<[enum: 1..100000, funs: [...]]>

In the case above a Stream is returned, meaning that no computations are done.

This until you pass the stream to the Enum module for the final evaluation.

iex> 1..100_000 |> Stream.map(&(&1 * 3)) |> Stream.filter(odd?) |> Enum.sum
7500000000

Instances of File are Streams, namely from the Stream.resource/3 family, which garantees that the stream is opened before evaluation and closed afterwards.

Processes

Processes are at the heart of Elixir/Erlang. Provide the concurrency model, run in isolation, use message passing for communicating with between one another. Fault-tolerance and distribution are enabled through processes.

They are extremely lightweight (RAM and CPU), to not be confused with system processes or threads.

`spawn`ing new processes

spawn is used to execute a function in another process:

spawn fn -> 1 + 2 end
#PID<0.42.0>

It returns a PID, which you can inspect with Process.alive?/1.

self/0 can be used to retrieve the PID of the current process.

`send`ing and `receive`ing messages

Sending and receiving messages works through a process mailbox.

The send/2 function is used to send messages to a specified PID. It is non-blocking, and you can send messages to "yourself" (to the same process through self/0).

The receive/1 macro goes through the mailbox (of the process in which it is called) that match the given patterns.

iex> send self(), {:test, 42}
{:test, 42}
iex> receive do
...>   {:test, msg} -> msg
...>   {:something, _else} -> "no match"
...> end
42

Receiving messages is blocking, thus you can specify an optional timeout using the after keyword:

iex> receive do
...>   {:test, msg}  -> msg
...> after
...>   3_000 -> "nada"
...> end
"nada"

Linking processes and reasoning

When using spawn/1 the child processes is not linked to the parent process. Meaning that the isolated parent process is still running and that eventual errors in the child process are not propagated.

In the majority of cases you want processes to be linked, this can be down via spawn_link/1.

Linking can also be done manually via Process.link/1.

Why do we link processes?

This is due to the fault-tolerant nature of Elixir. Isolated processes will never affect the state of another processes.

With links however you can build relationships between processes and define strategies to recover from a failure.

This is often done via supervisor processes that are responsible for restarting a new processes and keep the system state intact.

spawn/1 and spawn_link/1 are primitives, mostly you'll use abstractions that make use of them.

Tasks, Agents and state

Tasks provide better error report and process introspection.

They build upon the spawn/1 and spawn_link/1 functions.

The Task module provide a start/1 and start_link/1 function, following the same concept as discussed above.

The return a Tuple with {:ok, pid} (not just the PID), which enables the use in supervision trees.

Also Task.async/1 and Task.await/1 are available.

If you need to create a process to maintain state, you can use Tasks. Defining a start_link/0 function that relies on Task.start_link/1. Then looping indefinitely through a function that continuously receives message and responds to them (returning the state or updating it).

The same can be achieved using Agent, which can be seen as a wrapper/abstraction around processes state.

E.g.:

{:ok, pid} = Agent.start_link(fn -> %{} end)
{:ok, #PID<0.42.0>}
iex> Agent.update(pid, fn map -> Map.put(map, :hello, :world) end)
:ok
iex> Agent.get(pid, fn map -> Map.get(map, :hello) end)
:world

IO and the file system

Behind the scenes the IO module uses processes.

Thus when you use the File or IO module, you'll often get a tuple in return with {:ok, pid}.

The IO module can be used to interact with stdin/stdout/sterr, and generally with IO devices.

The Path module is an utility for constructing file path independently of the underlying operating system.

The nice thing about interacting with devices using processes is that you can perform distributed message passing and do file operations across nodes.

alias, require, import and use

alias

alias, require and import are "directives" because they have lexical scope.

use is a macro to inject code.

With alias you can define aliases for a module, to avoid repetitiveness in your code.

Using the keyword list syntax, e.g. alias Math.List, as: List

Since alias is lexically scoped, you can also define aliases inside functions.

require and import

To use macros from a module, you need to require it.

import works similarly, with the difference that is also includes functions alongside macros.

import also lets you use functions from other modules without using the fully qualified name of the module.

You can also import specific functions, with specifying their arity: import List, only: [duplicate: 2]

import behind the scenes calls require too, thus macros are available.

use

use defines an extension point for another module.

Meaning that another module can inject code in the current module.

This is useful e.g. when using ExUnit.Case, GenServer or SuperVisor are examples you'll likely encounter.

Behind the scenes a function __using__/1 is called, to inject code in the same context where use is defined.

Module attributes

This is an inherited feature from Erlang.

Module attributes are used to annotate a module, to pass information to a module or the virtual machine.

They are used as constants, and are defined during compilation.

For example there is the @vsn module attribute that is used for code reloading. If none is specified an MD5 checksum of the module is used to determine changes.

Most used are @moduledoc and @doc to provide documentation, which can later be recalled with the h helper functions in the IEx shell.

Module attributes can be used as classic "constants", as you can set a value once in a readable fashion at the top of the module definition. You can also call functions to assign a value to a module attribute, except functions of the same module, since they still need to be compiled.

Module attributes can also be used as storage, for example during tests. This is where "attribute accumulation" comes in handy, for example for tagging a test as external and such.

This is done via Module.register_attribute __MODULE__, :param, accumulate: true

Structs

Structs are built on top of maps, and are used to provide compile-time guarantees. You can also define default values for properties.

The macro defstruct is used inside a defmodule definition, and structs take the name of the module they are defined in:

defmodule Foo do
  defstruct bar: 42, baz: 100
end

This can now be used to create a Foo struct:

iex> %Foo{}
%Foo{bar: 42, baz: 100}
iex> %Foo{baz: 42}
%Foo{bar: 42, baz: 42}

Only fields defined in the structs are permitted.

You can also update structs with the | syntax similar to maps.

A special property __struct__ is set in the map, that defines the name of the struct.

Structs are essentially maps with a fixed set of fields. But they are "bare" maps, since no protocol of Maps is available for Structs (e.g. you cannot iterate over the keys, or access them).

Use the Map module for access and inspection.

Structs are the first introduction to the world of data polymorphism.

Protocols

Protocols are used to define a common behaviour between data types.

To enable extension for a given module, protocols are used to allowing an implementation to an external module.

This is where defprotocol and defimpl come in handy.

You can use defprotocol in your code, and allow other developers that use your library to implement the protocol for their own modules through defimpl.

The syntax is as follows:

defprotocol YourModule do
  @spec fun(t) :: String.t()
  def fun(value)
end

defimpl YourModule, for: Integer do
  def fun(_value), do: "integer"
end

The dispatch functionality is based on the first input of the function.

For example if you call YourModule.fun(42) the implementation for the Integer module is used.

You can implement protocol for "any" data type, use Any.

Using the @derive directive you can implement the Protocol based on the Any type.

There is also @fallback_to_any true which achieves the same (when an Any implementation is not defined).

Comprehensions

If you need to iterate over an Enumerable, Comprehesions come in with the for keyword.

You can filter out values, manipulate them, etc.

A comprehesion consists of a generator, filters and collectables.

It looks something like this

for x <- generator, filter(x), do: collectable(x)

You can also use pattern matching to match values in the generator part.

A complete example:

iex> multiple_of_3? = fn(n) -> rem(n, 3) == 0 end
iex> for n <- 0..5, multiple_of_3?.(n), do: n * n
[0, 9]

There are other options like :into, :reduce etc available.

Sigils

Sigils are used for working with textual representations. They start with ~ and a letter that identifies the sigil. Followed by a delimiter (out of 8 total, /, |, ", ', (, [, {, <).

E.g. you can define a word list and string with double quotes inside_

iex(1)> ~w/hello world/
["hello", "world"]
iex(2)> ~w(hello world)
["hello", "world"]
iex(3)> ~s(foo bar "test")
"foo bar \"test\""

The ~w sigil is used for holding list of words. You can also add the modifiers c, s and a to specify the type of the elements:

iex> ~w(foo bar bat)a
[:foo, :bar, :bat]

There are also Calendar sigils like Date, Time, DateTime etc.

iex> d = ~D[2019-10-31]
~D[2019-10-31]
iex> d.day
31

iex> t = ~T[23:00:07.0]
~T[23:00:07.0]
iex> t.second
7

iex> dt = ~U[2019-10-31 19:59:03Z]
~U[2019-10-31 19:59:03Z]
iex> %DateTime{minute: minute, time_zone: time_zone} = dt
~U[2019-10-31 19:59:03Z]
iex> minute
59
iex> time_zone
"Etc/UTC"

Using sigil_* you can define your custom Sigils.

try, catch and rescue

There are 3 error mechanisms in Elixir: errors, throws and exits.

Errors

Errors are raised (using raise) when exceptional things happen in your code. Like ArithmeticError, RuntimeError etc.

You rarely need to try / rescue from an error.

Most modules provide functions that return tuples you can match against (e.g. File.read).

If you expect something to be there (e.g. a file), you can use the function variant with ! at the end (e.g. File.read!/1)

In Elixir you try to avoid using try / rescue because we don't use errors for control flow.

In Elixir errors are taken literally and are reserved for exceptional situations. If you need such a control flow, throws are used.

Throws

Throws are rarely used. A use case is when it is not possible to retrieve a value without throw and catch.

Exits

A process can explicitly die using exit/1 providing an exit code.

The reason why rescueing from exceptions in Elixir is so uncommon is due to the fault-tolerant nature of Erlang.

exit signals are usually handled by Supervisors that handle the underlying processes following a define strategy.

"fail fast" is the main idea here.

After

after can be used to always run a part of code in a try/rescue clause. (Similar to finally).

There is also a shorthand available when defining fuctions, where you can omit the try part and write directly the function body.

def fun do
  ..
after
  ..
end

Else

An else block is used to run code when a try block does not raise an error.

Types and specs

Typespecs are a notation for declaring function signatures (specifications) and declaring custom types.

Typespecs are defined using the @spec attribute, placed before the function.

The type definitions are placed after the function name and arguments, followed by :: and the the specifications of the types.

@spec foo(integer) :: integer
def foo(num) do
  ...
end

@type is used for defining custom types:

@type some_type :: integer

@type some_other_type :: %{
  text: String.t,
  amount: number
}

@typedoc, @doc and @moduledoc are used to document custom types.

Types defined in a module are available outside, through the module namespace.

SomeModule.my_custom_type

You can also make types private using @typep

Thanks to typespecs you get static code analysis through the compiler.

Behaviours

Behaviours are a way to provide a public API and to ensure the functions are implemented.

They define a set of functions that need to be implemented.

They are similar to interfaces if you want.

Behaviours of a module are defined through @callback (following typespec definitions) and need to be implemented using @behaviour and @impl.

E.g. a JSON parser, implementing the Parser behaviour:

defmodule JSONParser do
  @behaviour Parser

  @impl Parser
  def parse(str) do
  ...
  end
end

Where Parser could look like this:

defmodule Parser do
  @doc """
  Parses a string.
  """
  @callback parse(String.t) :: {:ok, term} | {:error, String.t}
end

You can also use "dynamic dispatch" where you can define a function in the Behaviour module and rely on the implementation, delegating/dispatching to the implementation.