Skip to content

πŸ” A detailed example for how to encrypt data in an Elixir (Phoenix v1.7) App before inserting into a database using Ecto Types

Notifications You must be signed in to change notification settings

dwyl/phoenix-ecto-encryption-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Phoenix Ecto Encryption Example

data encrypted

GitHub Workflow Status codecov.io Hex.pm docs contributions welcome HitCount


πŸ’‘ Note: we wrote this example/tutorial to understand how to do field-level encryption from first principals.
Once we solved the problem, we built a library to streamline it: fields.
We still recommend going through this example, but if you just want to get on with building your Phoenix App, use fields.


Why?

Encrypting User/Personal data stored by your Web App is essential for security/privacy.

If your app offers any personalised content or interaction that depends on "login", it is storing personal data (by definition). You might be tempted to think that data is "safe" in a database, but it's not. There is an entire ("dark") army/industry of people (cybercriminals) who target websites/apps attempting to "steal" data by compromising databases. All the time you spend building your app, they spend trying to "break" apps like yours. Don't let the people using your app be the victims of identity theft, protect their personal data! (it's both the "right" thing to do and the law ...)

What?

This example/tutorial is intended as a comprehensive answer to the question:

"How to Encrypt/Decrypt Sensitive Data in Elixir Apps Before Inserting (Saving) it into the Database?"

Technical Overview

We are not "re-inventing encryption" or using our "own algorithm" everyone knows that's a "bad idea": https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own
We are following a battle-tested industry-standard approach and applying it to our Elixir/Phoenix App.
We are using:

Β―\_(ツ)_/Β―...? Don't be "put off" if any of these terms/algorithms are unfamiliar to you;
this example is "step-by-step" and we are happy to answer/clarify any (relevant and specific) questions you have!

OWASP Cryptographic Rules?

This example/tutorial follows the Open Web Application Security Project (OWASP) Cryptographic and Password rules:

  • Use "strong approved Authenticated Encryption" based on an AES algorithm.
    • Use GCM mode of operation for symmetric key cryptographic block ciphers.
    • Keys used for encryption must be rotated at least annually.
  • Only use approved public algorithm SHA-256 or better for hashing.
  • Argon2 is the winner of the password hashing competition and should be your first choice for new applications.

See:

Who?

This example/tutorial is for any developer (or technical decision maker / "application architect")
who takes personal data protection seriously and wants a robust/reliable and "transparent" way
of encrypting data before storing it, and decrypting when it is queried.

Prerequisites?

If you are totally new to (or "rusty" on) Elixir, Phoenix or Ecto, we recommend going through our Phoenix Chat Example (Beginner's Tutorial) first: https://github.com/dwyl/phoenix-chat-example

Crypto Knowledge?

You will not need any "advanced" mathematical knowledge; we are not "inventing" our own encryption or going into the "internals" of any cyphers/algorithms/schemes.

You do not need to understand how the encryption/hashing algorithms work,
but it is useful to know the difference between encryption vs. hashing and plaintext vs. ciphertext.

The fact that the example/tutorial follows all OWASP crypto/hashing rules (see: "OWASP Cryptographic Rules?" section above), should be "enough" for most people who just want to focus on building their app and don't want to "go down the rabbit hole".

However ... We have included 30+ links in the "Useful Links" section at the end of this readme. The list includes several common questions (and answers) so if you are curious, you can learn.

Note: in the @dwyl Library we have https://www.schneier.com/books/applied_cryptography So, if you're really curious let us know!

Time Requirement?

Simply reading ("skimming") through this example will only take 15 minutes.
Following the examples on your computer (to fully understand it) will take around 1 hour
(including reading a few of the links).

Invest the time up-front to avoid on the embarrassment and fines of a data breach.

How?

These are "step-by-step" instructions, don't skip any step(s).

1. Create the encryption App

In your Terminal, create a new Phoenix application called "encryption":

mix phx.new encryption

When you see Fetch and install dependencies? [Yn],
type y and press the [Enter] key to download and install the dependencies.
You should see following in your terminal:

* running mix deps.get
* running mix deps.compile
* running cd assets && npm install && node node_modules/webpack/bin/webpack.js --mode development

We are almost there! The following steps are missing:

    $ cd encryption

Then configure your database in config/dev.exs and run:

    $ mix ecto.create

Start your Phoenix app with:

    $ mix phx.server

You can also run your app inside IEx (Interactive Elixir) as:

    $ iex -S mix phx.server

Follow the first instruction change into the encryption directory:

cd encryption

Next create the database for the App using the command:

mix ecto.create

You should see the following output:

Compiling 13 files (.ex)
Generated encryption app
The database for Encryption.Repo has been created

2. Create the user Schema (Database Table)

In our example user database table, we are going to store 3 (primary) pieces of data.

  • name: the person's name (encrypted)
  • email: their email address (encrypted)
  • password_hash: the hashed password (so the person can login)

In addition to the 3 "primary" fields, we need one more field to store "metadata":

  • email_hash: so we can check ("lookup") if an email address is in the database without having to decrypt the email(s) stored in the DB.

Create the user schema using the following generator command:

mix phx.gen.schema User users email:binary email_hash:binary name:binary password_hash:binary

phx.gen.schema

The reason we are creating the encrypted/hashed fields as :binary is that the data stored in them will be encrypted and :binary is the most efficient Ecto/SQL data type for storing encrypted data; storing it as a String would take up more bytes for the same data. i.e. wasteful without any benefit to security or performance.
see: https://dba.stackexchange.com/questions/56934/what-is-the-best-way-to-store-a-lot-of-user-encrypted-data
and: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html

Next we need to update our newly created migration file. Open priv/repo/migrations/{timestamp}_create_users.exs.

Your migration file will have a slightly different name to ours as migration files are named with a timestamp when they are created but it will be in the same location.

Update the file from:

defmodule Encryption.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    create table(:users) do
      add(:email, :binary)
      add(:email_hash, :binary)
      add(:name, :binary)
      add(:password_hash, :binary)

      timestamps()
    end
  end
end

To

defmodule Encryption.Repo.Migrations.CreateUsers do
  use Ecto.Migration

  def change do
    create table(:users) do
      add(:email, :binary)
      add(:email_hash, :binary)
      add(:name, :binary)
      add(:password_hash, :binary)

      timestamps()
    end

    create(unique_index(:users, [:email_hash]))
  end
end

The newly added line ensures that we will never be allowed to enter duplicate email_hash values into our database.

Run the "migration" task to create the tables in the Database:

mix ecto.migrate

Running the mix ecto.migrate command will create the users table in your encryption_dev database.
You can view this (empty) table in a PostgreSQL GUI. Here is a screenshot from pgAdmin:
elixir-encryption-pgadmin-user-table

3. Define The 6 Functions

We need 6 functions for encrypting, decrypting, hashing and verifying the data we will be storing:

  1. Encrypt - to encrypt any personal data we want to store in the database.
  2. Decrypt - decrypt any data that needs to be viewed.
  3. Get Key - get the latest encryption/decryption key (or a specific older key where data was encrypted with a different key)
  4. Hash Email (deterministic & fast) - so that we can "lookup" an email without "decrypting". The hash of an email address should always be the same.
  5. Hash Password (pseudorandom & slow) - the output of the hash should always be different and relatively slow to compute.
  6. Verify Password - check a password against the stored password_hash to confirm that the person "logging-in" has the correct password.

The next 6 sections of the example/tutorial will walk through the creation of (and testing) these functions.

Note: If you have any questions on these functions, please ask:
github.com/dwyl/phoenix-ecto-encryption-example/issues

3.1 Encrypt

Create a file called lib/encryption/aes.ex and copy-paste (or hand-write) the following code:

defmodule Encryption.AES do
  @aad "AES256GCM" # Use AES 256 Bit Keys for Encryption.

  def encrypt(plaintext) do
    iv = :crypto.strong_rand_bytes(16) # create random Initialisation Vector
    key = get_key()    # get the *latest* key in the list of encryption keys
    {ciphertext, tag} =
      :crypto.crypto_one_time_aead(:aes_256_gcm, key, iv, to_string(plaintext), @aad, true)
    iv <> tag <> ciphertext # "return" iv with the cipher tag & ciphertext
  end

  defp get_key do # this is a "dummy function" we will update it in step 3.3
    <<109, 182, 30, 109, 203, 207, 35, 144, 228, 164, 106, 244, 38, 242,
    106, 19, 58, 59, 238, 69, 2, 20, 34, 252, 122, 232, 110, 145, 54,
    241, 65, 16>> # return a random 32 Byte / 128 bit binary to use as key.
  end
end

The encrypt/1 function for encrypting plaintext into ciphertext is quite simple; (the "body" is only 4 lines).

Let's "step through" these lines one at a time:

Having different ciphertext each time plaintext is encrypted is essential for "semantic security" whereby repeated use of the same encryption key and algorithm does not allow an "attacker" to infer relationships between segments of the encrypted message. Cryptanalysis techniques are well "beyond scope" for this example/tutorial, but we highly encourage to check-out the "Background Reading" links at the end and read up on the subject for deeper understanding.

  • Next we use the get_key/0 function to retrieve the latest encryption key so we can use it to encrypt the plaintext (the "real" get_key/0 is defined below in section 3.3).

  • Then we use the Erlang block_encrypt function to encrypt the plaintext.
    Using :aes_gcm ("Advanced Encryption Standard Galois Counter Mode"):

    • @aad is a "module attribute" (Elixir's equivalent of a "constant") is defined in aes.ex as @aad "AES256GCM"
      this simply defines the encryption mode we are using which, if you break down the code into 3 parts:
      • AES = Advanced Encryption Standard.
      • 256 = "256 Bit Key"
      • GCM = "Galois Counter Mode"
  • Finally we "return" the iv with the ciphertag & ciphertext, this is what we store in the database. Including the IV and ciphertag is essential for allowing decryption, without these two pieces of data, we would not be able to "reverse" the process.

Note: in addition to this encrypt/1 function, we have defined an encrypt/2 "sister" function which accepts a specific (encryption) key_id so that we can use the desired encryption key for encrypting a block of text. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".

Test the encrypt/1 Function

Create a file called test/lib/aes_test.exs and copy-paste the following code into it:

defmodule Encryption.AESTest do
  use ExUnit.Case
  alias Encryption.AES

  test ".encrypt includes the random IV in the value" do
    <<iv::binary-16, ciphertext::binary>> = AES.encrypt("hello")

    assert String.length(iv) != 0
    assert String.length(ciphertext) != 0
    assert is_binary(ciphertext)
  end

  test ".encrypt does not produce the same ciphertext twice" do
    assert AES.encrypt("hello") != AES.encrypt("hello")
  end
end

Run these two tests by running the following command:

mix test test/lib/aes_test.exs

The full function definitions for AES encrypt/1 & encrypt/2 are in: lib/encryption/aes.ex
And tests are in: test/lib/aes_test.exs

3.2 Decrypt

The decrypt function reverses the work done by encrypt; it accepts a "blob" of ciphertext (which as you may recall), has the IV and cypher tag prepended to it, and returns the original plaintext.

In the lib/encryption/aes.ex file, copy-paste (or hand-write) the following decrypt/1 function definition:

def decrypt(ciphertext) do
  <<iv::binary-16, tag::binary-16, ciphertext::binary>> =
    ciphertext
  :crypto.crypto_one_time_aead(:aes_256_gcm, get_key(), iv, ciphertext, @aad, tag, false)
end

The fist step (line) is to "split" the IV from the ciphertext using Elixir's binary pattern matching.

If you are unfamiliar with Elixir binary pattern matching syntax: <<iv::binary-16, tag::binary-16, ciphertext::binary>> read the following guide: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html

The :crypto.crypto_one_time_aead(:aes_256_gcm, get_key(key_id), iv, ciphertext, @aad, tag, false) line is the very similar to the encrypt function.

The ciphertext is decrypted using block_decrypt/4 passing in the following parameters:

  • :aes_256_gcm = encyrption algorithm
  • get_key(key_id) = get the encryption key used to encrypt the plaintext
  • iv = the original Initialisation Vector used to encrypt the plaintext
  • {@aad, ciphertext, tag} = a Tuple with the encryption "mode", ciphertext and the tag that was originally used to encrypt the ciphertext.

Finally return just the original plaintext.

Note: as above with the encrypt/2 function, we have defined an decrypt/2 "sister" function which accepts a specific (encryption) key_id so that we can use the desired encryption key for decrypting the ciphertext. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".

Test the decrypt/1 Function

In the test/lib/aes_test.exs add the following test:

test "decrypt/1 ciphertext that was encrypted with default key" do
  plaintext = "hello" |> AES.encrypt |> AES.decrypt()
  assert plaintext == "hello"
end

Re-run the tests mix test test/lib/aes_test.exs and confirm they pass.

The full encrypt & decrypt function definitions with @doc comments are in: lib/encryption/aes.ex


> And tests are in: [`test/lib/aes_test.exs`](https://github.com/dwyl/phoenix-ecto-encryption-example/blob/master/test/lib/aes_test.exs)

3.3 Key rotation

Key rotation is a "best practice" that limits the amount of data an "attacker" can decrypt if the database were ever "compromised" (provided we keep the encryption keys safe that is!) A really good guide to this is: https://cloud.google.com/kms/docs/key-rotation.

For this reason we want to 'store' a key_id. The key_id indicates which encryption key was used to encrypt the data. Besides the IV and ciphertag, the key_id is also essential for allowing decryption, so we change the encrypt/1 function to preserve the key_id as well

defmodule Encryption.AES do
  @aad "AES256GCM" # Use AES 256 Bit Keys for Encryption.

  def encrypt(plaintext) do
    iv = :crypto.strong_rand_bytes(16)
    # get latest key
    key = get_key()
    # get latest ID;
    key_id = get_key_id()
    # {ciphertext, tag} = :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, plaintext, 16})
    {ciphertext, tag} = :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, to_string(plaintext), 16})
    iv <> tag <> <<key_id::unsigned-big-integer-32>> <> ciphertext
  end

  defp get_key do
    get_key_id() |> get_key
  end

  defp get_key(key_id) do
    encryption_keys() |> Enum.at(key_id)
  end

  defp get_key_id do
    Enum.count(encryption_keys()) - 1
  end

  defp encryption_keys do
    Application.get_env(:encryption, Encryption.AES)[:keys]
  end
end

For the complete file containing these functions see: lib/encryption/aes.ex

For this example/demo we are using two encryption keys which are kept as an application environment variable. The values of the encryptions keys are associated with the key Encryption.AES. During the encryption we are by default always using the latest (most recent) encryption key (get_key/0) and the corresponding key_id is fetched by get_key_id/0 which becomes part of the ciphertext.

With decrypting we now pattern match the associated key_id from the ciphertext in order to be able to decrypt with the correct encryption key.

  def decrypt(ciphertext) do
    <<iv::binary-16, tag::binary-16, key_id::unsigned-big-integer-32, ciphertext::binary>> =
      ciphertext

    :crypto.block_decrypt(:aes_gcm, get_key(key_id), iv, {@aad, ciphertext, tag})
  end

So we defined the get_key twice in lib/encryption/aes.ex as per Erlang/Elixir standard, once for each "arity" or number of "arguments". In the first case get_key/0 assumes you want the latest Encryption Key. The second case get_key/1 lets you supply the key_id to be "looked up":

Both versions of get_key use encryption_keys/0 function to call the Application.get_env function: Application.get_env(:encryption, Encryption.AES)[:keys] specifically. For this to work we need to define the keys as an Environment Variable and make it available to our App in config.exs.

3.4 ENCRYPTION_KEYS Environment Variable

In order for our get_key/0 and get_key/1 functions to work, it needs to be able to "read" the encryption keys.

We need to "export" an Environment Variable containing a (comma-separated) list of (one or more) encryption key(s).

Copy-paste (and run) the following command in your terminal:

echo "export ENCRYPTION_KEYS='nMdayQpR0aoasLaq1g94FLba+A+wB44JLko47sVQXMg=,L+ZVX8iheoqgqb22mUpATmMDsvVGtafoAeb0KN5uWf0='" >> .env && echo ".env" >> .gitignore

For now, copy paste this command exactly as it is.
When you are deploying your own App, generate your own AES encryption key(s) see: How To Generate AES Encryption Keys? section below for how to do this.

Note: there are two encryption keys separated by a comma. This is to demonstrate that it's possible to use multiple keys.

We prefer to store our Encryption Keys as Environment Variables this is consistent with the "12 Factor App" best practice: https://en.wikipedia.org/wiki/Twelve-Factor_App_methodology

Update the config/config.exs to load the environment variables from the .env file into the application. Add the following code your config file just above import_config "#{Mix.env()}.exs":

# run shell command to "source .env" to load the environment variables.
try do                                     # wrap in "try do"
  File.stream!("./.env")                   # in case .env file does not exist.
    |> Stream.map(&String.trim_trailing/1) # remove excess whitespace
    |> Enum.each(fn line -> line           # loop through each line
      |> String.replace("export ", "")     # remove "export" from line
      |> String.split("=", parts: 2)       # split on *first* "=" (equals sign)
      |> Enum.reduce(fn(value, key) ->     # stackoverflow.com/q/33055834/1148249
        System.put_env(key, value)         # set each environment variable
      end)
    end)
rescue
  _ -> IO.puts "no .env file found!"
end

# Set the Encryption Keys as an "Application Variable" accessible in aes.ex
config :encryption, Encryption.AES,
  keys: System.get_env("ENCRYPTION_KEYS") # get the ENCRYPTION_KEYS env variable
    |> String.replace("'", "")  # remove single-quotes around key list in .env
    |> String.split(",")        # split the CSV list of keys
    |> Enum.map(fn key -> :base64.decode(key) end) # decode the key.

Test the get_key/0 and get_key/1 Functions?

Given that get_key/0 and get_key/1 are both defp (i.e. "private") they are not "exported" with the AES module and therefore cannot be invoked outside of the AES module.

The get_key/0 and get_key/1 are invoked by encrypt/1 and decrypt/1 and thus provided these (public) latter functions are tested adequately, the "private" functions will be too.

Re-run the tests mix test test/lib/aes_test.exs and confirm they still pass.

We also define a test in order to verify the working of key rotation. We add a new encryption key and assert (and make sure) that an encrypted value with an older encryption key will still be decrypted correctly.

  test "can still decrypt the value after adding a new encryption key" do
    encrypted_value = "hello" |> AES.encrypt()

    original_keys = Application.get_env(:encryption, Encryption.AES)[:keys]

    # add a new key
    Application.put_env(:encryption, Encryption.AES,
      keys: original_keys ++ [:crypto.strong_rand_bytes(32)]
    )

    assert "hello" == encrypted_value |> AES.decrypt()

    # rollback to the original keys
    Application.put_env(:encryption, Encryption.AES, keys: original_keys)
  end

The full encrypt & decrypt function definitions with @doc comments are in: lib/encryption/aes.ex And tests are in: test/lib/aes_test.exs

4. Hash Email Address

The idea behind hashing email addresses is to allow us to perform a lookup (in the database) to check if the email has already been registered/used for app/system.

Imagine that [email protected] has previously used your app. The SHA256 hash (encoded as base64) is: "bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="

try it for yourself in iex:

iex(1)> email = "[email protected]"
"[email protected]"
iex(2)> email_hash = :crypto.hash(:sha256, email) |> Base.encode64
"bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="

If we store the email_hash in the database, when Alex wants to log-in to the App/System, we simply perform a "lookup" in the users table:

hash  = :crypto.hash(:sha256, email) |> Base.encode64
query = "SELECT * FROM users WHERE email_hash = $1"
user  = Ecto.Adapters.SQL.query!(Encryption.Repo, query, [hash])

Note: there's a "built-in" Ecto get_by function to perform this type of
"SELECT ... WHERE field = value" query effortlessly

4.1 Generate the SECRET_KEY_BASE

All Phoenix apps have a secret_key_base for sessions. see: https://hexdocs.pm/plug/1.13.6/Plug.Session.COOKIE.html

Run the following command to generate a new phoenix secret key:

mix phx.gen.secret

copy-paste the output (64bit String) into your .env file after the "equals sign" on the line for SECRET_KEY_BASE:

export SECRET_KEY_BASE={YourSecreteKeyBaseGeneratedUsing-mix_phx.gen.secret}

Your .env file should look similar to: .env_sample

Load the secret key into your environment by typing into your terminal:

source .env

Note: We are using an .env file, but if you are using a "Cloud Platform" to deploy your app,
you could consider using their "Key Management Service" for managing encryption keys. eg
:

We now need to update our config files again. Open your config.exs file and change the the following: from

  secret_key_base: "3PXN/6k6qoxqQjWFskGew4r74yp7oJ1UNF6wjvJSHjC5Y5LLIrDpWxrJ84UBphJn",
  # your secret_key_base will be different but that is fine.

To

  secret_key_base: System.get_env("SECRET_KEY_BASE"),

As mentioned above, all Phoenix applications come with a secret_key_base. Instead of using this default one, we have told our application to use the new one that we added to our .env file.

Now we need to edit our config/test.exs file. Change the following: from

config :encryption, EncryptionWeb.Endpoint,
  http: [port: 4001],
  server: false

To

config :encryption, EncryptionWeb.Endpoint,
  http: [port: 4001],
  server: false,
  secret_key_base: System.get_env("SECRET_KEY_BASE")

By adding the previous code block we will now have a secret_key_base which we will be able to use for testing.

5. Create and use HashField Custom Ecto Type

When we first created the Ecto Schema for our "user", in Step 2 (above) This created the lib/encryption/user.ex file with the following schema:

schema "users" do
  field :email, :binary
  field :email_hash, :binary
  field :name, :binary
  field :password_hash, :binary

  timestamps()
end

The default Ecto field types (:binary) are a good start. But we can do so much better if we define custom Ecto Types!

Ecto Custom Types are a way of automatically "pre-processing" data before inserting it into (and reading from) a database. Examples of "pre-processing" include:

  • Custom Validation e.g: phone number or address format.
  • Encrypting / Decrypting
  • Hashing

A custom type expects 6 callback functions to be implemented in the file:

  • type/0 - define the Ecto Type we want Ecto to use to store the data for our Custom Type. e.g: :integer or :binary
  • cast/1 - "typecasts" (converts) the given data to the desired type e.g: Integer to String.
  • dump/1 - performs the "processing" on the raw data before it get's "dumped" into the Ecto Native Type.
  • load/1 - called when loading data from the database and receive an Ecto native type.
  • embed_as/1 - the return value (:self or :dump) determines how the type is treated inside embeds (not used here).
  • equal?/2 - invoked to determine if changing a type's field value changes the corresponding database record.

Create a file called lib/encryption/hash_field.ex and add the following:

defmodule Encryption.HashField do
  @behaviour Ecto.Type

  def type, do: :binary

  def cast(value) do
    {:ok, to_string(value)}
  end

  def dump(value) do
    {:ok, hash(value)}
  end

  def load(value) do
    {:ok, value}
  end

  def embed_as(_), do: :self

  def equal?(value1, value2), do: value1 == value2

  def hash(value) do
    :crypto.hash(:sha256, value <> get_salt(value))
  end

  # Get/use Phoenix secret_key_base as "salt" for one-way hashing Email address
  # use the *value* to create a *unique* "salt" for each value that is hashed:
  defp get_salt(value) do
    secret_key_base =
      Application.get_env(:encryption, EncryptionWeb.Endpoint)[:secret_key_base]
    :crypto.hash(:sha256, value <> secret_key_base)
  end
end

Let's step through each of these

type/0

The best data type for storing encrypted data is :binary (it uses half the "space" of a :string for the same ciphertext).

cast/1

Cast any data type to_string before encrypting it. (the encrypted data "ciphertext" will be of :binary type)

dump/1

The hash/1 function use Erlang's crypto library hash/2 function.

  • First we tell the hash/2 function that we want to use :sha256 "SHA 256" is the most widely used/recommended hash; it's both fast and "secure".
  • We then hash the value passed in to the hash/1 function (we defined) and concatenate it with "salt" using the get_salt/1 function which retrieves the secret_key_base environment variable and computes a unique "salt" using the value.

We use the SHA256 one-way hash for speed. We "salt" the email address so that the hash has some level of "obfuscation", in case the DB is ever "compromised" the "attacker" still has to "compute" a "rainbow table" from scratch.

load/1

Return the hash value as it is read from the database.

embed_as/1

This callback is only of importance when the type is part of an embed. It's not used here, but required for modules adopting the Ecto.Type behaviour as of Ecto 3.2.

equal?/2

This callback is invoked when we cast changes into a changeset and want to determine whether the database record needs to be updated. We use a simple equality comparison (==) to compare the current value to the requested update. If both values are equal, there's no need to update the record.

Note: Don't forget to export your SECRET_KEY_BASE environment variable (see instructions above)

The full file containing these two functions is: lib/encryption/hash_field.ex
And the tests for the functions are: test/lib/hash_field_test.exs

First add the alias for HashField near the top of the lib/encryption/user.ex file. e.g:

alias Encryption.HashField

Next, in the lib/encryption/user.ex file, update the lines for email_hash in the users schema
from:

schema "users" do
  field :email, :binary
  field :email_hash, :binary
  field :name, :binary
  field :password_hash, :binary
  timestamps()
end

To:

schema "users" do
  field :email, :binary
  field :email_hash, HashField
  field :name, :binary
  field :password_hash, :binary

  timestamps()
end
  def changeset(%User{} = user, attrs \\ %{}) do
    user
    |> cast(attrs, [:name, :email])
    |> validate_required([:email])
    |> add_email_hash
    |> unique_constraint(:email_hash)
  end

  defp add_email_hash(changeset) do
    if Map.has_key?(changeset.changes, :email) do
      changeset |> put_change(:email_hash, changeset.changes.email)
    else
      changeset
    end
  end

We should test this new functionality. Create the file test/lib/user_test.exs and add the following:

defmodule Encryption.UserTest do
  use Encryption.DataCase
  alias Encryption.User

  @valid_attrs %{
    name: "Max",
    email: "[email protected]",
    password: "NoCarbsBeforeMarbs"
  }

  @invalid_attrs %{}

  describe "Verify correct working of hashing" do
    setup do
      user = Repo.insert!(User.changeset(%User{}, @valid_attrs))
      {:ok, user: user, email: @valid_attrs.email}
    end

    test "inserting a user sets the :email_hash field", %{user: user} do
      assert user.email_hash == user.email
    end

    test ":email_hash field is the encrypted hash of the email", %{user: user} do
      user_from_db = User |> Repo.one()
      assert user_from_db.email_hash == Encryption.HashField.hash(user.email)
    end
  end
end

For the full user tests please see: test/user/user_test.exs

6. Create and user Hash Password Custom Ecto type

When hashing passwords, we want to use the strongest hashing algorithm and we also want the hashed value (or "digest") to be different each time the same plaintext is hashed (unlike when hashing the email address where we want a deterministic digest).

Using argon2 makes "cracking" a password (in the event of the database being "compromised") far less likely as it uses both a CPU-bound "work-factor" and a "Memory-hard" algorithm which will significantly "slow down" the attacker.

Add the argon2 Dependency

In order to use argon2 we must add it to our mix.exs file: in the defp deps do (dependencies) section, add the following line:

{:argon2_elixir, "~> 1.3"},  # securely hashing & verifying passwords

You will need to run mix deps.get to install the dependency.

6.1 Define the hash_password/1 Function

Create a file called lib/encryption/password_field.ex in your project. The first function we need is hash_password/1:

defmodule Encryption.PasswordField do

  def hash_password(value) do
    Argon2.Base.hash_password(to_string(value),
      Argon2.Base.gen_salt(), [{:argon2_type, 2}])
  end

end

hash_password/1 accepts a password to be hashed and invokes Argon2.Base.hash_password/3 passing in 3 arguments:

6.1.1 Test the hash_password/1 Function?

In order to test the PasswordField.hash_password/1 function we use the Argon2.verify_pass function to verify a password hash.

Create a file called test/lib/password_field_test.exs and copy-paste (or hand-type) the following test:

defmodule Encryption.PasswordFieldTest do
  use ExUnit.Case
  alias Encryption.PasswordField, as: Field

  test ".verify_password checks the password against the Argon2id Hash" do
    password = "EverythingisAwesome"
    hash = Field.hash_password(password)
    verified = Argon2.verify_pass(password, hash)
    assert verified
  end

end

Run the test using the command:

mix test test/lib/password_field_test.exs

The test should pass; if not, please re-trace the steps.

6.2 Verify Password

The corresponding function to check (or "verify") the password is verify_password/2. We need to supply both the password and stored_hash (the hash that was previously stored in the database when the person registered or updated their password) It then runs Argon2.verify_pass which does the checking.

def verify_password(password, stored_hash) do
  Argon2.verify_pass(password, stored_hash)
end

hash_password/1 and verify_password/2 functions are defined in: lib/encryption/password_field.ex

Test for verify_password/2

To test that our verify_password/2 function works as expected, open the file: test/lib/password_field_test.exs
and add the following code to it:

test ".verify_password fails if password does NOT match hash" do
  password = "EverythingisAwesome"
  hash = Field.hash_password(password)
  verified = Field.verify_password("LordBusiness", hash)
  assert !verified
end

Run the tests: mix test test/lib/password_field_test.exs and confirm they pass.

If you get stuck, see: test/lib/password_field_test.exs

Define the other Ecto.Type behaviour functions:

defmodule Encryption.PasswordField do
  @behaviour Ecto.Type

  def type, do: :binary

  def cast(value) do
    {:ok, to_string(value)}
  end

  def dump(value) do
    {:ok, hash_password(value)}
  end

  def load(value) do
    {:ok, value}
  end

  def embed_as(_), do: :self

  def equal?(value1, value2), do: value1 == value2

  def hash_password(value) do
    Argon2.Base.hash_password(to_string(value),
      Argon2.Base.gen_salt(), [{:argon2_type, 2}])
  end

  def verify_password(password, stored_hash) do
    Argon2.verify_pass(password, stored_hash)
  end
end
alias Encryption.{HashField, PasswordField, User}

Update the lines for :email and :name in the schema
from:

schema "users" do
  field :email, :binary
  field :email_hash, HashField
  field :name, :binary
  field :password_hash, :binary

  timestamps()
end

To:

schema "users" do
  field :email, :binary
  field :email_hash, HashField
  field :name, :binary
  field :password_hash, PasswordField

  timestamps()
end

7. Create and use EncryptedField Custom Ecto Type

Create a file called lib/encryption/encrypted_field.ex and add the following:

defmodule Encryption.EncryptedField do
  alias Encryption.AES  # alias our AES encrypt & decrypt functions (3.1 & 3.2)

  @behaviour Ecto.Type  # Check this module conforms to Ecto.type behavior.
  def type, do: :binary # :binary is the data type ecto uses internally

  # cast/1 simply calls to_string on the value and returns a "success" tuple
  def cast(value) do
    {:ok, to_string(value)}
  end

  # dump/1 is called when the field value is about to be written to the database
  def dump(value) do
    ciphertext = value |> to_string |> AES.encrypt
    {:ok, ciphertext} # ciphertext is :binary data
  end

  # load/1 is called when the field is loaded from the database
  def load(value) do
    {:ok, AES.decrypt(value)} # decrypted data is :string type.
  end

  # embed_as/1 dictates how the type behaves when embedded (:self or :dump)
  def embed_as(_), do: :self # preserve the type's higher level representation

  # equal?/2 is called to determine if two field values are semantically equal
  def equal?(value1, value2), do: value1 == value2
end

Let's step through each of these

type/0

The best data type for storing encrypted data is :binary (it uses half the "space" of a :string for the same ciphertext).

cast/1

Cast any data type to_string before encrypting it. (the encrypted data "ciphertext" will be of :binary type)

dump/1

Calls the AES.encrypt/1 function we defined in section 3.1 (above) so data is encrypted 'automatically' before we insert into the database.

load/1

Calls the AES.decrypt/1 function so data is 'automatically' decrypted when it is read from the database.

Note: the load/2 function is not required for Ecto Type compliance. Further reading: https://hexdocs.pm/ecto/Ecto.Type.html

embed_as/1

This callback is only of importance when the type is part of an embed. It's not used here, but required for modules adopting the Ecto.Type behaviour as of Ecto 3.2.

equal?/2

This callback is invoked when we cast changes into a changeset and want to determine whether the database record needs to be updated. We use a simple equality comparison (==) to compare the current value to the requested update. If both values are equal, there's no need to update the record.

Your encrypted_field.ex Custom Ecto Type should look like this: lib/encryption/encrypted_field.ex try to write the tests for the callback functions, if you get "stuck", take a look at: test/lib/encrypted_field_test.exs

Now that we have defined a Custom Ecto Type EncryptedField, we can use the Type in our User Schema. Add the following line to "alias" the Type and a User in the lib/encryption/user.ex file:

alias Encryption.{HashField, PasswordField, EncryptedField, User}

Update the lines for :email and :name in the schema
from:

schema "users" do
  field :email, :binary
  field :email_hash, HashField
  field :name, :binary
  field :password_hash, PasswordField

  timestamps()
end

To:

schema "users" do
  field :email, EncryptedField
  field :email_hash, HashField
  field :name, EncryptedField
  field :password_hash, PasswordField

  timestamps()
end

8. Ensure All Tests Pass

Typically we will create git commit (if we don't already have one) for the "known state" where the tests were passing (before starting the refactor).

The commit before refactoring the example is: https://github.com/dwyl/phoenix-ecto-encryption-example/tree/3659399ec32ca4f07f45d0552b9cf25c359a2456

The corresponding Travis-CI build for this commit is: https://travis-ci.org/dwyl/phoenix-ecto-encryption-example/jobs/379887597#L833

Note: if you are new to Travis-CI see: https://github.com/dwyl/learn-travis

Conclusion

We have gone through how to create custom Ecto Types in order to define our own functions for handling (transforming) specific types of data.

Our hope is that you have understood the flow.

We plan to extend this tutorial include User Interface please "star" the repo if you would find that useful.



How To Generate AES Encryption Keys?

Encryption keys should be the appropriate length (in bits) as required by the chosen algorithm.

An AES 128-bit key can be expressed as a hexadecimal string with 32 characters.
It will require 24 characters in base64.

An AES 256-bit key can be expressed as a hexadecimal string with 64 characters.
It will require 44 characters in base64.

see: https://security.stackexchange.com/a/45334/117318

Open iex in your Terminal and paste the following line (then press enter)

:crypto.strong_rand_bytes(32) |> :base64.encode

You should see terminal output similar to the following:

elixir-generate-encryption-key

We generated 3 keys for demonstration purposes:

  • "h6pUk0ZccS0pYsibHZZ4Cd+PRO339rMA7sMz7FnmcGs="
  • "nMd/yQpR0aoasLaq1g94FL/a+A+wB44JLko47sVQXMg="
  • "L+ZVX8iheoqgqb22mUpATmMDsvVGt/foAe/0KN5uWf0="

These two Erlang functions are described in:

Base64 encoding the bytes generated by strong_rand_bytes will make the output human-readable (whereas bytes are less user-friendly).



Useful Links, FAQ & Background Reading

Understanding Advanced Encryption Standard (AES)

If you prefer to read, Ryo Nakao wrote an excellent post on understanding how AES encryption works: https://nakabonne.dev/posts/understanding-how-aes-encryption-works/

If you have the bandwidth and prefer a video, Computerphile (YouTube channel) has an great explaination:

aes-explanation

Running a Single Test

To run a single test (e.g: while debugging), use the following syntax:

mix test test/user/user_test.exs:9

For more detail, please see: https://hexdocs.pm/phoenix/testing.html

Ecto Validation Error format

When Ecto changeset validation fails, for example if there is a "unique" constraint on email address (so that people cannot re-register with the same email address twice), Ecto returns the changeset with an errors key:

#Ecto.Changeset<
  action: :insert,
  changes: %{
    email: <<224, 124, 228, 125, 105, 102, 38, 170, 15, 199, 228, 198, 245, 189,
      82, 193, 164, 14, 182, 8, 189, 19, 231, 49, 80, 223, 84, 143, 232, 92, 96,
      156, 100, 4, 7, 162, 26, 2, 121, 32, 187, 65, 254, 50, 253, 101, 202>>,
    email_hash: <<21, 173, 0, 16, 69, 67, 184, 120, 1, 57, 56, 254, 167, 254,
      154, 78, 221, 136, 159, 193, 162, 130, 220, 43, 126, 49, 176, 236, 140,
      131, 133, 130>>,
    key_id: 1,
    name: <<2, 215, 188, 71, 109, 131, 60, 147, 219, 168, 106, 157, 224, 120,
      49, 224, 225, 181, 245, 237, 23, 68, 102, 133, 85, 62, 22, 166, 105, 51,
      239, 198, 107, 247, 32>>,
    password_hash: <<132, 220, 9, 85, 60, 135, 183, 155, 214, 215, 156, 180,
      205, 103, 189, 137, 81, 201, 37, 214, 154, 204, 185, 253, 144, 74, 222,
      80, 158, 33, 173, 254>>
  },
  errors: [email_hash: {"has already been taken", []}],
  data: #Encryption.User<>,
  valid?: false
>

The errors part is:

[email_hash: {"has already been taken", []}]

A tuple wrapped in a keyword list.

Why this construct? A changeset can have multiple errors, so they're stored as a keyword list, where the key is the field, and the value is the error tuple. The first item in the tuple is the error message, and the second is another keyword list, with additional information that we would use when mapping over the errors in order to make them more user-friendly (though here, it's empty). See the Ecto docs for add_error/4 and traverse_errors/2 for more information.

So to access the error message "has already been taken" we need some pattern-matching and list popping:

{:error, changeset} = Repo.insert User.changeset(%User{}, @valid_attrs)
{:ok, message} = Keyword.fetch(changeset.errors, :email_hash)
msg = List.first(Tuple.to_list(message))
assert "has already been taken" == msg

To see this in action run:

mix test test/user/user_test.exs:40

Stuck / Need Help?

If you get "stuck", please open an issue on GitHub: https://github.com/nelsonic/phoenix-ecto-encryption-example/issues describing the issue you are facing with as much detail as you can.



Credits

Inspiration/credit/thanks for this example goes to Daniel Berkompas @danielberkompas for his post:
https://blog.danielberkompas.com/2015/07/03/encrypting-data-with-ecto

Daniel's post is for Phoenix v0.14.0 which is quite "old" now ...
therefore a few changes/updates are required.
e.g: There are no more "Models" in Phoenix 1.3 or Ecto callbacks.

Also his post only includes the "sample code" and is not a complete example
and does not explain the functions & Custom Ecto Types.
Which means anyone following the post needs to manually copy-paste the code ... and "figure out" the "gaps" themselves to make it work.
We prefer to include the complete "end state" of any tutorial (not just "samples")
so that anyone can git clone and run the code locally to fully understand it.

Still, props to Daniel for his post, a good intro to the topic!