Introduction
Our clients occasionally ask us to look into why a particular email that spoofed the client was not blocked by a mail server. Generally these emails are intended to impersonate a user at the company in question, and naturally our clients would want to ensure that the emails are rejected by a receiving mail transfer agent (MTA). In this blog post we will discuss the various technologies that are used to authenticate email, along with their strengths and weaknesses. The intended audience of this blog post is one who wishes to better understand how to protect their email from attackers impersonating its domain.
Note that many of these technologies discussed in this article have various Requests For Comments (RFC), and they will be referenced throughout the blog post where applicable.
First, it’s important to understand the way an email is normally submitted to an MTA. Below is an example conversation between a sending mail client, and Amazon’s Simple Email Service (SES) using the Simple Mail Transfer Protocol (SMTP). This communication can be sent in plaintext on port 25, using SSL on port 465, or over TLS on port 587. In certain cases cloud providers may block these common ports, and port 2525 is used instead. Data that is submitted by the client is depicted in red, while data from the server is in blue.
- Â Â Â Â Â PROXY TCP4 74.120.248.95 10.44.15.76 14659 465
- Â Â Â Â Â 220 email-smtp.amazonaws.com ESMTP SimpleEmailService-793939519 qAfLvOj6BjiiCjSnD6S
- Â Â Â Â Â EHLO email-smtp.us-east-1.amazonaws.com
- Â Â Â Â Â 250-email-smtp.amazonaws.com
- Â Â Â Â Â 250-8BITMIME
- Â Â Â Â Â 250-SIZE 10485760
- Â Â Â Â Â 250-AUTH PLAIN LOGIN
- Â Â Â Â Â 250 Ok
- Â Â Â Â Â AUTH LOGIN
- Â Â Â Â Â 334 VXNlcm5hbWU6
- Â Â Â Â Â bXlfZmFrZV91c2VybmFtZQ==
- Â Â Â Â Â 334 UGFzc3dvcmQ6
- Â Â Â Â Â bXlfZmFrZV9wYXNzd29yZA==
- Â Â Â Â Â 235 Authentication successful.
- Â Â Â Â Â MAIL FROM:<test@ses-example.com>
- Â Â Â Â Â 250 Ok
- Â Â Â Â Â RCPT TO:<success@simulator.amazonses.com>
- Â Â Â Â Â 250 Ok
- Â Â Â Â Â DATA
- Â Â Â Â Â 354 End data with <CR><LF>.<CR><LF>
- Â Â Â Â Â MIME-Version: 1.0
- Â Â Â Â Â Subject: Test message
- Â Â Â Â Â From: Senior Tester <test@ses-example.com>
-      Content-Type: text/html; charset=”UTF-8″
- Â Â Â Â Â Content-Transfer-Encoding: quoted-printable
- Â Â Â Â Â To: success@simulator.amazonses.com>
- Â Â Â Â Â <b>Cool email body</b>
- Â Â Â Â Â .
- Â Â Â Â Â 250 Ok 0000012345678e09-123a4cdc-b56c-78dd-b90e-d123be456789-000000
- Â Â Â Â Â QUIT
- Â Â Â Â Â 221 Bye
Lines are numbered for reference. On line 3, the client uses the EHLO command to identify the domain of the client. Lines 9-14 depict a username and password being submitted to the SMTP server for authentication. Lines 15-18 specify the envelope sender and envelope recipient of the email address. Finally, lines 19-29 are used to submit the data portion of the email message.
There are a couple of important points to take away from this handshake
- The EHLO domain does not need to be the same as the MAIL FROM: domain.
- The AUTH portion does not need to authenticate the same user that is specified in the MAIL FROM: field.
These points will drive our discussion when it comes to understanding how these technologies work together to protect your domain from attackers who attempt to impersonate your domain.
There are also two confusing portions when looking at the handshake:
- The MAIL FROM: header can be different than the From: header.
- The RCPT TO: header can be different than the To: header.
The former headers (MAIL FROM and RCPT TO) are used to determine delivery (RFC2821), while the latter are only presented to the user (RFC2822).
SPF (RFC7208)
The sender policy framework is one of the first technologies to see wide deployment for email authorization. Before this protocol, there was no restriction on what a sending host could use as a MAIL FROM value in a message or the domain value in a HELO command. SPF is used to provide a listing of hosts that are allowed to use a particular domain name. A receiving host can use this information to verify such authorization.
SPF uses the Domain Name System (DNS), in which an administrator of a domain can publish a DNS record that specifies which hosts can send email from the domain. SPF records are generally TXT records that are configured under the protected domain. As an example, here is the example SPF record for reddit.com
.
$ nslookup -type=TXT v=spf1 include:amazonses.com include:_spf.google.com include:mailgun.org ip4:174.129.203.189 ip4:52.205.61.79 ip4:54.172.97.247 ~all
The SPF record is evaluated left to right. The first 3 included mechanisms are further lookups that are performed. In this case, 3 TXT lookups will be performed for amazonses.com
, _spf.google.com
, and mailgun.org
. Each of these lookups can have further lookups of their own. The 3 ip4
mechanisms are IPv4 addresses that are also authorized to send mail as reddit.com. Likewise, IPv6 addresses may also be specified via an ip6
mechanism. Finally, the ~all
mechanism indicates a soft-fail (mark message as spam) if none of the previous blocks return true. Note that there are additional block types possible outside of ip4
, ip6
, and include
. However, the ones used in this example are the most common types found.
Expanding the list of authorized domains yields the following CIDR networks:
amazonses.com ip4:199.255.192.0/22 ip4:199.127.232.0/22 ip4:54.240.0.0/18ip4:69.169.224.0/20 ip4:23.249.208.0/20 ip4:23.251.224.0/19ip4:76.223.176.0/20 _spf.google.cominclude:_netblocks.google.com include:_netblocks2.google.cominclude:_netblocks3.google.com _netblocks.google.comip4:35.190.247.0/24 ip4:64.233.160.0/19 ip4:66.102.0.0/20ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:74.125.0.0/16ip4:108.177.8.0/21 ip4:173.194.0.0/16 ip4:209.85.128.0/17ip4:216.58.192.0/19 ip4:216.239.32.0/19 _netblocks2.google.comip6:2001:4860:4000::/36 ip6:2404:6800:4000::/36ip6:2607:f8b0:4000::/36 ip6:2800:3f0:4000::/36 ip6:2a00:1450:4000::/36ip6:2c0f:fb50:4000::/36 _netblocks3.google.comip4:172.217.0.0/19 ip4:172.217.32.0/20 ip4:172.217.128.0/19ip4:172.217.160.0/20 ip4:172.217.192.0/19 ip4:172.253.56.0/21ip4:172.253.112.0/20 ip4:108.177.96.0/19 ip4:35.191.0.0/16ip4:130.211.0.0/22 mailgun.orginclude:spf1.mailgun.org include:spf2.mailgun.org spf1.mailgun.orgip4:104.130.122.0/23 ip4:146.20.112.0/26 ip4:141.193.32.0/ip4:161.38.192.0/20 spf2.mailgun.orgip4:209.61.151.0/24 ip4:166.78.68.0/22 ip4:198.61.254.0/23ip4:192.237.158.0/23 ip4:23.253.182.0/23 ip4:104.130.96.0/28ip4:146.20.113.0/24 ip4:146.20.191.0/24 ip4:159.135.224.0/ip4:69.72.32.0/20
As you can see, by including several SPF records, the effectiveness of recursion can easily result in a scenario where many servers exist that can impersonate your domain. When adding a host to your SPF record, it’s important to understand that if an include block is used, you are delegating access to the provider as the record that you point to can be updated at any time in the future without notice. Generally, this is performed so that a service provider does not need to communicate with its customers whenever new netblocks are rotated in-and-out of commission. Another important thing to note is that a soft maximum of 10 lookups are allowed as per the SPF specification. If more than 10 DNS lookups are required to fully resolve an SPF record, the record is deemed invalid and a recipient mail agent may choose to have a permerror and refuse to evaluate the remaining portion of the SPF header. For this reason, the first limitation of SPF is that it generally tends to not scale well for customers that have many integrated email systems. If you find yourself in this scenario, separate subdomains or domains can be used. For example, domain.com
can be used for internal corporate email, while email.domain.com
can be delegated to transactional emails to notify customers.
Second, let’s discuss one common misconception on what SPF is actually protecting. SPF does not validate against the FROM domain. Instead, SPF looks at the Return-Path value to validate the originating server. Return-Path is the email domain that receiving MTA uses to notify the sending mail server of delivery problems, such as bounces, and generally is the same as the EHLO domain above. Therefore, an email can pass SPF regardless of whether the FROM address is fake. The problem with this limitation is that the From address is generally what recipients see in their email clients. Certain mail clients (such as Google Workplace’s web interface) do split out the Return-Path and From header if they do not match, as seen in this screenshot below.
[image id=”3032″ filter=”false”]
[image id=”3033″ filter=”false”]
This screenshot illustrates the aforementioned point beautifully because the FROM
domain of gerrit-saml-84469bfdc6-r2q5g
does not match the Return-Path
domain, which is mail.praetorian.com
, and gerrit-saml-84469bfdc6-r2q5g
is not even an FQDN that can be registered on the Internet! In this example, Google Workplace shows a via within the From field to let the user know of the mismatch. The Return-Path is also sometimes referred to as the mailed-by header.
Finally, we’ll discuss the various SPF modes. The common ones are pass, neutral, soft-fail, and hard-fail. An SPF record that passes means that the IP address of the user is found within the SPF record, as delegated to from within DNS, and the email can generally be seen as authorized. A neutral result means that the IP is neither found, nor is there any type of failure published within DNS. In the example above, ~all
was used to designate that all IPs that are not within the list are to be treated as a soft-failure. If the ~all
was omitted the result of a lookup that was not found would have been neutral. This means that the DNS record does not indicate a strong result one way or the other. Additional methods of email authorization/authentication should be performed for these types of messages. A soft-failure (~all
) indicates to the receiving MTA that the email message should be accepted, but marked as spam, while a hard failure -all
indicates to the receiving MTA that the email should not be accepted at all, and to send a bounce notification to the originating MTA. Note that the standard specifies that if a message fails SPF, there is no guarantee that it will be rejected. That final decision about delivery is up to the receiving MTA. Let’s analyze why this may be the case.
Consider a scenario where Alice may send an email FROM
alice@example.com TO
Bob at bob@alumni.example.edu (and for simplicity purposes let’s assume that the Return-Path
is also alumni.example.edu, as that is what SPF is protecting). Bob has since graduated, and the inbox at alumni.example.edu simply forwards all receiving messages to Bob’s personal email, which is bob@example.com. When an email gets forwarded, it is important to preserve all headers in the email so that Bob can see that Alice sent the message and so alumni.example.edu will transmit an email FROM
alice@example.com TO
bob@example.com. This email will technically fail SPF as alumni.example.edu is not authorized to send emails as example.com. A forwarded email message is identical when compared to a maliciously spoofed email. For this reason, when an email fails SPF in any manner (soft or hard), it is often still processed with other technologies (such as DKIM).
Summary of SPF
Overall, SPF has four major weaknesses:
- A limitation of the number of lookups allowed can cause scalability issues as a company integrates additional mailing services into their domain.
- The
FROM
address is not the field that is being authorized. Rather, the domain of theReturn-Path
is. - SPF does not work with email forwarding.
- For the reason above, an email that fails SPF does not immediately get marked or rejected by the receiving MTA.
These pitfalls of SPF do not imply that you should avoid configuring SPF, as it is a general good practice to do so. However, standards such as DKIM and DMARC have been created to address these weaknesses.
DKIM (RFC6376)
DKIM stands for “Domain Keys Identified Mail” and it’s a way for mail servers to cryptographically verify the authenticity of email associated with a domain. It works by affixing a digital signature to each outgoing mail. This digital signature is linked to the domain, where the public key is affixed in DNS. As an example, observe the following DNS lookup.
$ dig TXT praetorian._domainkey.praetorian.com <<>> DiG 9.10.6 <<>> TXT praetorian._domainkey.praetorian.com;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16687;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ; OPT PSEUDOSECTION:; EDNS: version: 0, flags:; udp: 4096;; QUESTION SECTION:;praetorian._domainkey.praetorian.com. IN TXT ; ANSWER SECTION:praetorian._domainkey.praetorian.com. 300 IN TXT "v=DKIM1; k=rsap=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCAwRH/UCMpGnFnu3ezYdZihgsec7Gu4qh1n/ek8gSy5N2V+k+1L6GElTL+ZBAbBtfxfRMlMG5N48NRoftP9sabvpo9WB8IvaVcICeUeBCXmU5HJFKjdHmStNz5RhgA5w3fKAI1D8kQ0LNkfftuUjkbdFcx0yf2g3hPe5GgZmoHZwIDAQAB" ; Query time: 39 msec;; SERVER: 2600:1700:156:8080::1#53(2600:1700:156:8080::1);; WHEN: Fri Apr 23 13:25:04 CDT 2021;; MSG SIZEÂ rcvd: 312
We can see that a public RSA key is listed within DNS under the praetorian.com zone with a selector of praetorian
. This selector is an arbitrary piece of text that is conveyed in every outbound email via the s=tag
. When an email is received by an email server, the server will determine where a selector is specified. If so, the server will perform a DNS lookup to determine what public keys should have been used to sign the email. It can then validate that the email is legitimate if the signature matches the expected value. Effectively, this is asymmetric cryptography where the private key is maintained by the outgoing email service that is authorized to send email on behalf of the domain, and the public key is published in DNS where any relying party can perform a lookup to retrieve the signature. Here is an example of what DKIM looks like in an email.
Authentication-Results: mx.google.com; dkim=pass header.i=@praetorian.com header.s=praetorian header.b=eZiwGOyb; spf=pass (google.com: domain of <username>@praetorian.com designates 209.85.220.41 as permitted sender) smtp.mailfrom=<username>@praetorian.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=praetorian.comDKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=praetorian.com; s=praetorian; h=from:in-reply-to:references:mime-version:date:message-id:subject:to:cc;bh=AJasz3WkVZWacTHeJh2a6opM1M39BToCgg480FlqSRQ=;b=eZiwGOybpodUpiaAvGIb4YOtebdoNwlyJGMi/+zRE33lT2MXLoC5DWl0S2e3EQ73cUqRcbjbFOt49VONxzFZHOkYaH5fIIGfPq/cPntirJBKGKpuh5Bp7sm9td47qsG4bHfaGGO64GRe+I/4YLUo6jEQg0UXjmnEHec7tDQZLUI=
The selector chosen in this example s=praetorian
instructs the email server to perform a lookup of praetorian._domainkeys
prepended to the d=praetorian.com
, domain, i.e. praetorian._domainkeys.praetorian.com
, where the public key is retrieved from DNS and is used to validate the email.
The b=
value is the hash data of the headers listed in the h=
tag. This hash is also called the DKIM signature and encoded in Base64.
The bh=
value is the computed hash of the message body. The value is a string of characters representing the hash determined by the hash algorithm.
Therefore, DKIM, like other signing algorithms, applies a signature to the hash of the message. If two messages have the same hash value, they have the same signature. In this example the a=rsa-sha256
hash algorithm was used. This algorithm is currently known as a safe algorithm. Because email messages are signed, and IP addresses are not used for authentication with DKIM, when it is used, DKIM allows arbitrary forwarding of messages, unlike SPF, which was one of the drawbacks discussed above.
It’s also important to note that DKIM does not provide individual message-based authentication, but only at the domain level. For example, the email server can choose to sign arbitrary messages for any username and message under its domain. DKIM does not provide assurance that the email message came from the intended username, but only from the praetorian.com domain. Depending on the use case, additional technologies, such as S/MIME or PGP, can be used to provide user-level authentication and non-repudiation of email messages. However, compared to SPF, DKIM significantly improves the security of outbound messages and should always be used when possible.
DMARC (RFC7489)
DMARC, or Domain-based Message Authentication, Reporting & Conformance is a policy engine that is built on top of SPF and DKIM. Specifically, it instructs mail servers on how to treat messages that fail either SPF, DKIM, or both. It also defines how to report failures of such messages back to the domain administrator so they can either investigate why valid emails are failing SPF/DKIM or be aware of parties that are attempting to impersonate the domain. DMARC, like SPF and DKIM, relies on publishing a policy within DNS. Here is an example of a DMARC record:
dig TXT _dmarc.praetorian.com <<>> DiG 9.10.6 <<>> TXT _dmarc.praetorian.com;; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2804;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ; OPT PSEUDOSECTION:; EDNS: version: 0, flags:; udp: 4096;; QUESTION SECTION:;_dmarc.praetorian.com.      IN  TXT ; ANSWER SECTION:_dmarc.praetorian.com.  300 IN  TXT "v=DMARC1; p=reject; pct=100; rua=mailto:a6f64d11a534363@rep.dmarcanalyzer.com; fo=1;" ; Query time: 25 msec;; SERVER: 2600:1700:156:8080::1#53(2600:1700:156:8080::1);; WHEN: Fri Apr 23 14:42:50 CDT 2021;; MSG SIZE rcvd: 147
DMARC records consist of key-value pairs and the following are the more common options:
- p – the policy to apply to messages. In this case, reject messages that fail the DMARC policy. Other values for this are: quarantine, which means deliver the message in the spam inbox, or none, which means do nothing.
- pct – the percentage of messages that the policy should affect. In this example, 100% of messages that fail the DMARC policy should be rejected. When deploying an initial DMARC policy you can start at 0%, and slowly increase to 100%.
- rua – a reporting URI. In this case, the mail server is instructed to send an email containing all messages that fail the DMARC policy to the aforementioned address . The email will include an XML payload, which contains a summary of all messages that were received by the email server and failed DMARC. In this example, the metadata of messages are being sent to a service that aggregates results and allows administrators to view trends of DMARC failures over time.
- fo – specifies what should be considered a failure of DMARC. 0 means both SPF and DKIM must fail, 1 means that either SPF or DKIM must fail, d means that only DKIM fails, and s means that only SPF failure messages result in failure of the DMARC policy. 1 is the most strict, as either SPF or DKIM can fail and a message would be sent to the rua address.
There are also other tags which can be present, which are not as prevalent:
- sp – the subdomain policy. If this is not present, the policy from the parent domain trickles downward. This can be used to have different DMARC policies for subdomains that can be more or less strict than the parent domain.
- adkim/aspf – Strict or relaxed alignment for SPF or DKIM. Alignment means that the
MAIL FROM
(M-From) header must match the header from (From) domains. The default is “relaxed,” which means that a subdomain of example.com (say, mail.example.com) would pass relaxed alignment for example.com. However, if “strict” was used, all headers would need to be mail.example.com, and not example.com, as an exact string search would be used. In either case, if the root domain doesn’t match, such as attacker.com, it would not pass in either relaxed or strict mode. - ri – the reporting interval. The default value is 86400 seconds, or 1 day. It may be more frequent (a smaller number), but mail servers will only report on a best available basis, as per the RFC.
DMARC gives administrators fine-grained control over email security and how messages are to be treated when they fail the policy. Praetorian’s recommendations to deploy DMARC are as follows:
- First, ensure DKIM and SPF are configured before configuring DMARC. DMARC requires one or both of these technologies to already be deployed. Once DKIM/SPF are deployed, wait at least 24-48 hours before continuing.
- Start with a very relaxed DMARC policy. Consider setting
p=none
, and configure an email address to receive daily reports. This will let you see reports without risk of email messages being marked as spam or rejected. Wait approximately a week before continuing in order to have a good sample size of emails to analyze. - Quarantine a small percentage of messages with
p=quarantine; pct=5
. Continue to observe DMARC reports for any messages that should not be quarantined. You will see messages that have a disposition ofsampled_out
, which means that they were exempt from quarantine based on thepct
setting in the DMARC record. - Slowly increase the
pct
to 100. - Once all messages are subject to the quarantine policy, and you’ve observed that no false positives are being quarantined, apply the strictest policy of
p=reject
. This will instruct mail servers to bounce messages that are failing the DMARC policy. You will continue to receive DMARC reports and should monitor trends to ensure messages from legitimate sources continue to get delivered.
One important aspect of DMARC is that it not only gives the administrator insight into message delivery at large, it also provides feedback to users if they attempt to impersonate a domain. For example, if an attacker attempts to impersonate praetorian.com and there is DMARC policy instructing inbound mail servers to reject messages that do not conform to the policy, the receiving mail server will send a bounce-back message to the sender letting them know that the email message bounced due to the attacker having attempted to impersonate a domain to which they do not have access. An example bounce-back message may look like the following:
Unauthenticated email from praetorian.com is not accepted due to domain's DMARC policy. Please contact the administrator of praetorian.com domain if this was a legitimate mail. Please visit https://support.google.com/mail/answer/2451690 to learn about the DMARC initiative.
Finally, all of these technologies rely on trusting DNS as an authoritative source of truth. It is impossible to fully validate SPF, DKIM, or DMARC if an attacker can modify DNS responses via an “in-the-middle” attack. Therefore, DNSSEC must be deployed on a domain in order to cryptographically sign the DNS zone. Organizations should aim to have a restrictive email policy and establish separate domains or subdomains for various streams of email. For example, use domain.com for corporate emails, mail.domain.com for automated emails, marketing.domain.com for marketing campaigns, etc. This will allow mail reputation to be effectively managed with a robust DMARC policy and prevent cross-contamination of reputation between various entities at a large organization. In conclusion, several technologies were developed, namely SPF, DKIM, and DMARC, along with DNSSEC. These technologies work in tandem to secure a domain’s email authorization and authentication to validate the legitimacy of email sent by the domain. Indicators from these technologies allow receiving mail agents to reject, mark as spam, or approve emails for delivery into an inbox.