How to construct SPIFFE IDs

Quintessence Anx

Today, we’re going to learn about what to consider when you construct your SPIFFE IDs. Making sure that your workloads have unique, consistent SPIFFE IDs will ensure that you are in alignment with your authorization needs and are future-proofed against breaking changes, even when you change or introduce new major infrastructure pieces.

A quick recap: what is SPIFFE?

From SPIFFE’s documentation: “SPIFFE, the Secure Production Identity Framework for Everyone, is a set of open-source standards for securely identifying software systems in dynamic and heterogeneous environments.” But what does that mean, practically speaking?

In essence, it means that workloads use signed, certified identifying documents rather than opaque tokens and passcodes to validate their identity. To do this, SPIFFE uses two core concepts: the SPIFFE ID and the SPIFFE Verifiable Identity Document (SVID). Essentially, the SPIFFE ID is the “identity” and the SVID is “proof of that identity”. There are other areas both in and out of tech that use the same concept. For example, when using the web there is the domain and then the TLS certificate that proves who has control of that domain at the time of issuance. Outside of tech, people’s identities are verified by driver’s licenses, passports, and so forth. In this case, the SPIFFE IDs are verified by the SVID, which is a signed document such as a JWT or X.509 certificate. (Today we’re focusing on the SPIFFE ID - SVIDs will be the subject of  a later post.)

SPIFFE ID format

SPIFFE IDs must align with the following format (as seen in the SPIFFE documentation):

spiffe://trust-domain/workload-identifier

A trust domain acts as an identity namespace and is a required component of any SPIFFE ID. Trust domain names must be unique in order to tell them apart (and to avoid naming collisions - there will be more on this later in the post). One pattern that can help ensure unique trust domains is to anchor naming on a fully qualified domain name that you own (though this is not required by the specification). For instance, using prod.example.com instead of just prod as your trust domain mitigates the risk of collision with similarly named production systems.

Choosing the workload identifier path for your SPIFFE IDs is vitally important. Ideally, the information in your SPIFFE ID should be such that your authorization system can make a decision based on your SPIFFE ID without making additional calls for information. Practically, to prevent SPIFFE IDs from becoming overly complex and/or from containing information they should not, there may be times when the SPIFFE ID contains most but not all of what is needed.

The ability to use what you need as your trust domain and workload identifier introduces a lot of flexibility, but for the new user can also introduce some confusion around where to start.

What should go into your SPIFFE ID

The simplest rule to follow:

Choose the simplest naming scheme that accomplishes all business needs.

This means that you should err on the side of simplicity and not complexity. When you are building your SPIFFE ID that you should ask yourself deployment needs and boundaries:

  • What restricts which services will and will not be sending data to each other?
  • Are there compliance requirements, such as PCI, that are governing service communication and data access? If so, what is the impact?

Let’s look at a quick example:

spiffe://example.com/us-central/instance-id/service-name

Is this example “correct”? Taking a look at the configuration, we can see that this is a valid SPIFFE ID as it has both a trust domain and a workload identifier. But does it follow the rule of choosing the simplest domain pattern?

The answer is “it depends”. If you have a relatively dynamic infrastructure and you’re not authorizing traffic based on the machine it’s coming from, then this ID is likely overloaded. One way you could simplify the ID is to drop the instance-id portion. This leaves you with the region and service name to do authorization over, which is itself a common pattern, e.g.:

spiffe://example.com/us-central/service-name

On the other hand, if you have certain guarantees around what services run on what machines and you need to enforce that, then the previous example might be a better fit for your specific needs.

What shouldn’t go into the SPIFFE ID

A SPIFFE ID should be focused on the inherent attributes of the workload. Avoid including any details that are transient and possible to change during the lifetime of a workload instance. You should also avoid putting any sensitive information in your SPIFFE ID. For example, the following details are best left out:

  • IP addresses or ports
  • Name of the team that owns the service
  • Dates (e.g. date of last audit or any other sensitive dates)
  • Key-value data

Although it may be tempting, you should not structure your SPIFFE ID with key-value pairs for safety. SPIFFE IDs should always be compared as strings rather than being parsed, the latter of which is the natural thing to do when key:value structures are present. For example, ordering becomes a sharp edge when including key:value pairs. Another concern is interoperability - many software systems work with SPIFFE off-the-shelf, however they may leverage a different naming scheme, which can break parser expectations. Instead, a “primary keys only” approach should be taken, leaving less room for error and more room for interoperability.

How to construct unique SPIFFE IDs

As mentioned earlier, trust domain names need to be unique in order to tell them apart. As an example, you’d likely want to avoid names like “development” and “production” as these are common concepts across teams and organizations. Things like acquisitions and partnerships should also be considered, where it’s necessary to integrate with infrastructure from a different company. One common practice is to utilize a domain name you own along with subdomains as your trust domain, like project.example.org for a particular project.

Similarly, the path component representing the workload’s identity within the trust domain should also be unique in order to tell workloads apart. It helps to know what naming scheme you will use when you are modeling your SPIFFE ID pattern. Here at SPIRL, we use the following naming scheme (which is a variant of the Istio naming scheme):

/cluster-name/ns/namespace/sa/service-account

To clarify, let's distinguish between the segments that remain constant and those that rely on the workload. Of these segments, ns and sa are constant and the others will depend on the workload. Rewriting the above so that the dependent attributes have curly braces, the SPIFFE ID now looks like:

/{{cluster-name}}/ns/{{namespace}}/sa/{{service-account}}

The dependent attributes in this ID can be thought of as “dynamic” and the others as “static”. With this in mind, when you are checking if your naming scheme has overlaps you should check the dynamic and static segments:

  • Static segments should only overlap if they are the same thing. In some of the above examples we used ns before the {{namespace}} tag to label it in the path. To ensure there is no collision, ns should only be used for this purpose and there should be no other instances of ns at that segment of the path.
  • Dynamic segments should not create an overlap condition.

As a couple quick examples, if you have two static paths they should not overlap:

/foo
/bar

Dynamic conditions should neither overlap with each other or static segments:

/foo
/{{var}}

In this latter case, there is only a risk of overlap if var can take on the value foo.

Although in this section we’re discussing these path patterns as a naming scheme, if you are using SPIRL you have the option to use dynamic SPIFFE IDs that will actually be generated by a template. If this interests you, please reach out to us via our contact form!

Concrete Example: Square

Choosing the structure of your SPIFFE ID is very much a “choose your own adventure”. To showcase an exact example, let’s take a look at Square. As a large organization, Square has complex architecture. Some services and/or regions use Kubernetes - but not all. They also need to be able to separate services by business units (Payments, Cash App, Caviar, etc.) and environments. In their talk, they highlight two specific SPIFFE IDs - one for their “Service” service, which is basically a registry of all their services in that environment, and a foo component within the service, service. The resulting IDs are:

spiffe://production.squareup.com/service
spiffe://production.squareup.com/service/foo

This allows you to address the service as a whole using the prefix, while still enabling more granular control where needed. This means that Square chose to have a pattern similar to what you’d see with directory trees - the “parent” level is an identifier and the “sub-level” is another identifier. This is not part of the SPIFFE specification (it’s not against it either), this is something they chose to do for convenience. SPIFFE IDs are all just strings. In contrast, here is Isto’s naming pattern for service bar in namespace foo for comparison:

spiffe://example.com/ns/foo/sa/bar

In this naming convention, neither ns nor sa are their own identifiers, they are functioning as human-friendly labels for what comes next - foo and bar.

If you would like to watch the talk, Square's SPIFFE Community Day 2019 talk is located here. The talk and Q&A are approximately 30 minutes in total.

What to remember and next steps

The flexibility within SPIFFE's framework allows for varied ID patterns, illustrating the dynamic nature and scope of this essential identity framework. As we saw with Istio and Square, there are different approaches to ensure a valid, unique, SPIFFE ID pattern that aligns with authorization policies and business needs. With light planning, you can define a naming scheme that won’t need to be changed in years to come. If you would like to learn more about, please feel free to reach out to us here at SPIRL via our contact form or to join the SPIFFE Community Slack for the OSS project.