API Keys, Roles and Policies

Tagged in

Introducing API Keys and the difference between Access Keys

API Keys are a new resource type for Identity and Access Management, replacing Access Keys. They provide finer grained restrictions and a more expressive API.

API Keys introduce the concept of a Role, and decouple the key itself (used for authentication) from the policy (used for authorization). A role is attached to a policy, and can be referenced by multiple keys. Furthermore, API Key policies offer more powerful authorization mechanisms and a wider spectrum of parameters.

Currently, both /api-key and /access-key endpoints are available as each implementation is separate from the other. Eventually, Access Keys will be migrated to an equivalent API key, at which point the old endpoints will be deprecated. Access Keys are now considered legacy and we encourage users to use API Keys and Roles.

IAM Authorization flow

The introduction of API keys decouples the authorization and authentication mechanisms. Therefore, we can split authorization through several levels or “gateways” for a single request.

Authorization Layers

The Org policy applies to any user under an organization.
The Role policy is created and managed by individual users within an organization.

A request must be authorized at both levels to be processed by the platform.

IAM Roles

A Role is a way to store a user’s policy, which can be re-used across several API keys.

{
  "name": "my-new-role",
  "policy": {
    "default-service-strategy": "deny",
    "services": {
      "iam": {
        "type": "allow"
      }
    }
  }
}

In the example above, the role called "my-new-role" states that all operations are denied except for iam services, which are explicitly allowed.

After a role is created, an api-key can reference that role’s ID. Then, that api-key will have the permissions stated in that policy.

A role can be created with the create-iam-role operation (POST /iam-role) endpoint. We can modify the role’s policy by calling update-iam-role-policy (PUT /iam-role/<id>:policy). Finally, to assume the role in a request, we can distribute one or more API keys by calling create-api-key (POST /api-key).

Note

Policy updates may not take effect immediately due to some background synchronization. Generally, you should design your usage patterns with re-use of Roles in mind, and if you require frequent rotation, favor rotating API Keys rather than Roles.

API Key Policies

Incoming requests to the Exoscale API go through several layers of authorization. The last of which is defined by a policy attached to the user role. Given the right permissions, a user can create their own policies to enforce fine-grained authorization parameters. The rest of this document describes what possibilities are available when creating policies, as well as some best practices and examples.

A policy is composed of

The Services map - a separation of the authorization logic according to which platform service the request belongs to.
The Default Service Strategy - a default top-level allow or deny decision that applies when there is no entry in the services map for the service of the incoming API operation. For example: list-zones is an operation that belongs to the compute service - if compute is not present in the services map, the default service strategy will be applied.

{
  "default-service-strategy": "allow",
  "services": {
    "compute": {...},
    "dbaas": {...}
  }
}

Requests are evaluated in the context of the platform service they belong to - Compute, SOS, DBaaS, etc - a user can define one of the following authorization bodies for each service.

Nothing - in the absence of a specification, use the default service strategy
Allow overrides the default service strategy - allow all requests.
Deny overrides the default service strategy - deny all requests.
Rule-based a more fine-grained approach to authorization

Rule based authorization 101

{
  "default-service-strategy": "deny",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "action": "deny",
          "expression": "resource.sks_nodepool.name in ['important-nodepool', 'foobar']"
        },
        {
          "action": "allow",
          "expression": "true"
        }
      ]
    }
  }
}

A rule-based service policy consists of a list of one or more rules:

Rules are composed of an Expression - against which the request is evaluated, and an Action - the resulting decision should the evaluated expression returns TRUE.
Expressions are written in the Common Expression Language CEL
Rules are evaluated in order
Operator precedence is defined in the CEL language specification
If the expression of a rule is valid, its action becomes the authorization output - allow or deny.
If the expression of a rule is invalid, nothing is concluded and we evaluate the next rule
If no rule is valid, the request is not authorized to proceed regardless of the default service strategy.

The authorization flow will process each rule in order. The first valid rule will short-circuit the authorization process, resulting in its action (either allow or deny) - equivalent to an OR condition between rules. If all rules are exhausted the request is rejected, regardless of the default service strategy.

CEL bindings

A binding refers to the coupling of a variable in a CEL expression and its corresponding value at the time the expression is evaluated.

The top-level bindings are:

service - the service class in scope (ex: “sos”, “instance-pool”, etc)
zone - the zone where the call is attempted (ex: “ch-gva-2”)
now - a CEL timestamp (string, can be coerced to CEL timestamp in expressions via timestamp(now))
source_ip - the ip of the caller
api_key - the exoscale api key of the caller EXO1234...
operation - the operation performed (ex: “scale-instance-pool”)
identity - the caller’s identity. A map containing the following keys:
- identity.key - the caller’s exoscale API key EXO1234...
- identity.created - the CEL timestamp of the key’s creation
- identity.description - the key’s description (name)
- identity.org.uuid - the caller’s organization UUID
- identity.org.name - the caller’s organization name
parameters - parameters passed to the command (ex: parameters.size -> 3)
resources - a map of resource type -> resource (varies depending on the call, see IAM Reference: Operations and Resources)

The resources and parameters bindings are special, as they don’t resolve to mere literals, they can contain nested bindings. The key difference between parameters and resources is the later might contain metadata that is not in the request payload.

For example:

parameters.foo == 'bar' restrict the call if the input foo is equal to bar.
parameters.foo.bar == 'baz' or parameters.foo == {'bar':'baz'} - restrict the call if the input object foo has a key bar equal to baz
parameters.foo.exists(k, k.bar == 'baz') - restrict the call if the input collection foo has and object with a key bar equals to baz
resources.elastic_ip.ip == 10.10.10.10 - restrict the call on an elastic_ip resource that has the specified IP address.
resources.instance_pool.id == "d4c1673a-a342-4a0f-b8e7-b2da4091ddfd" - restrict the call on an instance_pool that has the specified ID.

The IAM Reference: Operations and Resources contains the necessary information to map any API endpoint to its corresponding operation, parameters, and resources bindings.

CEL Best Practices

Split expressions into rules

When writing a rule with more than one condition, it becomes more manageable to write several smaller rules.

This can be achieved as long as the conditions are separated by OR operands.

[
  {
    "action": "allow",
    "expression": "operation == 'list-buckets' || resources.bucket.startsWith('public-')"
  }
]

is equivalent to

[
  {
    "action": "allow",
    "expression": "operation == 'list-buckets'"
  },
  {
    "action": "allow",
    "expression": "resources.bucket.startsWith('public-')"
  }
]

Write a catch-all rule

Sometimes, we want our service to have a different default behavior than the rest, while still abiding to some rules.

Because rules are evaluated in order, we can achieve this by writing a catch-all rule at the end of the list.

This is also useful to prevent unexpected authorization failures due to no rule matching the request, it is particularly useful when writing organization policies where you may wish to prevent certain calls across the organization (such as deleting an instance with a particular label, or modifying a particular security group):

{
  "default-service-strategy": "allow",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "action": "deny",
          "expression": "'foo' in resource.sks_cluster.addons"
        },
        {
          "action": "allow",
          "expression": "true"
        }
      ]
    }
  }
}

Don’t mix resources in a single rule

Every request may load a specific number of resources in order to have enough information to evaluate the rule properly. This means that not every rule will be able to evaluate every request.

For example: the expression "resource.security-group.name == 'dev-sg'" assumes the presence of a security group resource has been loaded into the context.

Consider the incoming request is GET /instance-pool/<id> (listing instance pools). Both instance pools and security groups belong to the compute service, thus the aforementioned CEL expression will be evaluated. However, only the instance pool resource is loaded for get-instance-pool so a CEL expression that refers to resources.instance_pool.id and resources.security_group.id cannot result in a match for the get-instance-pool endpoint.

Even if the expression includes an instance pool parameter "resource.security_group.name == 'dev-sg' && size(resource.instancepool.instances) > 2", it will be invalid too.

Therefore we strongly recommended to write separate rules based on the resource types, and to match on the operation or a list of operations when checking resource rules.

CEL extensions

We currently have the following CEL functions added as extensions:

inIpRange(<string IP>, <string Range>): returns true if IP is within Range.

source_ip.inIpRange('127.0.0/24')

inIpRange(source_ip, '127.0.0/24')

resources.instance.ipv6_address.inIpRange('2001:0db8:85a3:0000:0000:0000:0000:0000/64')

Compute

Allow only requests that load an instance resource - eg. the resize-instance-disk operation - whose labels include "dev".

{
  "default-service-strategy": "allow",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "action": "allow",
          "expression": "!has(resources.instance)"
        },
        {
          "action": "allow",
          "expression": "'dev' in resources.instance.labels"
        }
      ]
    }
  }
}

The first rule !has(resources.instance) is needed because the second rule will fail for any request that doesn’t involve an instance, thus denying authorization.

Warning

A lookup on a non-existing binding will trigger an error within the CEL evaluator and short-circuit to a failed expression. The use of has(some.binding.foo) is very useful to avoid this situation.

Allow read only operations:

operation.startsWith('get-') || operation.startsWith('list-')

Allow an instance pool to be scaled to a size of 2, 3, or 4.

operation == 'scale-instance-pool' && int(parameters.size) >= 2 && int(parameters.size) <= 4

Prevent the deletion of the load balancer with the name my-nlb:

operation == 'delete-load-balancer' && resources.load_balancer.name == 'my-nlb'

Prevent the deletion of nodepools on the SKS cluster named my-sks-cluster:

operation == 'delete-sks-nodepool' && resources.sks_cluster.name == 'my-sks-cluster'

Only allow calls to create instance where public_ip_assignment has been specified and where the public_ip_assignment is set to none. CEL expressions are not fully supported via compute-legacy so we deny all calls to prevent instances being created via the legacy Compute v1 API.

{
  "default-service-strategy": "allow",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "action": "deny",
          "expression": "operation == 'create-instance' && (!parameters.has('public_ip_assignment') || parameters.public_ip_assignment != 'none')"
        },
        {
          "action": "allow",
          "expression": "true"
        }
      ]
    },
    "compute-legacy": {
      "type": "deny"
    }
  }
}

Prevent the creation and modification of resources in the zone ch-dk-2, this could also be written with the simpler rule zone == 'ch-dk-2' but errors will be visible on some tooling such as the UI which shows resources across all zones - so allowing read only calls can be useful:

{
  "default-service-strategy": "allow",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "expression": "zone == 'ch-dk-2' && !(operation.startsWith('get-') || operation.startsWith('list-'))",
          "action": "deny"
        },
        {
          "expression": "true",
          "action": "allow"
        }
      ]
    }
  }
}

Allow/deny calls from a given source IP:

source_ip == '188.61.116.99'

Allow/deny calls from a list of source IPs:

source_ip in ['188.61.126.88', '188.61.116.99']

DNS

Allow the retrieval of domains and records:

operation in ['list-dns-domains', 'get-dns-domain', 'list-dns-domain-records', 'get-dns-domain-record']

Allow the creation of domains that end with .ch:

operation == 'create-dns-domain' && parameters.unicode_name.endsWith('.ch')

Allow the creation of TXT records only (on any domain):

operation == 'create-dns-domain-record' && parameters.type == 'TXT'

Allow TXT records to be updated and deleted:

operation in ['update-dns-domain-record', 'delete-dns-domain-record'] && resources.dns_domain_record.type == 'TXT'

Allow the creation of A records on the domain my-test-domain.ch:

operation == 'create-dns-domain-record' && resources.dns_domain.unicode_name == 'my-test-domain.ch' && parameters.type == 'A'

Only allow TTLs in the range of 600 (10 minutes) to 7200 (2 hours):

operation in ['create-dns-domain-record', 'update-dns-domain-record'] && int(parameters.ttl) < 600 && int(parameters.ttl) > 7200

DBaaS

Prevent the creation of PostgreSQL services that do not specify a single IP filter of 10.20.0.0/16 (note: the order of deny rules in a policy is important, as the first rule that matches will either allow or deny the request):

operation == 'create-dbaas-service-pg' && !parameters.ip_filter.exists_one(x, x == '10.20.0.0/16')

Allow the creation of PostgreSQL services with the hobbyist-2 or startup-4 plans:

operation == 'create-dbaas-service-pg' && parameters.plan in ['hobbyist-2', 'startup-4']

Prevent the creation of dbaas resources in the zone CH-DK-2:

operation.startsWith('create-dbaas-') && zone == 'ch-dk-2'

Prevent the deletion of dbaas services in the zone CH-GVA-2:

operation.startsWith('delete-dbaas-service-') && zone == 'ch-gva-2'

Allow the PostgreSQL service my-service to be retrieved:

operation == 'get-dbaas-service-pg' && parameters.name == 'my-service'

Allow only a specific user to see his Kafka secrets

operation = 'reveal-dbaas-kafka-user-password' && parameters.username = 'a-user'

IAM

Deny requests to the IAM service. Allow all requests to other services.

{
  "default-service-strategy": "allow",
  "services": {
    "iam": {
      "type": "deny"
    }
  }
}

Deny requests to the IAM service for a specific key. Allow all requests to other services regardless of which key is used.

{
  "default-service-strategy": "allow",
  "services": {
    "iam": {
      "type": "rules",
      "rules": [
        {
          "action": "allow",
          "expression": "api_key != 'EXO123456789'"
        }
      ]
    }
  }
}

Alternatively this can be written using a deny rule, which is more practical if you want to enforce this within an organization policy:

{
  "default-service-strategy": "allow",
  "services": {
    "iam": {
      "type": "rules",
      "rules": [
        {
          "action": "deny",
          "expression": "api_key == 'EXO123456789'"
        },
        {
          "action": "allow",
          "expression": "true"
        }
      ]
    }
  }
}

Ensure keys can only be created for a particular role id:

{
  "default-service-strategy": "allow",
  "services": {
    "iam": {
      "type": "rules",
      "rules": [
        {
          "action": "deny",
          "expression": "operation == 'create-api-key' && parameters.role_id != '<my-role-id uuid>'"
        },
        {
          "action": "allow",
          "expression": "true"
        }
      ]
    }
  }
}

Prevent updates and deletion of a role named ‘my-role’:

operation in ['update-iam-role', 'update-iam-role-policy', 'delete-iam-role'] && resources.iam_role.name == 'my-role'

Prevent the creation of legacy IAM keys, create-access-key is the operation for legacy access keys, where operations were set directly on the key, create-api-key is the newer endpoint which works with IAM roles.

{
  "default-service-strategy": "allow",
  "services": {
    "compute-legacy": {
      "type": "deny"
    },
    "iam": {
      "type": "rules",
      "rules": [
        {
          "expression": "operation == 'create-access-key'",
          "action": "deny"
        },
        {
          "expression": "true",
          "action": "allow"
        }
      ]
    }
  }
}

Allow keys to be created, listed, and retrieved:

{
  "default-service-strategy": "allow",
  "services": {
    "iam": {
      "type": "rules",
      "rules": [
        {
          "action": "allow",
          "expression": "operation in ['create-api-key', 'list-api-keys', 'get-api-key']"
        }
      ]
    }
  }
}

SOS

The following policy allows listing and retrieving objects on the bucket my-bucket, no other operations or buckets are allowed meaning that the caller will not be able to list buckets or to look at the properties of a bucket using the exo storage show command.

{
  "default-service-strategy": "deny",
  "services": {
    "sos": {
      "type": "rules",
      "rules": [
        {
          "expression": "parameters.bucket == 'my-bucket' && operation in ['list-objects', 'get-object']",
          "action": "allow"
        }
      ]
    }
  }
}

A more elaborate policy is provided in the next example:

Allows listing all buckets in the organization via the Exoscale CLI exo storage ls or s3cmd s3cmd ls. Other S3 compatible tooling may require additional operations to be added,
Operations relating to a bucket are restricted, only my-bucket and my-other-bucket are allowed, attempts to list objects in other buckets will be rejected.
The last rule in the policy relates to usage via the Exoscale CLI, more operations are needed to support exo storage show sos://my-bucket.
The order of the rules is important due to the way the deny rule has been written, in order to allow listing buckets the expression containing list-buckets must come before the deny expression.

{
  "default-service-strategy": "deny",
  "services": {
    "sos": {
      "type": "rules",
      "rules": [
        {
          "expression": "operation in ['list-sos-buckets-usage', 'list-buckets']",
          "action": "allow"
        },
        {
          "expression": "!(parameters.bucket in ['my-bucket', 'my-other-bucket'])",
          "action": "deny"
        },
        {
          "expression": "operation in ['list-objects', 'get-object']",
          "action": "allow"
        },
        {
          "expression": "operation in ['get-bucket-acl', 'get-bucket-cors', 'get-bucket-ownership-controls']",
          "action": "allow"
        }
      ]
    }
  }
}

Allow/deny reading/writing on a specific public prefix:

operation in ['get-object', 'put-object'] && parameters.key.startsWith('public')

Understanding a forbidden API call

Here are possible scenarios when the API returns a 403 Forbidden response.

forbidden by [role|org] policy, [compute|sos|dns|iam]: Unable to find an operation in the list defined by the policy. The message indicates that on the specific policy (at role or organization level), the performed operation is not specified with either deny or allow action. As the parser is not able to find a single match - the call is denied. The message also shows a service-class the restriction belongs to.

Note

If the goal is to restrict a key to a specific operation or resource, but at the same time allow any other type of operations - {"action":"allow" "expression":"true"} could be added to the end of the rules set.

forbidden by [role|org] policy, [compute|sos|dns|iam] - A deny rule matched. Rule index: *INDEX_NUMBER* This extended version of the previous message, will appear when a policy has rules whose action is deny and matches the performed call. The index indicates the sequence number of the rule in the rules set starting at 0 for the first rule.