Teleport. First glance and implementing dynamic role mapping.

I'll start with two theses:

"I think the role of cybersecurity is somewhat exaggerated." A quote from the Minister of Digital Transformation of Ukraine. This thesis was voiced by the author around December 2019. After that, many government registries were hacked.
Security processes should not interfere with normal work. This is my reasoning.

And while I disagree with the first one, I very much agree with the second. That's how Teleport appeared in my infrastructure.

First, I'll write about my personal first impression, then move on to the original topic and dynamic role mapping. For those too lazy to read the first part - here's a link to the second part.

First impression of Teleport

Having Zero-Trust, RBAC and the principle of least privilege is better than not having them. That's why I'm completely reworking the access system architecture right now. There seem to be many pros, and there are also quite a few cons in teleport. Let's go through everything in order.

If I'm talking about my personal impression - it's still mixed. It seems okay, but nothing similar was found on the market. Yes, there are similar products, but something that's specifically an Access plane, and with the right set of features in one product - that wasn't found. And the needed set of features - zero-trust model, role-based access model, SSO, at least partial ability to use native clients, audit logs, and it would be nice to have just-in-time access. And of course, managing all of this shouldn't be super complicated, and I want the product to solve a specific task and not be overloaded with functionality. And Teleport handles this pretty well at first glance.

But there's not much to compare it to. No other Access planes were found. And there are cons, let's talk about them. The first con is the pricing model. Where is it, guys? Why is there a community version and a call? Why can't you just roll out a normal price, or at worst, respond to me in an email? Why insist on a call? I simply don't have time to find a free slot in my calendar to tell you about our company and listen to how great Teleport is. I've already chosen it and just want to understand if I can afford the paid version or not. I don't need a half-hour call with your marketing department for that. No, I'll call of course, but later, when I finish the task I started and find time for it. And I'll come to you with feedback.

The second con is extremely selective functionality. Okay, I understand there's no point in supporting a repository for Debian 10 when it's already EOL, but what I don't understand is the difference between Microsoft SQL Server in Azure and Microsoft SQL Server on my physical server. Even in the case of Azure, you suggest creating a separate VM and installing an agent there. What's the difficulty in doing the same with my local server? That option simply doesn't exist.

The third con is the number of Windows RDP servers in the community version. There are 5. Thanks for not limiting SSH and k8s. I understand charging money for just-in-time access, for extended integrations, for additional SSO providers. I don't completely understand it, because other services in related areas charge money for the cloud version and support, but okay, it's still understandable. What did RDP sessions do to you? How are they conceptually different from SSH?

And the fourth con is still the complexity. Well, as a con. More of a general problem inherent to all similar services. You really need to read the documentation COMPLETELY before configuring anything more complex than an owner role for all resources. It's not really a problem, more of a fact that you can't just quickly jump in and configure everything correctly on the first try based on personal experience and ideas about how this or that functionality should work. You really need to understand it. Not that this is unusual for me, but my formed expectations didn't match reality.

Otherwise - everything is quite functional, not overcomplicated, and works as expected.

Dynamic role mapping to GitHub teams in the community version. At least, how I see its implementation at the current moment in time.

At the time of starting the implementation, the time spent working with Teleport was approximately 20-25 hours. This is the time from when I created the instance and set up the server. Testing, evaluation and so on. The initial task - safely distribute access to the team, and in such a way that later I don't have to go to servers and delete SSH keys, extra users, and so on.

Infrastructure can be divided into 2 types: critical - exclusively for administrative access, and that which can be given access to developers if necessary. Naturally, not on one project. There will be many infrastructures. And the last thing I want to do is create roles for each individual project, or even better - make a separate cluster for each project. Separate clusters will also exist, but in separate cases, more as a privilege than as a given.

Therefore, the idea of dynamically mapping access to infrastructure based on entities received from the SSO provider doesn't look so bad. Since only GitHub is available in the community version - we'll base it on its teams entity, which we already use.

First is to create an auth connector and configure it. This is simple and done according to the documentation. A code example will be below, details on mapping further.

kind: github
metadata:
  name: acme-gh
spec:
  api_endpoint_url: ""
  display: GitHub
  endpoint_url: ""
  teams_to_roles:
  - organization: acme
    roles:
    - access
    - editor
    - any other roles giving full control over all existing resources
    team: ultimate
  - organization: acme
    roles:
    - dynamic-team-access
    - dynamic-admin-access
    team: all-company
version: v3

In the auth config, we see a mapping of roles giving the most complete access to all existing resources, including Teleport administration, to the GitHub team ultimate. This is a group of people who will have absolutely all existing capabilities. Ideally, this will be 3-5 most trusted people in the company. This way there won't be a single point of failure if only one person has access to everything, and there won't be a situation where too many people have access to everything at once.

We also see a mapping of roles dynamic-team-access and dynamic-admin-access to the all-company group. Yes, damn it, this is really a group that contains all company employees. Why? Because if a person logging into Teleport is not a member of groups explicitly specified in the auth connector - they won't be able to log in and will get an error. So, this is a kind of hack - give everyone the ability to log in, and at the same time just show an empty dashboard to those who don't have rights. Does this reduce the security level? Yes, a bit. Did I find another way to dynamically map roles to users? Within the time limit I set for myself on this task - no, I didn't find one. If Teleport employees suggest a better way to me - I'll write about it.

How this should work. The all-company group contains all company employees. But also, some employees are in other groups. They're not explicitly specified in the auth connector, but they come along with the user information. And they can be used.

All infrastructure besides admin and team access can be divided into the following categories:

project affiliation
environment type (dev or prod)
resource type (web server, database server, build server, monitoring server, etc.)
access scope (admin or team)

All this information will be in the labels of each resource. Environment type and resource type are purely informational labels, they're needed just for convenience and don't participate in access mapping. But the project name and access scope will be needed. Also, we need to get the GitHub team to resource mapping from somewhere. It seems like it could be taken from the project name and make a team with the same name in GitHub, but there will be two teams: project-team and project-admin with different access levels. And Teleport's expression syntax can't use concatenation and build a name from two values. Therefore, admin_group and team_group will also be added to the labels, which will contain the corresponding GitHub team names.

Thus, each resource will have at least 6 labels:

env: [prod|development|staging] [string]
project: [ {{ project_name }} ] [string]
scope: [admin|team] [string]
type: [string]
admin_group: [string]
team_group: [string]

Next, we make one role for each access level: team and admin.

Team:

kind: role
metadata:
  description: Dynamic role to access resources based on team name
  name: dynamic-team-access
spec:
  allow:
    logins:
    - '{{internal.logins}}'
    - ubuntu
    - root
    node_labels:
      project: '*'
      scope: team
    node_labels_expression: |
      contains(user.spec.traits["github_teams"], labels["team_group"]) && labels["scope"] == "team"
    windows_desktop_labels:
      project: '*'
      scope: team
    windows_desktop_labels_expression: |
      contains(user.spec.traits["github_teams"], labels["team_group"]) && labels["scope"] == "team"
    windows_desktop_logins:
    - '{{internal.windows_logins}}'
    - Administrator
version: v8

Admin:

kind: role
metadata:
  description: Dynamic role to access admin resources based on team name
  name: dynamic-admin-access
spec:
  allow:
    rules:
    - resources:
      - '*'
      verbs:
      - read
      - list
      - create
    logins:
    - '{{internal.logins}}'
    - ubuntu
    - root
    node_labels:
      project: '*'
      scope: '*'
    windows_desktop_labels:
      project: '*'
      scope: '*'
    node_labels_expression: |
      contains(user.spec.traits["github_teams"], labels["admin_group"])
    windows_desktop_labels_expression: |
      contains(user.spec.traits["github_teams"], labels["admin_group"])
    windows_desktop_logins:
    - '{{internal.windows_logins}}'
    - Administrator
version: v8

What these roles do. They give access to all SSH and RDP resources. For administrators - the condition is that the resource has an admin_group label that matches the name of the GitHub team the user is a member of. For team members - that the resource has a team_group label that matches the name of the GitHub team the user is a member of, and the scope is team. This way, administrative project resources are separated from general ones, and dynamic access mapping is implemented for each individual user based on which teams they're a member of.

Naturally, this tightens the rules for handling GitHub groups, but greatly simplifies the access granting process.

Of course, direct access to the resources will be closed.