This is part 2 in a series of blog posts dedicated to helping companies learn what it takes to achieve a Zero Trust security architecture of their own; much like Google’s BeyondCorp.
For a deeper technical dive, ScaleFT is offering an exclusive free chapter of the O’Reilly book titled Zero Trust Networks: Building Secure Systems in Untrusted Networks. Get your free copy today!
My previous post in this series covered the benefits of migrating towards a Zero Trust architecture, and the next few will focus on the steps a company should take to get there. With any significant architectural shift, it’s imperative to come prepared with the right mindset and prerequisite knowledge to ensure as seamless a transition as possible. As we’ve learned from the past decade or so, any company whose idea of moving to the cloud was to simply lift and shift their legacy applications quickly found out the hard way why that doesn’t work. Similarly, a company moving its access controls to the Zero Trust model without clearly understanding the implications will most certainly falter. Zero Trust is a more effective way to protect your resources, but it must be done right.
Google talks extensively about how they approached the BeyondCorp project in this regard, and much of their rollout involved rigorous testing and analysis. BeyondCorp was the result of a seven year project that consisted of a large, dedicated team, however. Most companies simply can’t afford to invest that much time and energy into a project like this, even when the outcomes are so clearly desirable. Luckily, we can learn from their experience and focus on what’s really needed for a successful implementation.
The recommendations put forth here are activities you should be doing regardless, and chances are that you already are to some degree. The goal here is to gain a solid understanding of your current state to put you in a better position moving forward towards Zero Trust.
Take an inventory of all employee devices
A key principle of the Zero Trust model is factoring in the device being used when attempting to connect to a resource. As we’ve seen from recent high profile breaches, unpatched software or unencrypted disks are low hanging fruit for attackers, making its enforcement critical for future prevention. Authenticating a device is now just as important as authenticating a user – with the combined attributes and state at the time of the request forming an identity profile that can be authorized.
This means that all employee devices must be known by the system in a similar fashion as a corporate directory of employees. With BeyondCorp, Google mandates that all employee devices be fully managed by IT, which they keep in a Device Inventory service. Most companies are not that strict, however, and will support a BYOD policy. How can a company track all those devices to ensure they meet the access policies?
We’re focused on the planning process here, so the first step is to take an inventory of all employee devices. Simply ask your employees through a survey which devices they use and when – mobile phones, tablets, home desktops, personal laptops, etc. It doesn’t have to be exact at this stage, but what you’ll walk away with from this exercise is a better understanding of how employees work. With this information, you can decide how best to track and monitor the devices, as well as what to include within your access policies. For example, if you find that a significant portion of your employees are using Android devices from coffee shops with public wifi, you’ll likely want to make disk encryption a policy requirement when connecting to sensitive resources.
Whatever the results, don’t let this information overwhelm you. There are numerous fleet management services, and endpoint monitoring services that may eventually be part of your Zero Trust system. You don’t necessarily have to invest in either at this point, but it’s good to keep in mind as potential architectural components during the planning and design phase.
Take an inventory of all credentials
A Zero Trust system aims to eliminate the use of static credentials entirely – instead shifting to a more dynamic environment that can issue ephemeral credentials limited in scope and time to each individual request. This is a key benefit to the architecture itself, and a real answer to dealing with common insider threat vectors plaguing companies of all kinds.
Eventually this means operating or using a service to handle PKI, but similar to the employee device exercise, the first step is to simply take an inventory of all your current credentials being used to access company resources. This is something you should do fairly regularly, as you will almost certainly uncover some less than desirable circumstances. For example, when’s the last time you peeked into your .ssh directory? Or inspected the authorized_keys file on your Linux servers? Have you ever shared database credentials with a colleague? Even the most compliant companies and security-conscious employees slip up somewhere in their day-to-day activities.
There are really two ways to operate this exercise, and attacking from both sides will yield the best results. One: you can ask all your employees to document the credentials they use to access various resources. Two: you can document all active user accounts and valid credentials for every resource. This process is different than running a vulnerability scan, but similar tools can be used to perform this exercise. Of course, make sure that this process doesn’t accidentally reveal the credentials themselves!
Diagram your current system architecture
Designing a new security architecture is no easy task, regardless of how nimble and cloud-native your company is. While the outcome of Zero Trust is a system that places no access controls with the network, ripping out the corporate VPN will likely be the last step to take. A more realistic scenario is to preserve the network topology throughout the implementation process, while gradually placing more of the access controls to the application layer over time.
In order to plan ahead properly, a key exercise is to diagram the current system, placing extra emphasis on where the access controls lie. Subnet traffic and logical grouping, such as microservices running within the same AWS VPC, are less impacted with a Zero Trust network. Specific protocols and ports related to ingress and egress firewall rules also have little to do with this exercise, although it is certainly still important to follow best practices in that regard.
What is most important with this exercise is understanding network segmentation and remote access with regards to your current trust model. Where do the resources themselves live, and how are they protected today? What will happen once those resources are deployed to the public Internet? We know the answer to that, which is why implementing a Zero Trust network means authenticating, authorizing, and encrypting every request. Given that we’re in planning mode, it is first imperative to understand how your current network assigns trust, because that will change.
As you go through this exercise, start to think of your network as the public Internet – where clients are accessing resources without VPNs. The trust model will come down to authenticating the user and device, and authorizing that profile against the access policies associated to the resource. The decision to grant trust will no longer live in the network, so having a snapshot of your current architecture will help you visualize what will change.
Inspect your traffic logs
A natural follow-up exercise to creating a system architecture diagram is to then inspect the traffic to get a better understanding of every possible network flow to your protected resources.
Any large scale system with decent traffic will generate a significant amount of log files to trawl through, so it’s best to focus in on what’s actually useful. Traffic volume isn’t necessarily important to our design, except when deciding which resources to start migrating over. You likely won’t pick the highest traffic resource as your first application to migrate. Things to look at could be the user-agent to understand the identity of the client, any traffic blocked by an ACL, whether the payload was encrypted, and whether the channel was encrypted.
It’s also important to see the various protocols being used across all the infrastructure and web-based resources. This data provides insight into user behavior, but more importantly to the design phase, tells you what protocols you will need to support with the new system. During the BeyondCorp planning phase, for example, Google found a wide range of protocols in use – including numerous legacy or insecure edge cases. They made strict decisions about what they would support with the new system, and built the appropriate workflows to support – HTTPS and SSH as the primary examples they reference in the research papers.
There is no shortage of logging and monitoring tools on the market to choose from, so pick your favorite from the bunch. Once you have the tools instrumented, start to classify the data based on the resources so you can inspect the flows to each independently. This will help during the migration process, as you can inspect traffic across the current system side-by-side with the Zero Trust system on a per-resource basis.
Making Sense of the Data Collected
Performing these exercises leaves you with a lot of data about your environment, but what should you actually do with it? Independently, each data point may be useful to a degree, but the real value lies in its collectivity as it will help clear the path towards designing your own Zero Trust network. For example, during Google’s BeyondCorp initiative, they performed these exercises and found running services that were supposed to have been decommissioned.
Through these exercises you will know the network topology that will change and the network flows that will change. Having a grasp on both is critical. We are talking about a security architecture, so naturally, maintaining strict security is essential. Just as important is to be able to make it through the transformation without any interruption, so the more prepared you are going in, the better.
We’re going to parlay our new knowledge into creating your access policy framework. This is where we really start to apply the Zero Trust principles in practice There are a number of ways to approach access policies, which I’ll cover in the next post. Stay tuned!