OPA Validation

We’ve recently been trying to use OPA validations for our CI/CD checks. Honestly, I hate it. Regardless, I created a short article on how it works and why its appealing.

OPA is a general purpose policy engine. It gives you the ability to offload decisions for policy enforcement for your code and is meant to unify policy to any stack.

This technology is meant to give us more control over our systems. You can integrate it as a service or run it as a daemon. OPA is meant to unify systems as a vendor neutral solution to policy enforcement and management. Netflix uses OPA to enforce API authorization and Chef uses it for IAM capabilities. These work as guardrails and restraints against Kubernetes clusters.

Imagine you have an app and customers connect to your online portal that handles this through microservices. Alice, as the developer, can have access to the portals payments, accounts, promotions, and other services. This is the problem.

In order to keep track of her access and others, we need to put API authorization on each of our services. In some cases we may need to deal with IAM. The problem with doing it with each service is it takes time, money, and raises additional questions.

This problem doesn’t stop at the application layer, it also exists in the platform. If your application runs on Kubernetes, we can assume Alice has access to Kubernetes pods, access controls, etc. The problem with this is that Alice can make a mistake, which could bring down the entire application. There’s nothing stopping Alice from downloading container images with vulnerabilities, accidently redirecting network traffic, or allocating additional CPU to one pod leading to run away resources.

Managing access control in each system therefore can become difficult. What ends up happening is we run multiple policies for each service or authorization system for access control rules. The solution with OPA is to centralize and to create a unified policy across the stack.

II.

OPA is a general purpose policy engine. Say you are responsible for implementing the “salary service”, which looks up and provides salary data for your company. The service exposes API in order to look up data for an employee. Obviously you don’t want just anyone to use this service to access this data. That authorization attempt then becomes a policy decision and OPA will query the request based on things such as the GET lookup, the path, and user ID. This request will generate a decision to allow or deny based on what needs to be enforced. This decouples decision making for administrators and because OPA is domain agnostic, it can plug into many different services and platforms.

OPA can receive any JSON value for policy queries and decisions. OPA uses REGO, which is a high level declarative policy language. Its purpose is to enforce constraints around the JSON data. All of the policy decisions OPA makes are kept in memory however, it is recommended to run it as a host daemon next to your service in order to meet SLA and cut down on network response times. With OPA, policy is treated as code and therefore can develop test policies, logging, tracing, profiling, and other methods for control/observability.

III.

Let’s test and create a policy.

On the left we have the policy in the Rego Playground and on the right we have our code we want to input. OPA looks for bindings to the variable “employee ID”. You can build up additional rules to test which will appear as true in the output.