The other day I encountered the following post where Seth Vargo was doing some statements regarding the usage of environment variables:

[…] While this approach is simple and straightforward, it comes with considerable security drawbacks - the secrets exist in plaintext in the environment. Any other process, library, or dependency running inside the process has access to the environment which has already been exploited multiple times. Unfortunately, it is trivial for a malicious library author to inject this type of vulnerability into an otherwise helpful utility package. To be absolutely, unequivocally clear, you should not store secret or sensitive information in environment variables in plaintext.

Then, the author, states some alternatives like using KMS to cypher values for the environment variables; using Berglas library and also suggesting Hashicorp Vault.

At the time of that post, GCP was not offering a direct rival to Hashicorp Vault but a few months ago Google announced Secret Manager being GA.

As of now, at my current workplace we are migrating from a private cloud to GCP and I think there is not a better moment for me to analyze what are the benefits of using Secret Manager and how easy or hard is to make it work.

Deploying resources

As some other companies, our workloads are deployed in a Kubernetes cluster (GKE on Google Cloud) while always trying to follow the Principle of Least Access in order to guarantee that a workload can only access to the resources it needs and cannot elevate privileges or modify some other resources.

It only seems fair to do a proof of concept that follows these guidelines:

  • Create a service account with just the IAM (permissions on GCP) needed.
  • Deploy a GKE cluster using mentioned service account.
  • Deploy a pod inside that cluster and make sure it can access Secret Manager.

Let’s start.

Make sure the correct project is selected:

From now on, replace pablo-test-project with your GCP project identifier when executing the commands.

gcloud config set project pablo-test-project

For accessing Secret Manager API from our Google Kubernetes Engine there are three things that needs to be done.

Enable Secret Manager API

gcloud services enable secretmanager.googleapis.com

Or from the Console UI.

Set corresponding IAM role on the service account running the instance

From service account documentation.

When you set up an instance to run as a service account, you determine the level of access the service account has by the IAM roles that you grant to the service account. If the service account has no IAM roles, then no API methods can be run by the service account on that instance. Furthermore, an instance’s access scopes determine the default OAuth scopes for requests made through the gcloud tool and client libraries on the instance. As a result, access scopes potentially further limit access to API methods when authenticating through OAuth. […]

In order to assign the corresponding IAM roles to the service account, the account should be created first:

gcloud iam service-accounts create demo-service-account \
--description='Service account used for our demo cluster' \
--display-name=demo-service-account

The role to be assigned to this service account is roles/secretmanager.secretAccessor which allows accessing the payload of secrets:

gcloud projects add-iam-policy-binding pablo-test-project \
--member=serviceAccount:[email protected] \
--role=roles/secretmanager.secretAccessor

Configure OAuth scopes on the instance that consumes the API

From access control documentation.

To access the Secret Manager API from within a Compute Engine instance or a Google Kubernetes Engine node (which is also a Compute Engine instance), the instance must have the cloud-platform OAuth scope. For more information about access scopes in Compute Engine, see Service account permissions in the Compute Engine documentation.

As mentioned before, an instance’s access scopes determine the default OAuth2 scopes for requests made. So the Kubernetes instance that is going to be created needs the cloud-platform scope. And also don’t forget to indicate the service account created before.

gcloud container clusters create demo-cluster \
--zone=europe-west3-a \
--num-nodes=1 \
--machine-type=n1-standard-1 \
[email protected]erviceaccount.com \
--scopes=https://www.googleapis.com/auth/monitoring,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/cloud-platform

After the cluster is deployed, get the credentials for kubectl.

gcloud container clusters get-credentials demo-cluster --zone europe-west3-a

Before proceeding further with the cluster, create a secret and add a value on the Secret Manager:

gcloud secrets create MY_SECRET
printf "s3cr3t" | gcloud secrets versions add MY_SECRET --data-file=-

To test that the instance can access to Secret Manager API, a dummy pod will be deployed so that you can exec into it:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: httpbin
spec:
  containers:
  - image: docker.io/kennethreitz/httpbin
    imagePullPolicy: IfNotPresent
    name: httpbin
    ports:
    - containerPort: 80
EOF

Exec into the created container and install curl on it:

kubectl exec -it httpbin -- sh
$ apt update; apt install curl;

Trying it out

In order to try the Secret Manager API an OAuth2 token needs to be issued as we can read on the documentation.

For some applications, you might need to request an OAuth2 access token and use it directly without going through a client library or using the gcloud or gsutil tools. There are several options for obtaining and using these access tokens to authenticate your applications. […] On the instance where your application runs, query the metadata server for an access token. […]

When inside the deployed container:

curl http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token -H "Metadata-Flavor: Google"

Since the OAuth2 scopes defined for the instances contains the cloud-platform one, the service account will be able to send requests towards Secret Manager API.

Copy the access_token from the response and execute the following call:

Remember to replace pablo-test-project with your project name.

curl https://content-secretmanager.googleapis.com/v1beta1/projects/pablo-test-project/secrets/MY_SECRET/versions/latest:access -H "Authorization":"Bearer <access_token>"

The response should look like this:

{
  "name": "projects/127553260172/secrets/MY_SECRET/versions/1",
  "payload": {
    "data": "czNjcjN0"
  }
}

Decoding base64 czNjcjN0 is s3cr3t. This worked fine, but then do we need the IAM permission or is it just enough with the OAuth2 scope?

Remove the IAM permission from the service account:

gcloud projects remove-iam-policy-binding pablo-test-project \
--member=serviceAccount:[email protected] \
--role=roles/secretmanager.secretAccessor

Wait a minute until the change is propagated and issue the requests again. The response should look like this:

{
  "error": {
    "code": 403,
    "message": "Permission 'secretmanager.versions.access' denied for resource 'projects/pablo-test-project/secrets/MY_SECRET/versions/latest' (or it may not exist).",
    "status": "PERMISSION_DENIED"
  }
}

So effectively, the IAM role is required for this to work. Now, you can add the IAM role again for the next part.

Moving this to a more realistic scenario

It is very important not to break the current workflow as this Secret Manager API is probably an overkill for local and dev environment. This is one of the selling points.

For this, I created a demo application which is doing the following:

  • Expects an environment variable named GCP_PROJECT, if the value is not present then it will try to read the keys from environment and use a default value when an environment variable is not found with the expected name.
  • If GCP_PROJECT is defined, it will try to search the secret on Secret Manager. If the value is not found there it will use a default value as well.
  • The application listens on the port 8080 and exposes the following route GET /get-secret expecting the header secret: <secret-name>.

On the next section, this application is explained but for now we can just add it do the cluster:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: secret-manager
spec:
  containers:
  - image: docker.io/pmorelli92/secret-manager:latest
    imagePullPolicy: Always
    name: secret-manager
    ports:
    - containerPort: 8080
    env:
    - name: GCP_PROJECT
      value: pablo-test-project
EOF

For simplicity, one can just port forward it:

kubectl port-forward secret-manager 8080:8080

For existing secret:

curl localhost:8080/get-secret -H "secret: MY_SECRET"
> {"name":"MY_SECRET","value":"s3cr3t"}

For non existing secret:

curl localhost:8080/get-secret -H "secret: OTHER_SECRET"
> {"name":"MY_SECRET","value":"default-for-OTHER_SECRET"}

Now, let’s do another test and deploy the pod without GCP_PROJECT environment variable but with the MY_SECRET one:

kubectl delete pod secret-manager;

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: secret-manager
spec:
  containers:
  - image: docker.io/pmorelli92/secret-manager:latest
    imagePullPolicy: Always
    name: secret-manager
    ports:
    - containerPort: 8080
    env:
    - name: MY_SECRET
      value: hello-from-env
EOF

Stop the port-forwarding, and execute it again:

kubectl port-forward secret-manager 8080:8080

For existing secret:

curl localhost:8080/get-secret -H "secret: MY_SECRET"
> {"name":"MY_SECRET","value":"hello-from-env"}

For non existing secret:

curl localhost:8080/get-secret -H "secret: OTHER_SECRET"
> {"name":"MY_SECRET","value":"default-for-OTHER_SECRET"}

Easy, right?

Breakdown the demo application

The application is fairly easy. On startup it looks for the value of the GCP_PROJECT environment variable. This value is supplied to the SecretGetter struct which defines the GetSecret function.

type SecretGetter struct {
	GoogleCloudProject string
}

// GetSecret gets a secret either from environment variable or from GCP Secret Manager
func (sg SecretGetter) GetSecret(name string, fallback string) string {
	// If GCP project is not present, get value from environment variables
	if sg.GoogleCloudProject == "" {
		return getEnv(name, fallback)
  }
  ...
}

// getEnv returns the value for an environment value, or a fallback if not found
func getEnv(name string, fallback string) string {
	value, ok := syscall.Getenv(name)
	if !ok {
		return fallback
	}
	return value
}

If GoogleCloudProject is empty, then the value will be taken from environment variables. Otherwise it does the following, shorten for brevity:

func (sg SecretGetter) GetSecret(name string, fallback string) string {
	// If GCP project is not present, get value from environment variables
	if sg.GoogleCloudProject == "" {
		return getEnv(name, fallback)
	}

	// Get the token for the service account that runs the node pool
	tokenUrl := "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"
	rq, err := http.NewRequest(http.MethodGet, tokenUrl, nil)
	if err != nil {
		fmt.Println(err)
		return fallback
	}

    ...

	// Get the secret value using the access_token that we fetched above
	secretUrl := fmt.Sprintf(
		"https://content-secretmanager.googleapis.com/v1beta1/projects/%s/secrets/%s/versions/latest:access",
		sg.GoogleCloudProject, name)
	rq, err = http.NewRequest(http.MethodGet, secretUrl, nil)
	if err != nil {
		fmt.Println(err)
		return fallback
	}

	rq.Header.Add("Authorization", fmt.Sprintf("Bearer %s", tokenResponse.AccessToken))
	rs, err = http.DefaultClient.Do(rq)
	if err != nil {
		fmt.Println(err)
		return fallback
	}

	secretResponse := struct {
		Error   int    `json:"error"`
		Status  string `json:"status"`
		Payload struct {
			Data string `json:"data"`
		} `json:"payload"`
	}{}

	...

	// In case there is an error because of privileges or oauth scopes, return the fallback
	if secretResponse.Error != 0 {
		fmt.Println(fmt.Sprintf("error %d - status %s", secretResponse.Error, secretResponse.Status))
		return fallback
	}

	// Secret Manager returns the secret on base64
	data, err := base64.StdEncoding.DecodeString(secretResponse.Payload.Data)
	if err != nil {
		fmt.Println(err)
		return fallback
	}

	return string(data)
}

And then for the API handler:

// getSecretHandler gets the secret value according to the name sent on the header
func getSecretHandler(secretGetter SecretGetter) http.HandlerFunc {
	return func(w http.ResponseWriter, rq *http.Request) {
		// Only work with GET requests
		if rq.Method != http.MethodGet {
			w.WriteHeader(http.StatusMethodNotAllowed)
			return
		}

		// Fetch the secret name on the header
		secretName := rq.Header.Get("secret")
		if secretName == "" {
			w.WriteHeader(http.StatusBadRequest)
			return
		}

		// Create the struct definition for the response
		bytes, err := json.Marshal(struct {
			Name  string `json:"name"`
			Value string `json:"value"`
		}{
			Name: secretName,
			// Use the secret getter to get the secret or the fallback
			Value: secretGetter.GetSecret(secretName, fmt.Sprintf("default-for-%s", secretName)),
		})
		if err != nil {
			w.WriteHeader(http.StatusInternalServerError)
			return
		}

		// Return the secret value
		w.Header().Set("Content-Type", "application/json")
		w.WriteHeader(http.StatusOK)
		_, _ = w.Write(bytes)
	}
}

Of course there is a catch. When working on local environment if the value for GCP_PROJECT is set, then only the default value will be returned as the call to get the token of the instance running inside GCP will fail:

Get "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token": dial tcp: lookup metadata.google.internal: no such host

Summary

  • (+) Secret Manager has a very intuitive API which also has some nice dedicated UI for developers to access secrets (if they also have the required IAM), see key versions and configure policies for expiration.
  • (+) Moving away from storing secrets on plain text on environment variables is also a correct step in order to secure our applications.
  • (+) Having the possibility to implement this in a way where it does not interfere with the local environment is one of the selling points.
  • (-) Vendor lockdown. It is understandable that this product is not directly interchangeable. That being said there are alternatives like Hashicorp Vault that also offers the same (and maybe more) functionality. However, it is clear that if you are on GCP you have already invested time on using cloud specific built-in solutions like CloudSQL and Cloud Storage which are not directly interchangeable either.

Demo application repository

👉 Repository 👈