top of page

Things I wish I knew about Cloud (Security) before I became a CISO | Part 3: I’m in the cloud — now what do I do?

  • Autorenbild: Anton Horn
    Anton Horn
  • 19. Dez. 2025
  • 10 Min. Lesezeit

This article is the final part in a 3-part series where I break down the issues I encountered in understanding public cloud environments and how to secure them. I went through this learning journey while I started my CISO job at Allianz Direct, a German insurance company that was doing direct to consumer business worth over 1bn € annually and running exclusively on public cloud infrastructure.

In this final part, I will break down what I would do now if I was made responsible for securing the cloud assets of a larger organization — both immediately and which processes I would set up to secure it long-term.

I try to stick to concepts and abstractions of the relevant technology portions. Obviously, all of this circles around tech but I’m trying to stay away from going too deep into it.

I thought primarily of the following groups of people when I wrote this:

  • Security GRC Professionals

  • Auditors

  • “Non-technical” Security Executives

  • Other tech professionals who want to understand how GRC people think about the Cloud

Disclaimer: I’m assuming you already know about the shared responsibility model and I’m not focusing much on SaaS specific concepts or privacy aspects of Cloud usage. Also when I talk about Cloud I generally mean the three hyperscalers Amazon Web Services (AWS), Microsoft Azure and Google Cloud Project (GCP).


Securing an inherited cloud estate needs urgent action and strategic planning

In the following sections I will talk about how I would set up a cloud security program, both short-term and long-term.

There are a couple of common scenarios were you need to overhaul your cloud security setup, for example

  1. Following an incident

  2. During a company acquisition or divestiture

  3. When security catches up with a part of the organization that is already working in the cloud

I would treat all three scenarios quite similarly. First you urgently fix the glaringly open holes, to afterwards build a sustainable security program that is continuously improving and managing risks.

Run the first month like an incident

Let’s say your company acquired another company and you’re suddenly responsible to secure their cloud estate — what do you do?

First of all, I would treat this very close to an incident. So this should be handled with high urgency and only focusing on the most critical issues.

I would primarily start looking at these three types of problems:

  1. Start by identifying critical vulnerabilities

  2. Create a basic ability to detect active threats

  3. Prioritize and close urgent compliance tasks

As part of any security initiative, one of the most important things is to understand the organizational structure, as well as roles and responsibilities of the organization, but you can’t avoid that anyway when working on the other topics. So I’m not addressing it as a separate issue.

You don’t really have the time to first complete a vulnerability scan, then do some log analysis etc. — so all of these things are happening largely in parallel.

Start by identifying critical vulnerabilities

I would always start by Red Teaming the cloud estate and all public-facing assets on it. This should happen both from an external and internal (assumed breach) perspective.

While we’re at it, just go ahead and launch a spear-phishing campaign as well — this is not cloud-specific but if I were in charge of securing a newly acquired company, it would be high on my list.

At the end of the Red Team engagement, I would also recommend to conduct a high-level code review. Throughout any of these tests, I would clarify to the testing team that the focus should be the emulation of a vaguely interested adversary. It’s nice hygiene information to find out that they didn’t fix some Content Security Policies but if the testing team can’t build a real exploit out of it, it’s not that important for now.

While performing this Red Team engagement, start turning on the native security tooling of the cloud accounts in use (so AWS Security Hub, GCP Security Command Center, Azure Defender) — specifically the features that identify vulnerabilities. Focus on Cloud Security Posture Management (CSPM) first and then turn on other features like Software Composition Analysis (SCA) or Dynamic Application Security Testing (DAST).

You will have to wade through a sea of findings but I would start by looking at the critical ones and put those in a backlog for quick validation.

Lastly, and just because it’s such a prevalent issue, specifically have a look at storage buckets exposed to the internet and get confirmation that all of them actually need to be public.

Create a basic ability to detect active threats

While conducting this once-over to find any vulnerabilities that could get you into trouble, work under the assumption that you’ve already been breached.

This means you need to perform a compromise assessment to look for any evidence of attackers having successfully breached you. This could include:

  • Suspicious incoming or outgoing traffic

  • Compute instances that no one is responsible for

  • Suspicious API calls in the orchestration layer

  • Suspicious identities

You should focus both on logs from the orchestration layer, as well as those of applications and infrastructure/network security tools (Web Application Firewalls for example).

Additionally, let’s turn on the native security tooling for Threat Detection, such as AWS GuardDuty, GCP Security Command Center and Azure Defender (AWS seems to provide different services for different security capabilities, Azure and GCP bundle a lot of basic security features into one tool).

Prioritize and close urgent compliance tasks

While the topics of vulnerabilities and threat detection are critical to ensure you’re not exposed to any active threats, you shouldn’t ignore your governance and compliance obligations.

Starting with potentially the most urgent problem: If there are any findings from regulators or external auditors, make sure you have a clear timeline for their mitigation which can be met despite the chaos of the Post Merger Integration.

Secondly, perform a full review of the active identities. Start by looking at accounts that haven’t been used in the last 90 days — if no one knows whether they should or shouldn’t remain active, at least disable them. Re-instating access is easier than dealing with a data breach.

Lastly, have a look at the structure of the accounts / projects / organizations of the cloud environment. Are all of them centrally managed? Did you overlook any of them during any of the previous stages?

With that, we’re already getting into the long-term view.

Make your long-term security program fit the company strategy

While we’re not quite as much under pressure anymore as we were in the first phase, I would still prioritize any activities in the long-term view by their urgency and importance.

I would estimate importance based on how severe my problems would be, if that measure were not in place. I would estimate urgency based on how many other measures I have in place that act as compensating controls.

Accordingly, in order I would look at the following long-term improvements:

  1. Developing an Incident Response capability

  2. Reviewing and improving threat detection capabilities

  3. Integrate additional tooling beyond the CSP-provided tools

  4. Automate vulnerability identification and prioritization

  5. Tackle privileged identities and machine identities

  6. Build a unified approach for Compliance as Code

  7. Use tagging to assign asset ownership

Trade-offs

Any security program needs to be aligned with the business and technology strategies of the organization you’re trying to protect. While there are many cloud security capabilities that offer real economic leverage, a startup still needs to prioritize differently than a multinational enterprise with over 100,000 employees (on average, their threat landscape will also look quite differently).

You can for example always optimize for automation, which makes a lot of sense when you have to cover thousands or millions of assets. However, you can also do regular point-in-time assessments to make sure your most critical risks are under control but you don’t invest into infinite scalability when you have 6 months runway before your venture funding runs out.

Developing an incident response capability

It’s always easiest to put stuff down on a piece of paper — so let’s start there: Write an incident response plan!

You need to understand who does what during a cloud incident. I would generally recommend to align yourself with something like the NIST Incident Response Framework and at least document what you would do in the 4 phases.Make sure you do some kind of walkthrough with the necessary stakeholders — this can be done in just an hour and its primary benefit is that you clarify all assumptions you put into the plan. Which hopefully avoids the “oh, I’m not responsible for that and I don’t know who is” reaction during an incident.

As part of that incident response process, don’t forget to look into backup arrangements. Many managed data storage services (databases, file storage) of your cloud service providers will have some kind of built in DR capability — turn those features on if you don’t have a more sophisticated capability at hand.

Reviewing and improving threat detection capabilities

Here I would build on the rush-job we did in the short-term phase and simply verify that the tools provided by the Cloud Service Provider(s) actually fit your needs to detect all necessary threats.

For that it helps to build a threat model of your environment and start identifying the threat scenarios that are special to your environment. Here I would focus on application-specific detections that are not standardized in any security products.

Obviously, building a full Threat Detection capability including its architecture is quite complicated. It involves a lot of tooling, not just affecting your cloud estate but as we’re focusing on cloud security specifically, I will limit myself to threat detection in that part of your attack surface.

There are certain cloud-specific attack scenarios that are not perfectly covered by the native threat detection tooling of the CSPs. For example you should have a look at integrating spend monitoring alerts into your detection alerting pipeline. Additionally, check out how well infrastructure drift gets detected or alerted on (infrastructure drift is the delta between the infrastructure as defined via code vs. the actual infrastructure in your cloud environment — drift could also indicate an attacker deploying resources outside of your controlled deployment process).

Integrate additional tooling beyond the CSP-provided tools

Threat Detection is just one part of your cloud security architecture where the tools provided by the Cloud Service Providers (CSPs) are not always sufficient.

Most importantly if you’re a customer of multiple CSPs, it’s more and more useful to integrate vendor-agnostic tooling. This will include unification of

  • Cloud Security Posture Management

  • Cloud Detection & Response

  • Cloud Workload Protection

  • Or CNAPP as a combination of all three

Additionally, the cloud security space seems to move more and more into an overarching software security offering. As examples, while Wiz started clearly focused on cloud security, they’re now expanding into SAST or Snyk originally being a SAST vendor, now also covering Container and Infrastructure as Code security.

This largely stems from the fact that the boundaries between code and infrastructure are just vanishing.

There is an awesome episode from the cloud security podcast that was recently released with James Berthoty, breaking down the current space of cloud security tooling, including a few rants about the mess of it all.

Automate vulnerability identification and prioritization

Once you improved your capabilities for responding to and detecting threats, it’s time to beef up your capacity to prevent them.

There are two processes we need to integrate security into:

  1. the software release process

  2. the work planning process

The way software is released, is now usually done through a “Continuous Integration / Continuous Deployment” (CI/CD) pipeline. Which can perform tests, as well as adding governance, validation and traceability to software releases.

Where it becomes interesting for security is that we can add security specific tests here. Those can either be informative for us and act as lagging indicators of identified security issues (e.g. vulnerabilities) or act as quality gates that prevent software being pushed into production below a certain security threshold (e.g. no high/critical vulnerabilities).

When we implement fixed security gates, we don’t need to consider how this work gets integrated into the planning of teams working on the software. However, if we’re not ready for that yet, we simply use the CI/CD pipeline to identify security issues and then make sure those are integrated into the normal planning process of each team.

Unless you’re >95% sure about the validity of your security findings, I would not create quality gates yet — you will just make everyone annoyed that you’re blocking the release process with nonsense.

This also requires proper reporting on your identified security issues. This means you need to make sure the owners of the security problems are aware of said problems and have the necessary information on how to fix them.

Tackle privileged identities and machine identities

There are some identity management topics that don’t materially change in the cloud but some that really do.

Just to mention this at least once: Make sure your “super admin” or root account is not used for any operational activities. This account should only be used to create another privileged identity or if any services don’t work without it. Also don’t forget to enable multi-factor authentication for it.

When building a comprehensive cloud security program, tackling privileged identities is going to be critical.

You need to understand which personas/teams in your organization have high-level access to your cloud environment and build a privileged access management process around it. Make sure all logging is enabled, so you can monitor and detect any suspicious activity.

Machine identities are something that is much more prevalent in cloud environments, because anything any service does in the cloud requires you to first define the necessary privileges for this role.

For both privileged and machine identities, use IAM tools the cloud providers offer, to validate that there are no overprovisioned authorizations.

Build a unified approach for Compliance as Code

I mentioned in one of my previous articles that Compliance as code is incredibly cool, if you use it right.

So for a proper cloud security program, you should not ignore it. Each cloud service provider offers different services to apply security policies across all your cloud resources. Investigate how they work and look into applying the no-brainer policies, like ensuring that all databases are always encrypted.

Afterwards look into further building guardrails around your cloud environment to ensure security is built in and your engineering teams don’t have to worry about enabling certain features.

The next evolution of this is Open Policy Agent for cloud-agnostic policies that also include things like only deploying containers without root privileges.

Use tagging to assign asset ownership

Lastly, to ensure you continuously know who owns what in your cloud estate, make sure you know the owner of every single asset in your environment.

One of the best features for this is tagging, which allows you to add metadata to all of your resources. Here I would focus on the static resources and not the ephemeral ones. So make sure you tag your databases and container images but maybe not the individually running containers.

Also perform some validation that there is only one owner per resource. When it comes to vulnerability remediation, “shared” items suddenly tend to have to no owners anymore.

This was the last part of this series. Let me know what you thought.

Beyond the books and podcasts I recommended in the previous articles (one, two), for a strategic view on cloud security, security in general and enterprise risk make sure you follow Phil Venables on LinkedIn and his blog.

Also, Google Cloud just wrote an article that is somewhat similar to this one about how to secure an inherited GCP environment. It’s more of a specific how-to guide that primarily makes sense in the GCP universe (and I don’t necessarily agree with the order of recommendations but it’s still helpful).


 
 
 

Kommentare


bottom of page