Things I wish I knew about Cloud (Security) before I became a CISO | Part 2: Security Concepts I needed to relearn in the cloud

Anton Horn
19. Dez. 2025
11 Min. Lesezeit

This article is part 2 in a 3-part series where I break down the issues I encountered in understanding public cloud environments and how to secure them. I went through this learning journey while I started my CISO job at Allianz Direct, a German insurance company that was doing direct to consumer business worth over 1bn € annually and running exclusively on public cloud infrastructure.

In part 1 I wrote about how the cloud is fundamentally different from traditional on-premise environments and which traditional IT concepts I had to re-learn.

In this part, I will explain the differences in security concepts and security technology in the cloud.

In part 3, I will explain what I would do now if I was made responsible for securing the cloud assets of a larger organization — both immediately and which processes I would set up to secure it long-term.

I try to stick to concepts and abstractions of the relevant technology portions. Obviously, all of this circles around tech but I’m trying to stay away from going too deep into it.

I thought primarily of the following groups of people when I wrote this:

Security GRC Professionals
Auditors
“Non-technical” Security Executives
Other tech professionals who want to understand how GRC people think about the Cloud

Disclaimer: I’m assuming you already know about the shared responsibility model in the cloud and I’m not focusing much on SaaS specific concepts or privacy aspects of Cloud usage. Also when I talk about Cloud in this article I generally mean the three hyperscalers Amazon Web Services (AWS), Microsoft Azure and Google Cloud Project (GCP).

Security works differently in the Cloud

As I explained in my previous article, the cloud just works differently compared to traditional IT. This is primarily due to its elastic properties, as well as the distributed, ephemeral and immutable nature of resources.

This has several effects on how security is and should be done in the cloud, affecting for example Application Security, Threat Detection and Governance. The following were the most eye-opening to me:

There are completely new parts of the attack surface
Vulnerability Scanners can light up like Christmas Trees
There are security tools that only exist in the cloud
Threat Detection is even weirder in the cloud
Security as Code sounds awesome, and is awesome (to a point)
Asset management gets easier but you need to focus on the non-ephemeral resources
Auditors are not always familiar with cloud-native environments

There are completely new parts of the attack surface

During the hiring process for my CISO job I was asked if I knew about Cloud and what security issues it entails. I wanted the job and was trying to sound confident, so I said something like “Oh sure, it’s mainly about misconfigurations” — probably because I read something about the NSA being hacked through a public S3 bucket (which did happen, and exposed storage buckets still seem to be a common security issue in the cloud).

While misconfigurations are a problem, understanding that they exist doesn’t help you understand what the hell these things are that are being configured insecurely. While there are a few things like networking misconfigs and exposed storage buckets that I grasped fairly quickly, three areas felt especially foreign to me in the beginning:

The Cloud Orchestration layer
Whatever the hell this Kubernetes stuff is
The Identities of your users and things.

Orchestrating the Cloud

The Cloud Orchestration Layer is the interface through which you manage your Public Cloud resources. For example if you want to create a new database or virtual machine in AWS you can do this through the AWS console, which is just a SaaS application that let’s you control your AWS environment. Every action that happens on this orchestration layer is also represented through API calls, which means you can automate activity and interact with this orchestration layer through other systems.

In on-premise environments, there isn’t an exact equivalent of this orchestration layer. Which means it’s a new kind of attack surface that needs to be protected differently. One example is that an attacker could make configuration changes like exposing internal databases to the internet or steal compute power for cryptomining (by e.g. creating a high-powered and very expensive virtual machine that mines bitcoins).

Kubernetes sounds strange and it really is…

Kubernetes is a very complicated system that does the orchestration of containers “for you” (even if you buy it as a managed service from one of the major cloud service providers, it still seems to require a ton of operations expertise and effort). Kubernetes can also be run on-premise but it’s most widely used in cloud contexts.

As Kubernetes is taking care of managing the resources that are necessary for your running applications (incl. memory, networking, some logging and a bunch of other stuff) it’s an incredibly powerful and sensitive asset. Using one of the managed services provided by AWS, GCP or Azure secures a large part of the Kubernetes attack surface but there are still enough things that can go wrong.

Identity is the new perimeter

Identity & Access Management plays a huge role in understanding Cloud Security and it works quite differently than you would traditionally think.

For example, even activities that happen between different cloud services (e.g. a microservice changing an entry in a database) relies on an assigned identity and appropriate permissions to do these things.

While you have a lot of fine-grained control over the identities and their authorizations in the cloud, IAM is also quite complex, which can lead to over-provisioned access rights.

Misuse of identities is still one of the most common attack vectors in cloud environments and it’s often said that “identity is the new perimeter” in the cloud context because in many cases, just one leaked access key is enough to compromise and entire cloud environment (which could include every single IT asset some companies own).

For all three of these resource types (Cloud Orchestration Layer, Kubernetes and Cloud IAM) you need to implement security measures that are completely unique to Public Cloud environments.

Vulnerability Scanners can light up like Christmas Trees

Vulnerability management in the cloud extends beyond just your running applications and also needs to include the configuration of the three asset types I mentioned above (Orchestration Layer, Kubernetes and IAM) — I will break down the tooling for these things in the following section.

Beyond these additional resources that need to be scanned for vulnerabilities, there is an interesting and somewhat annoying consideration for Vulnerability Management: Containers can multiply certain parts of your attack surface. Each container is built on a base image that includes an operating system and potentially some pre-installed dependencies like Python or Node.js.

If you just turn on your vulnerability scanner or conduct Software Composition Analysis, 10 vulnerabilities in that base image can result in a few thousand vulnerabilities in your production environment getting picked up by your vulnerability scanning tool.

Normalizing and deduplicating this, as well as assigning the task of fixing the issues in the base image to the correct team are processes you may need to implement outside of your vulnerability scanning tool.

This also gives you the ability to fix an issue that seems incredibly wide-spread in one central place. You would just need to redeploy all the services built on that patched base image.

Security as Code sounds awesome, and is awesome (to a point)

Since a lot of vulnerabilities in the cloud stem from misconfigurations, you have the very cool ability to centrally define boundaries for certain sensitive security-configurations

This is also called “Compliance as Code” and all of the Hyperscalers offer some capability like this. Generally this is achieved by centralizing your Cloud projects into a management group, where you can then apply security policies across the entire enterprise.

The important distinction I had to find out is that software engineers define compliance very differently than I would — I was thinking of it from an audit and GRC standpoint. Cloud engineers think of it rather from the perspective of CIS Benchmarks and other configuration standards.

You can implement certain configurations like a standard set of IAM roles or enabling encryption by default for all databases but you won’t be able to fix your gaps in your risk management process with these tools.

What Compliance as Code allows you to do though is get almost perfect assurance that certain security requirements are 100% in your environment. So a very powerful capability that made me sleep better at night.

There are security tools that only exist for the cloud

Some of the major types of security tools include:

Cloud Security Posture Management (CSPM) to identify risky configurations in your cloud environment. A classic example would be a file storage bucket that is exposed to the internet that maybe shouldn’t be.
Cloud Workload Protection Platform (CWPP) to protect running workloads such as VMs, containers or serverless functions. For example detecting
Cloud Detection & Response (CDR) to identify and manage active threats against your cloud environment and cloud-native resources.
Cloud-Native Application Protection Platforms (CNAPP) are an aggregation of some of all of the above — they cover protection needs across all Native
Most CSPs offer at least some of these capabilities with their own tooling. At the very least they offer their own CSPM to find misconfigurations, e.g. Security Hub on AWS, Azure Defender on Microsoft Azure or GCP Security Command Center.

Threat Detection can be weird in the Cloud

In threat detection in the cloud, distributed, ephemeral and immutable properties play an important role — if e.g. systems are constantly shut down and spun up again, a few things can happen:

“device” logs can already be gone when you get around to investigating an alert because the container was restarted
tying IPs and ports to a specific resource can be difficult if you haven’t set up your logging accordingly, making investigations difficult
machines applying changes to machines makes it sometimes difficult to understand who triggered a suspicious activity
while admins should not directly ssh into containers and apply changes, it still happens (maybe just to read logs) — you need to understand what is suspicious and what is not
lastly, you just can’t go into a data center and pull out the hard drive or memory chip of an affected server, so you need to do forensics quite differently

An anecdote on useless threat hunting reports

One of the first consulting projects I commissioned as a newly appointment CISO was to hire a very large firm to conduct a threat hunting engagement. I wanted to make sure we double-checked that there were no attackers who had gained persistence in our environment.

Sadly, what I got back in the report where somewhat useless findings about “suspicious looking” network connections that turned out to be entirely okay.

For example some connections including the port “43389” — which looked a bit like someone using the “Remote Desktop Protocol” (a Windows remote administration service, normally using the port 3389).

The cause of these “suspicious” port ranges was simply that Kubernetes can dynamically assign somewhat random ports as part of its tasks to manage the networking of the different containers that are running in a Kubernetes cluster.

Hence, this report was quite useless for me, and the consultant would have delivered a more helpful result if he understood the dynamics of Kubernetes networking on some level.

Attacker’s advantage in the cloud

Attackers have a very interesting advantage in the cloud, which really needs to change the operating model for how you run threat detection and incident response: They already know what your system looks like.

All the APIs for every Public Cloud Service are already known to your attackers. Once they gain access to an overprivileged identity, they can run certain attacks completely automated, without having to first figure out how to built an exploit that fits exactly your environment. This means detecting an attack needs to happen within minutes and building automated response actions is going to be a necessity.

Asset management gets easier but you need to focus on non-ephemeral resources

I had a couple of interesting learnings when it came to asset management in the cloud:

Shared responsibility means you don’t have to track many things you normally would
Everything as Code means there are not shadow assets (in theory)
Keeping track of ephemeral resources is a waste of time

Under the shared responsibility model, your Cloud Service Provider is responsible for managing (and securing) a significant portion of your technology stack. This also means you get no visibility into that part of the stack, making it not just pointless but also impossible to track. At least you don’t need to patch any network devices anymore.

When everything is defined as code, nothing gets hidden from you. You theoretically have visibility into every single resource that has been deployed in your environment. There are two exceptions to that:

People creating cloud accounts that you’re not aware of and that are not integrated into your security tool chain
People using the cloud like a rented data center — if someone creates VM and then starts manually installing things and changing configurations, it’s much more difficult to centrally keep track of what’s running on that machine

There are things that make sense to keep track of in your cloud estate — ephemeral resources are not one of them. In general I would say you need to focus on things that are static, which are for example databases, container images or identities.It doesn’t make all that much sense to keep track of individually running containers. So if you have a security policy that states that every Virtualized Server needs to be tracked in your CMDB, you should probably rewrite statement.

That last part was also something that I had a few (friendly) discussions with different audit teams on.

Auditors are not always familiar with cloud-native environments

Audits of IT systems happen all the time, this includes financial reporting audits for publicly listed companies, audits by regulators, internal audits and many others (like PCI DSS compliance audits)

The problems auditors look for haven’t changed significantly over the past 2 or 3 decades because the risks haven’t really changed all that much. It’s still about ensuring Confidentiality, Integrity and Availability.

Accordingly, a lot of the controls to mitigate those risks haven’t significantly changed over a long time period. While there may have been no AI-powered pentesting 10 years ago, there definitely was pentesting and vulnerability scanning available as controls to identify and fix security issues.

Back then, deployments of new software versions happened every few weeks or maybe even just twice a year. It made sense to pentest each new version because it was economically feasible and you had just one big piece of software that needed to be tested.

Now, with containerization and CI/CD pipelines you can suddenly deploy a couple thousand times a year and it’s no longer feasible to pentest every change. Now you have to figure out how to identify vulnerabilities sooner and using much more automation. You may also want to look into continuous pentesting services.

When I had to explain to various audit teams how we implemented security processes differently in Allianz Direct, because the traditional approach made no practical or economic sense, I always tried to focus on the risk they’re looking to understand. If they want to see how we prevent our systems from getting hacked through known vulnerabilities, I took them through our continuous pentesting program and how defined significant changes that needed to be pentested. Or how we used other security tools as part of the CI/CD process to increase our security assurance. Ultimately we were much better positioned than if we did it the “old way” but if an audit team can’t understand it, it’s going to cause other problems.

That’s it for the second part of this series. In the next article, I will explain how I would implement cloud security in a company now.

If you want to dive deeper into Cloud Security, I would recommend two podcasts: Google Cloud Security Podcast and Cloud Security Podcast, as well as two books: Alice and Bob Learn Application Security by Tanya Janca and Agile Application Security by Laura Bell. Both books focus more on building an overall well crafted DevSecOps program but they also cover Cloud Security. I also read a few general cloud security books but none of them are no-brainer recommendations like these two.