Companies are developing software in the cloud in a big way. The cloud has opened up a world of possibilities for application makers, enabling flexible architectures and ever more efficient ways of working. However, it also presents a number of new risks. There’s no shortage of tools for scanning and detecting flaws in infrastructure, code, processes, and more. If you’re still trying to use a tried-and-true software development lifecycle (SDLC) from your non-cloud days, things can get hairy fast.
To make sense of how the cloud has changed development, and understand best practices for a secure SDLC, I caught up with senior solutions engineer, Matt Brown, who just published <this handy guide> to help practitioners navigate and secure application creation in the cloud, and figure out which tools can help along the way.
Q: First things first, Matt, how does the cloud (and, especially, multi-cloud) change a company’s SDLC?
A: The cloud has offered companies an opportunity to develop software so much more efficiently and flexibly. They’re able to deliver value faster and in more targeted ways to internal and external clients. I’m blown away by what many of our customers have achieved in just a few short years following their cloud transformation journeys. Several did a “lift and shift” of an existing handful of applications to take advantage of some of the modern features and integrations available in the cloud, but once they saw what was possible, and how meaningful the differences were, those handful have turned into hundreds — even thousands — of applications. That’s awesome, but of course it also brings a raft of security issues.
Q: Okay, you’ve piqued my interest. What are some of those security issues?
A: By running their pipelines soup-to-nuts in the cloud — with multiple, disparate services doing everything from hosting code to defining infrastructure to running quality assurance (QA) tests, and then scaling capacity on-demand in their cloud platforms — companies introduce a ton of complexity. And anytime you have complexity, there’s opportunity for exploitation. Now add in factors like a distributed architecture, fragmented teams of people, disparate services delivered on their own timeframes (as opposed to one, monolithic application delivered on an orchestrated timeframe), and just the sheer speed of delivery. Software that took you a half-year to deliver — a quarter if you were really efficient — now gets pushed multiple times a day and scaled up or down to meet demand in containerized environments. So, not only are there more opportunities for exploitation, but any problems you do introduce are going to be magnified in your cloud. That means your security detections are going to be really noisy, it’ll be hard to identify the root cause of any one problem, and it’ll be near impossible to be certain that your remediation effort will solve the problem.
Q: So, how does that look in your SDLC? And what do you need to do differently in the cloud?
A: A more traditional SDLC might look like this, with six defined stages: plan, code, build, test, deploy, and production/monitor.
There are many ways people draw the stages in a fast-moving, cloud SDLC, including as an “infinity” diagram to emphasize its continuous nature. But for simplicity, I’ll lay out the steps similarly to map the sub-steps and tools within each.
Q: Wouldn’t design always be in your plan phase? What’s different in the cloud?
A: I think the big difference with planning your secure SDLC in the cloud is the level of coordination that’s required. Because the architecture allows for distributed teams to work independently, and because people can work so much faster than before, they need to be more intentional and explicit about how they’re going to address problems when they do arise. What will our pipelines look like? Will we adhere to a “gold standard” set of images? How will we roll them out? How will we govern access? What will our testing methodology be? How will we handle exceptions? Which tools will we use to monitor security? How quickly must we remediate detected issues? All of these questions need to be addressed in the planning phase and understood and agreed to by all involved, including not just developers but also security.
One trend is for high-performing teams to “shift left” as many security detections as possible, which means doing things like static code analysis in the code phase versus later in the lifecycle. But to do that, everybody needs to be in on the plan, especially since it’s someone on a security team that’s administering the analysis, but someone in development who’s on the hook to fix any detected code problems. This kind of coordination isn’t easy, but it’s essential in fast-moving, continuous development.
Some of the tools to consider in the plan phase include <IriusRisk>, <Secure Code Warrior>, <Security Compass>, and <Snyk Learn>.
Q: I see you recommend both SAST and SCA in the code phase. Why?
A: If you’re writing code yourself, static application security testing (SAST) (as well as its offshoots, dynamic and interactive application security testing) has historically been a great tool. But the reality is that the majority of software these days comes from third parties — notably open source. And that’s where software composition analysis (SCA) comes in. It’s critical not just to understand the vulnerabilities I’m introducing into my own code, but those that open source developers have included in their libraries that I’m using in my software. If we make it easy to identify that software bill of materials (SBOM) and its origins ahead of time, it makes traceability much easier when the inevitable flaw does get introduced. It’s trading off a little bit of pain now to avoid a lot of pain down the road, for both security and development.
Some of the tools to consider in the code phase include <Checkmarx SAST>, <Checkmarx SCA>, <Fortify>, <FOSSA>, <GitHub Dependabot>, <GitHub Advanced Security with CodeQL>, <GitLab>, <GitLab Dependency Scanning (formally Gemnasium)>, <JFrog Xray>, <Mend>, the <OWASP Dependency Check>, <Semgrep>, <ShiftLeft>, <Snyk Code>, <Snyk Open Source>, <SonarQube>, <Sonatype Nexus Lifecycle>, <Synopsys (Black Duck)>, <Synopsys (Coverity)>, and <Veracode>.
Q: It looks like you’ve segmented the build phase into components like infrastructure, containers, and artifacts. What’s the magic there?
A: One of the beautiful things about modern cloud architectures is they allow us to get really modular, which helps with component validation and re-use. We can test something out, mark it as “good,” and then use it again and again. That lets us go fast, with fewer surprises. That’s why I recommend segmenting things out like your artifacts, infrastructure as code (IaC), and container images and scanning them separately. By testing them before deployment, you run a much lower risk that they’re going to cause issues in production, which adds to your alert noise. As such, you can rule them out more quickly, which cuts your troubleshooting time in production, when the stakes are very high. And, of course, maintaining them in one place ensures that any remediations you do will count because you’ve fixed them at the source. That’s why it’s a best practice to keep code in your repository, separated from your container images, and your infrastructure definitions in yet a separate place.
Some of the tools to consider in the build phase include <Anchore>, <Aqua>, <AWS Elastic Container Registry>, <Azure Container Registry>, <Checkov (by Bridgecrew, Prisma Cloud, Palo Alto)>, <Checkmarx KICS>, <Clair (by Quay)>, <GitHub Container Registry>, <Google Container Registry>, <JFrog Artifactory>, <Lacework>, <Microsoft Defender for Containers>, <Prisma Cloud (by Palo Alto Networks)>, <Qualys>, <Snyk Container>, <Snyk IAC>, <Sonatype Nexus Repository>, <Sysdig>, <Terrascan (by Tenable)>, and <Wiz>.
Q: Tell me why you recommend fuzzing in the testing phase.
A: The testing phase is where the rubber hits the road. That’s where you really start to see the application in action — and put it through its paces. It’s a great place to suss out code flaws that show up when you give it invalid or malformed input. In a cloud environment where these vulnerabilities can be exploited in a variety of ways — code injections, buffer overflows, cross-site scripting — that’s where this kind of testing is especially important.
Q: And how about secrets in code?
A: Another important thing to test your application for is the presence of secrets — sensitive information in your codebase such as passwords, API keys, encryption keys, authorization tokens, and more. Look for those as early as possible in your SDLC, but definitely do it again in the test phase because that’s the best place to tell if those secrets are exposed and hackable. In our secrets identification here at Dazz, we can find which secrets are running live in your production environment, and then trace them back to their source with precise pipeline mapping so you can eradicate them for good.
Some of the tools to consider in the test phase include <BluBracket>, <Burp Suite PortSwigger>, <ForAllSecure>, <GitGuardian>, <GitHub>, <GitLab>, <Google OSS-Fuzz>, <Nightfall>, and <Synopsys Defensics>.
Q: So what about all of that scanning during the deployment phase? Aren’t you “shifting left”?
A: There’s a lot you can see in runtime that you don’t see before, even in testing. So, deploy your dynamic application security testing (DAST), runtime application self-protection (RASP), interactive application security testing (IAST), and API security testing to see your application running from an internal or external lens, with protection capabilities, or by looking at unauthorized API access or abuse. API security is especially relevant in cloud environments because so much of the environment hangs together with APIs, and malicious API traffic is a rapidly-growing part of total API traffic.
Some of the tools to consider in the deploy phase include <42Crunch>, <Astra Penteset>, <Contrast Assess>, <Contrast Protect>, <CyberRes Fortify WebInspect>, <Detectify>, <ForAllSecure — Mayhem for API>, <Imperva>, <Invicti (formerly NetSparker)>, <K2>, <NoName Security>, <Rapid7 Insight AppSec>, <Salt Security>, <StackHawk>, <Synopsys Seeker>, <ThreatX>, and <Veracode (formerly Crashtest Security)>.
Q: With all that scanning in the pipeline, what’s left to do in production?
A: Well, just because an application moves into production doesn’t mean it’s frozen there. New vulnerabilities get discovered, new releases introduce new flaws, the cloud infrastructure changes. So, in production, you have to keep monitoring, not just the application with application performance monitoring (APM), but also your cloud infrastructure with cloud security posture management (CSPM) and workload with cloud workload protection platform (CWPP). And because things are ever-changing in the cloud, you also have to monitor those changes with post-deployment IaC drift tools.
Some of the tools to consider in the production phase include <AppDynamics (by Cisco)>, <Bridgecrew (by Prisma Cloud, Palo Alto)>, <CrowdStrike Falcon Horizon>, <Datadog APM>, <Dynatrace>, <Env0>, <Lacework>, <Microsoft Defender for Containers>, <Orca>, <Prisma Cloud (by Palo Alto Networks)>, <Prisma Cloud Compute (by Palo Alto Networks)>, <Qualys Container Runtime Security>, <Snyk Cloud>, <Snyk IaC>, <Sonrai Security>, <Sysdig>, <Tenable.cs>, and <Wiz>.
Q: So, what’s next?
A: I recommend downloading <A Guide to Building a Secure SDLC>. And if you want to dig further into the security of your SDLC, we can provide a <complimentary assessment> to understand the overall security of your environment, pipeline and access governance, alert noise and overlap, presence of secrets in your code, root cause and source analysis, fix recommendations, and a roadmap for ongoing, developer-driven remediation.