Matthew Sharp, CISO, Logicworks
Max Sobell, COO, Carve Systems
As we start 2020, many IT leaders are confronted with increased regulatory pressures, new public cloud projects, and an overwhelming number of new tools and processes to assimilate. It is no surprise that 88% of IT leaders believe compliance inhibits further cloud adoption. Sometimes it can feel like walking a tightrope with no safety net.
With the stakes higher than ever, IT leaders still don’t fully understand how public cloud affects an enterprise risk profile. There are risks that were once critical in your datacenter, that are now less important in public cloud. More importantly, new risks arise in public cloud that you never had to think about in your datacenter. As we move beyond the simplistic understanding of “public cloud is secure / not secure”, we need a more nuanced understanding of how to best identify the real risks that deserve your attention in 2020.
In this post, we present a common scenario: a company has already migrated a few workloads to the AWS cloud, but has designed a simplistic architecture that approximates their on-premises architecture (colloquially known as Lift and Shift). Via Threat Modeling, we identify the risks of this simplistic architecture, examine mitigations to those threats, and finally present a more secure architecture that enhances your ability to subsequently optimize cost, performance, availability, and scalability.
Understanding the Threats of Public Cloud
Moving to a public cloud doesn’t make you inherently more or less secure. However, if you don’t reconsider the new attack surface and adequately cover key areas of risk — or worse, just assume that the same security tools and practices you used on-premises are just going to “work” on public cloud — you’re in for some pain.
Below is an example of a cloud architecture, perhaps built by an inexperienced cloud architect, modeled after their on-premises architecture:
The basic features of this diagram are that:
- All resources are in a single Virtual Private Cloud in a single Availability Zone
- There are three Amazon EC2 instances and an RDS database behind a load balancer
- Static web assets are delivered by AWS CloudFront (a CDN)
Now let’s try to understand the threats inherent in this architecture. In order to do this, we’ll undergo a threat modeling exercise and try to think like an attacker.
Introduction to Threat Modeling
Depending upon the architecture and cloud technologies in use, organizations need to consider the attack surface and design a strategy that adequately covers key areas of risk. One approach to accomplish this is the use of Threat Modeling.
The following process flow diagram outlines the foundational steps to perform a Threat Model exercise:
Every threat modeling effort includes a structured analysis of potential security failures known as “threat enumeration”. Not everyone instinctively thinks about what could go wrong with a system from the attacker’s viewpoint. Luckily, you don’t need the knowledge of a would-be attacker to use threat modeling effectively. There are tons of brainstorming strategies to help you activate your attacker mindset.
One of the most accessible threat enumeration techniques we’ve implemented at Logicworks is STRIDE. We had help from Carve Systems, a boutique security consulting company focused on helping engineering organizations shift security left and ship secure products on time. STRIDE is a mnemonic device for common software security threats:
|Spoofing||Authentication||Impersonating something or someone else.||Pretending to be Bill Gates or microsoft.com, possibly including use of an SSL/TLS private key|
|Tampering||Integrity||Modifying data or code||Modifying a packet as it traverses the Internet, or modifying data in a database or on disk|
|Repudiation||Non-repudiation||Claiming to have not performed an action||“I did not send that email,” “I did not modify that file,” “I did not visit that website,” “It wasn’t me who bought those things!”|
|Information Disclosure||Confidentiality||Exposing information to someone not authorized to see it||Putting private keys in a source code repository, poor permissions allowing private server configuration files to be read, user email address enumeration|
|Denial of Service||Availability||Deny or degrade service to users||Executing an expensive operation within a web application to absorb resources or increase server scaling + compute cost, injecting code to break site rendering or backend functionality|
|Elevation of Privilege||Authorization||Gain capabilities without proper authorization||Horizontal: gaining privileges, or access to data, of another user. Vertical: gaining privileges, or access to data, of an administrative user or system functionality|
STRIDE originates from Microsoft and is still used as a part of their SDL (Secure Development Lifecycle). The challenge for developers deploying to the cloud, as explained in the post, is evaluating the risk of potential threats.
Once a threat has been identified, a team must evaluate the risk the threat poses. Requirements, threats, and mitigations all interplay to determine what risk a threat poses to the business. Sometimes a threat doesn’t align with business requirements and may be safe to ignore. Remember that threats can take on many forms. They don’t always come from attackers. Abnormal application usage leading to instability can also be a threat to the business.
Threat Modeling the Simple Cloud Architecture
We particularly like the STRIDE framework because there is lots of overlap between categories, which gives us extra opportunities to identify threats. In addition, it’s very fast to pick up and allows engineering teams to get right into the important activity of threat modeling. It’s important to note, though, that STRIDE only goes one direction: threat enumeration, not threat categorization. Trying to categorize threats into one of the STRIDE categories is an exercise in futility. Most threats will fit into multiple buckets, and the framework isn’t useful in that direction.
Also note that threats to a system will always be present. We use the analogy of a house: the threat of a break-in will always exist. That threat can be mitigated by securing the house: bars on the windows, pry bar on the door. Exploitation of vulnerabilities is what allows an attacker to realize the threat (e.g. an open garage door or window is a vulnerability).
In a brief ~2 hour exercise, a Threat Model on our sample architecture might produce a list of findings and recommended mitigation steps as follows:
|1||An attacker can access our SSL/TLS private key and impersonate our web server.||We don’t store the private key securely.||Logicworks Advanced Security Services: Threat Manager/IDS|
|2||One user can pretend to be another user and buy widgets unauthorized.||We don’t use 2FA and don’t have password complexity requirements.||Application Layer Issue|
|3||An attacker can spoof an administrator in order to access the ec2 jump server and gain access to the other ec2 hosts and data.||We have access open to the world and the admin uses a simple password. We don’t have any sort of rate limiting or brute-force prevention.||Logicworks Advanced Security Services: Threat Manager/IDS|
|4||An attacker can access our s3 bucket, authenticating as the payment processor.||We use IP based authentication and set the range too wide.||Logicworks Cloud Build Service|
|5||One user can tamper with another user’s widget store, lowering the prices of their widgets.||We don’t enforce authorization at the application layer.||Application layer issue|
|7||An attacker can tamper with the payment records stored in s3 to credit money to their account, or appear to have paid for something that they did not.||Bad s3 bucket permissions: universal list/read/write access.||Logicworks Data Loss Prevention|
|8||An attacker can attack the application layer via injection attacks and modify our database.||Gaps in secure development practices. No logging/monitoring and alerting at the application layer in production deployment.||Logicworks Advanced Security Services: WAF|
|9||An administrator changes security settings on our s3 bucket, or cloudfront, or AWS configuration.||No alerting on AWS configuration changes / configuration drift.||Logicworks Compliance Assessment|
|10||A user can claim that they didn’t buy a particular widget, or never received payment for a widget, even though they did. We can’t prove otherwise.||No central logging, or logging disabled.||Application layer issue|
|11||An admin can claim they never accessed the web application through the jumpserver, even though they did.||Logging is not read-only, and the admin account is shared.||Logicworks Cloud Build Service|
|12||An attacker can view private information (account info, emails, passwords) for other users.||Application layer does not enforce authorization.||Application layer issue|
|13||An attacker can view private AWS account information via debugging or unexpected interfaces.||The Metadata Service (169.254.169.254) is exposed via an application proxy.||Application layer issue|
|14||We have our API keys for Swipe (our payment processor) in our github repository. An attacker with access to the repository can execute transactions on our behalf.||We don’t rotate the keys when an employee leaves. Anyone can grant access to our repositories, and sometimes people make mistakes.||Logicworks Data Loss Prevention|
|Denial of Service|
|15||An attacker can absorb resources needed to serve other users by calling expensive application calculations at a high rate (e.g. draw a chart).||We don’t have a way to do rate limiting. We don’t have good visibility into requests coming into our server, and we don’t have a way to block abusive IPs.||AWS Shield Advanced|
|Elevation of Privilege|
|16||An attacker can exploit known flaws in the ec2 web server to gain elevated privileges on the server.||We don’t patch our servers regularly.||Logicworks Advanced Security Services: Patching|
|17||An attacker can guess the credentials for the AWS console and gain full control over my environment.||We don’t use multi-factor authentication.||Logicworks Pulse Scanner|
|18||An attacker who gains control of an ec2 instance can modify it without our knowledge||We don’t have good monitoring or alerting on changes from a known good state.||Logicworks Pulse Scanner, Threat Manager/IDS, and FIM|
Addressing the Threats Identified in Threat Modeling with Logicworks
In order to address the threats uncovered in the threat modeling exercise, we need to go back to our cloud architect and come up with a new cloud architecture. As a general comment, we can prevent a lot of re-work by doing light threat modeling early in the SDLC, as soon as the initial design is completed but before implementation begins.
To build a secure cloud architecture, your company’s internal IT team must knit together a customized combination of cloud native (AWS Trusted Advisor, Security Hub), Independent Software Vendor (ISV) providers of cloud access security broker (CASB) solutions (Netskope, SkyHigh), and cloud workload protection platforms CWPPs (Cavirin, Alert Logic), container security (Aqua, Twistlock, Aporeto, Trend Micro, WhiteSource), and custom developed intellectual property such as Logicworks Data Loss Prevention.
The architecture diagram below was built by Logicworks, and shows a mature approach to building a standard 3-tier web application. Note that in this diagram, we are not just mitigating threats from the simple cloud architecture above, we are also ensuring that the environment is prepared to scale for additional business use cases.
This diagram represents the networking and cloud native service mitigations required to meet a high security standard. Note that each mitigation identified during the threat modeling exercise (see chart above) is numbered here.
A couple important points to highlight:
- We’ve separated the management plane from the data plane; note management VPC, and production VPC. Typically there would also be an additional Staging and Test VPC. Of course this model is infinitely extensible, and often an organization will have more SDLC tiers or can even separate applications into different VPCs.
- The business value of this setup is that the company can add additional applications in separate VPCs or in separate instances in the same VPC, without having to re-architect the whole solution. This saves time/money/effort.
- The management VPC includes core security features, including Intrusion Detection, a Bastion Host, and Active Directory servers.
- Again, the business value of this is that the environment can grow without having to do the hard work of reproducing core security tools and functionality.
- Rather than hitting the load balancer directly, users are directed through a Web Application Firewall.
In addition to these technical features, we also need to develop a strong security practice that helps us implement and maintain these configurations over time. The following diagram represents more of the strategic or high-level service mitigations:
Whether you engage Logicworks or develop these services in-house, each is critical for maintaining ongoing security and governance. All these services are part of our suite of AWS Managed Services, which provides 24×7 support for your infrastructure, plus:
- Logicworks Build Service: Our engineers design and build a secure cloud infrastructure with network and access control best practices built-in.
- Logicworks Compliance Assessment: Provides a compliance report and full findings of your environment including security/compliance baseline, performance by OS/cloud service, top remediation suggestions. Limited to: HIPAA, PCI, CIS, ISO 27002, GDPR, SOC2, NIST.
- Logicworks Data Loss Prevention: Logicworks DLP automatically configures Amazon Macie, a machine learning tool that can detect sensitive data in your AWS S3 storage buckets. Then our serverless technology alerts and takes action when sensitive data is at risk of being leaked. Can remove public access from new or changed S3 objects (objects within my public buckets are ignored) or quarantine new or changed S3 objects within my public buckets when objects have risk level.
- Logicworks Pulse Scanners: We designed custom scripts to scan through customer environments and alert them of potential security and compliance misconfigurations. (In real terms: Scanners are scripts that run on a schedule across all managed AWS client accounts. They perform security/compliance-type tasks and have a framework for notifying interested parties.)
- Logicworks Advanced Security Services: Patching: We’ll help you detect, assess, test, and execute OS-level patches for your instances.
It’s up to your IT team to determine how to identify, and then mitigate the new risks inherent in public cloud, and unique to your chosen architectures. Understanding how these pieces fit together – and where they satisfy or do not satisfy specific regulations – can be a significant challenge, and one that companies can’t afford to get wrong.
Given how complicated the world of cybersecurity can be, it’s important to trust the right providers and platforms to help your business stay secure. Following repeatable processes, leveraging proven development patterns, implementing appropriate developer guardrails, and addressing threats before they can affect your business is recommended to stay ahead of security issues.
Additionally, investing in the right skills will save you money and time in the long run by reducing the likelihood of expensive and time consuming data breaches.
To learn more about how Logicworks can help you build and maintain secure AWS and Azure infrastructure, contact us. For more information on how Carve Systems can help you develop your own threat model and help you avoid re-work by shifting security left, visit their website or email info [at] carvesystems [dot com].