Product

Public Cloud Storage Buckets: Why They Keep Getting Exposed and How to Stop It

Learn why cloud storage buckets get exposed and how to secure AWS, Azure, and GCP configurations to prevent data breaches and maintain regulatory compliance.

April 24, 202612 min read

Introduction

In the landscape of modern cloud computing, few security failures are as persistent—or as preventable—as the exposed storage bucket. As we move through 2025, security researchers and automated scanners continue to uncover thousands of publicly accessible buckets containing sensitive customer data, intellectual property, and regulated financial records. The narrative is almost always the same: a major organization suffers a "data breach," only for the post-mortem to reveal that there was no sophisticated intrusion or zero-day exploit. Instead, a storage container was simply left open to the internet, its contents indexed by search engines and accessible to anyone with a browser.

The irony of this situation is that cloud providers—Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP)—have spent the last several years implementing aggressive "secure by default" settings. New buckets are generally private, and multiple layers of warnings now precede any action that would make data public. Yet, the leaks persist. For senior engineers and tech leads, understanding the root causes of these exposures is no longer just about knowing which buttons to click; it is about understanding the complex interplay between legacy configurations, developer velocity, and the inherent friction between accessibility and security.

This post examines why public bucket exposure remains a critical threat, breaks down the platform-specific technicalities that lead to misconfigurations, and provides a blueprint for auditing and securing your storage posture at scale.

The Persistence of the Problem: Why It Still Happens

If the major cloud providers have made buckets private by default, why are we still seeing headlines about exposed S3 buckets or Azure blobs? The answer lies in the reality of enterprise scale and the "move fast" culture of modern software development.

1. Developer Friction and Testing Shortcuts

In a fast-paced CI/CD environment, developers often encounter "403 Forbidden" errors when trying to integrate a frontend application with a backend storage bucket. To quickly verify if the issue is code-related or permission-related, a common (and dangerous) shortcut is to set the bucket to public "just for a minute" to see if the error disappears. These "temporary" changes frequently find their way into production or are forgotten entirely, leaving a permanent hole in the perimeter.

2. Legacy Debt and Migrations

Many organizations have cloud footprints that date back a decade. In the early days of AWS S3, the default settings were much more permissive. While AWS has since changed these defaults for new buckets, older buckets—and the Infrastructure as Code (IaC) templates used to create them—often retain legacy configurations. Migrating these resources without a rigorous security review often carries those vulnerabilities forward into new environments.

3. The "All Authenticated Users" Misconception

A recurring technical misunderstanding in GCP and AWS involves the definition of an "authenticated user." In GCP, the permission allAuthenticatedUsers does not mean "all users in my organization." It means anyone with a valid Google account. Similarly, in AWS, granting access to the "Authenticated Users" group means anyone with an AWS account—any AWS account in the world. Engineers often select these options under the mistaken belief that they are restricting access to their internal staff.

4. Shadow IT and Unmanaged Resources

As organizations grow, central IT and security teams often lose visibility into every cloud account being used. Marketing teams, data scientists, or regional offices may spin up their own accounts to bypass central procurement. These "shadow" environments rarely benefit from the standardized security controls and organizational policies (like AWS Service Control Policies) that prevent public buckets in the main corporate accounts.

For a deeper look at how these organizational challenges impact the broader security landscape, refer to these principles on cloud infrastructure security.

Platform-Specific Breakdown: AWS S3

AWS S3 is the most common site of bucket exposures, largely due to its market share and the complexity of its permissioning model. S3 permissions are governed by a combination of IAM policies, Bucket Policies, and Access Control Lists (ACLs).

The ACL vs. Bucket Policy Conflict

Historically, S3 used ACLs to manage access. ACLs are difficult to audit at scale because they are attached to individual objects within a bucket. An engineer might secure the bucket itself but upload an object with a public-read ACL, making that specific file accessible to the world.

Today, the industry standard is to disable ACLs entirely and use Bucket Policies exclusively. AWS now provides the "S3 Object Ownership" setting to enforce this.

Block Public Access (BPA)

AWS introduced the "Block Public Access" feature as a master kill-switch. It can be applied at the bucket level or, more importantly, at the account level. When enabled, it overrides any permissive bucket policy or ACL.

A common mistake is failing to enable BPA at the account level. Even if 99% of your buckets are secure, one new bucket created by an automated script without BPA enabled becomes a liability.

Example: Dangerous S3 Bucket Policy

The following JSON represents a common misconfiguration where a bucket is made public to allow an external integration, but the Principal is over-scoped.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-sensitive-data-bucket/*"
    }
  ]
}

In the example above, the Principal: "*" allows anyone in the world to call GetObject. A secure approach would use a Condition block to restrict access to specific VPC IDs or IP ranges.

Platform-Specific Breakdown: Azure Blob Storage

Azure handles storage security differently, primarily through "Public Access Levels" on the Storage Account and the Container.

The Hierarchy of Access

In Azure, the "Allow Blob public access" setting at the Storage Account level acts as a gatekeeper. If this is disabled, no container within that account can be public. However, if it is enabled, individual containers can be set to:

Private: No anonymous access.
Blob: Anonymous read access for blobs only.
Container: Anonymous read and list access for the entire container.

The "Container" level is particularly dangerous because it allows attackers to enumerate (list) all files in the bucket, making it trivial to scrape the entire dataset.

The SAS Token Risk

Shared Access Signatures (SAS) are often used to provide temporary access to Azure Blobs. While not technically "public access," a leaked SAS token with a long expiration date is functionally equivalent to a public bucket. Tech leads must ensure that SAS tokens are generated with the shortest possible TTL (Time to Live) and are restricted to specific IP addresses.

Platform-Specific Breakdown: GCP Cloud Storage

Google Cloud Storage (GCS) uses IAM as its primary mechanism for access control, but it also supports legacy ACLs.

Uniform Bucket-Level Access

GCP’s "Uniform bucket-level access" is the equivalent of disabling S3 ACLs. It ensures that only IAM policies govern access. If this is not enabled, an object-level ACL could still expose a file even if the bucket's IAM policy looks secure.

The "allUsers" Identity

In GCP, the identity allUsers represents anyone on the internet. A common mistake in GCS is adding this member to the Storage Object Viewer role.

{
  "bindings": [
    {
      "role": "roles/storage.objectViewer",
      "members": [
        "allUsers"
      ]
    }
  ]
}

The presence of allUsers in any IAM binding for a storage resource should trigger an immediate high-priority security alert.

How to Audit Your Storage Posture

To stop the cycle of exposure, organizations must move from manual checks to automated, continuous auditing.

Key Audit Steps:

Inventory All Buckets: You cannot secure what you do not know exists. Use tools like AWS Config, Azure Resource Graph, or GCP Asset Inventory to list every storage resource across all regions and accounts.
Check for "Block Public Access" Equivalents: Verify that account-level or organization-level blocks are active.
Analyze Bucket Policies and IAM: Look for Principal: "*" or allUsers. Use automated policy analyzers (like AWS IAM Access Analyzer) to identify paths to public access.
Review Logging Configuration: Ensure that Access Logging (S3) or Diagnostic Settings (Azure) are enabled. If a bucket is exposed, you need logs to perform forensics and determine if data was actually exfiltrated.
Scan for Sensitive Data: Use services like Amazon Macie, Azure Purview, or GCP Cloud Data Loss Prevention (DLP) to identify buckets that contain PII or secrets. A public bucket with marketing assets is a minor issue; a public bucket with credit card numbers is a catastrophe.

Tooling Recommendations

While native tools are powerful, senior engineers often leverage open-source or third-party tools for a multi-cloud view:

Prowler: An open-source security tool for AWS that performs best-practice assessments.
Cloud Custodian: A policy-as-code engine that can automatically remediate (delete or restrict) buckets that violate security policies.
Steampipe: Uses SQL to query cloud infrastructure, making it easy to find public buckets across hundreds of accounts with a single query.

Implementing Preventative Controls

Detection is necessary, but prevention is the goal. Implementing guardrails at the infrastructure level ensures that even if a developer makes a mistake, the cloud provider will block the action.

1. Organization-Wide Policies

The most effective way to prevent public buckets is to enforce it at the highest level of the cloud hierarchy.

AWS: Use Service Control Policies (SCPs) to deny the s3:PutBucketPublicAccessBlock and s3:PutAccountPublicAccessBlock actions to anyone except a master security role. This prevents anyone from turning off the "Block Public Access" setting.
Azure: Use Azure Policy to "Deny" the creation of storage accounts where allowBlobPublicAccess is set to true.
GCP: Use Organization Policy Constraints, specifically storage.publicAccessPrevention, to enforce private access across the entire project or organization.

2. Infrastructure as Code (IaC) Linting

Security should start in the IDE. By using tools like tfsec, checkov, or terrascan, you can scan Terraform or CloudFormation templates for misconfigurations before they are ever deployed.

Example: Checkov Policy for S3 A simple pre-commit hook can catch a missing BPA configuration:

# Example checkov output
Check: CKV_AWS_53: "Ensure S3 bucket has public access block"
	Passed for resource: aws_s3_bucket.secure_bucket
	Failed for resource: aws_s3_bucket.exposed_bucket

3. Automated Remediation

For large-scale environments, "Mean Time to Remediate" (MTTR) is a critical metric. When a bucket is detected as public, an automated workflow (e.g., an AWS Lambda triggered by an EventBridge rule) should immediately revert the bucket to private and notify the owner. This "self-healing" infrastructure significantly reduces the window of opportunity for attackers.

Advanced Security Measures

Beyond simply making a bucket "not public," senior engineers should implement defense-in-depth to protect data even if a credential is compromised.

Encryption at Rest and in Transit

Always enforce AES-256 or AWS-KMS encryption. Furthermore, use bucket policies to deny any PutObject request that does not include the x-amz-server-side-encryption header. Similarly, enforce TLS by denying any request where aws:SecureTransport is false.

VPC Endpoints and Service Controls

To truly isolate storage, keep the traffic off the public internet entirely. Use VPC Endpoints (AWS) or Private Links (Azure/GCP) to allow your application servers to communicate with storage containers over the provider's private backbone. You can then update your bucket policies to only allow traffic originating from those specific endpoints.

The Principle of Least Privilege

Avoid using a single IAM role for all storage operations. Create granular roles:

App-Reader: s3:GetObject only.
App-Writer: s3:PutObject only.
Admin: Metadata management only, no data access.

By decoupling data access from resource management, you limit the "blast radius" of a compromised credential.

Conclusion

The persistence of public cloud storage exposure in 2025 is a sobering reminder that technical features alone cannot solve security problems; they must be paired with rigorous processes and cultural shifts. While AWS, Azure, and GCP have provided the tools—Block Public Access, Azure Policy, and GCP Organization Constraints—it remains the responsibility of the engineering leadership to ensure these tools are deployed universally and enforced through automation.

Securing cloud storage is not a one-time task but a continuous state of operation. It requires moving away from manual configuration toward Infrastructure as Code, implementing robust auditing cycles, and fostering a culture where security is integrated into the developer workflow rather than acting as a hurdle to it. By treating storage security as a core component of cloud infrastructure security, tech leads can ensure that their organization’s data remains an asset rather than a headline-grabbing liability.

The "open door" of a public bucket is rarely the result of a single failure. It is usually a chain of events: a legacy template, a missing organizational policy, a developer in a rush, and a lack of automated monitoring. Breaking that chain requires a proactive, multi-layered approach that prioritizes prevention and rapid remediation. In the cloud, visibility is the first step toward security—if you can’t see your buckets, you can’t secure them. Start by auditing your environment today, and then build the guardrails that make "public by mistake" a technical impossibility.

This content was generated by AI.