RectifyCloud
Back to Blog
Product

SOC 2 Monitoring and Alerting Requirements: What Auditors Expect You to Detect and Log

Learn what SOC 2 CC7.2 requires for monitoring and alerting. Discover which events to log, retention periods auditors expect, and how to reduce alert noise.

May 12, 202611 min read

Introduction

For many engineering teams, achieving SOC 2 compliance feels like a bureaucratic exercise in "security theater"—a flurry of screenshots and policy updates that takes place once a year. However, as senior engineers and tech leads, we know that the spirit of the SOC 2 Trust Services Criteria (TSC) is rooted in actual operational resilience. Specifically, Common Criteria 7.2 (CC7.2) focuses on the entity’s ability to monitor system components for anomalies that could indicate a security incident.

In a modern, cloud-native environment, CC7.2 is often where the rubber meets the road. It is no longer sufficient to point at a dashboard and tell an auditor, "We use Datadog." Auditors are increasingly sophisticated; they want to see the specific logic behind your alerts, the breadth of your telemetry, and the historical evidence that your monitoring system functioned as intended throughout the entire observation period.

According to research from IBM X-Force, the primary obstacles to effective incident response are incomplete logging, insufficient retention, and limited telemetry—particularly in containerized and hybrid-identity environments. When an auditor samples your logs, they aren't just looking for a "clean" record; they are looking for proof that your systems are configured to detect unauthorized access, privilege escalation, and data exfiltration in real-time. This post explores the technical requirements of CC7.2, the specific events you must monitor, and how to build an alerting pipeline that satisfies auditors without burying your on-call engineers in noise.

Understanding the CC7.2 Mandate

The AICPA defines CC7.2 as the requirement that "the entity monitors system components and the operation of those components for anomalies that are indicative of malicious acts, natural disasters, and errors affecting the entity's ability to meet its objectives." To an auditor, this translates into a demand for a comprehensive visibility stack that covers the entire lifecycle of a potential breach.

The "monitoring" required here is not just uptime or performance monitoring (though those are relevant for the Availability principle). For the Security principle, the focus is on security telemetry. This includes:

  1. Unauthorized Access: Detecting when someone who shouldn't be in the system tries to get in, or when a legitimate user accesses resources they aren't authorized to see.
  2. Privilege Escalation: Identifying when a user or service account moves from a low-privilege state to a high-privilege state (e.g., an EC2 instance role suddenly gaining AdministratorAccess).
  3. Configuration Changes: Monitoring for "drift" or manual changes to security groups, IAM policies, and encryption settings.
  4. Data Exfiltration: Detecting unusual patterns of data movement, such as bulk S3 bucket downloads or large outbound network transfers to unknown IPs.

The challenge is that most organizations fail not because they don't have logs, but because their logs are siloed, truncated, or lack the context necessary to reconstruct an event. To move beyond screenshots and manual evidence gathering, we must treat monitoring as code and compliance as a continuous data engineering problem.

What Must Be Logged: The Auditor’s Checklist

When an auditor arrives for a Type II examination, they will request a "population" of logs for the observation period. If you cannot produce logs for a specific three-day window six months ago, you are looking at a qualified report (a "fail"). To satisfy CC7.2, your logging strategy must encompass several distinct layers of the stack.

Identity and Access Management (IAM) Logs

Identity is the new perimeter. Auditors expect to see logs for every authentication and authorization event.

  • Success/Failure Events: Every login attempt to your SSO (Okta, Azure AD, Google Workspace) and your cloud console (AWS, GCP, Azure).
  • MFA Events: Challenges, successes, and—crucially—MFA bypasses or resets.
  • Token Issuance: Creation of long-lived API keys or session tokens.
  • Cross-Account Access: Logs showing when an external entity or a different internal account assumes a role in your production environment.

Cloud Control Plane Logs

These are the logs generated by the cloud provider itself (e.g., AWS CloudTrail, Azure Activity Log). Auditors look for:

  • Resource Creation/Deletion: Who spun up that unencrypted RDS instance?
  • Policy Modifications: Changes to S3 bucket policies, IAM roles, or Security Group rules.
  • Data Plane Events: High-risk actions like S3:GetObject or KMS:Decrypt for sensitive data stores.

Infrastructure and Network Logs

Even in serverless or containerized environments, network telemetry is vital.

  • VPC Flow Logs: To detect lateral movement or communication with known malicious IP addresses.
  • Load Balancer Logs: To identify SQL injection attempts or DDoS patterns.
  • Egress Logs: To monitor for data exfiltration attempts to non-company-owned endpoints.

Application and Container Logs

For containerized environments, auditors are increasingly focused on the ephemeral nature of the infrastructure.

  • Kubernetes Audit Logs: Who executed a command inside a pod (kubectl exec)? Who changed a ConfigMap?
  • Runtime Security: Events from tools like Falco or Sysdig that detect unexpected binary execution or file system changes within a container.
  • Application-Level Auth: Logs from your internal middleware showing which user ID accessed which record ID.

The Problem of Retention and Telemetry Gaps

One of the most common findings in SOC 2 audits is insufficient log retention. Many default cloud configurations (like standard CloudWatch log groups) are set to "Never Expire" or, conversely, a very short window like 14 days to save costs.

Auditors generally expect a 90-day "hot" retention period where logs are easily searchable, and a one-year "cold" retention period for long-term forensics. If your SOC 2 observation period is 12 months, you must be able to produce evidence from day one of that period.

The IBM X-Force findings highlight a critical gap: incomplete logging in hybrid-identity environments. If a user authenticates via On-Prem Active Directory, syncs to Azure AD, and then assumes a role in AWS, the "telemetry gap" often occurs at the hand-off points. An auditor will test this by following a single user's journey across these systems. If the trail goes cold in the middle, your monitoring program is considered insufficient.

Configuring Alerts That Satisfy Auditors

Logging is the "record," but CC7.2 specifically requires "monitoring... for anomalies." This means you must have active alerting. An auditor will ask: "How would you know if an engineer's credentials were stolen and used to dump your database?"

To answer this, you need to map your alerts directly to the risks identified in your SOC 2 Risk Assessment. Below is an example of how you might configure a technical alert for a high-risk event—unauthorized API calls—using a standardized format like a CloudWatch Metric Filter and Alarm.

{
    "FilterPattern": "{ ($.errorCode = \"*UnauthorizedOperation\") || ($.errorCode = \"AccessDenied\") }",
    "MetricTransformations": [
        {
            "MetricName": "UnauthorizedAPICalls",
            "MetricNamespace": "CloudTrailMetrics",
            "MetricValue": "1"
        }
    ],
    "AlarmConfiguration": {
        "AlarmName": "SOC2-CC7.2-Unauthorized-Access-Detected",
        "AlarmDescription": "Alert when an IAM user or role attempts an unauthorized action, potentially indicating privilege escalation or compromised credentials.",
        "Threshold": 5,
        "EvaluationPeriods": 1,
        "ComparisonOperator": "GreaterThanOrEqualToThreshold",
        "AlarmActions": ["arn:aws:sns:us-east-1:123456789012:Security-Alerts-Topic"]
    }
}

Key Alerting Categories for SOC 2

To ensure your alerting is robust enough for an audit, prioritize these categories:

  • Root Account Usage: Any login to the "Root" account of a cloud organization should trigger an immediate, high-priority alert.
  • IAM Escalation: Alerts for CreatePolicyVersion, AttachUserPolicy, or UpdateAssumeRolePolicy when performed by non-admin identities.
  • Security Group "Anywhere" Rules: Alerts when a Security Group is modified to allow 0.0.0.0/0 on sensitive ports (22, 3389, 5432).
  • Large Data Transfers: Threshold-based alerts on egress traffic from database subnets.
  • MFA Deactivation: Immediate notification if MFA is disabled for any privileged user.

Solving the "Signal vs. Noise" Dilemma

The quickest way to make a monitoring program fail is to overwhelm engineers with "compliance noise." If your team receives 50 alerts a day for "Access Denied" errors caused by misconfigured CI/CD scripts, they will eventually create a Gmail filter to ignore them. When the actual breach happens, the alert will be buried.

Auditors do not want to see that you alert on everything; they want to see that your alerting is effective. Here is how to tune your system for both compliance and sanity:

1. Severity Tiering

Map your alerts to a severity matrix.

  • P1 (Critical): Immediate page. Root login, MFA deletion, unauthorized production database access.
  • P2 (High): Ticket created, notification in Slack. New IAM user created, Security Group change.
  • P3 (Medium): Weekly report/Log only. Failed login attempts below a certain threshold.

2. Context Enrichment

An alert that says "Access Denied for user 'service-account-82'" is useless. An alert that says "Access Denied for 'service-account-82' attempting to 's3:DeleteBucket' from an unknown IP in Russia" is actionable. Use automated tools to enrich logs with Geo-IP data and threat intelligence before they hit your alerting engine.

3. Suppression of Known Good

If your vulnerability scanner (e.g., Wiz, Prisma Cloud) regularly performs "Access Denied" actions as part of its discovery process, whitelist those specific service accounts from your compliance alerts. Document this whitelisting in your internal procedures so you can explain it to an auditor.

The Auditor’s Test: The Incident Response Walkthrough

During the audit, the auditor will perform a "walkthrough" of your monitoring and incident response process. They will often select a random alert from your history and ask to see the "end-to-end" lifecycle of that event.

The expectations for this walkthrough include:

  • Timestamp of Detection: When did the system first log the anomaly?
  • Timestamp of Alert: How long did it take for the monitoring system to trigger the notification?
  • Evidence of Triage: Who looked at the alert? Is there a Jira ticket or a Slack thread showing an engineer investigated it?
  • Resolution: Was the alert a false positive? If so, why? If it was a real incident, was the incident response plan (CC7.3) followed?

If you cannot connect the alert to a specific human action or resolution, the auditor may conclude that your monitoring is "ineffective," regardless of how many logs you have stored in S3. This is why the "screenshot" method of compliance is so dangerous—it captures a moment in time but fails to prove the process is working.

Modernizing Evidence Collection

The traditional way to prove CC7.2 compliance is to manually export logs and take screenshots of alert configurations. This is labor-intensive and error-prone. Senior engineers are increasingly moving toward "Continuous Compliance" or "Compliance as Code."

By using APIs to pull evidence directly from your SIEM or cloud provider, you can maintain a "live" dashboard of your compliance status. This approach aligns with the philosophy of moving beyond screenshots. When your monitoring system is integrated with your compliance platform, the evidence of "Anomaly Detection" is generated automatically every time an alert is fired and resolved.

The Impact of Containerization and Hybrid Identity

As mentioned in the IBM X-Force report, containerized and hybrid environments introduce specific telemetry challenges that auditors are now trained to look for.

Container Telemetry

In a Kubernetes environment, IP addresses are ephemeral. A log entry showing an attack from 10.2.14.5 is useless if that pod disappeared three minutes later. Your logging system must capture Metadata Enrichment—associating the log with the Pod Name, Namespace, and Image Hash. Auditors will check if your monitoring can distinguish between a standard container process and an interactive shell session.

Hybrid Identity

If your organization uses a mix of on-premises AD and Cloud IAM, the "Source of Truth" for identity is often fractured. Auditors expect a unified view. If an employee is terminated in HRIS, they want to see that the monitoring system would flag any subsequent login attempts across all linked identity providers. A failure to log the "Identity Hand-off" is a common CC7.2 deficiency.

Conclusion

SOC 2 CC7.2 is not a "set it and forget it" requirement. It is a mandate for continuous operational vigilance. Auditors expect a monitoring program that is:

  • Comprehensive: Covering Identity, Control Plane, Infrastructure, and Applications.
  • Persistent: Maintaining logs for the duration of the observation period (typically one year).
  • Actionable: Filtering out noise to ensure that real anomalies trigger timely human intervention.
  • Verifiable: Capable of being reconstructed during a walkthrough to prove that the system actually works.

As tech leads, our goal should be to build monitoring systems that serve the business first and the auditor second. When we focus on high-fidelity telemetry, intelligent alerting, and automated evidence collection, we don't just pass the SOC 2 audit—we actually secure the environment. By moving away from manual screenshots and toward integrated, code-driven compliance, we reduce the burden on our engineering teams while providing the transparency and rigor that modern security standards demand.

Building a robust monitoring stack is an investment in your company’s reputation. When the IBM X-Force report notes that detection is the primary bottleneck in security, it serves as a reminder that the logs we collect today are the only thing standing between a minor anomaly and a catastrophic breach. Ensure your monitoring is configured not just to satisfy an auditor’s checklist, but to provide the visibility your team needs to defend the stack.

This content was generated by AI.