Enterprise Security for Dialogue APIs: What SOC 2 Does and Does Not Cover

Enterprise procurement teams ask for SOC 2 Type II certification as a baseline condition for onboarding any SaaS vendor that handles sensitive data. The certification is meaningful: it demonstrates that a vendor has controls in place and that those controls were independently audited over a defined observation period. It is not a comprehensive security guarantee for conversational AI use cases. There are specific risks in dialogue session data management that SOC 2 does not address, and enterprise teams that rely solely on SOC 2 for vendor security evaluation are missing important questions.

What SOC 2 Type II Actually Measures

SOC 2 Type II audits evaluate a vendor against the AICPA Trust Services Criteria across five categories: security (always required), availability, processing integrity, confidentiality, and privacy (each optional and elected by the vendor). The audit covers a 6-12 month observation period and verifies that stated controls were operating continuously, not just at audit time. This is the key difference from Type I, which is a point-in-time snapshot.

The security trust service criteria cover access controls, logical access, physical access, change management, and incident response. These are infrastructure-level controls: who can access production systems, how software changes are managed, what happens when a security incident is detected. For a dialogue API vendor, this means: are API keys properly scoped and rotated, is production infrastructure access restricted, are security incidents tracked and disclosed?

SOC 2 does not mandate specific technical controls - it mandates that a vendor documents controls and operates them consistently. Two vendors can both be SOC 2 certified with completely different implementations: one uses AES-256 encryption at rest, another uses AES-128. Both can pass the audit if they both consistently implement their documented controls. The certification tells you about consistency; you still need to evaluate the substance of the controls.

The Training Data Risk SOC 2 Does Not Address

The most important question for a conversational AI vendor that is absent from the SOC 2 framework: does the vendor use customer conversation data to train or improve their models? SOC 2 addresses data security but not data use. A vendor could be fully SOC 2 certified while using your customers' conversation transcripts to train their NLU models for other customers' use cases.

This is not a hypothetical concern. Several major NLP vendors have included data-use provisions in terms of service that permit using customer data for model improvement, with opt-out available but not opt-in required. For enterprise customers in regulated industries, having their users' dialogue sessions included in a shared training corpus is a material compliance risk regardless of whether the data is encrypted in transit and at rest.

The correct question to ask: "Does your system use any data from my organization's dialogue sessions to train, fine-tune, or improve models that serve any other customer?" The answer should be an unambiguous no, with a contractual commitment and audit rights. Equmenopolis's answer to this question is no: customer context data is never used for training. The Data Processing Agreement explicitly prohibits it, and the prohibition is auditable through our SOC 2 privacy trust service criteria.

Context Object Encryption and Key Management

Dialogue context objects accumulate sensitive user information: names, locations, account numbers, health information, payment details - whatever the conversation domain involves. These context objects must be encrypted at rest to meet SOC 2 security criteria. The implementation details matter enormously and are often not reviewed during procurement.

Customer-managed encryption keys (CMEK) allow the enterprise customer to hold the encryption keys for their data, meaning the vendor cannot decrypt the data without the customer's key. This is a materially stronger guarantee than vendor-managed encryption, where the vendor holds both the data and the keys. CMEK is not required by SOC 2 and most vendors do not offer it at lower pricing tiers. For organizations in regulated industries, CMEK provides an additional control layer that addresses the risk of vendor-side unauthorized access.

Equmenopolis uses AES-256 encryption at rest for all context objects. Encryption keys are managed per-tenant using our key management service. CMEK is available on Enterprise plans where customers bring their own AWS KMS or Azure Key Vault keys. All key access events are logged for audit purposes.

Data Residency and GDPR Article 25

GDPR Article 25 (Data Protection by Design and Default) requires that personal data processing implements technical measures to ensure privacy from the point of design, not as an afterthought. For dialogue APIs processing EU user data, this means: data collected in the EU must stay in the EU, context objects must not be transmitted outside the EU without appropriate safeguards, and the minimum necessary data must be retained for the minimum necessary period.

SOC 2 does not address GDPR compliance - it is a US framework and its privacy trust service criteria are modeled after AICPA's own privacy principles, not GDPR. Vendors that market EU data residency options need to address GDPR Article 25 separately, typically through a Data Processing Agreement (DPA) with Standard Contractual Clauses (SCCs) and a Sub-Processor List documenting third-party services used within the EU data processing chain.

Equmenopolis operates EU infrastructure in Frankfurt (AWS eu-central-1). EU tenants have their context data processed and stored exclusively in the EU region. The DPA includes SCCs and the Sub-Processor List covers the infrastructure and monitoring tools used in the EU region. Context data for EU tenants never leaves the EU region without explicit tenant consent, which is not granted by default.

Incident Response for Dialogue Data Breaches

GDPR Article 33 requires notification to supervisory authorities within 72 hours of discovering a personal data breach. For a dialogue API vendor, a breach of context storage is a breach of potentially extensive personal information about your users. The vendor's incident response SLA matters: if they take 48 hours to notify you of a breach, you have 24 hours to notify the authority. If they take 72 hours to notify you, you are already past your notification deadline before you knew there was a breach.

In vendor security questionnaires, ask: "What is your incident notification SLA for personal data breaches?" and "Does the notification SLA start from detection or from confirmation?" A vendor who starts the clock at confirmation (after investigating whether a breach actually occurred) can stretch the notification window indefinitely. The clock should start at detection of a potential breach, not confirmation of a confirmed one.

Practical Security Checklist for Dialogue API Procurement

Beyond SOC 2: ask whether customer conversation data is used for model training (no should be contractual); verify encryption standard and key management options (AES-256, CMEK availability); confirm data residency options and geographic restrictions; review the DPA for GDPR coverage and Sub-Processor List; check incident notification SLA (target: 24 hours from detection); ask about API key scope and rotation policies; and confirm whether context deletion requests can be fulfilled within 30 days (GDPR Article 17 requirement).

Conclusion

SOC 2 Type II is necessary but not sufficient for enterprise dialogue API security evaluation. The framework covers infrastructure controls but not the questions most specific to conversational AI: training data use, context encryption and key management, data residency, and regulatory compliance obligations. Enterprise procurement teams that add these questions to their standard SOC 2 review will identify material security differentiators between vendors that a SOC 2 certificate alone does not surface.