Data Retention & Privacy
How Hay automatically manages conversation data retention, anonymization, and GDPR compliance.
Overview #
Hay includes a built-in data retention system that automatically removes personal information from old conversations. This helps you comply with GDPR and other data protection regulations while preserving anonymized metadata for analytics.
Retention is opt-in — by default, conversations are kept indefinitely. Once you set a retention period, Hay handles cleanup automatically.
How It Works #
When a conversation passes the retention window, Hay anonymizes it rather than deleting it outright. This means:
Removed (personal data):
- All messages in the conversation
- Customer link (who the conversation was with)
- Conversation title
- Context and metadata
- Linked document references
- Associated vector embeddings
Preserved (for analytics):
- Timestamps (created, closed, updated)
- Conversation status and channel
- Agent assignment
- Organization ID
After anonymization, the conversation record remains with the title [Anonymized] and a deleted_at timestamp. This lets you keep aggregate metrics (resolution times, volume trends, channel distribution) without retaining any personal data.
Configuring Retention #
- Go to Settings → Customer Privacy
- Under Conversation Retention Period, choose a timeframe: 30, 60, 90 days, or indefinite
- Save your changes
The retention period counts from when a conversation was closed. Only conversations with status closed or resolved are eligible for anonymization. Open or in-progress conversations are never touched.
Legal Hold #
Sometimes you need to preserve a specific conversation regardless of your retention policy — for example, during legal proceedings or an active dispute.
To place a conversation on legal hold:
- Open the conversation
- Use the legal hold toggle to enable it
Conversations on legal hold are exempt from automatic anonymization, even if they exceed the retention period. They remain fully intact until the hold is removed.
Cleanup Schedule #
The retention job runs automatically once per day (at 3:00 AM UTC). Each run:
- Finds all organizations with a retention policy configured
- Identifies closed/resolved conversations past the retention window
- Skips conversations on legal hold
- Deletes associated embeddings from the vector store
- Deletes all messages
- Anonymizes the conversation record
- Writes an entry to the audit log
If an error occurs while processing one organization, the job continues with the remaining organizations.
Audit Trail #
Every retention cleanup is logged in the audit trail with:
- Number of conversations anonymized
- Number of messages deleted
- Number of embeddings deleted
- The retention period that was applied
- The cutoff date used
- The list of affected conversation IDs
You can review these logs to verify that retention is operating as expected.
FAQ #
What happens if I change the retention period?
The new period takes effect on the next daily run. Conversations that already exceed the new window will be anonymized.
Can I undo anonymization?
No. Anonymization permanently removes personal data. This is by design — GDPR requires that deleted data cannot be recovered.
Does retention affect analytics?
Aggregate analytics (conversation volume, resolution times, channel breakdown) are preserved. Per-conversation details and message content are not.
What about customer data exports?
Data exports (via Settings → Privacy) include all current data. If a conversation has already been anonymized, it will not appear in the export.
Are embeddings cleaned up too?
Yes. All vector embeddings linked to anonymized conversations are deleted, ensuring no semantic traces of the conversation remain in the vector store.