Data

Data protection assurance

Useful for

Data protectionCustomer trustData governanceGovernance

Introduction

Data protection should be treated in the same way the product is treated with penetration testing. Before wider customer onboarding, the company should get an external review of its data protection position.

The output should be a review document that can be shared with prospective clients. It should confirm that the company has considered how data is protected, where it is stored, what roles it plays under GDPR and how requests or incidents are handled.

Controller or Processor

The company must decide whether it is acting as a Data Controller, a Data Processor, or both in different contexts.

This affects contracts, privacy notices, customer expectations, breach reporting, data subject request handling and support timings.

Record:

Where the company is a Controller.
Where the company is a Processor.
Whether customers are Controllers for product data.
What instructions the company accepts from customers.
Who responds to data subject requests.
What reporting times apply after an incident.
What needs to be reflected in contracts and DPAs.

DPIA detail

A DPIA does not need to become a huge document before Pilot, but it should force the product team to think about the data properly.

The DPIA should not be written as a disconnected compliance document. It should be fed by a living data knowledge base that is built from evidence. An agent should be able to inspect Infrastructure as Code, application code, configuration, vendor records, logs and architecture notes, then update the data inventory and DPIA working record from what the product actually does.

Before Pilot, the product architecture and core data flows should be defined well enough to support the DPIA. The format is less important than the content: it may be a diagram, markdown record, generated summary or agent-created model, but it must identify the major components, core data stores, entry points, processing, access paths, exits and trust boundaries.

Useful DPIA prompts include:

What the product does.
What personal data may be collected.
Who the data subjects are.
Why the data is needed.
The lawful basis for processing.
Where the data is stored.
Which architecture components and data flows support the processing.
Which third parties can access it.
How the data is protected.
How long it is retained.
What happens if the Pilot database is recreated.

The DPIA should live with the other source-controlled decision records so people and agents can use it as product memory.

Agentic discovery loop

Run data protection review loop when reviewing or updating the DPIA position. The loop should look at the system from several angles:

Architecture and data-flow records show the product shape, core stores, access paths and trust

boundaries.

IaC shows stores, regions, backups, private endpoints, public exposure, networks, encryption

settings and managed identities.

Code shows data models, validation, API contracts, logging calls, exports, background processing,

support/admin actions and integrations.

Configuration shows services, secrets references, connection strings, third-party systems and

feature flags.

Runtime and operational evidence shows logs, telemetry, alerts, support processes, backups,

restore paths and vendor access.

The output should update the knowledge base rather than only produce a one-off finding. A useful knowledge base lets the company answer: what data exists, where it flows, who can access it, how it is protected, how long it is retained, how it is deleted and which risks remain.

Data inventory detail

The data inventory is the start of the Record of Processing Activities, even if it is very small at first.

Useful fields include:

Dataset name.
Business owner.
Technical owner.
System or storage location.
Data collected.
Whether it includes personal data.
Purpose.
Lawful basis.
Retention period.
Backup location.
Third-party processors.
Deletion approach.

The inventory should record how each dataset was discovered. A dataset inferred from code, IaC or a vendor configuration is often stronger evidence than a dataset remembered in a workshop. Both are useful, but the source matters.

Data flow and protection checks

Use architecture and data flow record as the minimum structure where no better architecture or data-flow record exists. It should link core data stores to data inventory records and material flows to DPIA sections.

For each material flow, record:

where the data enters the system
where it is processed
where it is stored
whether it is encrypted at rest
whether it is encrypted in transit
who or what can access it
whether it appears in logs, prompts, analytics, exports or backups
whether it crosses region, tenant, supplier or trust boundaries
how it can be deleted or exported for data subject requests

This is where technical governance and data governance meet. Private endpoints, TLS, managed identity, encryption, access control, backup handling, logging design and support access all affect the DPIA.

This is where the company records whether Pilot data will be deleted, migrated or retained.

Backup data governance

Backups are part of the data governance boundary. They may contain personal data, customer data, support records, audit data, logs or exports, even when the primary system has moved on.

The company should record:

which systems create backups
whether backups can be modified, deleted, restored only, or only expired through retention
what personal or sensitive data may be present
how long backups are retained
who can restore backups
who can decrypt backups
whether backup access and decryption attempts are logged and alerted
how backup limitations affect deletion, erasure, retention and incident response

Some platforms allow backups to be retained and restored but not selectively modified. That does not make the backup position wrong, but it must be visible. If a data subject request, deletion request or customer contract promises deletion, the backup limitation needs to be explained and reflected in the procedure.

Where backup data cannot be updated directly, the company should record the live-system deletion and put the backup copy beyond normal use until it expires or is overwritten under the retention schedule. The backup should not be used for analytics, reporting, support, product behaviour or any other normal processing purpose.

The key operational requirement is restore handling. If a backup has to be restored, previous deletion, erasure, correction or restriction requests must be re-applied to the restored environment. The GDPR request register should therefore record any restore re-application action required by a request.

Where possible, long-term backups that may contain personal, sensitive or customer-confidential data should be encrypted so they are difficult to decrypt without a deliberate, reviewable action. A useful pattern is to encrypt backup material with a public key while keeping the private key in a controlled Key Vault or equivalent secret-management service. Access to the private key should be restricted, audited and monitored. Attempted or successful decryption should raise an alert.

The aim is not to make restore impossible. The aim is to make long-term backup access deliberate, rare, evidenced and visible.

People making erasure requests should be told what happens to their data in backups. Use gdpr erasure backup response as a starting point where backup data is retained in a restore-only, immutable or technically difficult-to-modify form.

GDPR request tracking

GDPR requests should be treated as operational work, not only policy intent. Even during Pilot, the company needs a way to record and track requests so they are not lost in email, chat or support tickets.

The process can be manual at first, but it should be explicit. A request register should record:

when the request was received
what type of request it is
who owns the response
whether identity verification is needed
whether the company is acting as Controller or Processor
which customer or controller instruction applies, where relevant
which systems were checked
the due date
the decision and response
where evidence is stored

Use gdpr request register to track requests. The register should cover data subject requests, privacy complaints, consent withdrawals, erasure requests, access requests and customer/controller instructions where the company acts as Processor.

This is an explicit readiness test for Pilot. If external users can enter personal data, the company must be able to receive, track and evidence GDPR requests even if fulfilment is manual.

Product and company data locations

Data storage documentation should include the obvious product database and the places where data can quietly appear:

Application logs.
Audit logs.
Error tracking tools.
Analytics tools.
Emails.
Support tools.
Backups.
Developer exports.
Test datasets.
Third-party SaaS products.

Company GDPR also matters. A startup may hold personal data about staff, contractors, suppliers, prospects and investors before the product has many users. Microsoft 365, SharePoint, OneDrive, spreadsheets, accounting tools, payroll tools and support systems should be included in the data view.

External review

The review should cover:

DPIA quality.
Data inventory completeness.
Controller and Processor position.
Privacy policy and cookie position.
Vendor and processor list.
Data subject request procedure.
Breach reporting procedure.
Retention and deletion approach.
Logging position, especially no PII in logs.
Evidence that procedures exist and can be followed.

The review should also check whether the data knowledge base is being maintained as the product changes. If the DPIA does not reflect the current code, infrastructure and vendors, it should be treated as stale evidence.

How this evolves as the company grows

Before Pilot, architecture, core data flows, DPIA, data inventory and Controller/Processor decisions need to be started from evidence.

At Pilot, assume external users may enter PII and make GDPR request handling, no-PII-in-logs and backup limitations visible.

Before Production, data protection needs review evidence that supports contracts, privacy notices, support access and customer assurance.

As the company scales, data protection should be reviewed like a product capability, including external review and customer-facing evidence.

What an agent should look for

Does the DPIA come from architecture, code, IaC and data flows?

Are GDPR request routes and records in place?

Are Controller/Processor decisions reflected in contracts?

What good looks like

The company can explain the decision, show the evidence behind it and identify the next point where the control needs to mature.

How Brokenhouse helps

Turn this into a practical plan.

I help technology teams turn this guidance into decisions, implementation plans, governance evidence and production-ready operating models.

Talk through your situation

Next guidance

Related decisions to work through

View all guidance

Agent-led consultancy should amplify judgement

Agents should not replace expert judgement. They should help capture, structure, challenge, and reuse it.

Platform

Azure Dev Platform Modernisation

Describe the organisation, product, team shape, delivery model, and operating constraints.

Agentic software delivery governance

Agents used by the delivery team need a different governance model from AI models embedded in the product. Delivery agents may not be part of the customer-facing service, but they can still create risk because they may read code, write code, inspect logs, summarise documents, generate infrastructure changes or draft customer-facing material.