Ops

Stage gates

Useful for

GovernanceProduction readinessCompany readinessAgentic delivery

Introduction

Stage gates are decision points. They help a company decide whether it is ready to move to the next level of customer exposure, operational risk and commercial promise. They are not ceremonies and they are not a replacement for judgement.

The useful question is always the same: what evidence exists, what risk remains, who owns that risk and what changes when the company moves forward?

How the model works

A gate should produce one of three outcomes. Pass means the required evidence exists and the remaining risk is acceptable. Conditional pass means the company can move forward, but the carried actions are explicit and must be reviewed at the next gate. Block means the missing evidence or unresolved risk is too material for the next stage.

Conditional passes should become stricter over time. A weak control might be acceptable at company setup, but the same gap can become a blocker before POC or Pilot because the blast radius has changed.

Is the company ready?

Before a startup begins building software, it needs enough company infrastructure to operate deliberately. This does not mean building a mature enterprise control environment on day one. It means putting the company in a position where identity, domain ownership, email, documentation, devices and basic responsibilities are understood.

This gate exists because early shortcuts become difficult to unwind. The first tenant, the first domain, the first admin account and the first documentation area can quietly become the foundation for everything that follows.

The company is ready when it can safely create accounts, communicate externally, hold governance documentation, control administrative access and explain who owns the public presence of the business.

Passing this gate also means the company has started a reviewable record of the important facts. It is acceptable for the controls to be lightweight, but it is not acceptable for the company to be unable to say where something is managed, who owns it, when it expires or what risk has been accepted.

Key blocking risks include:

Shared admin accounts.
Unknown domain or DNS ownership.
No control over email authentication.
No place to store governance documentation.
No clear ownership of the corporate website.
No process for handling staff personal information.
Important controlled assets are not recorded, including where they are managed, who owns them and

whether they expire.

Can we start the POC?

A POC should be cheap, fast and disposable, but it should not be chaotic. The aim is to let the team learn quickly without creating avoidable governance debt that later blocks a Pilot or Production launch.

This gate exists because the habits created in the POC often become the default operating model. Naming, repositories, secrets, decision records, basic platform governance and logging standards should be in place before the first useful demo.

The POC can start when the team can build, deploy and learn without storing secrets in source, capturing PII in logs, losing important decisions or creating infrastructure that is mistaken for Production.

Key blocking risks include:

Conditional pass actions from 01 is the company ready are still open and have not been

explicitly re-accepted.

Secrets in source control.
PII written to logs.
No POC data governance stance.
No decision record trail.
New POC assets or temporary security compromises are not recorded.
No repository ownership.
POC environment presented as reusable Production platform infrastructure.
No ability to redeploy the demo.

Are we ready for a Pilot?

A Pilot is different from a POC because external users may enter real data. Even if the product is still changing quickly, the governance stance has to assume that personal data, commercial data or customer-sensitive information may appear.

This gate exists because Pilot data creates real obligations. The company may need to support data subject requests, explain where data is stored, understand its Controller or Processor position and make honest promises to Pilot users.

The Pilot can start when the product can accept external use without the data protection, platform infrastructure and support position being fictional.

Key blocking risks include:

Conditional pass actions from 02 can we start the poc are still open and have not been

explicitly re-accepted.

External users can enter data before the company understands what data is collected.
No DPIA or data inventory has been started.
No DSAR process exists.
Pilot data status is unclear.
PII appears in logs.
Product AI sends personal data to an uncontrolled location.
No separation between test and Pilot workloads.

Are we ready for Pre-Production?

Pre-Production is where the organisation stops relying on memory and starts proving that the platform can be recreated, secured, changed and recovered deliberately. The system may still be small, but the operating model needs to become more disciplined.

This gate exists because a successful Pilot creates pressure to move quickly. Without a deliberate Pre-Production stance, teams often take Pilot platform infrastructure, add customers and call it Production.

The team is ready for Pre-Production when the platform, access model, release process, incident process and infrastructure approach can support real customer promises.

Key blocking risks include:

Conditional pass actions from 03 are we ready for a pilot are still open and have not been

explicitly re-accepted.

Production access is informal or always-on.
Platform infrastructure cannot be recreated.
Restore or failover depends on memory.
No incident coordinator role exists.
No release traceability exists.
No clear public/private access stance.
No supply chain control over dependencies or containers.

Are we ready for Production?

Production is the point where the business starts making real promises. The product does not need to be perfect, but the company must understand what it is promising, what it can recover from, what risks it has accepted and how it will communicate when something goes wrong.

This gate exists because Production is not just a hosting environment. It is a business commitment covering security, data protection, contracts, support, monitoring, operations and cost.

The product is ready for Production when the company can make customer promises and operate the system with evidence, tested recovery paths and accepted risks.

Key blocking risks include:

Conditional pass actions from 04 are we ready for pre production are still open and have not

been explicitly re-accepted.

Security findings are unknown or ignored.
Backup restore has not been tested.
Contracts conflict with the data protection position.
No monitoring exists.
No incident communication process exists.
Customer promises exceed the platform's resilience.
Product AI processes personal data without validation or location control.

Are we ready to scale Production?

Scaling Production is not only about adding compute. As customer count, usage and dependency on the product increase, the tolerance for downtime shrinks and the cost of weak operations rises.

This gate exists because resilience should follow actual risk. Early Production may tolerate planned downtime or cold standby. As usage grows, the business may need warm standby, hot failover, active-active architecture, stronger support and more formal operating evidence.

Production is ready to scale when the business understands usage patterns, customer expectations, resilience triggers, support load and unit economics well enough to invest deliberately.

Key blocking risks include:

Conditional pass actions from 05 are we ready for production are still open and have not

been explicitly re-accepted.

Customer usage has grown but resilience expectations have not been revisited.
No one knows the cost per customer or tenant.
Support model is already overloaded.
Device governance still relies on BYOD where customer assurance or certification now requires

stronger endpoint control.

Failover is promised but not tested.
There is no trigger for moving from cold to warm or hot recovery.

What an agent should look for

Which controls are satisfied by evidence rather than assertion.

Which conditional pass actions are still open from an earlier gate.

Which risks belong in the technical risk register and which should be escalated to the company risk register.

Whether the evidence standard is increasing as external users, customer data and production promises increase.

Whether the company is using the gate to make a decision or simply documenting drift.

What good looks like

The company knows why it is moving forward, what evidence supports the decision, which risks are accepted and what must mature before the next commitment.

How Brokenhouse helps

Turn this into a practical plan.

I help technology teams turn this guidance into decisions, implementation plans, governance evidence and production-ready operating models.

Talk through your situation

Next guidance

Related decisions to work through

View all guidance

Agent-led consultancy should amplify judgement

Agents should not replace expert judgement. They should help capture, structure, challenge, and reuse it.

Platform

Azure Dev Platform Modernisation

Describe the organisation, product, team shape, delivery model, and operating constraints.

Agentic software delivery governance

Agents used by the delivery team need a different governance model from AI models embedded in the product. Delivery agents may not be part of the customer-facing service, but they can still create risk because they may read code, write code, inspect logs, summarise documents, generate infrastructure changes or draft customer-facing material.