This guide is for anyone running the go-live of a custom software project — internal or external-facing — and trying to avoid the launch-day disasters that turn a successful build into a damaged reputation. By the end you will know how to prepare in the days leading up to launch, how to choose between cutover styles, what to monitor in the first 48 hours, and how to plan the rollback you hope you never use.
Who This Guide Is For
Product owners, project managers, and operations leads responsible for the moment a piece of software starts taking real traffic — whether that is a new internal tool replacing a manual process, a client portal going live to 200 customers, or a public-facing app reaching its first users. The risk profile differs, but the discipline is the same: launch should be uneventful, and the only way to make it uneventful is to plan it.
Before You Start
You should have completed UAT and have sign-off from the business that the software is ready. If you have not, How to Handle User Acceptance Testing covers that step — launching software that has not been formally accepted is how launch-day surprises happen.
You should also have a production environment that is meaningfully different from staging — the same code, but with real production configuration, real third-party API keys, real domain, real SSL, real database. “It worked on staging” is not the same as “it works in production”.
Step 1: Freeze the Code, Not the Plan
Three to five working days before launch, code on the release branch should freeze. No new features, no scope additions, no last-minute “while we are at it” changes. Bug fixes only, and only for issues already identified in UAT.
The reason this matters is that every change carries a risk of breaking something else. Changes made in the final days before launch have not been through the same testing cycle as the rest of the work, and they are disproportionately responsible for launch-day issues. A team that resists scope freeze on the theory that the change is “small” is the team that finds out the small change broke the entire onboarding flow.
The code freeze is technical; the plan is not frozen. The plan continues to evolve — communication, training, comms, contingency. But the software itself stops changing.
Step 2: Build the Pre-Launch Checklist
The week of launch, run a structured pre-launch checklist. The categories that always matter:
- Infrastructure: production servers provisioned and tested under load, database backups configured, SSL certificate in place and verified, DNS records prepared, CDN configured
- Code: the production branch deployed to production, all migrations run, environment variables verified, third-party API keys active and tested
- Data: any required initial data loaded, user accounts seeded for the first day’s users, test transactions completed end-to-end
- Integrations: every integration verified against production endpoints, not staging — Stripe in production mode, not test mode, Xero pointing at the real tenant
- Monitoring: error tracking active, uptime monitoring configured, alerting routed to a real human in a channel they actually watch
- Documentation: user guides published if relevant, internal runbooks current, the team that will support the launch knows what to do
- Communication: launch announcement drafted and scheduled, internal team briefed, support team trained on the new system
This list is generic; the actual one for your project will be twice as long and project-specific. Build it explicitly. Tick every item. Anything not ticked is a launch-blocker.
Step 3: Choose the Cutover Style
The cutover decides how risk is managed at the moment of go-live. The three patterns that fit most projects:
- Big bang: at a defined moment, the new system replaces the old completely. Simple, decisive, high-risk. Fits internal tools where the user population is small and rollback is reasonable.
- Phased: the new system goes live for one team, one user group, or one geography at a time. Each phase confirms the system before the next phase begins. Fits broader rollouts where the user base is large or heterogeneous.
- Parallel running: the new and old systems run alongside each other for a period. Users access both; the team verifies the new system’s outputs match the old. Highest assurance, highest overhead. Fits high-stakes systems where errors are expensive.
For most internal tools, phased is the right answer — go live for one team first, fix what surfaces, then expand. For client-facing systems, parallel running is often the safest path but adds operational overhead. Big bang is appropriate only when the system is well-understood, the user population is small, and a quick rollback is genuinely possible.
A concrete example. A client switching from a legacy job tracking system to a new custom one had 40 internal users. We chose phased — the operations team (eight users) went live first for a week, fixed the small issues that surfaced, then the rest of the business followed. The first week absorbed all the genuine issues; the full rollout was a non-event.
Step 4: Pick the Right Launch Day
The mechanics of launch day are partly about timing. Launch Tuesday morning, not Friday afternoon. The reason is simple: if something goes wrong, you want the team available all week to fix it, not racing against the weekend.
Avoid launching the day before a holiday, the day before someone critical to the project is on leave, or in the middle of a known busy period for the business. The cost of pushing the launch by a week is small. The cost of launching into a week where you have no time or people to fix issues is high.
If the launch involves a marketing push or external communication, coordinate the launch with comms — but the technical go-live should still happen on a quiet morning. Marketing announcement on Wednesday can follow a Tuesday morning launch that the engineering team has confirmed is stable.
Step 5: Watch the First 48 Hours
The first 48 hours after launch is when the real testing begins. Production load, real users, real edge cases. The team should be actively watching, not on standby.
What to watch: error rates in the application (a spike in errors is the first sign of trouble), response times (a sudden slowdown often precedes a wider outage), database load (queries that worked fine in staging can fall over at production scale), third-party API responses (rate limits and timeouts surface in production), and user feedback (the support channel often surfaces issues before monitoring does).
The team responsible for monitoring should be explicit. “We will check in on the system” is not a plan. “Sam is on point until 5pm today, Alex from 5pm to 9am tomorrow, Sam back on at 9am” is a plan. Cover the hours users are active. If the user base is global, the rota is twenty-four hours; if it is a single office, the working hours of that office plus a couple of hours of buffer either side.
After 48 hours, monitoring relaxes to standard operational levels — but not before.
Step 6: Have a Rollback Plan You Have Actually Tested
A rollback plan is not a rollback plan unless it has been tested. The number of launches that go ahead with “we have a rollback plan” turning out to mean “we have a vague idea what we’d do” is high enough to be worrying.
A real rollback plan covers: the trigger criteria (under what conditions do we roll back?), the procedure (what exactly do we do?), who has authority to decide (a named person), and how long it takes (typically 15 minutes to 2 hours depending on architecture). It has been rehearsed on a non-production environment. The team knows what runs.
A concrete example. A SaaS launch had a rollback plan that involved switching DNS back to the old version of the application. The team had never tested it. On launch day, the DNS change took 90 minutes to propagate — much longer than expected — and during that window users were hitting a half-working system. The fix was correct in principle; the timing was wrong because nobody had measured it. A 30-minute test of the rollback path the week before would have caught this.
Trigger criteria matter because they avoid the launch-day debate about whether to roll back. “If the error rate exceeds 2% for 15 minutes, we roll back” is much easier to act on than “we’ll see how it goes”. Decide the criteria in advance; act on them when they trigger.
Step 7: Run a Post-Launch Review
A week after launch, hold a structured review. What went well, what went badly, what would we do differently. Document it. The review is not blame allocation; it is institutional learning. The next launch is easier when the lessons from this one are written down.
The review covers technical issues (what went wrong with the deployment, monitoring, infrastructure), process issues (was the checklist complete, was the cutover style right, was the timing right), communication (did the team and users know what was happening), and outcomes (did the launch achieve what it was supposed to). Honest answers to those questions, written down, make every future launch faster and safer.
Common Mistakes
- Friday afternoon launches. Optimistic and irresponsible. Launch Tuesday morning.
- No code freeze. The “small change” added on launch day breaks something else. Freeze the code, fix only what UAT already surfaced.
- Pre-launch checklist that lives in someone’s head. Write it down. Tick it explicitly. Anything not ticked is a launch-blocker.
- No rollback plan, or an untested one. A rollback plan that has never been rehearsed is not really a plan. Test the rollback path before launch.
- Going live with staging keys, test mode, or sandbox configurations. Stripe test mode, Xero sandbox tenant, monitoring pointed at the staging server — these survive into production launches more often than they should. The pre-launch checklist must verify production configuration explicitly.
- Walking away in the first 48 hours. The launch is not over when the deployment ships. The team has to be watching actively until the system has proven itself under real load.
- No post-launch review. The lessons get lost, the same mistakes repeat on the next launch.
What Good Looks Like
A safe software launch has a frozen codebase three to five days before go-live, a completed pre-launch checklist of fifty-plus specific items, a cutover style chosen deliberately (phased, parallel, or big bang) and matched to the risk profile, a launch on a Tuesday morning, a 48-hour watch period with named owners by shift, a rollback plan that has been rehearsed, and a post-launch review a week later that captures the lessons. The launch itself is uneventful — which is the goal — and the small issues that surface in the first 48 hours are caught and fixed before they escalate. The business gains a new capability without absorbing a launch-week crisis.
Next Steps
If the launch is the start of an ongoing operational relationship with the system, How to Keep Your Software Secure After Launch and How to Handle Software Incidents cover the post-launch operating model. If the launch involves users adopting a new tool, How to Roll Out a New Internal System covers the people side. For ongoing technical support after launch, see Software Support Retainers.