Document 6 of 7 - For Agencies

Implementing CodeBlu: A 90-Day Rollout Guide

A pilot-to-production playbook for a training coordinator deploying CodeBlu, covering setup, change management, stakeholder communication, and success metrics.

On this page

1. Purpose and the shape of this plan
2. Before day 1: prerequisites
3. Phase 1, days 1 to 30: pilot setup and launch
4. Change management considerations
5. Stakeholder communication
6. Phase 2, days 31 to 60: run, evaluate, and decide
7. Phase 3, days 61 to 90: production rollout
8. Success metrics
9. Risks and contingencies
10. Closing note
Sources

A pilot-to-production playbook for a training coordinator deploying CodeBlu, covering setup, change management, stakeholder communication, and success metrics.

1. Purpose and the shape of this plan

This guide is for the person inside the agency who will actually run a CodeBlu deployment: usually a training coordinator or a training sergeant. It assumes the agency has decided to evaluate CodeBlu seriously and has worked through the companion documents on evaluation, security, RFP questions, return on investment, and the comparison to traditional training.

The plan runs 90 days and is built around a single principle: prove it before you scale it. CodeBlu is a new product, and no agency should move from contract to full deployment without a structured pilot that tests the product against the agency's own success metrics. The 90 days divide into three phases. Days 1 to 30 set up and launch a pilot with a small cohort. Days 31 to 60 run and evaluate the pilot and produce a documented go or no-go decision. Days 61 to 90 expand to production if the decision is go. A no-go decision at day 60 is a legitimate and planned outcome, not a failure of the process.

The dates below are relative day numbers, not calendar dates. An agency should map them onto its own calendar, allowing for shift schedules and for the lead time its contracting process requires.

2. Before day 1: prerequisites

Five things should be complete or in progress before the 90-day clock starts. Beginning the rollout without them creates avoidable problems later.

A signed contract appropriate to a government buyer. The agency's counsel should have reviewed and finalized the contract. As noted in the companion materials, CodeBlu's standard terms of service are still being finalized, and several high-stakes provisions need to be drafted for a government context. Do not start a rollout against draft terms.

Security sign-off. The agency's IT or information-security function should have completed its review using the companion security document and obtained, in writing, the items that document lists: the subprocessor list, the data processing agreements with AI subprocessors, the data-residency confirmation, the breach-notification commitment, and the data-export and deletion commitments.

A written officer-data policy. Before any officer logs in, the agency should issue a short written instruction: officers must not enter real case data, real subject information, or other Criminal Justice Information into CodeBlu, and should treat all scenarios as fictional practice. This protects the agency on the CJIS question and on privacy.

The data-retention setting decided. The agency should have decided what retention period to configure in CodeBlu, reconciled against the agency's own records-retention obligations, which may be longer than CodeBlu's 90-day default.

Named roles. The agency should name a project owner, who is accountable for the rollout; an agency administrator, who manages the platform day to day; and an executive sponsor at the command level. For most agencies the project owner and administrator are the same person.

3. Phase 1, days 1 to 30: pilot setup and launch

3.1 Week 1: governance and metrics

The first week is planning, not software. Three deliverables.

First, confirm the pilot cohort. A workable pilot cohort is 10 to 20 officers, large enough to produce meaningful feedback and small enough to support closely. Choose a mix: include experienced officers and newer ones, include at least one or two skeptics rather than only volunteers, and include the first-line supervisors of the cohort so supervision is part of the test. Avoid a cohort made up only of enthusiasts, because an enthusiast-only pilot does not predict how the wider agency will respond.

Second, define success metrics before the pilot starts. This is the most important week-one task, and Section 6 of this guide is devoted to it. Metrics defined after the fact are not credible. Write them down, get the executive sponsor to agree to them, and set the thresholds that will mean go or no-go at day 60.

Third, set up governance. Decide who meets, how often, and who decides. A simple structure is a weekly 30-minute check-in among the project owner, the administrator, and the executive sponsor, plus a mid-pilot review around day 45 and the decision review at day 60.

3.2 Week 2: technical setup and verification

Week two stands up the platform and verifies it on the agency's own environment.

Provision the agency account and the administrator account. Import the pilot cohort roster. Confirm that role assignment is correct: officers as officers, supervisors as trainers or administrators as appropriate, and verify that a trainer can see the cohort roster and that an officer can see only their own records.

Run the technical verification that the companion RFP document recommends: a live test on the agency's actual equipment and network, in the rooms officers will use. Confirm microphone and audio quality, confirm a voice scenario runs cleanly end to end, and confirm an after-action review is produced. Test the realistic concurrent load the pilot will create. If the agency's bandwidth or its available quiet space is a constraint, week two is when to find out, not week four.

Confirm the records path: run one scenario, confirm the training hours log, generate a sample certificate, and run the per-year hour-summary export. Confirm the export format works for the agency's recordkeeping.

Decide and document where officers train. Voice practice needs a reasonably quiet, private space. Identify the rooms or stations, and decide whether sessions happen on-duty, on a training day, or on another basis. This decision has labor implications; see Section 4.

3.3 Weeks 3 and 4: pilot training begins

By week three the cohort starts training. Hold a short kickoff with the cohort, in person if possible. Cover what CodeBlu is, why the agency is piloting it, what the agency is and is not claiming for it, how to run a session, the officer-data rule from Section 2, and how feedback will be collected. Be candid with the cohort that this is a pilot and that their honest assessment, including criticism, is the point.

Set a modest, concrete practice target for the pilot period, for example a set number of scenarios per officer over the remaining pilot weeks, chosen so it is achievable within the cohort's schedule. An unrealistic target produces resentment and bad data.

Have supervisors in the cohort review after-action results with their officers at least once in this period, so the agency tests the human-in-the-loop model rather than letting the AI score stand alone. CodeBlu's own position is that its feedback is a training aid and not a substitute for instructor evaluation, and the pilot should operate that way from the start.

Begin collecting feedback immediately and continuously: a simple shared log where the administrator records what works, what breaks, officer reactions, and any technical issues. Do not wait for a survey at the end.

By day 30 the agency should have: a cohort actively training, a feedback log filling up, the technical environment proven, and the records path confirmed.

4. Change management considerations

Change management is not a phase; it runs through all 90 days. For a tool like CodeBlu it is as important as the technical rollout, because the deployment can fail on officer and instructor reception even if the software works perfectly.

Officer reception. Officers are reasonably skeptical of new training technology and especially of being scored by software. Three things reduce friction. Be honest about what the tool is: practice and reinforcement, not a verdict on the officer. Be clear about how the data is used: who sees an officer's scores, and explicitly whether CodeBlu results feed into formal performance evaluation or discipline. Officers will assume the worst unless told otherwise, so the agency should decide that question deliberately and communicate the answer. And give officers a private space to practice, because practicing a hard conversation badly in front of peers is a deterrent, while private repetition is where skill is built.

The relationship with instructors and trainers. Existing de-escalation instructors can perceive an AI training tool as a threat to their role. It is not, and the rollout should make that explicit. CodeBlu does not replace instruction; it adds practice volume to what instructors teach, and instructors gain a tool that keeps skills warm between their courses and gives them after-action data to inform their teaching. Bring instructors in early, ideally into the pilot cohort's supervision, so they help shape the deployment rather than resist it.

Labor and union considerations. If the agency's officers are represented, the deployment touches matters a collective bargaining agreement may cover: whether training time is on-duty or compensated, and whether a software-generated score can be used in evaluation or discipline. Engage the bargaining unit early, before the pilot, not after. Getting this wrong is one of the few things that can stop a rollout outright.

Framing the tool honestly. The fastest way to lose credibility with officers and command is to oversell. The agency should present CodeBlu as what it is: a new, promising practice tool being piloted carefully, not a proven solution. Honest framing survives contact with reality; a sales pitch does not.

5. Stakeholder communication

Different stakeholders need different messages. The project owner should plan communication for each before the pilot, not improvise it.

Chief executive and command staff. They need the decision frame: what the pilot will test, what the success thresholds are, what the cost is, and when the go or no-go decision comes. Give them the success metrics in writing at the start and a one-page status at the mid-pilot and decision points. Command should never be surprised by the rollout's status.

The pilot cohort and their supervisors. Covered in Section 3.3: what the tool is, why the agency is testing it, the officer-data rule, how feedback is collected, and an explicit statement of whether results affect evaluation. Honesty and a request for candid criticism.

The wider agency. Officers outside the pilot will hear about it. A brief, factual note that the agency is piloting a de-escalation practice tool with a small group and will decide later whether to expand prevents rumor from filling the gap.

The bargaining unit. Covered in Section 4. Early, formal, and before the pilot.

The governing body and the public. A city council, county board, or police commission may need to approve the expenditure and will want to know what was bought and why. The agency should be ready to describe CodeBlu plainly: a tool that adds verbal de-escalation practice and produces training records, adopted through a careful pilot. The agency should not claim CodeBlu reduces use of force or improves outcomes, because that is not established for this product; it should say the agency is investing in more and better-documented de-escalation practice, which is a defensible and honest statement. If asked publicly, the agency should be equally plain that real-world judgment and live training remain central and that the tool supplements them.

6. Phase 2, days 31 to 60: run, evaluate, and decide

6.1 Continuing the pilot

Through days 31 to 50 the cohort keeps training toward the practice target, supervisors keep reviewing after-action results with officers, and the feedback log keeps filling. The administrator should watch for officers falling behind the practice target and find out why: a scheduling problem, a technical problem, or a motivation problem each calls for a different response, and each is useful pilot data.

6.2 Mid-pilot review, around day 45

Hold the mid-pilot review with the project owner, administrator, and executive sponsor. The agenda: progress against the practice target, the technical-issue log, early officer and supervisor feedback, and any emerging problems. The purpose is to catch a fixable problem with two weeks of pilot left, and to give an early signal of where the day-60 decision is heading.

6.3 Structured feedback collection, days 50 to 57

In the final stretch, collect feedback systematically rather than only from the running log. Survey the cohort officers: was the practice realistic, was the feedback useful, was the tool easy to use, would they want to keep using it, and what would they change. Interview the supervisors separately: did the after-action data help them coach, did scoring seem sound, did the human-review model work. Pull the platform data: scenarios completed, hours logged, completion rates against the target, and any technical-failure rate.

6.4 The go or no-go decision, day 60

The decision review compares the pilot results against the success metrics set in week one. The output is a written recommendation to the chief executive: go, no-go, or extend the pilot. Each is legitimate.

A go recommendation means the metrics were met or substantially met and the agency should proceed to Phase 3. A no-go means the metrics were not met, or a problem surfaced, security, labor, technical, or reception, that the agency is not prepared to accept; the agency stops, and the contract's exit and data-deletion terms are exercised. An extend means the pilot was inconclusive, often because practice volume was too low to judge, and a defined extension with a specific question to answer is warranted. The decision should be documented with the metric results attached, because the chief executive may need to defend it to command or a governing body.

7. Phase 3, days 61 to 90: production rollout

Phase 3 applies only on a go decision. The principle is the same as the overall plan: expand in steps, not all at once.

7.1 Days 61 to 70: prepare to scale

Apply the lessons of the pilot before adding officers. Adjust the practice targets, the scheduling model, and the supervision approach based on what the pilot showed. Update the agency's written guidance and the cohort kickoff materials to reflect what was learned. Finalize the rollout sequence: which units or shifts onboard in which order. Confirm with CodeBlu that the larger licensed user count and any billing change are handled, and confirm the agency has a named support contact for the larger deployment.

7.2 Days 71 to 85: phased expansion

Onboard the agency in waves, by unit or shift, not in a single agency-wide switch. A wave structure means a problem affects one group and is caught before it reaches everyone, and it lets the administrator give each wave real attention. Each wave gets the same kickoff the pilot cohort received. Pilot-cohort officers can act as informal guides for their units, which is one more reason the pilot cohort should not have been enthusiasts only.

Through this period, watch the same indicators the pilot tracked: practice volume, technical issues, and feedback. A scaled deployment can surface problems a 15-officer pilot did not, particularly around bandwidth, scheduling, and supervisor workload.

7.3 Days 86 to 90: stabilize and establish steady state

By day 86 the agency is fully onboarded and the work shifts from rollout to operation. Establish the steady state: the recurring practice expectation for officers, the administrator's ongoing routine, the supervision rhythm for reviewing after-action results, and the schedule for exporting and retaining training records on the agency's own records system, so the agency's compliance evidence does not depend solely on continued CodeBlu access.

Hold a rollout retrospective with the project team and the executive sponsor. Rerun the return-on-investment worksheet from the companion document with real numbers from the deployment rather than estimates. Set the first formal review point, a reasonable choice is at the contract's first renewal, at which the agency will decide whether to continue, using the same kind of metrics-based judgment the pilot used.

The 90 days end with CodeBlu in routine operation, a documented decision trail, records being retained on the agency's schedule, and a defined review date. That is a complete, defensible deployment.

8. Success metrics

Metrics must be defined in week one and agreed by the executive sponsor. They fall into four groups. For each, the agency sets its own threshold; suggested shapes are given, not prescribed numbers, because the right threshold depends on the agency.

Adoption and usage. Did officers actually use the tool? Measures: percentage of the cohort meeting the practice target, average scenarios per officer, and completion rate. Low usage in a pilot predicts low usage at scale, and the agency should diagnose the cause before scaling.

User reception. Did officers and supervisors find it valuable? Measures: cohort survey results on realism, usefulness of feedback, and ease of use; supervisor assessment of whether the after-action data helped coaching; and the proportion of officers who would choose to keep using it.

Operational fit. Did it work within the agency's reality? Measures: technical-failure rate, whether scheduling worked, whether the records and export path functioned, and the administrator's actual time burden against the estimate.

Training-quality signal. This is the hardest group, and the agency must be honest about its limits. A 30-day pilot cannot prove an effect on real-world outcomes, and the agency should not set a metric that pretends it can. What a pilot can reasonably observe: whether officers' after-action scores improved with repetition, whether supervisors judged the practice professionally sound, and whether instructors found the tool consistent with what they teach. These are leading indicators of training quality, not proof of outcome. The agency should label them as such.

One discipline applies to all four groups: do not let the avoided-incident or use-of-force question become a pilot success metric. It cannot be measured in 90 days, and building the decision on it would be dishonest. The pilot decides whether CodeBlu is a usable, well-received, operationally sound practice tool. Whether it contributes to outcomes over time is a question for the longer-term review, answered with the agency's own data, and even then with appropriate caution about causation.

9. Risks and contingencies

A short register of the risks most likely to affect the rollout, and the planned response to each.

Low pilot usage. The most common failure mode. Response: diagnose early at the mid-pilot review; if the cause is scheduling, fix the scheduling; if it is motivation, address framing and supervisor engagement. Do not scale a pilot that did not generate enough practice to judge; extend it instead.

Technical or audio problems on agency equipment. Response: this is exactly what week-two verification exists to catch. If it surfaces later, treat bandwidth and quiet-space constraints as solvable logistics, and escalate genuine product defects to CodeBlu's support contact.

Labor or bargaining objection. Response: prevented by early engagement per Section 4. If it surfaces despite that, pause and resolve it with labor counsel before proceeding; do not push through.

Officer or instructor resistance. Response: honest framing, a private practice space, an explicit statement on how data is used, and instructor involvement in the pilot. Persistent resistance is itself a no-go signal worth respecting.

Vendor or service disruption. Response: CodeBlu is an early-stage vendor dependent on third-party services. The agency's protection is the contract's exit and data-export terms and the practice of retaining records on the agency's own system. Keep both current.

Overstatement creep. Response: the risk that, as the rollout gains momentum, the agency's internal or public messaging drifts toward claims CodeBlu's evidence does not support. The public-information officer and the project owner should hold the messaging to what is established.

10. Closing note

A CodeBlu deployment done well is unglamorous: a small pilot, honest metrics, a documented decision, a phased expansion, and records retained on the agency's own schedule. The 90-day structure exists so that the agency can adopt a new and promising tool without taking on the risk of a new and unproven one. If the pilot succeeds against metrics the agency set in advance, the agency has a defensible addition to its de-escalation training. If it does not, the agency has spent a contained amount of money and 60 days to learn that, which is also a good outcome. Either way, the decision belongs to the agency and rests on the agency's own evidence, which is exactly where a decision about training officers should rest.

Sources

Police Executive Research Forum, ICAT program and de-escalation training resources: https://www.policeforum.org
CIT International: https://www.citinternational.org
U.S. Department of Justice, Office of Community Oriented Policing Services, training resources: https://cops.usdoj.gov
International Association of Chiefs of Police, training and policy resources: https://www.theiacp.org
FBI CJIS Security Policy Resource Center: https://le.fbi.gov/informational-tools/cjis/cjis-security-policy-resource-center

This document is an implementation aid and is not legal or labor-relations advice. Route contract, records-retention, CJIS, and collective-bargaining questions to the agency's counsel, human resources, and CJIS Systems Officer.

Talk to us

For pricing, a structured pilot, or any question this document does not answer, email sales@codeblu.co. For security, privacy, or named-subprocessor questions, email privacy@codeblu.co.

Start a pilot conversation