Adding clients adds overhead unless you automate the repetitive parts. The MSPs that scale past 100 clients without doubling headcount tend to have done the same handful of things, which is what this post is about.
Playbook move 1: productize onboarding
Client onboarding should be a pipeline, not a checklist. Template the stages:
- Intake. Standardized form captures scope, SLA, patch windows, alert destinations.
- Provisioning. Script creates the tenant, enrolls a gold-image template, sets policy baseline.
- Agent rollout. Automated deploy to client endpoints with rollback on failure.
- Validation. Automated health check runs for 48 hours; human sign-off only after green.
- Handoff. Client portal access delivered, monthly report scheduled, SLA clock starts.
End-to-end time for a typical SMB: 2 hours of work instead of 2 weeks.
Playbook move 2: standardize monitoring
Every client’s monitoring policy shouldn’t be hand-crafted. Define baseline packs:
- SMB baseline, 10 metrics, 5 alerts, basic patching
- Regulated, adds audit retention, compliance scans
- High-availability, adds latency-sensitive alerts, SLO tracking
Deviations from baseline happen rarely, get documented, and require approval. This keeps the alert-noise-to-signal ratio stable across your book.
Playbook move 3: route alerts by business value
Not every alert is equal. A disk warning on a $500/month client’s dev box is not a $50,000/month client’s prod database.
Routing:
- Tier 1 client prod: PagerDuty to on-call, 5-minute SLA
- Tier 2 client prod: ticket to team queue, 30-minute SLA
- All dev: ticket, business-hours response
Client tier is a tag on the tenant; routing is a rule that reads the tag. No hand-maintained routing tables.
Playbook move 4: automate the long tail
The long tail of operational work (disk cleanup, service restarts, cert renewal) isn’t strategic and isn’t fun. Automate all of it. Aim for 70%+ of alerts being auto-resolved without a human touch.
Playbook move 5: measure unit economics
Track per-client:
- Revenue
- Incident volume
- Automation coverage
- Tech-hours per month
When a client slips from 10 hours/month to 40 hours/month, something broke, usually a change in their environment that your automation didn’t adapt to. Catching it early prevents margin erosion.
The multiplier
Teams running this playbook typically go from 30 clients/tech to 80-120 clients/tech over 18 months. That’s not from working harder; it’s from paying down operational debt systematically.
Two servers, free forever. Sign up at app.lynxtrac.com if any of this resonates.
Related posts
Lightweight RMMs vs enterprise tools: what small teams need
Small teams pay for friction on enterprise-scale RMM. Picking tooling that moves with you is about knowing which enterprise features are real value and which are overhead.
Designing an RMM agent that doesn't slow systems down
Every RMM agent is a tax on the host. Designing ours to stay under 1% CPU and 50 MB RSS without dropping signal took a handful of specific choices.
Lightweight RMM for DevOps teams
DevOps teams do not want a tool that behaves like 2010 enterprise software. This is what a lightweight, CI-friendly RMM actually looks like in practice.