Engineering High-LTV SaaS: Implementing Predictive Pacing and Model-Agnostic Architectures via Claude Code
Achieving $1M in Annual Recurring Revenue (ARR) for a specialized SaaS product requires more than just a functional codebase; it requires a rigorous approach to problem identification, algorithmic optimization, and architectural resilience. Our product, Clairvaux—an AI-enabled power dialer—was built using Claude Code, leveraging advanced prompting strategies and model-agnostic engineering to solve a high-value problem in call-heavy industries like HVAC, plumbing, and roofing.
The Optimization Problem: Maximizing Call Pickup Rates
The core value proposition of Clairvaux lies in optimizing a specific, measurable metric: the call pickup rate. In traditional outbound or inbound sales environments, the efficiency of a salesperson is bottlenecked by "dead time"—the period spent waiting for a ring to resolve into a live human connection.
We defined our primary optimization target as the percentage of dialed numbers that result in a live human answering within a 10-second window. By increasing the volume of calls per unit time and improving the probability of a connection, we have demonstrated revenue increases of up to 66% for enterprise clients, effectively doubling their monthly revenue.
The Ideation Protocol: Multi-Agent Divergent Prompting
The development of Clairvaux’s core features did not begin with human intuition, but with a structured, large-scale ideation process powered by Claude Code. To avoid the pitfalls of narrow thinking, we implemented a divergent prompting strategy designed to explore the entire solution space.
The protocol involved the following instruction set:
- Subagent Spawning: Instructing Claude Code to spawn 10 parallel subagents.
- Mechanism Generation: Each subagent was tasked with proposing 10 distinct mechanisms to increase the pickup rate.
- Divergent Constraints: We explicitly commanded the agents to diverge wildly across several dimensions: algorithmic, behavioral, infrastructural, regulatory, psychological, and time-based mechanisms.
- Filtering: We then manually audited the resulting hundreds of ideas to filter out "hallucinated" or non-feasible concepts (e.g., statistical regressions based on hyper-local weather data, which lacked sufficient signal).
This process led us to the discovery and refinement of predictive pacing.
Algorithmic Implementation: Predictive Pacing and Bayesian Optimization
The most impactful feature derived from this process was predictive pacing. The fundamental inefficiency in traditional dialing is the linear nature of the call: Dial -> Ring -> No Answer -> Next Call. The optimal play is to utilize multiple simultaneous dials with calculated offsets to maximize the probability of a connection within the same time window.
To implement this, we developed a simulation harness. This harness allows us to feed historical call data—accumulating massive datasets of call times and connection latencies—into a controlled environment. Within this simulation, we utilized Bayesian optimization to fine-tune the parameters of our predictive pacing algorithm.
The goal was to determine the optimal offset for a batch of calls (e.g., 50,000 calls) to ensure that the "collision" of multiple answered calls does not overwhelm the available human agents, while simultaneously maximizing the number of live connections per unit of time. By using a queuing system and built-in call routing, we can manage the "awkward" situation of multiple simultaneous pickups, routing the overflow to available agents.
The Anti-Framework Manifesto: Prioritizing "Vanilla Intellect"
A critical lesson learned during the scaling of Clairvaux is the danger of "framework fatigue." In the current AI ecosystem, there is a massive influx of agentic frameworks (such as Hermes or OpenClaw) and context libraries designed to provide "memory" via vector databases.
Our empirical findings suggest that for revenue-generating software, every additional framework used is inversely correlated with profitability.
The reasons are twofold:
- Complexity and Regression: Introducing a new framework often introduces regression within the codebase. The way a model understands a codebase via a specific framework's middleware can change the way prompts are mediated, breaking previously working logic.
- The Value of the Base Model: The true intelligence resides in the base model's "vanilla intellect," not the wrapper. The engineering effort is better spent on craftsmanship—the chassis, engine, and wheels—rather than "fuzzy covers" like fancy steering wheel covers. As seen in the work of developers like Boris Cherny, the most effective implementations often rely on minimal system prompts and clean, direct instructions.
Strategic Moats: Regulatory Hurdles and High-Touch SaaS
In an era where Claude Code and similar agents can rapidly convert tokens into functional software, "low-touch" SaaS (self-serve, $10/month) is increasingly vulnerable to being commoditized. If a problem is easy enough to solve with a simple script, a user can simply use an agent to rebuild your tool.
To build a sustainable moat, we focused on two areas:
- Regulatory and Infrastructural Complexity: We targeted industries that require heavy compliance, such as A2P (Application-to-Person) registration for telephony. You cannot simply prompt an AI to bypass the bureaucratic, legal, and regulatory requirements of telecommunications. Similarly, industries like healthcare (HIPAA) provide a natural moat through the necessity of strict, human-verified data handling.
- High-Touch Implementation: We moved toward a "high-touch" model, targeting mid-market companies with significant budgets. By focusing on enterprise-level deployment and human-led onboarding, we create a layer of service and relationship management that an autonomous agent cannot easily replicate.
Engineering for Model Agnosticism
Finally, to ensure long-term resilience against fluctuations in model availability and token economics, we engineered the Clairvance codebase to be model-agnostic.
The architecture is designed to allow for "hot-swapping" between models:
- Claude Code: Used for primary development and feature engineering.
- DeepSeek: Utilized for cost arbitrage during intensive, long-running tasks like massive code refactoring or bug-fixing, where token costs are a primary constraint.
- Codex/Gemini: Integrated via model-specific specification files (e.g.,
agents.md,gemini.md,skills.md).
By duplicating specifications and preparing the workspace for different model-specific requirements (such as how different platforms handle YAML front matter for skill descriptions), we ensure that our development velocity remains high, regardless of which model currently holds the frontier.