Illustration of an Agile development team discussing estimates around a board, symbolizing the evolution of methods toward greater predictability.

Alexandre Rivest

7 min lecture 03 October, 2025

Predictability: 10 Years of Evolution in Agile Estimation

Why do we estimate?

Of course, for predictability—but what else? Personally, part of the answer stems from inevitable frictions within a company's operations.

Suppose I tell you about a development team that iterates regularly on its product, develops, and frequently validates with users whether they’ve built the right thing. Would you consider that team Agile? Pretty much, yes!

If I tell you about a company that sets a development budget for the entire year, takes the time to estimate everything to determine the delivery date 10 months in advance, and divides the work into 15 sprints, would you consider the company Agile? Not really. What I just described is a waterfall in sprints!

Although agility brings many benefits to software development, one challenge persists. Agile teams often work within companies that are not Agile. Their approach—based on adaptation and small increments—clashes with rigid practices like annual budgets, fixed delivery dates, or hierarchical decisions focused on long-term planning. This mismatch complicates alignment between the teams’ Agile philosophy and the organization’s broader culture.

And that need for predictability is just as present in a startup. Except here, the constraint isn’t a months-long plan but a limited budget. Estimation then answers a very concrete question: “Does the work we’re planning fit the budget we have?”

To address this issue, teams generally turn to Agile estimation tools. These aim to offer greater visibility into project progress while avoiding binding figures.

These are the techniques Nexapp used when we began building products for our clients. Over time, however, our approach evolved to provide more clarity and predictability. Let’s examine the various estimation techniques we’ve employed over the years and what led us to our current solution.

Story Points

Like many Agile teams, the default approach when estimating sprint tasks is to use Story Points. They don’t represent a fixed duration, but rather a relative assessment based on complexity, uncertainty, and workload. The team assigns points by comparing tasks to one another, often during a planning session (like planning poker). Story points typically utilize the Fibonacci sequence (1, 2, 3, 5, 8, 13, etc.) because this progression more accurately reflects the increasing uncertainty that accompanies growing complexity.

Advantages

The primary advantage is that values are based on complexity, rather than time. Time-based estimates are fixed—if a task takes longer than estimated, all forecasts shift. By estimating complexity, the team has more flexibility in how and by whom the task is executed.

When a team uses story points, members need to sit together and evaluate each task to determine its complexity and assign a corresponding value. These estimation activities (planning poker, technical refinement, etc.) bring great value because they align the team’s understanding of the work to be done. This is when unknowns are typically identified.

Tip: Unknowns in tasks are the primary source of delays in product development. The earlier these unknowns are discovered, the more effectively the team can manage them and the more valuable the estimates become.

Disadvantages

A recurring challenge we encountered with story points was clients’ understanding of the approach. Being relatively abstract compared to time, story points require a solid grasp of how they work. In one meeting, we had to explain to a client that a 5-point task is not equivalent to five 1-point tasks. They saw a velocity of 25 points and treated it as linear, but this approach is not. You must consider a sprint’s composition to interpret a team’s velocity accurately. A velocity of X points is relatively reliable when sprint to sprint, there is roughly the same number of 1-, 2-, 3-point tasks, etc. If sprint composition is never uniform, is the team’s velocity in story points actually reliable?

Because story points represent complexity rather than duration, this estimation approach can be unreliable. A complex task might take a few days or several weeks. This variation can depend on who does the work: someone new to the project? Someone experienced with the tech? A recent graduate? Many factors influence the time it takes to complete a task. If time is unrelated to this unit of measure, why use it to determine velocity?

To address this, some teams associate hours with story points, asking whether they can complete a task within a certain number of hours. However, this opens the door to micromanagement. I’ve seen a manager track the hours worked by each team member and map them to tasks to “fill their time.” In that context, velocity and story points were no longer used for the team’s estimation needs, but rather for the manager to decide who would work on what. The result: with everyone having their own assigned task list, effective teamwork became impossible.

Although this method sparked valuable conversations during technical refinement, its high level of abstraction limited its effectiveness. It also requires mental gymnastics to understand.

That’s why some Nexapp teams moved on to the next approach to resolve some of these issues.

T-Shirt Sizing

T-shirt sizing is very similar to Story Points. They’re almost interchangeable, as you can map a T-shirt size to a point value. For example, a 1-point task might be “small,” a 3-point task “medium,” and so on:

Small (S)

Low complexity, effort, and risk
Can be completed quickly, often in a few hours or a day
Requires little collaboration or few dependencies

Medium (M)

Moderate complexity, effort, or risk
May take a day or two
May require some collaboration or minor dependencies

Large (L)

High complexity, effort, or risk
May take several days or span multiple sprints (if not split)
Involves significant collaboration, dependencies, or areas of uncertainty

Advantages

Despite the similarity, T‑Shirt Sizing solves some Story Point issues. It removes the non-linear numeric values for estimating complexity. A velocity of 15 points can vary greatly depending on composition. With T‑Shirt Sizing, velocity is more explicit: “In a 2-week sprint, the team can generally complete 5 small tasks, 3 medium tasks, and one large task.”

By modifying our estimation approaches, we resolved numerous issues with clients. People less familiar with software development made far fewer misinterpretations of our way of working.

Disadvantages

However, it retains some of the Story Points’ drawbacks. The definition of sizes remains abstract and vague. Based on the criteria above, each unit’s range is quite broad.

Some teams add more units, such as Extra Small (XS) and Extra Large (XL). The issue with more units is that it complicates defining team velocity. Take an example of a team that runs sprints of about 15 tasks:

comparison of two T‑shirt sizing methods, one using three sizes (small, medium, large) and the other using two additional sizes (extra small and extra large)

Using 5 complexity units complicates defining team velocity because comparing units becomes difficult when preparing a sprint. The more units there are, the more frequent and laborious the comparisons become.

There’s also the variable composition problem: what happens when team velocity includes XL tasks but there are none of that size in the near-term backlog? What replaces that value in our velocity calculation?

These possible variations make the team’s overall estimation much more complicated, as the numeric value becomes secondary to constantly analyzing sprint composition in detail.

This complexity led us to a critical reflection: if an estimation approach is hard to explain and apply consistently, it may be too complicated for a simple initial need—getting an estimate to improve predictability. That insight prompted us to adopt our next approach.

Throughput

The unreliability of Story Points and the high level of abstraction in T-shirt sizing led us to seek an alternative. We needed an explicit approach that left no room for interpretation. In parallel, we became increasingly interested in team efficiency—a core Lean concept. Originating in manufacturing, Lean was popularized in software development by Kanban, which emphasizes visualizing workflow to improve efficiency.

To measure and optimize this flow, key Lean metrics are used, such as Work in Progress (WIP), throughput, and cycle time. Borrowed from manufacturing, these indicators help identify bottlenecks, reduce wait times, and smooth value delivery. Lean has profoundly influenced software practices by promoting a culture of continuous improvement.

Lean rests on a fundamental principle: optimize to reduce waste. In this view, any activity that doesn’t deliver direct value to the customer is considered superfluous. Estimating tasks, by its very nature, is therefore questioned since it doesn’t directly contribute to the final product the customer pays for.

Within this philosophy, we adopted a throughput-based approach. Instead of estimating the complexity of each task, we focus on the number of tasks the team can deliver within a given unit of time. This method relies on uniformly splitting tasks: each work item should represent approximately the same amount of effort.

Advantages

The main advantage is its simplicity and transparency—no more debates about the difference between a 3-point and a 5-point task. One task equals one unit of work. Velocity becomes immediately understandable: “This team delivers an average of 12 tasks per 2-week sprint.”

This approach also eliminates estimation bias. Rather than predicting effort beforehand, we measure what is actually delivered. Historical data becomes our most reliable prediction tool.

Uniform task splitting, while demanding at first, quickly becomes natural with the right techniques. At Nexapp, for example, we use the SPIDR method to split user stories into similarly sized tasks. This structured method (Spike, Path, Interface, Data, Rules) provides a concrete framework for identifying a feature’s facets and splitting them consistently.

Predictability improves greatly. With 50 tasks in the backlog and a throughput of 12 tasks per sprint, we can estimate about 4 sprints to finish—roughly 8 weeks. This simplicity greatly eases stakeholder discussions.

Disadvantages

The main challenge is uniformly splitting tasks. You need discipline to ensure each task represents a similar amount of work. This takes experience and sometimes rethinking how we structure work.

Some tasks, by nature, can’t be split as finely as others. A database migration or integration with an external system can hardly be reduced to the same granularity as a bug fix. In these cases, we identify such tasks as “epics” that count for multiple units in our throughput.

The approach also requires team maturity in the technical breakdown. Developers must quickly identify when a task is too large and how to subdivide it effectively.

Comparison of the approaches

Conclusion

Our evolution of estimation methods at Nexapp reflects a broader industry trend: moving from complexity to simplicity. Story Points, while helpful in nurturing an estimation culture, often create more confusion than they resolve. T-shirt sizing improves communication but remains too abstract.

The throughput approach, inspired by Lean, directly addresses the fundamental need to provide predictability without sacrificing agility. By focusing on what truly matters—regular delivery of value—we’ve found a balance between organizational expectations and Agile principles.

Throughput is now our standard at Nexapp. Our internal teams use it consistently, enabling us to offer clients the transparency and predictability they seek while preserving our agility in execution.

When our teams join clients’ development efforts, they sometimes need to adapt to existing estimation practices. However, we continually strive to refine those practices in line with our throughput approach by demonstrating its tangible advantages in predictability and simplicity.

We continually watch for emerging practices and methodologies. As our journey from Story Points to throughput shows, we don’t hesitate to question and adapt our methods when a more effective approach presents itself.

Choose an estimation method that serves your context—not the other way around. The key is to stay true to the goal: create transparency, facilitate planning, and ultimately deliver value to your customers predictably.

Got a tech project in mind?

Work with a team that maximizes and accelerates the impact of your software investments.

Contactez-nous

Continue reading