Why people don’t use Monte Carlo

A few weeks ago there was a question on Reddit, asking people who don’t use Monte Carlo forecasts, why they don’t.

I expected to see a lot of answers along the lines of:

I don’t know what that is or how it would help me.
I know how to use the Monte Carlo but don’t understand the math underneath it and therefore can’t trust the answers. [How it works]

While I did see some of those, the answers were mostly full of people who very confidently said they understood Monte Carlo and then gave reasons for not using it, that clearly demonstrated that they don’t, in fact, know how it works.

ProKanban decided to debunk some of these misconceptions so earlier this week Colleen Johnson, Benjamin Huser-Berta, and I recorded a session together. I’ll link to that when it goes live.

First of all, let’s define what the problem is, that we would use a Monte Carlo simulation for. Monte Carlo itself is a general statistical tool that can be used for a variety of purposes. In this case, we’re using it to determine one of two things:

Given a number of work items, how long will it take to complete?
Given a defined end date, how many work items can we complete between now and then?

The Monte Carlo will use historical data to make a probabilistic forecast about the future. The fundamental assumption here is that the future is likely to look like the past. If we’re starting something now that is completely unlike what we’ve done in the past then that assumption will not be valid.

On the other hand, if we’ve been doing software development over the last three months and we expect to continue doing software development for the next three then the assumption holds and this can give us a highly accurate forecast.

Now what’s this probabilistic part? The answer that a Monte Carlo will give us is made up of a percentage and a range. “There is an 85% probability that we’ll be done on or before October 1”. The opposite would be a deterministic answer like “We’ll be done on October 1 at 3:05pm”. While our brains prefer a deterministic answer, the more accurate one is probabilistic.

Back to the Reddit posts

Too much work

A couple of posts stated that doing a Monte Carlo was too much work compared to estimation, and that you wouldn’t learn anything more than the estimation would give you.

First of all, nobody is ever going to calculate a Monte Carlo by hand. You’re doing to use a tool for it. If you’re using a built-in tool like ActionableAgile then it’s a small number of minutes to get the first results.

Even if you’re using a spreadsheet, like the one that Troy Magennis gives away for free, it might be half an hour to get the first forecast complete, as you’re having to copy data into the spreadsheet, and then a small number of minutes for each subsequent one.

Unless your idea of estimating is sticking a finger in the air to see which way the wind is moving, the Monte Carlo will always be faster.

Too much data required

There were a number of posts that incorrectly claimed that lots of different data points were required as input. Things like “average time”, “standard deviation”, and “correlation between each task”.

In fact, we need none of those. All we need is historical throughput. How many items did we complete last week? How about the week before that, and the one before that. All we need is throughput, although we do want enough data points. With eleven data points we have a fairly accurate forecast, although I usually aim for 15-20 data points.

Shifting priorities

One person said “priorities shift way too fast for the math to stay relevant”, which sort of misses the point of what any kind of forecast or estimate gives you. The forecast tells you how many items you can complete, not which specific ones you will do. Once you know how many, you still need to prioritize effectively.

Not culturally accepted

A couple of people brought up the perception that a probabilistic forecast may not be accepted by management as they want that deterministic answer instead, and that’s a valid point.

The probabilistic answer will be more accurate than the estimate, and less work to prepare, but if the culture in your company isn’t willing to accept it then you’ll need to fall back to the less accurate answer instead.

Conclusion

What I took away from this exercise is that there is an incredible amount of misinformation around Monte Carlo and probabilistic forecasting in general. It’s a lot easier than most people seem to think, requires less data than people assume, and is a lot more accurate than traditional estimates.

Is it always the right tool? Of course not, there is no tool that works in 100% of the cases. Monte Carlo does assume that we have a relatively stable system and that what’s happened in the past is a reasonable predictor of the future. If either of those isn’t true then it’s the wrong tool.

If we don’t have any of that historical data, then there are still other probabilistic approaches that might work however, like reference class forecasting.