Businesses invest significant resources into forecasting tools, AI models, and CRM systems to gain clarity on which deals will close, when they’ll close, and their potential value. Better predictions don’t just help with accuracy - they help sales teams focus their efforts on winnable deals, optimise pipeline management, and improve forecasting precision. Yet, even the most advanced prediction systems can fall short if certain pitfalls aren’t carefully addressed.

In this post, we’ll explore some of the common challenges that can derail your efforts to predict sales opportunities accurately. Drawing from experience, I’ll highlight potential solutions to avoid these pitfalls and build a more reliable system.

1. Over engineering your CRM system

*What fields do I need to enter to save this deal?*

A complex CRM reduces adoption, leading to bad data, which means inaccurate predictions. I’ve seen it so many times where a customer has configured 500 custom fields on their Opportunities. Sales Teams do not like having their nose buried in a CRM for hours on end trying to figure out why they can’t save their opportunity. It wastes their time, reduces productivity and ultimately negatively impacts revenue.

If your CRM is bloated with 500+ fields, ask yourself—how much time are your reps spending entering data instead of selling? The harder it is for them to update deals, the worse your data quality becomes, leading to inaccurate insights and poor decision-making.

Why it’s a problem: Models need accurate data to train on. If this doesn’t exist then your model will never perform well.

How to avoid it:

Use tools that will automate the entry of activity data like calls and emails.
Make the CRM easy and simple to use. Keep fields and data entry validation rules to a minimum.

2. When your CRM data doesn’t reflect reality

It’s funny I was actually looking at a clients opportunity history data today and I was trying to understand how long deals should be in various sales stages. The very first deal I picked, was created on one day and on that same day then transitioned through 6 sales changes before being closed as won. Knowing what the client sells, I know what was recorded was not what actually happened. I suspect that the rep won the deal and in order for them to get their commission the deal had to exist in their CRM system. They created the deal and moved it through the stages so that it wouldn’t trigger some alert that would be fired if the deal was created as closed.

To the casual observer you might look at your sales velocity metrics and think everything is looking great, but the data in your CRM does not reflect reality. This could be very misleading, and result in trying to fix problems that do not exist while not addressing those that do.

Why it’s a problem: This distorts sales velocity metrics, making it impossible to spot stalled deals or pipeline bottlenecks. If your reported time-in-stage looks too good to be true, it probably is.

How to avoid it:

Use incentives to encourage individuals to enter their opportunities at the correct time in the sales process.
Integrate the data in your CRM to your management processes so that pipeline and deal reviews rely on what’s in the CRM.

3. Ignoring passage of time

Sales organisations are dynamic things. Opportunities come and go, we talk of feeding them through a pipeline. Accounts may start off as being prospects and then become clients. Contacts change roles, or sometimes move to different accounts. The whole system has many moving parts that are specific to a single point in time.

Many models fail because they don’t track opportunities as they actually existed at the time. If your data team isn’t reconstructing the live state of your pipeline, your model could be learning from a false version of reality.

We can’t just provide historical information of opportunities as they appear after they have closed as we lose all the context that tell us how it closed.

The painful truth is that the historical data we provide for the model to learn from should reconstruct all the moving parts back to how they were when the opportunities were open.

Why it’s a problem: Worst case is that despite models appearing to learn very well, as soon as they are deployed in a live setting they will have extremely poor performance and be unable to provide any beneficial information.

How to avoid it:

Talk to your data team about how they can capture and retain the live state of your sales system, through things like snapshots.
Use a 3rd party product like cloudapps.com, who will take care of this for you.

4. Not Keeping Models Current

*It’s not like it was in the good ole days*

Your sales model is like a GPS using last year’s roadmaps—if your business has changed, it may send you down the wrong path. A model that has been deployed for a year will make predictions based on data that is at least a year old. To have your prediction models provide the best insights they need to be regularly updated to stay up to date with the state of the business.

Why it’s a problem: Models will become less accurate over time, degrading their usefulness to help you make informed decisions about what action to take.

How to avoid it:

Plan to have your models retrained on a regular basis.

5. Not questioning predictions

Someone once asked me when the pandemic started if the models we had would be aware of it and be able to adapt to the change in circumstances. The answer is no. The models work in a “situation normal” scenario. There will always be factors that the model will not have access to that should be taken into consideration rather than just blindly relying on predictions. These maybe activities happening on the deal itself, like the detail of a phone call with the client or it maybe macroeconomic events and just about anything in between.

Why it’s a problem: Relying on predictions without question could lead poor decisions leading to missed opportunities.

How to avoid it:

Use explainable AI, which is a technique that helps inform you about how predictions are being made by placing a value on what they considered important.
Suggest regular review sessions with the sales team to discuss model predictions versus real outcomes

6. Poor handling of missing data

Missing or incomplete data has to be filled in, before it can be used to train a machine learning model. The strategy chosen to fill in this missing data can have a dramatic effect on the performance of the model and consequently it’s ability to provide useful insights.

Let’s say that the opportunity never reached a point where it was properly qualified and the amount was not known before it was closed lost. You may be tempted to just fill the amount with zero, but that’s likely going to skew the model into believing that opportunities with lower amounts are less likely to be won. In this case a more effective strategy might be to use the median value of the opportunities as a replacement value.

Why it’s a problem: The model will learn a false reality which will result in misleading predictions.

How to avoid it:

Use statistical approaches to approximate missing values
Explore more exotic representation models to predict missing values.

7. Lack of Explainability in Predictions

Even when models are accurate, sales reps and managers often distrust predictions they can’t understand. A black-box AI model that spits out probabilities without explaining why a deal is likely to close won’t gain buy-in.

Why it’s a problem: Lack of explainability undermines trust and adoption.

How to avoid it:

Use explainable AI - which is a technique that will explain how the model is making its predictions.
Train your teams to understand and leverage these insights effectively.

8. Over-Engineering the Model

It’s tempting to think that you can just use all the information that you have available to build your model. Let’s throw in emails, phone calls, all the line items on a deal, the decision maker, key influencer… you get the picture. The problem here is that the more things that we include to learn from the more powerful we need to make the model and that’s not always a good thing.

Models that are very powerful have a tendency to learn and copy the specifics of training data instead of identifying trends and patterns. This means that they can be very poor at generalising when they’re let loose in the wild. We need to constrain them to help them perform better in a live setting and one way of doing that is to use fewer types of information, called features, for them to learn from.

Why it’s a problem: You may find that the predictions don’t work as well as you would like in a live setting.

How to avoid it:

Start by thinking what the purpose of the model is. (eg price optimisation, forecasting deal outcomes), then use your domain knowledge to start identifying the key pieces of information that could influence that. Start with that as the initial material for your model to learn from and iterate from there.
Talk to your data team about performing exploratory data analysis (EDA), which is a manual task that will investigate the characteristics of your sales data with the aim of identify information that is likely to be useful for your use case.

9. Not testing your model on live data

There is nothing worse than having invested huge amounts of time and resource building a prediction model that you believe is going to transform the fortunes of your sales function only to discover 3 months later that it doesn’t work nearly as well in a live setting. Models should be tested on live data before they are released and you really should plan for this at the outset of your project. If you can ensure that extracts of your live data are taken as early as possible then when you want to test it, time has passed and the chances are you will already know the outcome in some cases.

Why it’s a problem: Testing on live data is the best way to give you confidence in the actual performance of the model. Without this there is a significant risk you may not get the results you expect.

How to avoid it:

Test your model on live data and measure the performance.

Conclusion: The Path to Better Sales Predictions

Machine learning starts and ends with the data and the majority of the performance of the model and therefore the benefit that you will get from it will come from the data rather than the type of architecture of the model. In the real world data is imperfect - it always is, but that doesn’t necessarily mean that you can’t obtain useful analytics out of it.

Review these pitfalls with your team: Are we capturing live data accurately? Is our CRM setup simple enough to encourage consistent data entry?

With a combination of carefully nudging user behaviour to improve data and a few tricks to clean and enhance it, we can make tools that will provide valuable insights, which will improve the decisions we make and ultimately translate into better sales performance.

Contact Us

If you would like advice or help implementing prediction models into your sales organisation we’d love to hear from you.

Contact Us

9 Pitfalls to avoid predicting Sales Opportunities

1. Over engineering your CRM system

2. When your CRM data doesn’t reflect reality

3. Ignoring passage of time

4. Not Keeping Models Current

5. Not questioning predictions

6. Poor handling of missing data

7. Lack of Explainability in Predictions

8. Over-Engineering the Model

9. Not testing your model on live data

Conclusion: The Path to Better Sales Predictions

Contact Us

Win more with eXplainable AI

Our Top 7 Forecasting Models We Benchmarked For Monash