Speed as a competitive advantage in AI
The increasing use of AI in retail is a huge game changer for the industry, helping retailers and CPG brands transform their businesses through initiatives such as improved customer analytics leading to better personalised experiences, more sophisticated assortment decisions leading to enhanced on-shelf availability, and optimised price and promotions. And these are just a few examples of many benefits AI can bring.
Yet as AI becomes “the norm” in retail operations, the quality of the science powering your AI-driven initiatives becomes a key differentiator. Our data scientists work closely with the business users to iterate and improve our science, and to ensure it provides as much value as possible. But when a data scientist is creating Machine Learning models, it is sometimes easy to forget the importance of speed to the business, and the value that responsive actionable insight can bring.
Time to value
Any data scientist will focus primarily on the accuracy of their predictive models. And whilst accuracy is important, the true value to the business will only be realised when the model is implemented. It can be tempting to keep iterating to create the ‘perfect’ model. But if the model is already achieving at least 90% of the potential gain, then it’s important to fight that temptation, and proceed straight to implementation.
Feedback from how the initial model performs in the real world will be very valuable in experimenting and deciding what works best, and there will always be the chance to refine the approach later.
Speed of calculation
As the size of data increases and as the variety of techniques grows, the time needed to build models can also be a constraint. If it is taking hours to crunch the numbers then it’s worth investing in science engineering — to increase the raw speed of the calculations, to make better use of cloud scaling by parallelising as much as possible, and to automate the parameter search so that the results are consistent and repeatable.
Reducing the turnaround time of science generation can open up larger and more complex datasets, and provides the ability to try and compare a wider range of techniques. It also allows the practitioner more time to focus on understanding the business context (such as, what retail issue is this algorithm aiming to solve?) and ensuring that the science is able to effectively fulfil the business need.
Timeliness of data
The timeliness of data is also important. If your model only works on data that is at least a month old, then you have a conundrum:
- Do you provide a recommendation now, based on out-of-date information?
- Or do you wait a month to deliver the insight, by which point the business has already made their decision and moved on?
By working with IT and Data teams, you can fix the data flows and ensure that they are timely, and think about what thresholds of improvement will make a real difference. Realistically, there are a number of different levels, with the value for the business only increasing significantly as the time lag moves to the next level:
- Under one month
- Under one week
- Under one day
- Under one hour
- Under one minute
- Under one second
Reducing the data lag from one month to two weeks may help a little but is unlikely to make a huge difference. Getting the data lag below a week will provide significant impact though, and if it’s possible to get it below a day, even better. As for under one second, that opens up something completely different…
On-demand scoring
Reducing calculation time to a few seconds opens the possibility of calculating values on-demand rather than ahead of time. Whilst pre-calculated values are inherently limited — there is a limit to the number of different combinations that can be calculated and scored — on-the-fly scoring can include the contextual features that are happening right now:
- What has the customer just done?
- What is happening in the market right now?
- What are the current conditions, such as the weather or news events?
The ability to calculate customer features in the moment also means that they will always be fresh and accurate, based on the latest information, which is important for customer satisfaction, and to comply with privacy legislation.
Breadth not depth
And last but not least, with any given modelling problem, there will come a time when the modelling solution is working just fine and needs relatively little support or maintenance. A data scientist’s instincts may be to keep refining the model further — perhaps eke out a few more fractions of a percent of predictive accuracy. But this is often a bad decision.
Chasing marginal improvements in model performance can lead to very complex models, which are likely to be fragile, with the potential to perform badly when they encounter situations and data that are different to what they’ve been trained on. By trying to chase performance, the business outcome can actually end up quite the opposite.
Retail has an endless array of issues to solve. Data scientists should think of the breadth of business problems, not depth — the faster those other topics are tackled, the better the overall value that can be delivered to the retailer or CPG and ultimately their end-users: the customers and shoppers