8 January 2019
This article originally appeared in CIO Applications Europe
As an organisation who pioneered the practice of Customer Data Science, we’ve spent 30 years demonstrating that data analysis is the key tool for forecasting consumer behaviour. Partnering with retailers around the world has helped us drive innovation in the discipline. In truth, our clients were quick to acknowledge that the best predictor of customer behaviour wasn’t market research but was customer behaviour itself.
But it’s taken the best part of those 30 years for the technology that enables data analysis to be able to provide an almost real-time answer to the questions retailers ask us. And in the complex and competitive world of grocery retail, having information fast is becoming essential.
The growing demand for more sophisticated data analysis across retail is helping to energise the discipline. I suspect we’re still heading steadily upwards to a plateau in the data science industry, but as we make that journey there are challenges along the way.
We need more people…now
Scarcity of talent is a real issue facing the industry now and it’s not likely to go away in the short term. Data science roles are attracting more graduates and learning institutions are responding with more relevant courses, but there remains a significant skills gap. IBM predicts that demands for data scientists will soar by 28 percent by 2020. We know that client companies running their own data science operations face high attrition rates and escalating wage bills. Anyone starting a data analysis business today will need a healthy cheque book, and a lot of patience, to secure the best talent.
At dunnhumby we recognised the threat posed by the lack of data scientists over a decade ago. This was one of the key factors in our decision to establish a data analysis centre of excellence in India. Today we have a significant number of data specialists working there. Our experience there has helped us build a strong foothold in the country and built our reputation as a good employer.
Automation will prove part of the solution
Data scientists everywhere are often doing work that they’re over-qualified for. In an environment where good people are at a premium, it’s perhaps ironic that so much time can be spent on mundane tasks such as cleaning and managing data. The ability to use talent more effectively gives rise to the major breakthroughs that push our industry forward. Seen in this way, automation will be a great solution to the labour challenge. Ideally, data scientists should be tasked with developing great ideas, and leverage automation to make their execution virtually effortless.
We need to grasp the full potential of machine learning
Faster and smarter technology has introduced a fundamentally different way to approach data problems. Previously, we would develop a hypothesis then go about testing it. Testing might take 40 hours and do nothing more than prove our various hypotheses wrong. Like looking through a haystack without the certainty that there was actually a needle in it.
Computing power today means we can analyse data at infinitely faster speeds — essential when you consider we work with nine terabytes of data that includes information about 44 billion buying transactions covering 320 billion items. This gives us the luxury of being able to analyse data without a particular hypothesis in mind. This can throw up multiple sets of results which can then be developed to form a more rigorous hypothesis. Instead of exploring ideas, we explore data to generate ideas.
GDPR and privacy remain key issues for data scientists
Loss of reputation and customer confidence is the biggest threat facing companies who fail to implement strict data security policies; GDPR introduced a new rigour in the ways we must think about data. So, while our analysts are interested in data, not personal information, developing a ‘secure data culture’ must be paramount.
At dunnhumby we’ve been moving towards open-source cloud computing over the past few years. This has helped us produce results faster but brings with it a whole range of other issues around access and security. Having strict policies around ensuring deletion of temporary data sets is one such example.
But the future remains bright
I’m convinced that the biggest data analysis breakthroughs in retail are still ahead of us. Technology has given us the tools, it’s up to us to develop the great ideas. To help us tap into tomorrow’s talent, and evolve our techniques, we’ve developed relationships with several universities and offer sponsorship for data science students. Other industries have shown how effective strong links with academia can be in accelerating breakthrough developments — life sciences is a classic example. It’s my hope that ours can do the same.
  https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020