21 June 2017
Staying ahead of the competition in the world of data science is no mean feat, with new techniques and new technology constantly changing the goalposts. Part of dunnhumby’s approach to ensuring we can offer our retail and manufacturing clients the benefits of the latest scientific thinking is to focus on building strong links with world-leading academics and institutions. This enables us to be on the forefront of the latest developments and to advance the types of data science that power our products.
A good example of how we are investing in the latest scientific thinking and talent, is our sponsorship of PhD student Roxana Pamfil from Oxford University’s Industrially Focused Mathematical Modelling (InFoMM) Centre for Doctoral Training. Oxford's InFoMM CDT is a partnership between EPSRC (Engineering and Physical Sciences Research Council), the University of Oxford, and a large number of industry partners.
Our collaboration with Roxana involves analysing data from anonymised loyalty card customers to determine the best product recommendations for customers. The purchases are depicted as a (bipartite) network, whereby a connection (or ‘edge’) between a customer node and a product node indicates that the customer purchased the product. The aim of the project is to cluster customers and products together, in order to provide more relevant product recommendations to customers. An example of this is demonstrated in the graphic below:
To identify the best clusters or communities, we are applying two specific scientific techniques: stochastic block models and modularity maximisation. In the stochastic block model approach, the data is represented as a statistical model from which we can determine how likely a product or customer is to belong to each community. Modularity maximisation finds the best communities by comparing the number of connections within a community to the number expected by pure chance.
We constantly attempt to improve these models by adding more detailed purchase information to the networks, for example: different weights between customers and products, and product information in the form of category labels to improve the accuracy of product communities. One area of specific interest is how the network changes over time and whether the ‘customer-product’ community structure persists, in particular the importance of seasonality for each community. As always, we will be led by the data to find the best solution for customers going forward.
The results of this science collaboration will feed into several of dunnhumby’s capabilities, improving efficiencies of current techniques, as well as helping develop new solutions to key issues faced by retailers and manufacturers trying to improve the shopping experience for their customers.
One of our key academic advisers - Professor Mason Porter from UCLA – is presenting on ‘Mathematics for the modern economy’ at The Royal Society (London) on Wednesday 28 June. Professor Porter will be drawing on his experience from working with industrial partners like dunnhumby to discuss the role of industrial mathematics in business. Chris Brooks, Head of Science at dunnhumby, will also be sharing examples of where leading science, developed through our collaboration with academic partners, has helped to keep dunnhumby customer science at the cutting edge. For more information about the event, click here
Roxana Pamfil will present findings of her work on ‘Analysing Consumer Preference in Grocery Stores using Annotated Networks’ at the Society for Industrial and Applied Mathematics (SIAM) Network Science conference (Pittsburgh, USA) on 13-14 July. For more information about this event, click here