26 November 2019
Just as the phrase ‘Big Data’ had everyone captivated about 10 years ago, the buzzword of today that has businesses aflutter is undoubtedly Artificial Intelligence (AI). Open any journal or industry publication and it seems that everyone is talking about AI. Whether it’s using voice-activated assistants like Alexa, image recognition or the promise of self-driving cars, AI seems to be impacting many areas of our lives and aspects of business.
AI is defined as computers producing human-like intelligence. In theory, it could include everything from self-aware robots taking over the world at one extreme, to statistical methods at the other. But what exactly does that mean in practice?
From a data scientist’s perspective, AI is an umbrella term for a whole host of techniques where machines are fed data in a way that enables them to learn for themselves (so-called “machine learning” techniques), including:
With so many techniques available, which is the right one to use to solve a particular task? This will vary from problem to problem, as well as over time, and it is not always possible to know which will be the best in advance.
So, it’s important to experiment with as many relevant techniques as possible in order to choose the one that performs the best for the specific task at hand. The biggest mistake a data scientist can make is to stick to their favored technique and assume it will always produce the best results.
For example, deep learning has made great progress with image and speech recognition, but in some cases, such as forecasting monthly sales, it can sometimes perform worse than other approaches. While it is important to consider this technique amongst an array of others, it should be compared against alternatives using objective performance criteria that measure the success of the various approaches.
Given the sheer number of options available to data scientists, attempting to try out every variation by hand can be overwhelming and untenable. Automated solution search is not only faster, but able to reliably and rapidly test thousands of combinations. However, not everything can be automated, and this is where the role of the data scientist really comes into play.
Human intervention is essential when setting up technical solutions in the right way to solve problems using AI. What exactly is being predicted? What business knowledge is needed to shape the solution? How is success going to be measured? And – most importantly of all – how to ensure that the predictor has been generalised to work with data that it hasn’t seen yet, rather than just the data that it has been trained on?
While the lure of complex AI might be attractive to businesses wanting to modernize their problem solving, sometimes the simpler approaches are actually better. There is a constant trade-off between effort and return. Complex models may take a lot of time, resource and effort to build, and this isn’t always offset by incremental business benefit.
It may also be possible to save significant time by building on what others have already done. For example, many researchers have already spent thousands of hours to set the foundations for image recognition. So, rather than start from scratch with just the pixels of an image, which could take a long time to produce anything of value, a pre-trained image network (such as Resnet) can be used to identify the distinguishing features from an image, which can then be used as a stronger basis for building an image prediction model.
So while many businesses are eager to jump on the AI bandwagon, it’s important to understand how and where each technique can truly make a difference, and when it is worth investing in complex approaches. Just as you don’t need a sledgehammer to crack a nut, complex AI techniques may not be the best way to answer a particular business issue. Employing experienced data scientists will help determine the best techniques that deliver optimum results, balancing effort against likely return.