Blog

Agile: does it work for data science teams?

What is Agile?

 

Agile is a well-established approach to project management and software engineering. The core principles include having small iterations with continuous feedback and improvements. This allows teams to deliver value to clients faster while ensuring that what they are delivering meets the client’s needs.

There are many different Agile methods that can be used, each with their own set of tools and techniques. Two of the most popular methods are Scrum and Kanban. With Scrum, teams are organised with specific roles and responsibilities. The team commits to delivering a set of tasks in a fixed time interval (sprint). This is particularly useful when there is a specific deadline to be met. With Kanban, a visual approach to managing the team’s workload is adopted, and teams focus on reducing Work in Progress (WIP) by finishing current tasks before starting new ones. This is achieved using a Kanban board with a continuous workflow structure.

At dunnhumby, we have adopted Agile ways of workings in several of our Science teams. These teams are cross-functional with data scientists, data science engineers, data engineers, product owners and internal stakeholders. The Agile methodology that is followed depends on the project, but the aim is always to find an approach that works well for the team. If certain aspects are not working, retrospective sessions, where the team discuss what has worked well and poorly, are the perfect opportunities to make improvements to the process.

 

How does data science work differ from software engineering?

 

At its heart, data science is all about using data to create actionable insight. It is part software engineering, part research and part innovation. In a traditional Agile framework for software engineering, the expected deliverable after each iteration is working software. However, in data science this is not necessarily the case. Instead, the deliverable could be the results of an exploratory data analysis, a collection of metrics or even the evidence that the proposed approach is infeasible. The important point for Science teams to consider when adopting Agile methodologies is that these ‘measurable deliverables’ can be reviewed, and feedback be given.

 

Agile for Science teams

 

As a data science manager at dunnhumby, I have had the opportunity to try out a hybrid Agile method called Scrumban with my science team and have learned what works well and what doesn’t work as well.

What works well?

Clearly defined tasks

A common pitfall of data science projects is lack of clarity of tasks and their expected deliverables, which can lead to scope creep or tasks overrunning due to pursuing avenues which are ultimately fruitless. With an Agile approach, each task is defined with a clear deliverable and timeline. This gives clarity both to the data scientist performing the work as well as the wider team who know what will be accomplished during the task. Additionally, regularly completing tasks can be very motivating compared to making incremental progress within a very large project.

Regular demos

Demo sessions are opportunities to share significant results or and milestones with the wider team. Having regular demos ensures that the whole team are engaged in the work being delivered and are great opportunities for cross-functional learning and ideation. Unlike traditional data science projects where internal stakeholders may only be involved at a monthly or quarterly cadence, demos give the opportunity for regular feedback and course correction (when necessary). It shouldn’t be necessary for everyone to demo each session, but only after a significant piece of work or milestone is achieved.

Retrospectives

Retrospective sessions are times for the team to regularly reflect on what has worked well, what hasn’t worked well and why. They offer a chance to make changes to the current process in order to improve the team’s ways of working. This is a big step forward from traditional data science projects where there is no formalised feedback system for data scientists to openly discuss ideas and approaches to better the team.

What doesn’t work as well?

Scoping can be very hard

Due to the uncertain nature of many data science projects, it can be very hard to accurately estimate the effort required. Before starting a project, it may not be clear where the pain points will lie. Some seemingly straightforward tasks may expand greatly in scope after initial investigations whereas other complicated-looking problems may be solved with a simple solution. Therefore, care must be taken when scoping tasks and it is common to have to rescope and/or create more sub-tasks once the work is underway.

Allowing space for innovation

With each task having a clear deliverable and timeline, data scientists could easily be overly focussed on finishing their tasks quickly rather than spending time exploring alternative approaches. While this may lead to rapid delivery, it may miss out on innovative solutions that could have large future impacts. Therefore, depending on the project, care should be taken to allow space for innovative thinking, allowing the data to dictate the approach.

Summary

Applying Agile approaches to Science teams can have great benefits. For example, having clearly defined data science tasks, regular demos and retrospectives drive clarity, engagement and ownership within teams. However, accurately scoping tasks can be difficult and the innovative, uncertain nature of data science means certain projects may not fit as easily within Agile frameworks as those which are closer to software engineering in nature.

The latest insights from our experts around the world

customer first data science analytics & machine learning services
Ready to get started?

Speak to a member of our team for more information