Disaggregating Forecasts

Disaggregating forecasts through a variety of lenses helps the decision maker evaluate a plan’s likely outcome before rolling it out. For a lens to be useful, its timely delivery is essential. Unless the lens is known in advance, its implementation could take more time than the decision maker has.

Because the lenses of interest may vary by decision maker, circumstance, available information and many other factors, the forecaster needs a modelling framework that is flexible enough to accommodate new lenses with minimal change to the fitting and forecasting pipelines.¹

FORECASTER

“The total units next week will be 100.”

DECISION MAKER

“How many from loyal customers?”

FORECASTER

“70 units from loyal customers out of 150 total units for the week.”

DECISION MAKER

“How many from young professionals?”

FORECASTER

“20 units from young professionals out of 130 total units for the week.”

DECISION MAKER

“How much will we be selling in total then?”

FORECASTER

“Either 100, 150 or 130 units.”

DECISION MAKER

“Should I take the average of those figures?”

FORECASTER

“If you want to.”

Here, we will examine some options to consider when designing a mathematical framework for disaggregating forecasts along generic dimensions.

Reconciliation of Forecasts

In circumstances where all lenses of interest are known in advance, the forecaster could take a twofold approach:

produce independent low-level models for each lens in advance
funnel their forecasts through a reconciliation module.

A forecast’s reconciliation process takes a collection of forecasts and adjusts them, as little as possible,³ to enforce coherence between them.⁴ To do so, the adjustments to each forecast depend on which other forecasts are considered. Therefore, the reconciled forecast for a specific lens will change when new lenses are considered. Reconciliation approaches guarantee coherence only within a fixed set of lenses.

In cases where lenses are not necessarily known in advance, we need a solution that guarantees coherence between different sets of lenses too – if not, we slip back into the comedy scenario of Act 2.

Fully Bottom-Up Approach

There is always the temptation of suggesting modelling at a very low level so that all lenses of interest can be implemented simply as a sum of the lower models. Despite guaranteeing coherence and flexibility, this approach needs to be evaluated in the context of its statistical and computational implications.

If the modelling level is too low:

each model may not have enough data to support reliable forecasts
the computational requirements associated with estimating all the low-level models may be prohibitive.

If we want a system that is flexible enough to support coherent forecasts along arbitrary classifications of shoppers, not knowing the set of classifications in advance forces us to set the individual shopper-transaction level as our modelling level.⁵ This approach is untenable on both fronts:

Statistical feasibility
In each modelling unit, there’s not enough data variation to support learning.
Computational feasibility
Even if statistical feasibility were not an issue, thinking of all the transactions taking place in a supermarket chain in two years,⁶ we soon realise the enormous computational challenge of having the data entering the model fitting process at such a disaggregated level.

Apportionment Techniques

A feasible approach to producing flexible and coherent disaggregation of forecasts along arbitrary lenses can rely on the interplay between two groups of models:

Models of Units
Used to forecast the units sold at a reference level of aggregation.
Models of Proportions
Used to forecast the shares of units along further dimensions of interest.

Let’s imagine a scenario where, for a specific product, the model of units has forecasted a total 2,000 units for next week, and we are interested in knowing how many units we expect to sell to shoppers classified either as loyal, occasional or other (we will refer to this classification as the loyalty segmentation).

If we can also produce next week’s forecasted loyalty shares (for example, loyal: 30%, occasional: 50%, other: 20%) according to an independently trained model of proportions, then we are also able to produce forecasted units for each of the loyalty segments – simply by multiplying the total 2,000 units by the shares of each segment in turn (loyal: 600, occasional: 1000, other: 400). Coherence is guaranteed by construction. What about flexibility?

Choosing the appropriate models of proportions

If shares are:

stable over time
(no trend, no seasonality, no visible dynamic)
not affected by decision variables
(for example, for a pricing decision, we would check whether shares are affected by price and promotions)

then historical averages of proportions can be a quick and inexpensive way of forecasting shares.

On the other hand, if the above conditions do not hold, averaging approaches can be misleading, and regression models of proportions with explicit dependencies on explanatory variables need to be employed.

Regression models of proportions are general-purpose models and come in many varieties: regression of empirical log-odds, Dirichlet regression, multinomial regression, and so on.⁷

Once the appropriate variant is identified⁸, it will provide flexible means to a pipeline for coherent disaggregation of forecasts.

1 Mathematical models that are appropriate at a certain level of aggregation might not be suitable for other levels. Though necessary at times, using different model families for different lenses can generate a significant resource overhead, adding to time and cost of delivering value to decision makers. In constructing a flexible analytical pipeline, model families that can cover a wide range of circumstances are desirable.
2 Any aggregation has a specific information loss associated with it, which in turn affects the noise in the estimates of the model parameters – this is in the best-case scenario where there is no model misspecification (the generating and fitting models have the same model form).
3 The reconciliation is said to be optimal if it minimises the discrepancy between the initial forecasts and their reconciled version.
4 An example of an approach to forecasts reconciliation can be found in Hyndman, R. J., Ahmed, R. A., Athanasopoulos, G., & Shang, H. L. (2011). Optimal combination forecasts for hierarchical time series. Computational Statistics and Data Analysis, 55(9), 2579–2589.
5 A single shopper is not necessarily classified in the same way throughout history - for example, at some point, they might have been a new customer, then became a regular one, and then an occasional one. Therefore, even modelling at the shopper level (pooling across their transactions) would limit the spectrum of lenses that are expressible as simple sums of the bottom level models.
6 Bare minimum time requirement for being able to estimate seasonal patterns.
7 A survey of many of these models can be found in Morais J., Thomas‐Agnan C., & Simioni M. (2016). A tour of regression models for explaining shares. Working Papers, Toulouse School of Economics (No. 16‐742).
8 Examples of elements to consider in the identification process are: is there going to be a non-negligible proportion of zeros in the observed shares? How do these zeros arise? Would a continuous model be appropriate at all possible levels of interest?

Cookie	Description
cli_user_preference	The cookie is set by the GDPR Cookie Consent plugin and is used to store the yes/no selection the consent given for cookie usage. It does not store any personal data.
cookielawinfo-checkbox-advertisement	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-necessary	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
CookieLawInfoConsent	The cookie is set by the GDPR Cookie Consent plugin and is used to store the summary of the consent given for cookie usage. It does not store any personal data.
viewed_cookie_policy	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wsaffinity	Set by the dunnhumby website, that allows all subsequent traffic and requests from an initial client session to be passed to the same server in the pool. Session affinity is also referred to as session persistence, server affinity, server persistence, or server sticky.

Cookie	Description
wordpress_test_cookie	WordPress cookie to read if cookies can be placed, and lasts for the session.
wp_lang	This cookie is used to remember the language chosen by the user while browsing.

Cookie	Description
CONSENT	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
vuid	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
_ga	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*	This cookie is installed by Google Analytics to store the website's unique user ID.
_ga_*	Set by Google Analytics to persist session state.
_gid	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjSessionUser_{site_id}	This cookie is set by the provider Hotjar to store a unique user ID for session tracking and analytics purposes.
_hjSession_{site_id}	This cookie is set by the provider Hotjar to store a unique session ID, enabling session recording and behavior analysis.
_hp2_id_*	This cookie is set by the provider Hotjar to store a unique visitor identifier for tracking user behavior and session information.
_hp2_props.*	This cookie is set by the provider Hotjar to store user properties and session information for behavior analysis and insights.
_hp2_ses_props.*	This cookie is set by the provider Hotjar to store session-specific properties and data for tracking user behavior during a session.
_lfa	This cookie is set by the provider Leadfeeder to identify the IP address of devices visiting the website, in order to retarget multiple users routing from the same IP address.

Cookie	Description
aam_uuid	Set by LinkedIn, for ID sync for Adobe Audience Manager.
AEC	Set by Google, ‘AEC’ cookies ensure that requests within a browsing session are made by the user, and not by other sites. These cookies prevent malicious sites from acting on behalf of a user without that user’s knowledge.
AMCVS_14215E3D5995C57C0A495C55%40AdobeOrg	Set by LinkedIn, indicates the start of a session for Adobe Experience Cloud.
AMCV_14215E3D5995C57C0A495C55%40AdobeOrg	Set by LinkedIn, Unique Identifier for Adobe Experience Cloud.
AnalyticsSyncHistory	Set by LinkedIn, used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries (which LinkedIn determines as European Union (EU), European Economic Area (EEA), and Switzerland).
bcookie	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognise browser ID.
bscookie	LinkedIn sets this cookie to store performed actions on the website.
DV	Set by Google, used for the purpose of targeted advertising, to collect information about how visitors use our site.
ELOQUA	This cookie is set by Eloqua Marketing Automation Tool. It contains a unique identifier to recognise returning visitors and track their visit data across multiple visits and multiple OpenText Websites. This data is logged in pseudonymised form, unless a visitor provides us with their personal data through creating a profile, such as when signing up for events or for downloading information that is not available to the public.
gpv_pn	Set by LinkedIn, used to retain and fetch previous page visited in Adobe Analytics.
lang	Session-based cookie, set by LinkedIn, used to set default locale/language.
lidc	LinkedIn sets the lidc cookie to facilitate data center selection.
lidc	Set by LinkedIn, used for routing from Share buttons and ad tags.
li_gc	Set by LinkedIn to store consent of guests regarding the use of cookies for non-essential purposes.
li_sugr	Set by LinkedIn, used to make a probabilistic match of a user's identity outside the Designated Countries (which LinkedIn determines as European Union (EU), European Economic Area (EEA), and Switzerland).
lms_analytics	Set by LinkedIn to identify LinkedIn Members in the Designated Countries (which LinkedIn determines as European Union (EU), European Economic Area (EEA), and Switzerland) for analytics.
NID	Set by Google, registers a unique ID that identifies a returning user’s device. The ID is used for targeted ads.
OGP / OGPC	Set by Google, cookie enables the functionality of Google Maps.
OTZ	Set by Google, used to support Google’s advertising services. This cookie is used by Google Analytics to provide an analysis of website visitors in aggregate.
s_cc	Set by LinkedIn, used to determine if cookies are enabled for Adobe Analytics.
s_ips	Set by LinkedIn, tracks percent of page viewed.
s_plt	Set by LinkedIn, this cookie tracks the time that the previous page took to load.
s_pltp	Set by LinkedIn, this cookie provides page name value (URL) for use by Adobe Analytics.
s_ppv	Set by LinkedIn, used by Adobe Analytics to retain and fetch what percentage of a page was viewed.
s_sq	Set by LinkedIn, used to store information about the previous link that was clicked on by the user by Adobe Analytics.
s_tp	Set by LinkedIn, this cookie measures a visitor’s scroll activity to see how much of a page they view before moving on to another page.
s_tslv	Set by LinkedIn, used to retain and fetch time since last visit in Adobe Analytics.
test_cookie	Set by doubleclick.net (part of Google), the purpose of the cookie is to determine if the users' browser supports cookies.
U	Set by LinkedIn, Browser Identifier for users outside the Designated Countries (which LinkedIn determines as European Union (EU), European Economic Area (EEA), and Switzerland).
UserMatchHistory	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
UserMatchHistory	This cookie is used by LinkedIn Ads to help dunnhumby measure advertising performance. More information can be found in their cookie policy.
VISITOR_INFO1_LIVE	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	YSC cookie is set by YouTube and is used to track the views of embedded videos on YouTube pages.
yt-remote-connected-devices	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
_gcl_au	Set by Google Analytics, to take information in advert clicks and store it in a 1st party cookie so that conversions can be attributed outside of the landing page.

Disaggregating Forecasts

Act 1, Flexibility

FORECASTER

DECISION MAKER

FORECASTER

FORECASTER

DECISION MAKER

FORECASTER

Act 2, Coherence

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

Solutions to both the flexibility and coherence challenges are available.

Reconciliation of Forecasts

Fully Bottom-Up Approach

Apportionment Techniques

Epilogue

TOPICS

Get in touch

The latest insights from our experts around the world

Canada Consumer Trends Tracker | Navigating the Shifting Landscape of Grocery Shopping

United States Consumer Trends Tracker | Navigating the Shifting Landscape of Grocery Shopping

Understanding evolving customer needs in the age of AI and data abundance

Disaggregating Forecasts

Act 1, Flexibility

FORECASTER

DECISION MAKER

FORECASTER

FORECASTER

DECISION MAKER

FORECASTER

Act 2, Coherence

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

DECISION MAKER

FORECASTER

Solutions to both the flexibility and coherence challenges are available.

Reconciliation of Forecasts

Fully Bottom-Up Approach

Apportionment Techniques

Epilogue

TOPICS

RELATED PRODUCTS

Get in touch

The latest insights from our experts around the world

Canada Consumer Trends Tracker | Navigating the Shifting Landscape of Grocery Shopping

United States Consumer Trends Tracker | Navigating the Shifting Landscape of Grocery Shopping

Understanding evolving customer needs in the age of AI and data abundance