Building a Multi-Business Unit Rolling Forecast Model with Static and Future Inputs

In the first forecasting lab, we trained an LSTM using a single revenue series.

It worked, but it was also a simplified problem.

Real planning models rarely forecast one business unit in isolation.

Different parts of the business grow at different rates, have different seasonality patterns, and respond differently to market conditions.

For this lab, we wanted to move one step closer to that reality.

Instead of forecasting a single revenue stream, we created five business units:

Enterprise
SMB
APAC
EMEA
Public Sector

Each business unit was given its own growth profile, seasonality pattern, and revenue scale.

The goal was to see whether a single model could learn from all five business units at the same time while still understanding that each one behaves differently.

The First Problem

When multiple business units are combined into one model, an immediate problem appears.

Enterprise revenue is much larger than Public Sector revenue.

APAC grows faster than EMEA.

SMB has a different seasonal pattern than Enterprise.

If all the data is simply combined together, the model may struggle to understand which patterns belong to which business unit.

To address that, we introduced a business unit identifier.

Each business unit received a numeric ID that was passed into the model as a static feature.

Instead of treating every revenue series as identical, the model learns that different business units have different characteristics.

That was one of the main reasons for moving away from the simple LSTM used in the previous lab.

Looking Beyond Historical Revenue

Many forecasting examples focus entirely on historical values.

Revenue goes in.

Forecast comes out.

For this experiment, we wanted the model to see information that would already be known before the forecast period begins.

Examples include:

Month
Quarter
Whether the month falls in Q4

These values are available in advance and do not require prediction.

The model receives those future calendar features separately from the historical revenue data.

That makes the forecasting process a little more realistic because planners usually know future calendar periods even if they do not know future revenue.

Building the Forecast Windows

The model uses:

12 months of history
3 months of forecast horizon

For every forecast, the model receives one year of historical information and predicts the next quarter.

This felt like a reasonable balance.

There is enough history to learn seasonal patterns and enough forecast horizon to support a rolling planning process.

Why We Tried a TFT-Style Architecture

The main experiment in this lab was replacing the simple LSTM structure from Lab 01 with a simplified Temporal Fusion Transformer design.

The model still contains an LSTM encoder.

But additional components were added.

Static business unit information is injected into the encoder.

Future calendar features are processed separately.

An attention layer allows future forecast periods to reference information from the historical sequence.

That was the part I was most interested in testing.

Rather than forcing all information through the final LSTM state, the model can revisit historical patterns through attention when generating forecasts.

A Small Change That Made Sense

Another change from Lab 01 was the loss function.

The first model used Mean Squared Error.

This lab uses Huber Loss.

The reason was practical.

Forecast datasets often contain occasional spikes, unusual periods, or reporting noise.

Huber Loss provides a middle ground between MAE and MSE and is generally less influenced by large individual errors.

For forecasting work, that seemed worth testing.

Evaluating the Forecast

After training, the model was evaluated using standard forecasting metrics:

MAE
RMSE
MAPE
R²

The code also calculates MAPE separately for:

Month +1
Month +2
Month +3

This was useful because forecast quality often changes as the horizon increases.

A model that performs well one month ahead may behave very differently three months ahead.

Looking at the forecast horizon separately gives a better picture of model behavior.

Forecasting the Next Quarter

The final step was generating forecasts for each business unit.

The model takes the latest 12 months of history for each business unit, combines it with future calendar information, and predicts the next three months of revenue.

The output includes:

Monthly forecast values
Quarterly totals
Forecast comparison by business unit

That starts to look more like a planning deliverable rather than a machine learning experiment.

What I Found Interesting

The most interesting part of this lab was not the transformer architecture.

It was the combination of different information types.

The model was learning from:

Historical revenue
Business unit identity
Future calendar information

Those are all things planners already use when building forecasts.

The difference is that the model learns the relationships directly from the data.

Where This Could Go Next

This lab still uses synthetic revenue data.

The next logical step would be introducing business drivers.

Examples could include:

Headcount
Pipeline
Bookings
Marketing spend
Customer counts

Those drivers could be passed into the future feature set alongside the calendar information.

That would move the model closer to how rolling forecasts are built in real planning environments.

For now, the goal was simpler.

Move beyond a single time series.

Introduce business context.

Add future information.

Then see how the forecast changes.

That made this lab much more interesting than simply replacing one neural network with another.

Building a Multi-Business Unit Rolling Forecast Model with Static and Future Inputs

The First Problem

Looking Beyond Historical Revenue

Building the Forecast Windows

Why We Tried a TFT-Style Architecture

A Small Change That Made Sense

Evaluating the Forecast

Forecasting the Next Quarter

What I Found Interesting

Where This Could Go Next

Like this:

Related

More from the architecture desk.

Can Transaction Descriptions Predict the Correct GL Account?

Building a Budget Variance Anomaly Detection Model for EPM

Building a Simple LSTM Revenue Forecast Model for EPM Planning

Services

Platforms

Functional Areas

Company

Building a Multi-Business Unit Rolling Forecast Model with Static and Future Inputs

The First Problem

Looking Beyond Historical Revenue

Building the Forecast Windows

Why We Tried a TFT-Style Architecture

A Small Change That Made Sense

Evaluating the Forecast

Forecasting the Next Quarter

What I Found Interesting

Where This Could Go Next

Share this:

Like this:

Related

More from the architecture desk.

Can Transaction Descriptions Predict the Correct GL Account?

Building a Budget Variance Anomaly Detection Model for EPM

Building a Simple LSTM Revenue Forecast Model for EPM Planning

Services

Platforms

Functional Areas

Company

Discover more from EPMLogic