Cleaning up demand histories is a classic discipline for any forecaster in the supply chain. When generating a statistical forecast, you must first ensure that the historical data has been cleaned of outliers. Otherwise, beware of GIGO! (Garbage In, Garbage Out).
This historical clean-up is generally carried out on monthly buckets, sometimes on weekly buckets. It is often used to establish forecasts, omitting other side effects of a raw history.
For example, the history is cleaned up so that the statistical forecasting module doesn’t go astray, but in a dark corner of the ERP system, a statistical safety stock formula is used, which uses all historical consumption, including outliers.
Clean data is also a prerequisite for running artificial intelligence algorithms. This is known as “data cleansing”.
Let’s look at the impact of an uncleaned history on a simple example.
We select a part on which we have detected three outliers.
This item is used very frequently – we had 267 days of consumption in the past year.
Let’s take a closer look at this application history do you see these three outliers?
Have you also noticed that instead of exploring an aggregated monthly or weekly view, we explore daily consumption? If you want to correctly size a stock, for example, it’s these daily demands that count – rather than averages that mask the reality of real demand signals.
We’ll adjust each of these outliers downwards with a simple drag & drop:
That’s more reasonable:
I hear objections from the back of the room: this is no longer the real demand!
That’s right—it’s demand adjusted for points that appeared to be statistical outliers. The real question is whether you should size the stock of this item to cover all actual demand with immediate availability, or instead meet recurrent demand from stock while adopting a different approach for exceptional requests. For example: “Dear customer, this is an exceptional request. Please let us know in advance—there may be a slight delay.”
Let’s see the impact of this article:
Before the historical correction, the red zone was 68,150
After correction:
A little sweep of the data history, and our stock investment to achieve roughly the same service level drops by around 35%, so it’s worth doing a bit of housekeeping, isn’t it?
Cleaning up historical amounts of data, and detecting and interpreting outliers is an important discipline that deserves to be equipped with proper tooling to:
If you’d like to explore the subject further using your data, please don’t hesitate to contact us!