Seasonal adjustment in Azure Synapse using sktime
This is a simple demonstration using the Microsoft Azure Synapse platform to process data using Python. The example data I user are ONS UK HPI data. The UK housing data tend to have a significant seasonal component.
I used pandas (https://pandas.pydata.org/) to load/manipulate the data and SKTime to analyse the timeseries to calculate the seasonality adjusted series. HVplot is used for plotting.
The data are stored and the notebook run entirely on Azure Synapse.
Notebook 1¶
In [26]:
import panel as pn
import pandas as pd
import hvplot.pandas
from sktime.param_est.seasonality import SeasonalityACF
from sktime.transformations.series.difference import Differencer
from sktime.transformations.series.detrend import ConditionalDeseasonalizer
StatementMeta(tt, 0, 32, Finished, Available)
In [27]:
dd=pd.read_csv("abfss://sktime@sktime.dfs.core.windows.net/UK-HPI-full-file-2023-11.csv",
               parse_dates=["Date"],
               dayfirst=True)
dd=dd.set_index([pd.DatetimeIndex(dd["Date"]).to_period("M"),
                 "RegionName"])
StatementMeta(tt, 0, 33, Finished, Available)
In [29]:
ll=dd.xs("London", level=1)
ll=ll[ll.index> "1995-01-01"]
transformer = ConditionalDeseasonalizer(sp=12)    
ll["AveragePriceSAPy"] = transformer.fit_transform(ll["AveragePrice"])
ll.hvplot.line(y=["AveragePrice", "AveragePriceSA",
                    "AveragePriceSAPy"])
StatementMeta(tt, 0, 35, Finished, Available)
Out[29]:
Need more help?
Services related to Python software packaging: https://bnikolic.co.uk/2023/05/22/python-ssc