A series can be defined as a number of events, objects, or people of a similar or related kind coming one after another; if we add the dimension of time, we get a time series. A time series can be defined as a series of data points in time order.
In this article, we will understand what time series is and why it is one of the essential characteristics for forecasting.
The importance of time series
What importance, if any, does time series have and how will it be relevant in the future? These are just a couple of fundamental questions that any user should find answers to before delving further into the subject. Let’s try to answer this by posing a question.
Have you heard the terms big data, artificial intelligence (AI), and machine learning (ML)?
These three terms make learning time series analysis relevant. Big data is primarily about a large amount of data that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interaction. AI is a kind of technology that is being developed by data scientists, computational experts, and others to enable processes to become more intelligent, while ML is an enabler that is helping to implement AI.
All three of these terms are interlinked with the data they use, and a lot of this data is time series in its nature. This could be either financial transaction data, the behavior pattern of individuals during various parts of the day, or related to life events that we might experience. An effective mechanism that enables us to capture the data, store it, analyze it, and then build algorithms to predict transactions, behavior (and life events, in this instance) will depend on how big data is utilized and how AI and MI are leveraged.
A common perception in the industry is that time series data is used for forecasting only. In practice, time series data is used for:
- Pattern recognition
- Evaluating the influence of a single factor on the time series
- Quality control
For example, a retailer may identify a pattern in clothing sales every time it gets a celebrity endorsement, or an analyst may decide to use car sales volume data from 2012 to 2017 to set a selling benchmark in units. An analyst might also build a model to quantify the effect of Lehman’s crash at the height of the 2008 financial crisis in pushing up the price of gold.
Variance in the success of treatments across time periods can also be used to highlight a problem, the tracking of which may enable a hospital to take remedial measures. These are just some of the examples that showcase how time series analysis isn’t limited to just forecasting. In this chapter, we will review how the financial industry and others use forecasting, discuss what a good and a bad forecast is, and hope to understand the characteristics of time series data and its associated problems.
Forecasting across industries
Since one of the primary uses of time series data is forecasting, it’s wise that we learn about some of its fundamental properties. To understand what the industry means by forecasting and the steps involved, let’s visit a common misconception about the financial industry: only lending activities require forecasting.
We need forecasting in order to grant personal loans, mortgages, overdrafts, or simply assess someone’s eligibility for a credit card, as the industry uses forecasting to assess a borrower’s affordability and their willingness to repay the debt. Even deposit products such as savings accounts, fixed-term savings, and bonds are priced based on some forecasts. How we forecast and the rationale for that methodology is different in borrowing or lending cases, however.
All of these areas are related to time series, as we inevitably end up using time series data as part of the overall analysis that drives financial decisions. Let’s understand the forecasts involved here a bit better. When we are assessing an individual’s lending needs and limits, we are forecasting for a single person yet comparing the individual to a pool of good and bad customers who have been offered similar products. We are also assessing the individual’s financial circumstances and behavior through industry-available scoring models or by assessing their past behavior, with the financial provider assessing the lending criteria.
In the case of deposit products, as long as the customer is eligible to transact (can open an account and has passed know your customer (KYC), anti-money laundering (AML), and other checks), financial institutions don’t perform forecasting at an individual level. However, the behavior of a particular customer is primarily driven by the interest rate offered by the financial institution. The interest rate, in turn, is driven by the forecasts the financial institution has done to assess its overall treasury position. The treasury is the department that manages the central bank’s money and has the responsibility of ensuring that all departments are funded, which is generated through lending and attracting deposits at a lower rate than a bank lends.
The treasury forecasts its requirements for lending and deposits, while various teams within the treasury adhere to those limits. Therefore, a pricing manager for a deposit product will price the product in such a way that the product will attract enough deposits to meet the forecasted targets shared by the treasury; the pricing manager also has to ensure that those targets aren’t overshot by a significant margin, as the treasury only expects to manage a forecasted target.
In both lending and deposit decisions, financial institutions do tend to use forecasting. A lot of these forecasts are interlinked, as we saw in the example of the treasury’s expectations and the subsequent pricing decision for a deposit product. To decide on its future lending and borrowing positions, the treasury must have used time series data to determine what the potential business appetite for lending and borrowing in the market is and would have assessed that with the current cash flow situation within the relevant teams and institutions.
Characteristics of time series data
Any time series analysis has to take into account the following factors:
- Outliers and rare events
- Disruptions and step changes
Seasonality is a phenomenon that occurs each calendar year. The same behavior can be observed each year. A good forecasting model will be able to incorporate the effect of seasonality in its forecasts. Christmas is a great example of seasonality, where retailers have come to expect higher sales over the festive period.
Seasonality can extend into months but is usually only observed over days or weeks. When looking at time series where the periodicity is hours, you may find a seasonality effect for certain hours of the day. Some of the reasons for seasonality include holidays, climate, and changes in social habits. For example, travel companies usually run far fewer services on Christmas Day, citing a lack of demand. During most holidays people love to travel, but this lack of demand on Christmas Day could be attributed to social habits, where people tend to stay at home or have already traveled. Social habit becomes a driving factor in the seasonality of journeys undertaken on Christmas Day.
It’s easier for the forecaster when a particular seasonal event occurs on a fixed calendar date each year; the issue comes when some popular holidays depend on lunar movements, such as Easter, Diwali, and Eid. These holidays may occur in different weeks or months over the years, which will shift the seasonality effect. Also, if some holidays fall closer to other holiday periods, it may lead to individuals taking extended holidays and travel sales may increase more than expected in such years.
The coffee shop near the office may also experience lower sales for a longer period. Changes in the weather can also impact seasonality; for example, a longer, warmer summer may be welcome in the UK, but this would impact retail sales in the autumn as most shoppers wouldn’t need to buy a new wardrobe. In hotter countries, sales of air-conditioners would increase substantially compared to the summer months’ usual seasonality. Forecasters could offset this unpredictability in seasonality by building in a weather forecast variable. We will explore similar challenges in the chapters ahead.
Seasonality shouldn’t be confused with a cyclic effect. A cyclic effect is observed over a longer period of generally two years or more. The property sector is often associated with having a cyclic effect, where it has long periods of growth or slowdown before the cycle continues.
A trend is merely a long-term direction of observed behavior that is found by plotting data against a time component. A trend may indicate an increase or decrease in behavior. Trends may not even be linear, but a broad movement can be identified by analyzing plotted data.
Outliers and rare events
Outliers and rare events are terminologies that are often used interchangeably by businesses. These concepts can have a big impact on data, and some sort of outlier treatment is usually applied to data before it is used for modeling. It is almost impossible to predict an outlier or rare event but they do affect a trend. An example of an outlier could be a customer walking into a branch to deposit an amount that is 100 times the daily average of that branch. In this case, the forecaster wouldn’t expect that trend to continue.
Disruptions and step changes are becoming more common in time series data. One reason for this is the abundance of available data and the growing ability to store and analyze it. Disruptions could include instances when a business hasn’t been able to trade as normal. Flooding at the local pub may lead to reduced sales for a few days, for example.
While analyzing daily sales across a pub chain, an analyst may have to make note of a disruptive event and its impact on the chain’s revenue. Step changes are also more common now due to technological shifts, mergers and acquisitions, and business process re-engineering. When two companies announce a merger, they often try to sync their data. They might have been selling x and y quantities individually, but after the merger will expect to sell x + y + c (where c is the positive or negative effect of the merger).
Over time, when someone plots sales data in this case, they will probably spot a step change in sales that happened around the time of the merger, as shown in the following screenshot:
In the trend graph, we can see that online travel bookings are increasing. In the step change and disruptions chart, we can see that Q1 of 2012 saw a substantive increase in bookings, where Q1 of 2014 saw a substantive dip. The increase was due to the merger of two companies that took place in Q1 of 2012. The decrease in Q1 of 2014 was attributed to prolonged snow storms in Europe and the ash cloud disruption from volcanic activity over Iceland. While online bookings kept increasing after the step change, the disruption caused by the snow storm and ash cloud only had an effect on sales in Q1 of 2014.
In this case, the modeler will have to treat the merger and the disruption differently while using them in the forecast, as disruption could be disregarded as an outlier and treated accordingly. Also note that the seasonality chart shows that Q4 of each year sees almost a 20% increase in travel bookings, and this pattern continues each calendar year.
In this article, we defined time series and learned why it is important for forecasting. We also looked at the characteristics of time series data.
To know more how to leverage the analytical power of SAS to perform financial analysis efficiently, you can check out the book SAS for Finance.