The analysis of time series with non-consecutive data
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Pretoria
Abstract
In many fields of research and application, such as engineering, atmospheric sciences, electricity metering and load forecasting, time-based data is recorded. The data is collected for various reasons, such as determining the trend present in an atmospheric variable, modelling the electricity profiles of customers, estimating missing observations in billing data and for forecasting purposes. In many situations the series collected has intervals of missing observations, such situations are typically a result of disturbances such as system faults, misplaced data-tapes, power failures, operator mis-operation, etc. Due to the nature of the recorded data, it is clear that an observation recorded at time tn is dependent to some degree on all previously recorded observations, i.e. tn-I, tn-2, tn-3, ····· Time series analysis is the most appropriate method to describe such a dataset where the data observations are correlated over time. The most important requirement of standard techniques and existing packages is that the observations are complete and equally spaced over time, i.e. consecutive. This is rarely the case, and before the series could be modelled, it is necessary to fill the missing observations. Several methods of estimation, such as back- and forecasting, spline functions, auto regression and Kriging (Basson, 1991) can be used to estimate the missing data. Although these methods exist and have good scientific bases, it was often found that the people involved with the estimation of missing data, modelling and forecasting, do not know of their existence or do not make use of them. Reasons given are that the methodologies are too complicated, that the handling of estimation packages is difficult, and that often more than one package has to be used to do a simple forecast. In many situations, they revert back to "primitive" methods to estimate the missing data, such as connecting the last and first available points around a missing interval by a straight line, and reading the corresponding values along the line, etc. For obvious reasons, this approach is not ideal and a situation where the researcher can model the time series without having to fill the data, would be a more suitable option. This dissertation investigates the approach of modelling a non-consecutive time series in its complete form, and using the fitted model to estimate the missing data or determine the forecasts. Due to the nature of the data collected, the model that will be used to represent the data consists of the following components: a. Long-term trend b. Time series component, and c. Periodic component, giving the complete estimated model Y t = Y trend + Y ARMA + Y ampl. trend Y period The long-term trend describes the behaviour of the series over the whole monitoring period, and could be represented by a linear or a non-linear model. The time series component incorporates the intercorrelations-between-observations factor into the model. The seasonalities and the amplitude component of the periodic trend are described by the periodic part of the model. When the resulting residuals have no visible patterns, it is assumed that the most appropriate model has been fitted. A menu-driven program, T-SERIES, was written to incorporate all these components into modelling a set of non-consecutive data observations, and will be available on request.
Description
Dissertation (MSc (Mathematical Statistics))--University of Pretoria, 1994.
Keywords
UCTD, analysis of time series, non-consecutive data
Sustainable Development Goals
Citation
*