Abstract:
Fault detection and diagnosis presents a big challenge within the petrochemical industry. The annual economic impact of unexpected shutdowns is estimated to be $20 billion. Assistive technologies will help with the effective detection and classification of the faults causing these shutdowns. Clustering analysis presents a form of unsupervised learning which identifies data with similar properties. Various algorithms were used and included hard-partitioning algorithms (K-means and K-medoid) and fuzzy algorithms (Fuzzy C-means, Gustafson-Kessel and Gath-Geva). A novel approach to the clustering problem of time-series data is proposed. It exploits the time dependency of variables (time delays) within a process engineering environment. Before clustering, process lags are identified via signal cross-correlations. From this, a least-squares optimal signal time shift is calculated. Dimensional reduction techniques are used to visualise the data. Various nonlinear dimensional reduction techniques have been proposed in recent years. These techniques have been shown to outperform their linear counterparts on various artificial data sets including the Swiss roll and helix data sets but have not been widely implemented in a process engineering environment. The algorithms that were used included linear PCA and standard Sammon and fuzzy Sammon mappings. Time shifting resulted in better clustering accuracy on a synthetic data set based on than traditional clustering techniques based on quantitative criteria (including Partition Coefficient, Classification Entropy, Partition Index, Separation Index, Dunn’s Index and Alternative Dunn Index). However, the time shifted clustering results of the Tennessee Eastman process were not as good as the non-shifted data. Copyright