CARA : convolutional autoencoders for the detection of radio anomalies

Brand, Kevin; Grobler, Trienko L.; Kleynhans, Waldo

UPSpace Home
→
Engineering, Built Environment and Information Technology
→
Electrical, Electronic and Computer Engineering
→
Research Articles (Electrical, Electronic and Computer Engineering)
→
View Item

dc.contributor.author	Brand, Kevin
dc.contributor.author	Grobler, Trienko L.
dc.contributor.author	Kleynhans, Waldo
dc.date.accessioned	2025-03-28T08:24:43Z
dc.date.available	2025-03-28T08:24:43Z
dc.date.issued	2025-02
dc.description	DATA AVAILABILITY : The FRGADB data set and the corresponding FIRST fits cutouts that were used for our work are publicly available at https://doi.org/10.5281/zenodo.13773680. A Github repository containing the code for the experiments is also publicly available at https://github.com/KBrand26/CARA.	en_US
dc.description.abstract	With the advent of modern radio interferometers, a significant influx in data is expected. This influx will render the manual inspection of samples infeasible and thus necessitates the development of automated approaches to find radio sources with anomalous morphologies. In this paper, we investigate the use of autoencoders for anomalous source detection, based on the assumption that autoencoders will reconstruct anomalies poorly. Specifically, we compare an autoencoder architecture from the literature to two other autoencoder architectures, as well as to four conventional machine learning models. Our results showed that the reconstruction errors of these autoencoders were generally more informative with respect to identifying anomalies than machine learning models were when trained on PCA components. Furthermore, we found that the use of a memory unit in our autoencoders resulted in the best performance, as it further restricted the ability of autoencoders to generalize to anomalous sources. Whilst investigating the use of different reconstruction error metrics as anomaly scores, we determined that they were more informative when combined than they were in isolation. Thus, applying the machine learning models to the combined anomaly scores from the autoencoders resulted in the best overall performance. Particularly, random forests and XGBoost models were the most effective, with isolation forests also being competitive when using a small number of labelled anomalies to tune their hyperparameters. Such isolation forests are also more likely to generalize to unseen classes of anomalies than supervised models such as random forests and XGBoost.	en_US
dc.description.department	Electrical, Electronic and Computer Engineering	en_US
dc.description.librarian	hj2024	en_US
dc.description.sdg	SDG-09: Industry, innovation and infrastructure	en_US
dc.description.uri	https://academic.oup.com/rasti	en_US
dc.identifier.citation	Brand, K., Grobler, T.L. & Kleynhans, W. 2025, 'CARA : convolutional autoencoders for the detection of radio anomalies', RAS Techniques and Instruments, vol. 4, art. rzaf005, doi : 10.1093/rasti/rzaf005.	en_US
dc.identifier.issn	2752-8200 (online)
dc.identifier.other	10.1093/rasti/rzaf005
dc.identifier.uri	http://hdl.handle.net/2263/101783
dc.language.iso	en	en_US
dc.publisher	Oxford University Press	en_US
dc.rights	© 2025 The Author(s). Published by Oxford University Press on behalf of Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/).	en_US
dc.subject	Machine learning	en_US
dc.subject	Data methods	en_US
dc.subject	Anomaly detection	en_US
dc.subject	Radio continuum: galaxies	en_US
dc.subject	Autoencoders	en_US
dc.subject	Decision tree ensembles	en_US
dc.subject	SDG-09: Industry, innovation and infrastructure	en_US
dc.title	CARA : convolutional autoencoders for the detection of radio anomalies	en_US
dc.type	Article	en_US