Improving short text classification through global augmentation methods

Show simple item record

dc.contributor.author Marivate, Vukosi
dc.contributor.author Sefara, Tshephisho
dc.date.accessioned 2020-10-28T05:18:12Z
dc.date.available 2020-10-28T05:18:12Z
dc.date.issued 2020-08
dc.description.abstract We study the effect of different approaches to text augmentation. To do this we use three datasets that include social media and formal text in the form of news articles. Our goal is to provide insights for practitioners and researchers on making choices for augmentation for classification use cases. We observe that Word2Vec-based augmentation is a viable option when one does not have access to a formal synonym model (like WordNet-based augmentation). The use of mixup further improves performance of all text based augmentations and reduces the effects of overfitting on a tested deep learning model. Round-trip translation with a translation service proves to be harder to use due to cost and as such is less accessible for both normal and low resource use-cases. en_ZA
dc.description.department Computer Science en_ZA
dc.description.librarian hj2020 en_ZA
dc.description.uri http://link.springer.combookseries/558 en_ZA
dc.identifier.citation Marivate V., Sefara T. (2020) Improving Short Text Classification Through Global Augmentation Methods. In: Holzinger A., Kieseberg P., Tjoa A., Weippl E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science, vol 12279. Springer, Cham. https://doi.org/10.1007/978-3-030-57321-8_21. en_ZA
dc.identifier.issn 0302-9743 (print)
dc.identifier.issn 1611-3349 (online)
dc.identifier.other 10.1007/978-3-030-57321-8_21
dc.identifier.uri http://hdl.handle.net/2263/76628
dc.language.iso en en_ZA
dc.publisher Springer en_ZA
dc.rights © IFIP International Federation for Information Processing 2019. The original publication is available at : http://link.springer.combookseries/558. en_ZA
dc.subject Natural language processing (NLP) en_ZA
dc.subject Data augmentation en_ZA
dc.subject Text classification en_ZA
dc.subject Deep neural network (DNN) en_ZA
dc.title Improving short text classification through global augmentation methods en_ZA
dc.type Postprint Article en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record