Post-authorship attribution using regularized deep neural network

dc.contributor.authorModupe, Abiodun
dc.contributor.authorCelik, Turgay
dc.contributor.authorMarivate, Vukosi
dc.contributor.authorOlugbara, Oludayo O.
dc.contributor.emailvukosi.marivate@up.ac.zaen_US
dc.date.accessioned2023-04-24T07:47:39Z
dc.date.available2023-04-24T07:47:39Z
dc.date.issued2022-07-26
dc.description.abstractPost-authorship attribution is a scientific process of using stylometric features to identify the genuine writer of an online text snippet such as an email, blog, forum post, or chat log. It has useful applications in manifold domains, for instance, in a verification process to proactively detect misogynistic, misandrist, xenophobic, and abusive posts on the internet or social networks. The process assumes that texts can be characterized by sequences of words that agglutinate the functional and content lyrics of a writer. However, defining an appropriate characterization of text to capture the unique writing style of an author is a complex endeavor in the discipline of computational linguistics. Moreover, posts are typically short texts with obfuscating vocabularies that might impact the accuracy of authorship attribution. The vocabularies include idioms, onomatopoeias, homophones, phonemes, synonyms, acronyms, anaphora, and polysemy. The method of the regularized deep neural network (RDNN) is introduced in this paper to circumvent the intrinsic challenges of post-authorship attribution. It is based on a convolutional neural network, bidirectional long short-term memory encoder, and distributed highway network. The neural network was used to extract lexical stylometric features that are fed into the bidirectional encoder to extract a syntactic feature-vector representation. The feature vector was then supplied as input to the distributed high networks for regularization to minimize the network-generalization error. The regularized feature vector was ultimately passed to the bidirectional decoder to learn the writing style of an author. The feature-classification layer consists of a fully connected network and a SoftMax function to make the prediction. The RDNN method was tested against thirteen state-of-the-art methods using four benchmark experimental datasets to validate its performance. Experimental results have demonstrated the effectiveness of the method when compared to the existing state-of-the-art methods on three datasets while producing comparable results on one dataset.en_US
dc.description.departmentComputer Scienceen_US
dc.description.librarianam2023en_US
dc.description.sponsorshipThe Department of Science and Technology (DST) and the Council for Scientific and Industrial Research (CSIR).en_US
dc.description.urihttps://www.mdpi.com/journal/applscien_US
dc.identifier.citationModupe, A.; Celik, T.; Marivate, V.; Olugbara, O.O. Post-Authorship Attribution Using Regularized Deep Neural Network. Applied Sciences2022, 12, 7518. https://DOI.org/10.3390/app12157518.en_US
dc.identifier.issn2076-3417
dc.identifier.other10.3390/app12157518
dc.identifier.urihttp://hdl.handle.net/2263/90426
dc.language.isoenen_US
dc.publisherMDPIen_US
dc.rights© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.en_US
dc.subjectAuthorship attributionen_US
dc.subjectCharacter embeddingen_US
dc.subjectBidirectional decoderen_US
dc.subjectBidirectional encoderen_US
dc.subjectDeep learningen_US
dc.subjectNeural networken_US
dc.subjectSocial mediaen_US
dc.subjectRegularized deep neural network (RDNN)en_US
dc.titlePost-authorship attribution using regularized deep neural networken_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Modupe_PostAuthorship_2022.pdf
Size:
4.99 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.75 KB
Format:
Item-specific license agreed upon to submission
Description: