Text-line extraction from handwritten document images using GAN

dc.contributor.authorKundu, Soumyadeep
dc.contributor.authorPaul, Sayantan
dc.contributor.authorBera, Suman Kumar
dc.contributor.authorAbraham, Ajith
dc.contributor.authorSarkar, Ram
dc.date.accessioned2019-09-16T11:48:40Z
dc.date.issued2020-02
dc.description.abstractText-line extraction (TLE) from unconstrained handwritten document images is still considered an open research problem. Literature survey reveals that use of various rule-based methods is commonplace in this regard. But these methods mostly fail when the document images have touching and/or multi-skewed text lines or overlapping words/characters and non-uniform inter-line space. To encounter this problem, in this paper, we have used a deep learning-based method. In doing so, we have, for the first time in the literature, applied Generative Adversarial Networks (GANs) where we have considered TLE as image-to-image translation task. We have used U-Net architecture for the Generator, and Patch GAN architecture for the discriminator with different combinations of loss functions namely GAN loss, L1 loss and L2 loss. Evaluation is done on two datasets: handwritten Chinese text dataset HIT-MW and ICDAR 2013 Handwritten Segmentation Contest dataset. After exhaustive experimentations, it has been observed that U-Net architecture with combination of the said three losses not only produces impressive results but also outperforms some state-of-the-art methods.en_ZA
dc.description.departmentComputer Scienceen_ZA
dc.description.embargo2021-02-01
dc.description.librarianhj2019en_ZA
dc.description.sponsorshipPartially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, and the co-author Ram Sarkar is partially funded by DST grant (EMR/2016/007213).en_ZA
dc.description.urihttp://www.elsevier.com/locate/eswaen_ZA
dc.identifier.citationKundua, S., Paul, S., Bera, S.K. et al. 2020, 'Text-line extraction from handwritten document images using GAN', Expert Systems with Applications, vol. 140, art. 112916, pp. 1-12.en_ZA
dc.identifier.issn0957-4174 (print)
dc.identifier.issn1873-6793 (online)
dc.identifier.other10.1016/j.eswa.2019.112916
dc.identifier.urihttp://hdl.handle.net/2263/71362
dc.language.isoenen_ZA
dc.publisherElsevieren_ZA
dc.rights© 2019 Elsevier Ltd. All rights reserved. Notice : this is the author’s version of a work that was accepted for publication in Expert Systems with Applications. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. A definitive version was subsequently published in Expert Systems with Applications, vol. 104, pp. 1-12, 2020. doi : 10.1016/j.eswa.2019.112916.en_ZA
dc.subjectText-line extraction (TLE)en_ZA
dc.subjectGenerative adversarial network (GAN)en_ZA
dc.subjectDeep learningen_ZA
dc.subjectHandwritten documentsen_ZA
dc.subjectHIT-MW dataseten_ZA
dc.subjectICDAR dataseten_ZA
dc.titleText-line extraction from handwritten document images using GANen_ZA
dc.typePostprint Articleen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kundu_TextLine_2020.pdf
Size:
2.32 MB
Format:
Adobe Portable Document Format
Description:
Postprint Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.75 KB
Format:
Item-specific license agreed upon to submission
Description: