The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary

Show simple item record

dc.contributor.author De Schryver, Gilles-Maurice
dc.contributor.author Wolfer, Sascha
dc.contributor.author Lew, Robert
dc.date.accessioned 2020-05-07T13:52:22Z
dc.date.available 2020-05-07T13:52:22Z
dc.date.issued 2019-11
dc.description.abstract In an earlier publication it was claimed that there is no useful relationship between Swahili-English dictionary look-up frequencies and the occurrence frequencies for the same wordforms in Swahili-English corpora, at least not beyond the top few thousand wordforms. This result was challenged using data for German by a different team of researchers using an improved methodology. In the present article the original Swahili-English data is revisited, using ten years’ worth of it rather than just two, and using the improved methodology. We conclude that there is indeed a positive relationship. In addition, we show that online dictionary look-up behaviour is remarkably similar across languages, even when, as in our case, one is dealing with languages from very dissimilar language families. Furthermore, online dictionaries turn out to have minimum look-up success rates, below which they simply cannot go. These minima are language-sensitive and vary depending on the regularity of the searched-for entries, but are otherwise constant no matter the size of randomly sampled dictionaries. Corpus-informed sampling always improves on any random method. Lastly, from the point of view of the graphical user interface, we argue that the average user of an online bilingual dictionary is better served with a single search box, rather than separate search boxes for each dictionary side. en_ZA
dc.description.department African Languages en_ZA
dc.description.librarian am2020 en_ZA
dc.description.uri http://ejournal.ukm.my/gema en_ZA
dc.identifier.citation De Schryver, G.-M., Wolfer, S. & Lew, R. 2019, 'The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary', Gema Online Journal of Language Studies, vol. 19, no. 4, pp. 1-27. en_ZA
dc.identifier.issn 1675-8021 (print)
dc.identifier.issn 2550-2131 (online)
dc.identifier.other 10.17576/gema-2019-1904-01
dc.identifier.uri http://hdl.handle.net/2263/74512
dc.language.iso en en_ZA
dc.publisher UKM Press en_ZA
dc.rights Creative Commons Attribution 4.0 International (CC BY 4.0) license. en_ZA
dc.subject Lexicography en_ZA
dc.subject Online dictionaries en_ZA
dc.subject Log files en_ZA
dc.subject Corpus frequencies en_ZA
dc.subject Swahili en_ZA
dc.subject English en_ZA
dc.subject Language universals en_ZA
dc.subject.other Humanities articles SDG-09
dc.subject.other SDG-09: Industry, innovation and infrastructure
dc.title The relationship between dictionary look-up frequency and corpus frequency revisited : a log-file analysis of a decade of user interaction with a Swahili-English dictionary en_ZA
dc.type Article en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record