Kroeze, J.H. (Jan Hendrik)2008-06-042008-06-042008-05-18Kroeze, JH ,Bothma, TJD, & Matthee, MC 2008, ' From tags to topic maps: using marked-up Hebrew text to discover linguistic patterns',Proceedings of the 2008 International Conference on Information Resources Management (Conf-IRM 2008),[http://www.sprott.carleton.ca/conf-irm/CFP2008.pdf]978-0-473-134455-7http://hdl.handle.net/2263/5778The paper discusses a series of related techniques that prepare and transform raw linguistic data for advanced processing in order to unveil hidden grammatical patterns. It identifies XML as a suitable mark-up language to build an exploitable data bank of multi-dimensional data in the Hebrew text of the Old Testament. This concept is illustrated by tagging a transcription of Gen. 1:1-2:3 and manipulating this data bank. Transferring the data into a three-dimensional array allows advanced processing of the data in order to either confirm existing knowledge or to mine for new, yet undiscovered, linguistic features. Visualisation is discussed as a technique that enhances interaction between the human researcher and the computerised technologies supporting this process of knowledge creation. The empirical study is a small experiment that illustrates the viability and usefulness of the proposed expert devices as well as the benefits of applying information system techniques to linguistic databases.351636 bytesapplication/pdfenProceedings of the 2008 International Conference on Information Resources Management (Conf-IRM 2008) Niagara Falls, Ontario, Canada, 18-20 May 2008Text data miningData warehousingMOLAPXMLGenesisHebrew language -- Data processingData miningData warehousingXML (Document markup language)From tags to topic maps : using marked-up Hebrew text to discover linguistic patternsArticle