dc.contributor.author |
Kroeze, J.H. (Jan Hendrik)
|
|
dc.contributor.upauthor |
Bothma, T.J.D. (Theodorus Jan Daniel)
|
|
dc.contributor.upauthor |
Matthee, Machdel C.
|
|
dc.date.accessioned |
2008-06-04T07:45:26Z |
|
dc.date.available |
2008-06-04T07:45:26Z |
|
dc.date.issued |
2008-05-18 |
|
dc.description.abstract |
The paper discusses a series of related techniques that prepare and transform raw linguistic data
for advanced processing in order to unveil hidden grammatical patterns. It identifies XML as a
suitable mark-up language to build an exploitable data bank of multi-dimensional data in the
Hebrew text of the Old Testament. This concept is illustrated by tagging a transcription of Gen.
1:1-2:3 and manipulating this data bank. Transferring the data into a three-dimensional array
allows advanced processing of the data in order to either confirm existing knowledge or to mine
for new, yet undiscovered, linguistic features. Visualisation is discussed as a technique that
enhances interaction between the human researcher and the computerised technologies
supporting this process of knowledge creation. The empirical study is a small experiment that
illustrates the viability and usefulness of the proposed expert devices as well as the benefits of
applying information system techniques to linguistic databases. |
en |
dc.format.extent |
351636 bytes |
|
dc.format.mimetype |
application/pdf |
|
dc.identifier.citation |
Kroeze, JH ,Bothma, TJD, & Matthee, MC 2008, ' From tags to topic maps: using marked-up Hebrew text to discover linguistic patterns',Proceedings of the 2008 International Conference on Information Resources Management (Conf-IRM 2008),[http://www.sprott.carleton.ca/conf-irm/CFP2008.pdf] |
en |
dc.identifier.isbn |
978-0-473-134455-7 |
|
dc.identifier.uri |
http://hdl.handle.net/2263/5778 |
|
dc.language.iso |
en |
en |
dc.publisher |
Proceedings of the 2008 International Conference on Information Resources Management |
en |
dc.rights |
Proceedings of the 2008 International Conference on Information Resources Management (Conf-IRM 2008) Niagara Falls, Ontario, Canada, 18-20 May 2008 |
en |
dc.subject |
Text data mining |
en |
dc.subject |
Data warehousing |
en |
dc.subject |
MOLAP |
en |
dc.subject |
XML |
en |
dc.subject |
Genesis |
en |
dc.subject.lcsh |
Hebrew language -- Data processing |
|
dc.subject.lcsh |
Data mining |
|
dc.subject.lcsh |
Data warehousing |
|
dc.subject.lcsh |
XML (Document markup language) |
en |
dc.title |
From tags to topic maps : using marked-up Hebrew text to discover linguistic patterns |
en |
dc.type |
Article |
en |