Developing an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3

dc.contributor.advisorBothma, T.J.D. (Theodorus Jan Daniel)en
dc.contributor.coadvisorMatthee, Machdel C.en
dc.contributor.emailjan.kroeze@gmail.comen
dc.contributor.postgraduateKroeze, J.H. (Jan Hendrik)en
dc.date.accessioned2013-09-07T07:36:38Z
dc.date.available2008-09-08en
dc.date.available2013-09-07T07:36:38Z
dc.date.created2008-09-02en
dc.date.issued2008-09-08en
dc.date.submitted2008-07-28en
dc.descriptionThesis (PhD (Information Technology))--University of Pretoria, 2008.en
dc.description.abstractThe thesis discusses a series of related techniques that prepare and transform raw linguistic data for advanced processing in order to unveil hidden grammatical patterns. A threedimensional array is identified as a suitable data structure to build a data cube to capture multidimensional linguistic data in a computer's temporary storage facility. It also enables online analytical processing, like slicing, to be executed on this data cube in order to reveal various subsets and presentations of the data. XML is investigated as a suitable mark-up language to permanently store such an exploitable databank of Biblical Hebrew linguistic data. This concept is illustrated by tagging a phonetic transcription of Genesis 1:1-2:3 on various linguistic levels and manipulating this databank. Transferring the data set between an XML file and a threedimensional array creates a stable environment allowing editing and advanced processing of the data in order to confirm existing knowledge or to mine for new, yet undiscovered, linguistic features. Two experiments are executed to demonstrate possible text-mining procedures. Finally, visualisation is discussed as a technique that enhances interaction between the human researcher and the computerised technologies supporting the process of knowledge creation. Although the data set is very small there are exciting indications that the compilation and analysis of aggregate linguistic data may assist linguists to perform rigorous research, for example regarding the definitions of semantic functions and the mapping of these functions onto the syntactic module.en
dc.description.availabilityunrestricteden
dc.description.departmentInformation Scienceen
dc.identifier.citation2008en
dc.identifier.otherB23/eoen
dc.identifier.upetdurlhttp://upetd.up.ac.za/thesis/available/etd-07282008-121520/en
dc.identifier.urihttp://hdl.handle.net/2263/26750
dc.language.isoen
dc.publisherUniversity of Pretoriaen_ZA
dc.rights©University of Pretoria 2008 B23/en
dc.subjectOnline analytical processing (olap)en
dc.subjectXmlen
dc.subjectHebrew bibleen
dc.subjectThreedimensional arrayen
dc.subjectVisualisationen
dc.subjectComputational linguisticsen
dc.subjectText data miningen
dc.subjectData warehousingen
dc.subjectDatabase managementen
dc.subjectRound-trippingen
dc.subjectUCTDen_US
dc.titleDeveloping an XML-based, exploitable linguistic database of the Hebrew text of Gen. 1:1-2:3en
dc.typeThesisen

Files

Original bundle

Now showing 1 - 5 of 12
Loading...
Thumbnail Image
Name:
00front.pdf
Size:
71.54 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
01chapter1.pdf
Size:
200.45 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
02chapter2.pdf
Size:
733.64 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
03chapter3.pdf
Size:
889.23 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
04chapter4.pdf
Size:
260.95 KB
Format:
Adobe Portable Document Format