Towards cache optimization in finite automata implementations

Towards cache optimization in finite automata implementations

Files

00front.pdf (70.13 KB)

01part1.pdf (151.14 KB)

02part2.pdf (452.72 KB)

03part3.pdf (177.21 KB)

04part4.pdf (81.22 KB)

Date

2007-08-01

Authors

Ketcha Ngassam, Ernest

Publisher

University of Pretoria

Abstract

To the best of our knowledge, the only available implementations of FA-based string recognizers are the so-called conventional table-driven algorithm and, of course, its hardcoded counterpart suggested by Thompson, Penello, and DeRemer in 1967, 1986, and 2004 respectively. However, our early experiments have shown that the performance of both implementations is hampered by the random access nature of the automaton’s transition table in the case of table-driven, and also the random access nature of the directly executable instructions that make up each hardcoded state. Moreover, the problem of memory load and instruction load are also performance bottlenecks of these algorithms, since, as the automaton size grows, more space in memory is required to hold data/instructions relevant to the states. This thesis exploits the notion of cache optimization (that requires good data or instructions organization) in investigating various enhancements of both table-driven and hardcoding. Functions have been used to formally define the denotational semantics of string recognizers. These functions rely on various so-called strategy variables that are integrated into the formal definition of each recognizer. By appropriately selecting these variables, the conventional algorithms may be described, without loss of generality. By specializing these strategy variables, the new and enhanced recognizers can be denotationally described, and resulting algorithms can then be implemented. We first introduce the so-called Dynamic State Allocation (DSA) strategy regarded as a sort of Just-In-time (JIT) implementation of FA-based string recognizers whereby a predefined portion of the memory is reserved for acceptance testing. Then follows the State pre-Ordering (SpO) strategy that assumes some prior knowledge on the order in which states would be visited. In this case, acceptance testing takes place once each state have been allocated to its new position in memory. The last strategy referred to as the Allocated Virtual Caching (AVC) strategy is based on the premise that a portion of the memory originally occupied by the automaton’s states is virtually used as a sort of cache memory in which acceptance testing takes place, enabling therefore, the exploitation of the various performance enhancement notions on which hardware cache memory relies. It is shown that the algorithms can be classified in a taxonomy tree which is further mapped into a class-diagram that represents the design of a toolkit for FA-based string recognition. Also given in the thesis are empirical results that indicate that the algorithms suggested can, in general, outperform their conventional counterparts when recognizing large and appropriately chosen input strings.

Description

Thesis (PhD (Computer Science))--University of Pretoria, 2007.

Keywords

Software construction, Toolkit, Taxonomy, Algorithmic, Finite automata, UCTD

Citation

Ketcha Ngassam, E 2007, Towards cache optimization in finite automata implementations, PhD Thesis, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/26467>

URI

http://hdl.handle.net/2263/26467

Collections

Theses and Dissertations (University of Pretoria)
Theses and Dissertations (Computer Science)

Full item page

Towards cache optimization in finite automata implementations

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections