Abstract:
The advent of the fourth industrial revolution and the need for connectedness have
increased both data availability and quality. This data surge can also be seen in the
transport and mobility industry. Anything from onboard global positioning system interfaces
to vehicle trackers and wearable technology for passengers and drivers provide access to
more data as an untapped source of valuable information and insights to many
stakeholders. Topic modelling is traditionally used to structure and interpret text data from
a large corpus of documents. In this paper, patterns in bus route data collected over
several months by the onboard Global Positioning Systems (GPSs) of buses travelling in
Gauteng and the Northwest province are analysed. Since topic modelling is traditionally
used on text documents, the bus route coordinates had to be converted into a form
readable by the algorithm. This is an ongoing project, but analyses thus far show that the
most important terms per topic correspond to key nodes in city centres and points of
interest where routes overlap. This information may be used in city planning to optimise
the system of bus routes, terminals, and nodes. Organisations may also use this
information for business development and job creation.