A Semi-Automatic Data-Scraping Method for the Public Transport Domain
Fecha
2019-07-31
Título de la revista
ISSN de la revista
Título del volumen
Editor
IEEE Computer Society
Resumen
The growing amount of data on the Internet has led to a situation in which it is essential to process these data to generate new services with the specific aim of improving people's daily living conditions. Transport data is of the utmost importance, since everyday people have to move around to perform some daily tasks, such as going to work, studying and shopping, and this means that the number of journeys by public transport grows daily. People with special needs make a large number of these
trips, but they do not have suficcient information about the accessibility of the routes they want to take. Although there are numerous websites and applications that provide information on public transport services, most do not provide detailed information on the accessibility of the routes. We are, therefore, developing a technological framework for the processing, management, and exploitation of open data to promote accessibility to urban public transport. This is taking place within the framework of the Access@City project. This paper specifically focuses on the data extraction and processing of the existing information on the web concerning public transport and its accessibility for the generation of an open data repository in which to
store this information. We, therefore, propose a method for the semi-automatic generation of a data scraper for the public transport domain. This method allows the extraction of public transport data and the existing accessibility information from a selected website. We have additionally developed a web tool that employs the aforementioned method to generate a data scraper for the public transport domain.
Descripción
Palabras clave
Citación
IEEE Access, vol. 7(1), pp. 105627-105637, diciembre 2019
Colecciones
Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 4.0 Internacional