The last five years have seen a growing interest in the concept of big data. Transport is no exception, and there is a real sense of anticipation about how new data sources could help deliver more efficient and reliable systems, better informed passengers and new products and services. Whilst there has been significant progress already, are these high expectations justified? And what can transport authorities do to help realise this vision?
Figure 1. Number of google searches for ‘big data’
Big data is commonly defined by the 3 Vs of volume, variety and velocity.
In a sense, transport planners have been doing ‘big data’ for decades, at least since the computer made it possible to develop complex models to simulate the operation of transport networks across large urban areas.
Building these models has traditionally required large scale, expensive and time consuming data collection exercises to describe the behaviour of individual households and the characteristics of transport networks.
So the underlying data certainly had volume and variety, if not quite velocity.
Since around 2000, the development of real time information began adding ‘velocity’ to the mix. This went hand in hand with a growing interest in journey planning information and the development of more sophisticated traffic control systems.
So what’s happened in the past five years to justify all the new excitement around data?
The key to understanding this lies in the way in which a combination of relatively recent technologies have changed the way information is collected and the type of information that is available. Advances in mobile computing, telecoms, remote sensing and cloud computing mean that large volumes of digital information on individual movements and preferences are now being collected passively at very low cost.
Some of this information is generated by transport authorities (smart ticketing, traffic sensors, GPS tracking of public transport vehicles, journey planners, real time information systems). But an increasing proportion relies on private devices, as well as infrastructure and software owned by third parties. [examples of Google maps and Strava].
Figure 2. Google Maps – Birmingham
Figure 3. STRAVA app cycle ride density
New data sources can offer a cheaper alternative to traditional transport surveys. In some cases, they can actually give a richer picture about individual behaviour, the current performance of the transport system and even insights into how things are likely to change a short time into the future. In the right hands, it is easy to see how this can help transport users, decision makers and society at large.
But along with opportunities come some challenges.
One is that new data requires new tools and skills. Early experiences with bluetooth data illustrate the problem well. Inferring directionality, determining sample rates, identifying unique devices, converting the number of detected devices to number of vehicles are all new problems that require innovative solutions.
The highly analytical transport community is well placed to tackle the challenge but this will require an open mind, a degree of risk taking and some investment in staff development, at a time when transport authorities are facing severe financial constraints.
Another challenge is that new data requires new ways of working with an expanding community of data users and providers, which includes transport authorities, telecoms companies, academia, traditional transport consultants, analytics specialists, hardware providers, transport operators and a growing echo-system of independent app developers and tech start-ups.
This is beginning to throw up all sorts of questions around open data, data ownership, data integration, data quality, privacy, intellectual property, commercial confidentiality, profit v not-for-profit models and public v private ownership.
A fundamental question for transport authorities is what role should they seek to take – data creator, data integrator, commissioner, seed funder, entrepreneur, honest broker? Needless to say, this is an evolving debate.
Over the coming year, we will be engaging with the challenges and opportunities created by emerging data sources, starting with a workshop hosted at the Future Cities Catapult in mid-May. Exciting times ahead so expect more posts on this topic.