Data exploration on the most accessible Singapore hotels by public transport — Part 2

Cliff Chew
5 min readFeb 28, 2024

--

In this post, I share my attempt to solve the missing public transport routing issue from the OneMap Routing API, so as to calculate a Singapore hotel public transport accessibility metric. Previously, I shared how I identified the most accessible Singapore hotels by their linear and vehicle distances to 15 selected tourist attractions. I wanted to use public transport distances as an accessibility measure too, as I feel tourists are more likely to use public transport to travel around Singapore. Missing data from the OneMap Routing API from my first attempt didn’t allow me to complete my public transport accessibility measure.

In that first attempt, I used the OneMap Routing API to get 6,435 public transport routes (429 hotels to 15 attractions), out of which 581 ( 9% ) of the responses had the response “404 NOT FOUND : Unable to get route”.

Fig 1 — The unexpected 404 error.
Fig 2 — I replaced the “404” missing routes with “0” to make them easier to find

After letting the problem sit for awhile, this is my proposed workaround:

  1. Get all existing train station locations using the OneMap Geocoding API.
  2. Link hotels to their nearest train station, either by public transport (priority) or foot.
  3. Link attractions to their nearest train station, either by public transport (priority) or foot.
  4. Link the stations by rail (public transport) to connect the hotels and the attractions.

I hope that breaking down hotel-attraction routes into separate components and stitching them together will help me fix my current issue. While this approach may not give the shortest public transport route, I felt it was a reasonable compromise as train stations are easier to spot and navigate than public buses in general, which commonly make train networks the preferred mode of public transport for tourists.

Step 1 — Hotels to nearest train stations

I was able to easily link up most Singapore hotels to their top three train stations. I chose top three to give the route stitching more flexibility, in case certain attractions are nearer to specific train lines by certain train stations. At this point, an interesting and intuitive (obvious) pattern surfaced from the data — Sentosa hotels are not very accessible via public transport!

Fig 3 — All Sentosa hotels having 0 (missing) public distance. The OneMap API couldn’t find any public transport or walking route for these hotels

This is expected as Sentosa is an offshore island outside of mainland Singapore. Travelling into Sentosa hotels require changing to either monorail, cable car and shuttle buses, all of which are more inconvenient than commuting within mainland Singapore itself. That said, I am not saying to stay away from Sentosa hotels, but if having good accessibility to tourist attractions in mainland Singapore is your priority, don’t choose hotels in Sentosa!

Step 2 — Attractions to nearest train stations

I was also able to link the 15 attractions to their top three nearest train stations.

Fig 4 — Successfully matching attractions to their nearest train stations

Station to Station connections

With the hotel-mrt and attraction-mrt links done, the last part is to link up the train stations! To make this matching efficient, I only matched stations to fill the missing routes. Unfortunately, there still were a number of train stations that could not link up using public transport.

Fig 5 – 93 Train station-pairs could not be linked

This feels a bit ironic, as I assume connectivity between train stations should easily be matched through our train network. However, on hindsight, this seems like an unrealistic assumption. While I am not very familiar with the provision of routing APIs, I think that there are an infinite number of data points that may need to be linked by routes on our Earth’s surface, and it might be too difficult for the OneMap team to say to prioritise connectivity between train stations to be through our train networks.

Unfortunately, because of these missing station-to-station routes, I am back at my initial problem with calculating the public transport accessibility measure, and I need to think about my next steps on this project.

Conclusion

My stitching attempt using the OneMap Routing API failed, as I couldn’t complete my public transport distance measure. But I didn’t see it as a failure because I did learn a few interesting lessons. I wanted to share my analysis as well as I wanted to highlight the thought process that goes into a seemingly simple real-world analytics project, where unforeseen issues with data availability and data quality may arise and cause certain roadblocks. When such situations happen, the analyst will then have to figure out what additional resources he might need, and possible issues and workarounds, all while trying to fit the project’s timelines and budgets, as well as ultimately what type of value proposition the analytics project can still provide. Dealing with such problems require experience and domain knowledge to make informed and strategic bets on how the project’s next steps.

Nonetheless, I am glad to have gotten my vehicle distance measure using the OneMap Routing API. If the OneMap team is reading this, I am happy to link up to understand if I am using the OneMap Routing API wrongly. And if the “404” error is an expected response for the routing API, please add it into your official documentation.

I am thinking how I can “fix” my public distance accessibility measure. I could consider paid routing APIs like Google Maps Routing API or Graphhopper Routing API, which should provide much better data. Both APIs also have a freemium tier that should be sufficient for my use case. I am also considering the OpenStreetMap (OSM) routing API, but so far, it seems like the OSM routing API does not have public transport data as well. I will want to see how much I can do with open-source tools and datasets before going down the paid route (pun intended). Open to suggestions if anyone has any!

Thanks to everyone who has read this post. If you are interested in analytics side projects with a social science spin, follow me on Medium or Linkedin. Some topics I have explored include (1) Singapore housing prices, the (2) Taiwan housing prices, and (3) I even built a small web app for Singapore residents to track the library books that they want to borrow. I also share less technical stuff, like (4) how I learned to deal with uncertainty and (5) how I ended up being a freelance analytics consultant.

Lastly, I have a Substack (trying to keep that going) as well, where I share ideas on data concepts and strategies for targeted at busy business people.

--

--