Before you begin
This challenge is about processing real-time data from a webhook hosted by the Massachusetts Bay Transportation Authority (MBTA). For context, the MBTA is responsible for boat, train, bus, and commuter rail public transportation in the Greater Boston Area.
You will be improving this web application in collaboration with your interviewer(s). Feel free to ask them any questions around the requirements of the challenge, syntax, concepts, or library-related questions. Note that we are not going to run this code; we just want to work together and understand your thinking.
The scenario
This web application receives webhooks with status updates about all MBTA vehicles as they move through their routes. These updates are recorded in the database and will eventually be used to analyze where MBTA Commuter Rail trains are speeding, however nothing but your code is using the data yet. This application was made quickly, without code review. No one has confirmed the quality of this code.
Expected outcomes
You are now the owner of this application. We want you to make any and all changes necessary to improve the existing code and verify that the product requirements are met!
Our main priority is to address the product requirements which will be described later.
Additionally, we want you to update the existing code so that it is readable/maintainable for future developers, and can scale without performance concerns.
For Managers Only
We want our managers to be technical. Please drive this interview as if you were the lead engineer of a team working on this challenge.
Product requirements
- Save a log of speeding incidents to the database. Only log incidents for trains that meet both of the following criteria:
- The train must be a commuter rail train.
- The train must be going over 10 miles per hour.
- Know how many times a specific train has been speeding and where.
- You will not need to implement this function, but your data model must be able to satisfy this requirement.
What the current code does
- The web application has an endpoint in the
server.pyfile which receives webhooks from the MBTA. An example webhook payload can be found in thesamples/webhook-payload-sample.jsonfile. - That payload is then passed to a function defined in the
processor.pyfile which contains the bulk of the business logic for this project. - The code then calls the MBTA API to check if the train is a commuter rail train. An example response from the MBTA API can be found in the
samples/mbta-api-route-raw-response.jsonfile. - The application uses two tables:
train_lookupto count the number of times a train has been seen speeding. The schema for this table is defined in thesamples/train-lookup-table-schema.sqlfile.train_logto log the speed and location a train has been seen speeding (above 10 MPH). The schema for this table is defined in thesamples/train-log-table-schema.sqlfile.
This is an MBTA webhook processing challenge focused on identifying commuter rail trains, checking whether they exceed 10 MPH, and persisting only valid speeding incidents. A strong solution should cleanly separate webhook parsing, external API validation, and database writes, while also designing the schema so future queries can report how many times each train sped and where those incidents occurred.