Comparing reliability on Toronto's subway lines by service alerts


It is interesting to note how differences in a line’s operations, ridership, length or other factors can affect the reliability of the line. I’ve collected service alert counts for all four TTC subway lines between May 2020 and August 2021 (15 months) via very basic Twitter bots that organize service alerts from the TTC by subway line:





[You can follow these if you want. I really appreciate if I can make your commute easier!]

The purpose of these bots are to allow subway riders to recieve service alerts specific only to subway lines that they use; the official TTC Service Alert account (@ttcnotices) tweets every service alert. However I found an interest in taking the number of tweets from each subway line and comparing them to provide some insight into the reliability of the subway lines.

Tweet Count Considerations

First, there are some things to note. @TTCnotices will tweet service alerts upon incidence, and repeatedly if the incidence is unresolved for several minutes. Thus, there may be multiple tweets per incident. Furthermore, @TTCnotices will tweet once when an incident is resolved. This means that the total tweet count does not equate the number of incidences resulting in a service interruption on a particular subway line. However, by counting these redundant alerts, it does place a ‘weight’ on serious, extended incidents that cause significant disruption to the service – types of incidents that result in multiple tweet alerts. Overall, the number of tweets per individual line shouldn’t be pondered on, but how they compare to each other.

In addition, I removed tweet counts unrelated to service (approximately 250 COVID-safety tweets per each line between May 2020 and August 2021, with the exception of Line 4).

However there still remains some small amount of error, 5 or 10 maximum tweets per line for special events or updates (free-fare days, accessibility updates for stations on the line, etc) which is categorized in Table 1. At most, this would account for a 2.54% error overstating the tweet count for Line 4; on average, a 1.03% error overstating the cumulative tweet count for the subway system.

Table 1 – Summary of tweet count error

Distribution by Line

Graph 1 – Distribution of service related tweets by subway line

Nothing unusual in Graph 1 – the longest and busiest line unironically has the greatest proportion of service-related tweets. Line 1 Yonge-University is one of the busiest subway lines in North America – it is also one of the longest, at 38 kilometres. Line 4 Sheppard, which is the shortest, newest and second least-used line on the system has the least share of service alerts. Observing the relationship of distance and patronage gives more insight into the relative reliability of the rapid transit lines.

Relationship with Ridership

Graph 2 – Average number of alerts per daily rider compared between the subway lines

To compare ridership to the number of service alerts per line, the simplest method is to divide the cumulative service alerts over the patronage of each line, depicted in Graph 2. On average, with 1.4 million trips a day (2018 ridership data) and around 8,000 service related tweets, the system averaged 0.00575 alerts per average rider, per average day.

Comparing lines, Line 1 and Line 2 have very similar average alerts/rider, at 0.00557 and 0.00564 respectively. Both lines were heavily utilized prepandemic and continue to be the busiest lines in the system. In 2018, ridership on Line 1 was 794,680, and ridership on Line 2 was 527,640. Line 1 has 50.6% more riders than Line 2, yet there is only a 1.24% difference between average alerts/daily rider between either line.

Furthermore, Line 1 has 3.13% less service alerts/daily rider than the mean, while Line 2 has 1.91% less service alerts/daily rider than the mean. It is not unusual that Line 1 and 2 alerts/rider data is similar to the mean of the system, for both lines compose 94% of the 1.4 million daily trips made on the Toronto Subway. Interestingly enough, 92% of the cumulative service alerts on the system can be attributed to Line 1 and 2 as well (Graph 1).

Line 3 and Line 4 only attribute to 6% of the total ridership on the system (85,240 daily ridership in 2018), and 8% of the total service alerts as in Graph 1. In 2018, ridership on Line 3 and 4 was 35,090 and 50,150 respectively (Line 4 has 43.0% higher ridership than Line 3). Despite this, Line 3 has 250% more service alerts/daily rider than Line 4. Both lines are below and above the average alerts/daily rider by a large margin (see Table 2).

Table 2 – Average alerts/rider compared to the mean alerts/rider

What does this show us? Obviously, heavily used lines have a higher frequency of service alerts related to incidents (thus less reliable), but when considering the alerts on a per-rider basis, lines with high ridership appear to perform similarly. Both Line 1 and 2 have similar alerts/daily rider, although there is a daily ridership difference of over 250,000 between the lines. It is fair to assume that if Line 1 and Line 2 had the same ridership, the cumulative amount of service alerts would be approximately equal.

The lower-ridership transit lines have a more peculiar relationship with service alerts/daily rider. As with less riders, you can expect less incidents overall related to those riders. Ultimately, riders themselves are the source of the majority of delays on the system. This assumes consistent maintenance, safety standard enforcement, security and behaviour of riders across all subway lines. Hence it is unsurprising that Line 4 has less alerts/daily rider than Line 1 and 2. However, Line 3 does not adhere to the assumption of standards I’ve listed, for it is a line plagued with more safety issues related to maintenance (something to be discussed later!). The difference is quite evident, with service alerts/daily rider on Line 3 being 147%, 144% and 250% greater than Line 1, Line 2 and Line 4 respectively!

Impacts of COVID-19 on ridership

The latest line-specific data available is from 2018, so unfortunately this does not reflect the ridership during 2020-2021. However, it can be considered a proportional comparison to normal-ridership time periods. The subway system observed a a sharp decline in use due to the COVID pandemic, from 20% of prepandemic ridership in January 2021 to 33% of prepandemic ridership in August 2021. We can take this as a constant decline across all lines, although this would not reflect variations in commuting behaviour specific to each line. An example of such variations would be white-collar downtown-bound commuters that heavily influence Line 1 ridership, many of whom are able to work from home – or the mix of lower income users on Line 3, including many essential workers that are dependent on the system. Nonetheless, assuming a constant decline across all lines is all that I am capable of given this data, and this would only scale down the bar graph so there is no need to depict a graph and repeat a simple analysis.

Relation to Distance

Graph 3.1 Average number of service alerts per kilometre compared between each subway line

It is clear that service incidents is proportional to distance of rapid transit lines. However, much like with ridership, it is not a linear relationship – once again Graph 3.1 shows that when measuring riders/distance, longer lines have a higher rate of incidence than shorter lines.

Line 1 and Line 2 have a similar number of service alerts per kilometre, with a negligible 0.43% difference. Line 1 has a distance of 38.8 kilometres, 48% greater than Line 2’s 26.2 kilometre length. This suggests that as the distance of a line increases, the increase in average number of incidents per kilometre begins to taper off.

Compared to the mean number of service alerts per unit distance for the system, Line 1 and 2 are again similar (8-9% difference) to the mean of 105.16. As previously stated, 92% of cumulative service alerts in the past 15 months are attributable to Line 1 and Line 2, while both lines combined compose 65 kilometres or 85% of the total system. Hence, this is an expected value. The data for Line 3 and Line 4 decreases the average. Differences between the average is depicted in Table 3.1.

Table 3.1 – Average alerts/kilometre compared to the mean alerts/kilometre

Line 3 and 4 have vastly differing datapoints yet again, owing to the safety and maintenance issues of Line 3. The length of Line 3, at 6.4 kilometres, is only 900 metres longer (or 16.4% longer) than the 5.5-kilometre Line 4. Despite the small difference in total length, Line 3 suffers from 110% more service alerts per kilometre than Line 4.

The conclusion here is that after a certain length, the increase in number of service alerts per any unit distance begins to taper and approach a constant value (see Graph 3.2). Any range of distances beyond this length would have the approximately the same number of service alerts per unit distance. At that point, the length of the transit service becomes negligible in service alerts/unit distance. With respect to shorter lines, the same cannot be said as Line 3 remains an anomaly – however lines with shorter distances still have less alerts per unit distance. This is a result of fewer trains that can potentially be implicated in an incident, and less trackage and infrastructure prone to faults.

A delay on a shorter line can potentially have less of a residual effect on the line’s overall operations – and can be cleared quickly. However, for longer lines, any sort of incident resulting in a delay will cause a much larger ‘ripple effect’ that impacts a large number of trains. Even after resolution of the incident, delays will still persist for a greater length of time on longer transit lines. These kind of delays can be managed by breaking up entry/exit points into a line – such as numerous storage tracks which dispatch trains to replace large gaps caused by a ripple effect of delayed trains (this is hypothetical, in most cases the existing system cannot be modified like this). Automatic Train Control (ATC) is also beneficial in improving recovery time from a delay.

Graph 3.2 – Alerts/kilometre vs total distance (in kilometres) on the Toronto Subway

Graph 3.2 depicts the relationship of alerts/kilometre with total length. It is best represented by a natural logarithmic function. This supports the idea that service incidents and general reliability per unit distance is relatively constant for services that cover a longer distance. Unfortunately, the function’s precision is impacted by the anomaly of Line 3. As such, for a better quality function Line 3 should be excluded, which I’ve done in Graph 3.3.

Graph 3.3 – Alerts/kilometre vs total distance (in kilometres) excluding Line 3

By excluding Line 3, the function no longer overstates values for shorter lines. The new function is, however, more aggressively curved, and may not properly reflect the decreasing rate of change of alerts/unit distance as subway line length increases. If we assume the function of best fit to be truly representative of this relationship, Line 2 has more alerts per kilometre than it theoretically should, and Line 1 has less alerts per kilometre compared to the theoretical amount. The deviations of the datapoints are tabulated in Table 3.2, and can be used as a measure of precision between either function and between line datapoints and the respective function.

Table 3.2 – Error between theoretical datapoints and actual datapoints

The function that excludes Line 3 (Graph 3.3) has an average error of 6% between datapoints and the function – while the function that includes Line 3 (Graph 3.2) presents an average error of 21% (the ‘percent difference’ is essentially the same as a percent error). Using the function from Graph 3.3, we can estimate the theoretical amount of service alerts that would occur on Line 3 in the scenario that it shared the same technology, maintenance regimen, safety standard and other standards of the rest of the subway system. In this theoretical case, Line 3 would observe 44.142 alerts per kilometre in the 15 month timespan. In total, this would be approximately 283 cumulative service alerts in the time period. However, since there is so little data available from which the function is built, it is not useful to interpolate as such.


Measuring the number of service alerts on the system can give some insight into the reliability of a system. Specifically, we are able to see how this compares between lines. In addition to the data in this post, the TTC releases a monthly CEO’s report that can measure reliability via on time performance for each subway line.

On Time Performance - 2020

On Time Performance - 2021

Note the definition of On Time Performance by the TTC:

"OTP measures the headway of adherance of all service trains at end terminals. Data represents Monday-to-Friday service between 6 a.m. and 2 a.m. To be on time a train must be within 1.5 times of its scheduled headway." - CEO Report December 2020

The reports from 2020 and 2021, along with the data I’ve gathered, confirms the high reliability of Line 4. With few service alerts, and thus few incidents, the line performs excellent, with close to 100% OTP.

Line 1 and 2 may be similar in reliability on a per-rider or per-unit distance basis, however the cumulative service alerts would present Line 1 as having a greater sum of incidents. Recall that Line 1 constitutes for 55% of service alerts on the system; this is reflected in the CEO’s reports, which shows 6 months between January 2020 and June 2021 where Line 1 performed with less than 90% on time performance (OTP) with respect to measurements at either terminal. In comparison, Line 2 only incurred 2 months where OTP measured below 90%. It should also be noted that Line 1 Automatic Train Control (ATC) installation resulted in numerous nightly or weekend closures.

Line 3 incurred 3 months where OTP was below 90%, actually performing worse than Line 2 in this metric. While this is surprising considering the cumulative service alert count, it is less unexpected considering the sheer amount of incidents that occur on Line 3 on a per-rider or per-unit distance basis.

Looking Ahead

To conclude, it is appropriate to look to the future. ATC works on Line 1 will reach completion by next year, and ATC on Line 2 will be complete before 2030. This will improve reliability on both lines. The completion of the Eglinton Crosstown in 2022 will also reduce scheduled closures that typically occur due to sensitive works at Eglinton and Eglinton West (Cedarvale) stations on Line 1. Line 1 will see major improvements in 2022, and less service alerts. The closure of Line 3 in 2023 will be the end of this anomaly in the system – to be replaced by a Line 2 Scarborough Subway extension that is currently under construction. It is unfortunate that Line 3 could not be modernized, however this is a topic for another post.

PSDs at Dangsan Station, Line 2 of the Seoul Metropolitan Subway (Source: Wikimedia)

One of my further recommendations is the installation of platform screen doors to prevent unwanted and possibly tragic interactions with the train. This is terrible for the individuals involved, and also results in more delays than necessary. In addition, PSDs can prevent unwanted debris from accumulating on the tracks, which can potentially be a fire hazard. PSDs are an effective way to improve reliability. Where can these installations be targeted for best value early on? Line 1’s downtown U is a good place to start. Currently, the TTC is not considering PSDs for any existing subway lines.

Another point is to change the role of TTC enforcement from primarily monitoring fare compliance and instead being agents of safety on the system. This means doing away with special constables and fare enforcement officers entirely, in favour of supportive staff. Security incidents can be mitigated with alert and well trained staff instead – customer safety is more important than fare enforcement, in my opinion.

TTC Fare inspectors at Dundas Station issue a fine (Source: CTV News Toronto)

While the data in this post provides a useful insight into reliability, it should be noted that the number of service alerts is not the best measurement of reliability of any particular transit line. As well, the relationships of distance and service alerts (Graph 3.2, Graph 3.3) does not have sufficient datapoints to provide a function (and neither does any potential graph that relate ridership and service alerts based on this data). The Toronto system only composes of 4 lines, and this makes it difficult to make comparisons and attempt to group similar types of lines. Hence, I intend to make a post of the same topic regarding the Paris Metro in the future, as RATP already provides line-specific metro alerts through multiple accounts. Another interesting case is the Montreal Metro, where STM also provides this information (perhaps the TTC’s social media team can learn something from RATP and STM!).

Also, expect a future post where I break down and summarize reliability of the Toronto subway based on OTP data from monthly reports. A lot of graphing!


All of the data used in the post can be originally sourced to the TTC itself.