CERTIUM Blog

The standard for ATC communications

Decoding the importance of MOS in Air Traffic Communications

In today’s crowded skies, where aircraft move with calculated precision, the uncelebrated heroes orchestrating this aerial symphony are air traffic controllers. Picture a bustling control room, with its array of screens and headsets, where split-second decisions can mean the difference between smooth skies and potential problems.

Air traffic control (ATC) is the backbone of aviation safety, a realm where voice transmissions are as rapid and pointed as the aircrafts they guide. In this dynamic landscape, the clarity and intelligibility of communication is not just desirable but imperative for pilots and controllers alike.

As technology advances, the aviation industry is undergoing a significant transformation, particularly in the realm of communication. The migration from traditional circuit-switched systems to Voice over Internet Protocol (VoIP) has emerged as a pivotal shift, offering advantages in terms of efficiency, functionality and flexibility. Yet, with these benefits come challenges, as the aviation sector grapples with the intricacies of implementing an IP infrastructure capable of maintaining the high standards of reliability essential for safe air travel.

Within this rapidly evolving landscape, the importance of the Mean Opinion Score (MOS) becomes apparent. MOS serves as the critical metric for measuring voice quality and user experience, where a MOS of 5 indicates excellence and a MOS of 1 abysmal quality. It’s a compass guiding the industry through the challenges of VoIP migration and beyond. As voice monitoring takes center stage, the MOS not only gauges the clarity of individual communications but also indicates the overall performance of the air traffic communication system.

Join me on a journey through the skies, where every word transmitted holds the weight of responsibility, and explore how the MOS is shaping the future of voice monitoring in air traffic communications.

The mystery of MOS in Air Traffic Control

Originating from telecommunications (telco), MOS has found a unique adaptation in the neighboring world of ATC communication.

Before we go into the details of MOS in ATC, let’s have a look at the common challenges and changes when switching from TDM to IP:

  • The basic principle of packet-switching is not well suited for interactive real-time communication.
  • Different services are competing for the same infrastructure.
  • There is more intelligence in end-devices compared to circuit-switched communications.
  • Generally, the service environment is more dynamic, due to variations in network load, routing, need for software updates, etc.

While these challenges are less pronounced in ATC than in the telco domain, it is clear that IP networks make it a lot harder to ensure stable voice quality over time as this diagram illustrates:

A diagram illustrating the continuously changing quality of a VoIP connection due to different IP packet transport conditions over time, compared to the constant conditions of a switched ISDN connection.

But ATC communication also brings systemic differences and challenges. Unlike the telco domain, where audio is transmitted non-stop in both directions during a call, ATC introduces a distinctive rhythm – PTT (push-to-talk), the deliberate act of pushing a button to activate voice transmission. This action, however, comes with a twist; it effectively blocks the channel, allowing voice to flow in only one direction at a time, sculpting an exchange marked by brevity and coded precision. In the ATC world, brevity isn't just a stylistic choice; it's a necessity dictated by the nature of the communication.

While the International Telecommunication Union (ITU) standardizes Mean Opinion Score (MOS) evaluation for voice snippets of approximately 8 seconds in telephony, ATC confronts the challenge of collecting and collating talk snippets to get to 8 seconds. On top, there is also the question of how to assess the service performance during idle times, when no one is talking.

Regardless of such technical questions, regulations mandate a MOS better than 4.0, underscoring the critical importance of crisp and intelligible communication in the ATC landscape. However, the urgency and brevity inherent in air traffic control exchanges create a divergence from conventional MOS measurements, requiring a recalibration to accurately capture the quality of voice transmissions.

Navigating the ED-Standards

Embarking on the next leg of our journey into the intricacies of MOS in air traffic communication, we navigate the terrain of the EUROCAE Working Group 67 (WG-67) standards.

  • ED-136 VoIP ATM System Operational and Technical Requirements
  • ED-137 Interoperability Standard for VoIP ATM Components
  • ED-138 Network requirements and performances for voice over internet protocol (VoIP) air traffic management

In short, ED-136 sets a framework for IP-based ATC communication, ED-137 defines the protocols for communication between ATC components, ED-138 specifies how the underlying network should look like and ED-139 suggests how to test that an ATC system meets the aforementioned requirements. These standards were meticulously crafted to meet the unique needs of Air Navigation Service Providers (ANSPs) for flawless air-ground and ground-ground communications in the age of IP. The ED-standard are not just a set of specifications; they are a benchmark that ANSPs must meet to guarantee the reliability and clarity of voice transmissions in the critical realm of air traffic control.

When it comes to voice quality ED-136 Chapter 6.3.1 sets a high bar with the following statement: “The system SHALL achieve a Mean Opinion Score (MOS) > 4.0 […] for all air-ground and ground-ground communications.” This is indeed a challenging requirement as can be seen in the screenshot from a monitoring system below:

A screenshot of the passive monitoring system AVQA showing details of a five second timeslice of an RTP stream
A screenshot of AVQA

The data shows a five second timeslice of a VoIP stream, where effectively only two packets were lost. The resulting MOS of 3.95 already drops below the ED-136 quality requirement.

It must be noted that the ED-136 MOS requirement raises two big questions. First of all, it conflicts with speech encoding mechanisms, called codecs, that are explicitly allowed in ED-137. Two of these codecs, namely G.728 and G.729, don’t meet the “MOS > 4.0” requirement, even under the best conditions. Any ANSP using these codecs automatically violates ED-136 voice quality requirements. Secondly, ED-136 does not explicitly state whether the MOS requirement applies to the actual end-to-end, mouth-to-ear quality or only to the VoIP packet streams, i.e. disregarding the packets’ audio payload. This is of specific importance to ATC, as the speech quality of analog radio communications between controllers and pilots more often than not falls below the 4.0 mark. Any issues with the voice over IP packet transport further degrade the quality.

Despite such questions, the significance of adhering to the EUROCAE standard suite becomes evident when we consider the weight of responsibility on the shoulders of air traffic controllers. In an environment where every second is a precious resource, meeting the requirements is not merely a bureaucratic checkbox; it is a lifeline for ANSPs striving to maintain the delicate balance between safety and efficiency. The standards, with their stringent criteria for voice communication quality, become the bedrock upon which the foundations of air traffic control reliability are built.

Choosing the best way to calculate MOS in ATC

In the pursuit of deciphering the mystery of MOS in ATC, practical methods for actually determining the voice quality come into play. As the word “subjective” in its name suggests, MOS is originally an empirical metric. ITU-T P.800 defines that the speech quality is rated by actual humans on a scale from 1 (bad quality) to 5 (excellent quality). This MOS value is the arithmetic mean over many single ratings. Translated to ATC, the subjective assessment relies on the feedback and complaints of air traffic controllers, essentially placing the detection of issues in the hands of end-users. However, the question arises – is it prudent to wait for issues to surface through user complaints, or would it be wiser to proactively address potential concerns before they impact the users?

Two main methods for automating the monitoring process come into play. Drawing parallels to the telco landscape, the differentiation between active and passive monitoring methods emerges. Active methods involve the deliberate generation of test voice samples to assess quality, resembling the proactive measures often taken in the telecommunications sector. On the other hand, passive methods entail the continuous monitoring of live transmissions, akin to the real-time monitoring common in telecommunications networks.

1. Active testing: Here, known artificial speech samples are transmitted over the network and the receiver compares the known signal with the received signal. This approach is standardized most notably in ITU-T P.863 POLQA. Active testing can not only deliver a MOS per test call, but also other quantitative metrics and KPIs. Jitter, indicative of the variations in packet arrival time, packet loss, and the occurrence of duplicate packets contribute to evaluating the robustness of the communication network. Active tests provide an end-to-end view on these metrics. Excellent test results imply that at the time of measurement the entire chain from Controller Working Position (CWP), across the network to remote radio sites is in good condition.

2. Passive monitoring: This approach analyzes actual live traffic in the network, to determine metrics such as jitter, packet loss and used codecs to estimate the MOS based on ITU-T G.107. Rather than inspecting predefined test calls at communication endpoints, passive objective assessment gathers data passively from all calls at multiple points in the network. By correlating these observations, it is possible to infer the perceived speech quality in near real-time and to provide valuable insights from the network to guide any troubleshooting. This proactive stance enables the identification of potential issues before they escalate to the point of impacting end-users.

Active testing and passive monitoring are complementary approaches, but a truly proactive approach for the sake of air safety requires real-time insights that only a passive monitoring system can deliver.

Let’s take a look at a real-life example. ISAVIA stands as the primary airport and air navigation service provider (ANSP) in Iceland, operating within the challenges of one of the world's most demanding air traffic control environments, spanning the expansive North Atlantic. Operating within the Reykjavik oceanic control area (OCA), the ANSP handles a staggering 100,000 aircrafts each year. Anticipating and embracing the critical trend in ATC technology ISAVIA took proactive steps years ago. The result is a fully implemented IP-based Voice Communication System, showcasing the organization's commitment to staying at the forefront of technological advancements in the domain of air traffic control. ISAVIA was one of the first ANSPs to implement ED-137 compliant ATC communication. The focus then quickly shifted to monitoring their VoIP service and the need for real-time information. Consequently, ISAVIA decided for a best-in-class passive ATC VoIP monitoring system, serving all reporting needs, including the ability to alarm in real-time on low MOS. For more on this case and complying with the EUROCAE requirements you can watch this on-demand webinar.

Complementary approaches

Since there’s a big difference between these two approaches let me summarize some pros and cons for each:

The Magic of expected MOS

One final MOS mystery remains: how to assess the speech quality when there is no active transmission?

But first let’s get back to the differences between the ATC and the telco worlds. When we talk on the phone, our sentences will go on for dozens of seconds. In their definition of the MOS, the ITU states that MOS needs to be evaluated for voice snippets of around 8 seconds. Historically, this makes sense, as there needs to be some amount of speech for an actual human test subject to conclude on a rating of 1 to 5. For passive monitoring in the telco domain, this requirement is also not an issue, because audio is typically transmitted continuously in both

directions. However, in ATC we have very short, coded and unidirectional communication based on the PTT-paradigm. Passive monitoring systems for ATC therefore need to collect and collate all the short talk snippets to get to at least 8 seconds. This allows to calculate a meaningful estimate of the MOS meeting the requirements of the ITU.

Channels will typically be more idle than active – what MOS to report, when there is no PTT and hence no speech communication? Since ATC audio transmission is not continuous, there are long idle times, where the endpoints (controller working positions and radios) send so-called R2S packets. These R2S packets can contain valuable information on the radio signal strengths, the noise level and timestamps to calculate the one-way delay. Per default these packets are sent every 200 milliseconds – as opposed to voice packets which are typically sent every 20 milliseconds – and they also serve as a ‘keep-alive’ mechanism. It is not catastrophic if an individual R2S packet arrives late or even is lost and is has no virtually no user impact. However, the question arises: “what if”. What if, an audio transmission had taken place, at the same time that the R2S packet was lost?

To understand how the user experience would be in those idle times, Rohde & Schwarz developed the Expected MOS (eMOS). It makes use of those ‘keep-alive’ packets and pretends these actually contained audio. If the ‘keep-alive’ packets experience jitter or loss then the eMOS will go down, just like the MOS would for real audio. This metric is highly useful as it facilitates continuous MOS analytics and monitoring even in the world of ATC communications.

Final Words: It’s time to take action

Voice is generally difficult to send over a packet-based architecture such as IP due to its high sensitivity to integrity and latency of the packet stream. However, ANSPs really need to take on this challenge. The Mean Opinion Score is a quantitative, industry-standard metric that measures a phone call’s perceived quality and the overall voice service performance. It is essential to continuously keep track of every air-to-ground and ground-to-ground call’s quality of speech and then store this information for future reference or analysis. Along with other KPIs, the MOS score helps provide an unambiguous and comprehensible quantitative measurement of voice transmission capabilities.

Just like the telecom sector experienced challenges during its own migration to VoIP, the ATC communication industry is poised to encounter its share of hurdles. Anticipating scenarios where "things go bad," such as one-way audio, becomes imperative.

In contemplating the monitoring methodologies, the importance of passive monitoring takes center stage. The question arises: Do you want to be in a position where problems are only revealed through users' (air traffic controllers) complaints or sporadic artificial test calls? This highlights the proactive stance afforded by passive monitoring, ensuring a vigilant approach to address potential issues before they impact users.

In this dynamic context, one more critical question looms: What if Civil Aviation Authorities (CAAs), suddenly request measurement reports showcasing that ED-136 requirements are met? Are you equipped to provide that?

The unique nature of ATC communication, as described in preceding sections, suggests that the challenges faced by the ATC communication market might surpass those encountered by the telco industry. MOS requirements are just one facet of a complex landscape that warrants a comprehensive approach. In upcoming articles, we'll explore additional challenges inherent in ATC communication, unraveling the multifaceted nature of this critical domain. Stay tuned.