Mobile network testing blog

Stories & insights

Mobile network testing

Written by Anna Llagostera | March 24, 2023

EVS AMR-WB IO speech quality improvements in telephone services

The enhanced voice services (EVS) codec was released in 2014 and developed in 3GPP to replace the AMR-WB codec for VoLTE and VoNR mobile telephony. The codec improves speech quality with an enhanced coding scheme, extended audio bandwidth (up to 20kHz) and improved delay jitter and packet loss compensation. The new EVS codec in AMR-WB interoperable (IO) mode is able to encode and decode AMR-WB streams for seamless switching between EVS and AMR-WB. The interoperable mode is compatible with systems or devices that do not support the EVS codec, such as inter-technology call setups. This post looks at recent results from an internal study of speech quality improvements when the EVS AMR-WB IO mode is used in conjunction with the legacy AMR-WB encoder.

EVS AMR-WB IO speech quality improvements in telephone services

The EVS codec was released in 2014 and developed in 3GPP – the creator of the 3G, 4G and 5G mobile communication standards – to replace the AMR-WB codec for VoLTE and VoNR mobile telephony. The codec improves speech quality thanks to an enhanced coding scheme, extended audio bandwidth (up to 20kHz) and improved delay jitter and packet loss compensation.

Audio bandwidth for common codecs in mobile voice services
Audio bandwidth for common codecs in mobile voice services

EVS AMR-WB interoperable (IO) mode

One of the requirements when developing the new EVS codec was to retain compatibility with AMR-WB. As a result, the new codec has an AMR-WB IO mode that is compatible with systems that do not support the EVS codec. EVS AMR-WB IO is increasingly used in current mobile networks and terminals.

Similar speech quality is expected from the EVS AMR-WB IO encoder or the legacy AMR-WB encoder, when decoding the resulting bitstream with a legacy AMR-WB decoder. In contrast, the EVS AMR-WB IO decoder can reproduce better speech quality from an AMR-WB encoded bitstream than a classical AMR-WB decoder. To prove this, we used speech files decoded in the EVS AMR-WB IO mode in a standardized, formal listening test in our speech quality lab.

Listening test

We designed a listening test with speech files ranging from excellent to very poor quality and invited people to evaluate the perceived quality of the recordings using an Absolute Category Rating (ACR), where listeners use a 5-point scale to rate the audio sample based on their expected speech quality over a HiFi headphone. In this 5-point scale, 5 is excellent and 1 is bad.

We conducted a listening test with speech signals ranging from fullband to narrowband. The test contained plain audio bandwidth limitations and offline processing conditions that use common speech codecs such as EVS, Opus*, AMR-WB and AMR. The test also included live recordings from R&S real field measurements that reflect typical codec and bitrate usage in mobile communications that were collected under good, average and poor network conditions.

We also added the conditions that better reflect current EVS AMR-WB IO usage in mobile networks and help improve perceived quality. We opted for AMR-WB encoding followed by AMR-WB IO decoding using 23.85 kbit/s and 12.65 kbit/s as bitrates. The goal was to see how listening quality improved relative to the legacy AMR-WB codec.

The listening test was conducted in the R&S SwissQual speech quality lab in Switzerland and used the four German reference speech samples standardized in ITU-T Recommendation P.501 Annex C. The speech samples were well-balanced two-sentence high-quality speech recordings with a 24kHz audio bandwidth. We invited 24 persons to listen to the speech files, none of whom had any a-priori knowledge of the test background or of voice coding techniques. The listeners were normal telephone users. The listeners and listening conditions in the lab are according to ITU-T standards, which was a pre-condition for the test to be considered by ITU as standard conformant, formal listening test.

Detailed results from the listening test were presented to ITU-T in January 2023.

Results

The table below shows the mean opinion scores (MOS) for that particular test design and listening panel. The reference speech samples are encoded with the AMR-WB codec and decoded with (a) the legacy AMR-WB decoder and (b) the EVS AMR-WB IO decoder.

MOS improvements in the ACR listening test when decoding an AMR-WB bitstream with EVS AMR-WB IO instead of AMR-WB
MOS improvements in the ACR listening test when decoding an AMR-WB bitstream with EVS AMR-WB IO instead of AMR-WB

We found that the perceived speech quality increased substantially for the two bitrates included in the listening test (23.85 and 12.65 kbit/s), when decoding an AMR-WB bitstream with an EVS AMR-WB IO decoder instead of the legacy AMR-WB decoder. The biggest improvements in listening quality were stronger for the lower studied bitrate (12.65 kbit/s). The listening test confirmed the better decoding capabilities of EVS AMR-WB IO and helped quantify the expected improvement in speech quality enabled by this EVS feature.

Extended audio bandwidth with EVS AMR-WB IO

The extended audio bandwidth reproduced by the EVS AMR-WB IO decoder is one reason for the clear gain in speech quality. The figure below shows the average signal spectrum for a speech signal encoded with the AMR-WB codec and decoded with (a) the standard AMR-WB decoder (in blue) and (b) the EVS AMR-WB IO decoder (in orange).

Extended audio bandwidth with EVS AMR-WB IO as decoder (orange) instead of AMR-WB (blue) to decode an AMR-WB stream with a 23.85 kbit/s bitrate
Extended audio bandwidth with EVS AMR-WB IO as decoder (orange) instead of AMR-WB (blue) to decode an AMR-WB stream with a 23.85 kbit/s bitrate
Open Lightbox

The EVS AMR-WB IO decoder can reproduce a speech signal with roughly 7.8kHz audio bandwidth, while the legacy AMR-WB decoder has ‘just’ 7kHz of audio bandwidth.

In addition to the extended audio bandwidth, EVS AMR-WB IO results in a better speech quality in packet loss free conditions due to the improved EVS post-processing modules that reduce the number of coding artifacts.

Speech quality prediction by Recommendation ITU-T P.863 ‘POLQA’

The quality of the speech samples recorded with our Rohde & Schwarz mobile network measurement tools is evaluated with Recommendation ITU-T P.863 ‘POLQA’, which Rohde & Schwarz SwissQual AG co-invented and holds intellectual property rights. Since EVS AMR-WB IO decoding is now increasingly used in real-field environments, validating the ITU-T P.863 prediction accuracy under such conditions is especially important.

We computed the average ITU-T P.863 ‘POLQA’ scores for the legacy AMR-WB decoder and the EVS AMR-WB IO decoder based on the reference speech samples standardized in ITU-T P.501 Annex D. Rohde & Schwarz uses the ITU-T P.501 Annex D reference speech samples in all its measurement tools, since the samples are specifically prepared for ITU-T P.863 ‘POLQA’. The samples use sentences spoken by male and female speakers and their ITU-T P.863 ‘POLQA’ scores are close to the average for many samples.

The table below shows the average ITU-T P.863 ‘POLQA’ scores in fullband (FB) mode for (a) the legacy AMR-WB decoder and (b) the EVS AMR-WB IO decoder.

Average improvement when decoding an AMR-WB bitstream with EVS AMR-WB IO instead of AMR-WB for ITU-T Rec. P.863 FB based on ITU-T Rec. P.501 Annex D reference samples
Average improvement when decoding an AMR-WB bitstream with EVS AMR-WB IO instead of AMR-WB for ITU-T Rec. P.863 FB based on ITU-T Rec. P.501 Annex D reference samples

The perceptible improvement in speech quality when decoding with the EVS AMR-WB IO decoder instead of the legacy AMR-WB decoder can be clearly reproduced with ITU-T P.863 ‘POLQA‘ for the two most common AMR-WB bitrates of 23.85 and 12.65 kbit/s.

Conclusion

Using the AMR-WB IO mode of EVS together with the legacy AMR-WB codec can improve speech quality for telephone services in mobile networks. EVS AMR-WB IO decoders are significantly better than legacy AMR-WB decoders in decoding AMR-WB bitstreams. The low cost of implementing this solution means we expect to see EVS AMR-WB IO mode used more often in networks and devices.

_______________________________

* The Opus codec, released in 2012 and standardized by the Internet Engineering Task Force (IETF) as RFC 6716, is used in OTT applications and more notably in VoIP WhatsApp audio calls. As the EVS codec, the Opus codec is able to deliver an audio bandwidth up to fullband.

Related stories

Network Performance Score: How to conduct NPS benchmarking campaigns

Read more

Network Performance Score: Initiate improvements with a QoE-centric score

Read more

Network Performance Score: Configurator for step-by-step campaign setup

Read more

Network Performance Score: Templates for easy support in all products

Read more

Subscribe MNT blog

Sign up for our newsletter

Stay up to date and get stories and insights with our frequent mobile network testing newsletter.

Stories by category

Benchmarking & optimization

More information

Field services & interference hunting

More information

Innovations in mobile network testing

More information

Testing from RF to QoE

More information

Request information

Do you have questions or need additional information? Simply fill out this form and we will get right back to you.

Marketing permission

Your request has been sent successfully. We will contact you shortly.
An error is occurred, please try it again later.