New Z-code Mixture of Experts templates improve quality and efficiency in Translator and Azure AI

Microsoft is making upgrades to Translator and other Azure AI services powered by a new family of artificial intelligence models its researchers have developed, called Z-code, that deliver the kind of performance and quality that other large-scale language models have but can be executed much more efficiently.

“Our goal is to help everyone and every organization on the planet communicate better, and to achieve that goal there are really two important dimensions: we want the quality of translations to be as good as possible and we want to take support as many languages ​​as possible,” said Xuedong Huang, Microsoft Technical Fellow and Chief Technology Officer of Azure AI.

Z-code takes advantage of shared linguistic elements across multiple languages ​​through transfer learning – which applies knowledge from one task to another related task – to improve the quality of machine translation and other language comprehension tasks . It also extends these capabilities beyond the most common languages ​​across the world to underrepresented languages ​​that have less training data available.

“With Z-code, we’re really making incredible progress as we leverage both transfer learning and multitasking learning from monolingual and multilingual data to create a state-of-the-art language model that we believe , offers the best combination of quality, performance and efficiency that we can offer our customers,” said Huang.

These models use a sparse “mixing of experts” approach which is more efficient to run because it only needs to engage part of the model to complete a task, as opposed to other architectures which must activate a full AI model to execute every request. . This architecture allows massive scaling of the number of model parameters while keeping the amount of computation constant.

To bring these models into production, Microsoft uses NVIDIA GPUs and Triton Inference Server to efficiently deploy and scale them for high-performance inference.

Microsoft recently deployed Z code templates to improve common language understanding tasks such as name entity recognition, text summarization, custom text classification, and key phrase extraction in its Azure AI services . But this is the first time a company has publicly demonstrated that it can use this new class of Mixture of Experts models to power machine translation products.

The new Z-code-based translation model is now available, by invitation initially, to customers using document translation in Translator, a Microsoft Azure Cognitive Service that is part of Azure AI.

Microsoft’s Z Code models have consistently improved translation quality over current production models, based on common industry metrics. Unlike typical multilingual transfer learning approaches, which typically show AI quality gains in languages ​​that have fewer direct translation examples available for training, Z-code Mixture of Experts models show gains constant even in the most important languages.

New Z-code Mixture of Experts AI templates help improve and optimize the efficiency of Translator and other Azure AI services.

In a blind test commissioned by Microsoft, human raters found that Z-code Mixture of Experts models improved translations in all languages, with an average gain of 4%. For example, the models improved translations from English to French by 3.2%, from English to Turkish by 5.8%, from Japanese to English by 7.6%, from English to Arabic by 9.3% and English to Slovenian by 15%.

Create more powerful and integrative AI systems

Code Z is part of Microsoft’s larger Code XYZ initiative that seeks to combine models for text, vision, audio, and multiple languages ​​to create more powerful, integrative AI systems that can speak better. , hear, see and understand people.

Over the past five years, Microsoft has developed models that have matched human performance in conversational speech recognition, machine translation, image captioning, SuperGLUE natural language understanding, and good question answering. senses. These breakthroughs provide the basis for realizing more ambitious AI systems that can achieve multisensory and multilingual learning closer to how people learn and understand, Huang said.

“These are the pieces, the building blocks that we use to build truly differentiated intelligence…and to train profitable production systems,” Huang said.

The Z-code models were developed as part of Microsoft’s AI at Scale and Turing initiatives, which seek to develop large pre-trained models on large amounts of textual data to understand the nuances of language – which can be integrated into several Microsoft products and also made available. to customers for their own use.

The same underlying model can be refined to perform different language comprehension tasks such as translating between languages, summarizing speech, suggesting ways to complete a sentence, or generating suggested tweets, instead of having to develop separate models for each of these narrow goals.