Facebook In Developing Single Encoder Based Technology to Translate 93 Languages

Tidio Live Chat Software - Add Tidio live chat software to your website in minutes. Contact visitors and turn them into happy customers. Enhance their experience and boost your sales. Get it for Free

WP Rocket - WordPress Caching Plugin

Must Read

Facebook researchers recently published a paper based on Schwenk (2018a) which proposes an architecture for learning joint multilingual sentence representation in 93 languages using a single BiLSTM encoder and BPE vocabulary shared by all languages. There have been other researches in this area as well. Still, all of them have somewhat been limited in terms of performance primarily because they work on a separate model for each language and a cross connection between different languages is barred.

Facebook researchers are interested in the representation of sentence vectors common to both the input language and NLP tasks. The aim of this research is to help languages with limited resources, to achieve zero-shot migration of NLP models and to implement code conversion. What makes this research different from any other NLPs is that this research sets out to study the joint sentence representation in 93 different languages. In contrast, the common NLP focuses on two languages at the most.

Facebook Language Training Samples
Figure 1: 75 out of 93 languages used to train the proposed model

Also Read

The study covers a huge number of 34 languages and 28 different writing systems. This herculean task is achieved through the use of zero-shot cross-language natural language inference (XNLI datasets), classification (MLDoc datasets), bitext mining (BUCC datasets), and multilingual similarity searches (Tatoeba datasets). The new data obtained from the research based on Tatoeba Corpus acts as the baseline results for 122 languages.

The architecture of the study works in an encoding-decoding manner. Once a sentence is embedded, it is linearly transformed to initialize the LSTM decoder. There is only one encoder and decoder in the system and the researchers have used a joint byte-pair encoding vocabulary which will make the encoder learn language independent representations. The encoder is limited to 1-5 layers and each layer of every dimension is limited to 512 dimensions. The decoder generates meaning using the language ID and has a 2048 dimensional layer.

Facebook Multi Language Sentence Embedding
Figure 2: Architecture of our system to learn multilingual sentence embeddings.

Elegant Themes - The most popular WordPress theme in the world and the ultimate WordPress Page Builder. Get a 30-day money-back guarantee. Get it for Free

Moses statistical machine translation system is used for the pre-processing except for Chinese and Japanese texts for which Jieba and Mecab are used to split the texts respectively.

iThemes WordPress Hosting

Stay updated

Subscribe to our newsletter and never miss an update on the latest tech, gaming, startup, how to guide, deals and more.



- Advertisement -
- Advertisement -


Grow Your Business

Place your brand in front of tech-savvy audience. Partner with us to build brand awareness, increase website traffic, generate qualified leads, and grow your business.

- Advertisement -

Grow Your Business

Get these business solutions, tools and services to help your business grow.

Elementor -Join 5,000,000+ Professionals Who Build Better Sites With Elementor. Build your website with 100% visual design that loads faster and speeds up the process of building them.

WP Rocket

WP Rocket - Speed up your website with the most powerful caching plugin in the world. The website speed increase means better SEO ranking, user experience, and conversation. It’s a fact that Google loves a fast site.


Kinsta - If you are looking for WordPress managed hosting, Kinsta is in the leading front. Kinsta provides WordPress hosting for a small or large business that helps take care of all your needs regarding your website with cutting-edge technology.


OptinMonster - Instantly boost leads and grow revenue with the #1 most powerful conversion optimization toolkit in the world. 700,000+ websites are using OptinMonster to turn their traffic into leads, subscribers, and sales.


- Advertisement -
- Advertisement -
ChatGPT Reaches 100 Million Users in Two Months Microsoft’s Teams Get OpenAI-Based Features WhatsApp New Feature that Allows Users to Create Calling Shortcuts Instagram Working On Twitter-like Paid Verification Feature OnePlus Ace 2 Specs Exposed Online Realme GT Neo 5 Full Specs Revealed  Samsung Galaxy S23 Ultra: The New Android King Twitter To End Free API February 9 MLS Season Pass Now Available On Apple TV App Tesla To Increase Giga Shanghai EV Production to 20,000 Weekly 
OpenAI Releases Tool To Detect AI-generated Text Tesla Records Double Net Profit in 2022 India to Produce Upcoming iPhones: Trade Minister Japanese Professor Developed A Power Semiconductor made of Diamond Google Releases New Product for India’s Merchants Indian EV Startup Unveil Two AutoBalancing Electric Scooters OPPO Find X6 Pro Images Render via Weibo Sony Develops New Tech to Reduce Noise of Image Sensors