Multilingualism

The future of multilingualism in a post-AI civilization.

May 28, 2023

Each new language you learn opens up a new world. A world that you can explore, understand and inhabit. Learning languages is exercise for the brain, it wakes up dormant neural pathways and triggers different patterns of thought.

Language encodes knowledge. And when we lose a language we lose the knowledge that it encoded as well, until someone figures out a way to understand that language again. There is still a lot of ancient knowledge locked away in lost languages. Who knows what we will learn when the Indus Valley script is finally comprehended once more and someone reads those symbols after a timeout of two thousand years.

Language is deeply intertwined with culture. Each language unlocks a wealth of information, literature, and knowledge that is unique. By learning new languages, you gain insights into different cultures, worldviews, and ways of thinking. Multilingual individuals often exhibit enhanced cognitive abilities, such as better problem-solving skills, improved multitasking, and greater mental flexibility. This is because managing multiple languages requires the brain to constantly switch between different linguistic systems, strengthening cognitive control.

Let’s pause for a moment and consider India. A country of contradictions, the land of unity in diversity. India’s linguistic tapestry is a testament to the country's diverse cultural and historical influences. With thousands of languages and dialects spoken across the nation, multilingualism is a way of life for many Indians. At the heart of this linguistic abundance lies the idea that language is not merely a means of communication, but a reflection of the soul of a culture. This national aptitude at multilingualism is often taken for granted, but is something that should be actively fostered. Keeping aside parochial grievances and false pride, Indians should be striving to capitalize on any cognitive advantage multilingualism offers.

***

Now you may question the place of multilingualism in a world where AI powered translation is freely and instantly available for much of the population. Why not just translate everything to a language you already know? Why bother learning another language?

Well, translation is not a lossless encoding mechanism. Some of the knowledge encoded in a language is lost when you translate the material to another language. The challenge of capturing the true essence of a word or phrase from one language into another is a formidable one. While translations can convey the general meaning, some nuances often remain elusive. This phenomenon is particularly pronounced in languages with deep historical and cultural roots.

For instance, the Sanskrit term "Dharma" is often translated as "duty" or "religion." While these translations provide a general sense of the term, they fail to capture its full breadth and depth. In reality, "Dharma" encompasses an individual's cosmic duty, their moral responsibilities, and the natural order of the universe. It reflects the interconnectedness of all beings and the delicate balance that sustains existence.

In a monolithic future where universal translation wins and multilingualism loses, will all of us be the poorer for the knowledge that is lost? Will humanity, as a civilization, choose the convenience of universal translation over the satisfaction of a new language learnt and a new world explored?

***

As AI and large language models (LLM) make their presence felt, what I wonder will be the future of multilingualism?

There are thousands of languages in the world, yet the majority of content available digitally and on the Internet is in English. All the LLMs in vogue today have been trained with text from most major languages - but in varying volumes. By far and away English forms the bulk of the training corpus of all of these models.

But English is not the universal language. It is not even the language spoken by most humans in the world today. Could this disparity and bias towards English mean that LLMs are missing out on capturing the nuances of knowledge encoded in other languages.

If multilingualism causes human intelligences to develop different patterns of thought, could this be true in artificial intelligences as well?

Like the surprising (or maybe not) finding that multi modal models, large language models trained on not just text but also audio, video, images and more, are better at reasoning and other cognitive abilities than the ones trained just on text. Will there be a cognitive leap as the training corpus contains the definitive input from not just English but so many of the other surviving human languages.

If the training set is more representative, what could happen?

Will AI be able to provide more accurate answers and more appropriate interpretations based on a better encoding of knowledge? Could AI models be trained to preserve dying languages at the risk of extinction, and preserve knowledge that otherwise could be lost forever.

Will more multilingual AI be better at innovation and creativity? Multilingual individuals often demonstrate enhanced creativity, as they can draw upon multiple linguistic systems to express their ideas. Anthropic’s Claude models are notably better than OpenAI’s GPT-4 when it comes to creative writing such as poetry. I have also noticed that the Anthropic models are better in understanding the nuances of certain languages such as Malayalam. Could this be a coincidence? I think not.

Will there be AI agents trained with different weightage of languages in their corpus, so we can analyze their responses to different stimuli and even how they interact with each other. How would an English AI, a German AI, a Chinese AI, a Sanskrit AI and a Tamil AI respond to a given problem, for example the trolley problem or the prisoner’s dilemma or even more mundane choices. There may be much to learn from such an experiment. We may even understand more about ourselves in the process.

Research and experimentation with multilingual AI could help us understand the nuances of how language encodes knowledge and whether (and why) multilingualism matters. And no matter what the results of the experiments may say, it will be a step towards a more democratic and diverse AI future.

Sree Nair

May 29, 2023

Excellent thought provoking article.

'Multilingual individuals often exhibit enhanced cognitive abilities, such as better problem-solving skills, improved multitasking, and greater mental flexibility.' Is there any empirical evidence fr this ?

1 reply by Shyam Sreevalsan

1 more comment...

The Day After Tomorrow

Discussion about this post

Ready for more?