Thrifty Banker
  • Politics
  • Business
  • World
  • Investing
  • Politics
  • Business
  • World
  • Investing

Thrifty Banker

World

Arabic AI could help open doors for other languages

by October 4, 2023
October 4, 2023
Arabic AI could help open doors for other languages

The emergence of Chat-GPT and similar platforms has created a buzz around large language model AI – artificial intelligence trained on vast sets of data from the internet to respond to text commands.

Despite growing interest in AI in the Middle East, Arabic-language models have lagged behind. But a team of academics, researchers and engineers in the United Arab Emirates (UAE) recently unveiled a powerful tool tailored to the world’s Arabic speakers, which its creators say could pave the way for large language model (LLM systems) in other languages that are “underrepresented in mainstream AI.”

Named after the UAE’s largest mountain, “Jais” was created in collaboration between Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Silicon Valley-based Cerebras Systems, and Inception, a subsidiary of UAE-based AI company G42.

Although ChatGPT, Meta’s LLaMA and other LLMs have Arabic-language capabilities, they were mostly trained on English data on the internet, according to Timothy Baldwin, acting provost and professor of natural language processing at MBZUAI.

Instead, Jais used English and Arabic datasets, with a focus on content from the Middle East, allowing it to go beyond “what anyone else has been able to achieve for Arabic,” Baldwin says.

Languages that use the Latin alphabet dominate the internet, with English by far the most-used. That means datasets are largest in those languages, according to Mohammed Soliman, director of strategic technologies and the cyber security program at the Middle East Institute, in Washington DC.

Typically, language models trained in English have Western-centric data sets. “[These LLMs] lack awareness of other cultures, adversely affecting the user experience for people of diverse backgrounds,” Soliman added.

As a result of its training, Jais understands cultural nuances and dialects, according to MBZUAI, enabling it to be used more widely across different industries. In future releases, the team aims to have Jais work with images, graphs or tabular data instead of just text, broadening its uses and potentially enabling it to interpret medical scans, investment data or data from satellites.

Different dialects

Arabic is the sixth most spoken language in the world and is rich with a “constellation” of different dialects, which adds to the complexity of training a language model, Baldwin said. Modern Standard Arabic is typically used for official documents and formal writing, but local dialects are often used on blogs or social media. By training on a diverse set of data Jais can usually switch between dialects, said Baldwin.

“There’s certainly room for improvement there, but the focus has been more on the robustness in terms of being able to understand if we do have more informal inputs to the model,” Baldwin added.

A recent update allows Google’s Bard to also understand questions in over a dozen Arabic dialects, including Egyptian colloquial Arabic and Saudi colloquial Arabic; the response are then returned using Modern Standard Arabic.

Jais has 13 billion parameters, and a 30-billion parameter update is in the works, Baldwin said. Parameters quantify the size of a language model, but not necessarily the accuracy. ChatGPT-3.5 has around 175 billion parameters, according to OpenAI.

Jais, like other generative AI models, uses instruction tuning to prevent it from creating “toxic” or “harmful” answers, Baldwin said. It won’t generate anything that could lead to self-harm, damage to others, or is suggestive of addiction. The responses it generates adhere to local rules and customs on topics such as homosexuality and drugs.

MBZUAI had “various dialogues” with the UAE government and other institutions around responsible AI, which were referenced when developing Jais, according to Baldwin.

Regional developments

There have been growing efforts in the UAE to develop generative AI systems. It was the first country in the world to appoint a minister of AI, in 2017, and the region’s largest generative AI model, Falcon, was unveiled by Abu Dhabi’s Advanced Technology Research Council and the Technology Innovation Institute (TII) in March, with a new iteration released in September.

Although not currently available in Arabic, Falcon is more powerful than Jais in English, with 180 billion parameters, and outperforms competitors such as Meta’s LLaMA 2 based on its ability to reason, code and complete knowledge tests, according to TII. Unlike Google’s Bard and ChatGPT, Falcon and Jais are open-source, which means their code is available for anyone to use or change.

A 2018 report by consulting firm PwC estimated that the Middle East could accrue up to $320 billion in benefits from AI by 2030. The region wants to make sure it has its “own capabilities” in terms of AI, says Ali Hosseini, PwC’s Middle East chief digital officer.

“Some of the best open-source models are actually developed in our region,” Hosseini added, referencing Falcon and Jais.

Its makers hope that Jais will further the development of generative AI in the Middle East. “This is kind of step one of many future steps,” Baldwin said. “Not just for Arabic large language models, but elsewhere.”

This post appeared first on cnn.com
0
FacebookTwitterGoogle +Pinterest
previous post
Italian authorities investigate Venice bus crash that killed at least 21 people
next post
Teen suspect in fatal Thai shopping mall shooting charged with murder, police say

Related Posts

Italian mafia boss who escaped prison by tying bed sheets...

February 3, 2024

Women’s World Cup: How to watch Sweden and...

August 15, 2023

Israel exploring construction of humanitarian compound in northern...

December 20, 2023

‘Severe malnutrition’ is growing concern as thousands flee...

September 27, 2023

India’s main opposition party says tax authorities have...

February 16, 2024

A visual guide to Taiwan’s high-stakes presidential election

January 12, 2024

Taiwan’s new president calls on China to stop...

May 20, 2024

American mother and daughter taken hostage by Hamas...

October 21, 2023

At least 116 people killed in crush at...

July 3, 2024

29 million-year-old fossilized nest discovered in Oregon could...

January 17, 2024

    Sign up for our newsletter to receive the latest insights, updates, and exclusive content straight to your inbox! Whether it's industry news, expert advice, or inspiring stories, we bring you valuable information that you won't find anywhere else. Stay connected with us!


    By opting in you agree to receive emails from us and our affiliates. Your information is secure and your privacy is protected.

    Popular

    • 1

      Top 10 Countries for Natural Gas Production (Updated 2024)

      April 6, 2024
    • 2

      Understanding Lithium Mineralogy from an Investment Perspective

      September 12, 2023
    • 3

      US Capital Global Facilitates $50MM Financing to Accelerate Charbone Hydrogen’s North American Expansion

      June 6, 2025
    • 4

      Israel confirms it is arming Hamas rivals in operation opposition calls ‘complete madness’

      June 6, 2025
    • 5

      A GOP operative accused a monastery of voter fraud. Nuns fought back.

      January 3, 2025
    • 6

      Crypto Market Recap: Strategy Eyes US$1B Raise for Bitcoin Push, UK Regulator Reverses ETN Ban

      June 6, 2025
    • 7

      China’s aircraft carriers send message in the open Pacific for the first time – and bigger and more powerful ships are coming

      June 16, 2025

    Categories

    • Business (1,053)
    • Investing (2,079)
    • Politics (2,977)
    • Uncategorized (20)
    • World (3,387)
    • About us
    • Contact us
    • Privacy Policy
    • Terms & Conditions

    Disclaimer: thriftybanker.com, its managers, its employees, and assigns (collectively “The Company”) do not make any guarantee or warranty about what is advertised above. Information provided by this website is for research purposes only and should not be considered as personalized financial advice. The Company is not affiliated with, nor does it receive compensation from, any specific security. The Company is not registered or licensed by any governing body in any jurisdiction to give investing advice or provide investment recommendation. Any investments recommended here should be taken into consideration only after consulting with your investment advisor and after reviewing the prospectus or financial statements of the company.

    Copyright © 2025 thriftybanker.com | All Rights Reserved