About LingoAI

Setting the Stage for Decentralised AI of Multilingual Data

What is LingoAI?

LingoAI is a decentralized multilingual data platform for AI.

Our network capitalizes on global data resources, employing blockchain-based verification alongside a robust data and application separation protocol to ensure the security of all data exchanges, spanning from data crowdsourcing, local inference, and private deployment, safeguarding each step of the data handling process.

By harnessing the decentralised collaborative AI infrastructure network, LingoAI aims to help build people's AI and foster global communications.

Background

The genesis of LingoAI was originally inspired by the Chief Fintech Officer of MAS (Monetary Authority of Singapore), Mr. Sopnendu Mohanty, who raised the need for "data privacy and security when the government applies LLM for language localisation and translation”.

At the WA Web3.0 Conference held in Perth in April 2023, Una Wang (Current founder of LingoAI) attended the conference on behalf of LanguageStory, and exchanged ideas with Mr. Mohanty about the localization of AI models in language translation. Mr. Mohanty suggested to Una that the central bank has a very huge amount of content which needs to be translated into other languages. However, the central bank won't use centralized AI such as OpenAI to do the translation of documents due to the content privacy and security. Meanwhile, the translated content can't be used directly either.

Team

The project boasts an expert team including professionals with diverse backgrounds in AI, blockchain, Web3.0 technologies. The core team hails from prestigious institutions including Stanford University, Peking University, Harvard University, Washington University in St. Louis, and Nanyang Technological University, JP Morgan and more.

Additionally, an advisory board comprising industry experts who are from Regulatory Authority of Banking Institutions Singapore MAS, National Institute of Standards and Technology (NIST) Public Working Group on Generative AI, AI Scientist for MCC Research Lab's 5th Generation Computer Program provides strategic guidance and oversight.

Why are we building LingoAI?

Challenges

Two key challenges in the current landscape of AI: Firstly, there is a significant lack of high-quality, contextually accurate translation and content creation in multiple languages and specialized technical domains. Secondly, existing AI models, while powerful, often fall short in capturing factual knowledge and performing common sense reasoning.

The datasets used to train different models often come from narrowly defined domains with limit languages, and models trained on this data may exhibit inherent biases. Collecting audio data is a significant hurdle as the largest available speech datasets currently encompass no more than 100 languages.

Today, over 7,000 languages are spoken globally, along with numerous dialects that are often underrepresented in training datasets, even for widely-spoken languages like English. This underrepresentation can lead to biases in model performance. Additionally, there are various tasks involving speech data such as speech-to-text translation, text-to-speech conversion, keyword spotting, and intent classification.

Our Solution

LingoAI's aim is to solve real-time cross-lingual communication barriers for 8,000,000,000 people around the world, preserving indigenous languages, and to rebuild the "Tower of Babel" in the AI era through AI x DePIN.

We innovatively integrate the Web3.0 and AI MetaGraph technology stack from the ground up. MetaGraph allows the large language model to combine with the knowledge graph and multilingual RAG, ensuring that large language models avoid generating hallucinations.

LingoAI uses the fundamental semantic web through the Solid and MetaLife.Social protocols to achieve data and application separation, protecting personal data privacy, aiming to form a global decentralised data exchange market to solve the data shortage issue caused by Web2.0 data silos.

Our Products

LingoPod - Corpus mining device

LingoTrans-Web3.0 crowdsourcing platform

Multimodal data contribution

Multiple language file translation and proofreading

Last updated