Mistral AI: What to Know About Europe’s OpenAI Rival

This French startup is pushing generative AI to new heights with its commercial and open source LLMs.

Written by Ellen Glover
Published on Apr. 23, 2024
Mistral AI: What to Know About Europe’s OpenAI Rival
Image: Shutterstock

Mistral AI is an artificial intelligence startup that makes large language models (LLMs). Based in Paris, France, and founded by former researchers at Google DeepMind and Meta, Mistral is known for its transparent, portable, customizable and cost-effective models that require fewer computational resources than other popular LLMs.

What Is Mistral AI?

Mistral AI is a French artificial intelligence startup that launched in 2023. It builds open source and commercial AI models, some of which have achieved state-of-the-art performance on several industry benchmarks.

With substantial backing from prominent investors like Microsoft and Andreessen Horowitz — and a reported valuation of $5 billion — Mistral is positioning itself to be a formidable competitor in the increasingly crowded generative AI market. The company’s top commercial LLM outperforms those developed by incumbents like Google and Anthropic across several industry benchmarks, and even gives OpenAI’s GPT-4 — often considered the gold standard in AI model performance — a run for its money.

The company also makes a suite of open source models that are freely available for anyone to use and modify. In contrast to some of the most powerful AI companies, Mistral has made its LLMs more accessible, arguing that, “by training our own models, releasing them openly, and fostering community contributions, we can build a credible alternative to the emerging AI oligopoly.”

Related ReadingAI Has a Huge Climate Change Problem

 

What Does Mistral AI Offer?

Mistral AI offers several LLMs, both commercial and open source. Each has their own unique set of strengths and abilities.
 

Commercial Models

All of Mistral’s commercial models are closed-source and only available through its API.

Mistral Large

  • The most advanced of Mistral AI’s models.
  • Ideal for complex tasks like synthetic text generation and code generation.
  • Ranks second to GPT-4 in several industry benchmarks.
  • Has a maximum context window of 32k tokens.
  • Natively fluent in English, French, Spanish, German and Italian, as well as code.

Mistral Small

  • Focused on efficient reasoning for low latency workloads.
  • Ideal for simple tasks that can be done in bulk, like text generation and text classification.
  • Has a maximum context window of 32k tokens.
  • Natively fluent in English, French, Spanish, German and Italian, as well as code.

Mistral Embed

  • Converts text into numerical representations (aka “embeddings”) so it can process and analyze words in a way that is understandable to a computer.
  • Ideal for tasks like sentiment analysis and text classification.
  • Currently available in English only.

 

Open Source Models

All of Mistral’s open source models are available for free under Apache 2.0, a fully permissive license that allows anyone to use them anywhere, with no restrictions.

Mistral 7B

  • Designed for easy customization and fast deployment.
  • Can handle high volumes of data faster and with minimal computational cost.
  • Trained on a dataset of about 7 billion parameters, but it outperforms Llama 2 (13 billion parameters) and matches models with up to 30 billion parameters.
  • Has a maximum context window of 32k tokens.
  • Can be used in English and code.

Mixtral 8x7B

  • Designed to perform well with minimal computational effort.
  • Uses a mixture of experts architecture; only uses about 12 billion of its potential 45 billion parameters for inference.
  • Outperforms both Llama 2 (70 billion parameters) and GPT-3.5 (175 billion parameters) on most benchmarks.
  • Has a maximum context window of 32k tokens.
  • Natively fluent in English, French, Spanish, German and Italian, as well as code.

Mixtral 8x22B

  • The most advanced of Mistral AI’s open source models
  • Ideal for tasks like summarizing large documents or generating lots of text. 
  • A bigger version of Mixtral 8x7B; only uses about 39 billion of its potential 141 billion parameters for inference.
  • Outperforms Llama 2 70B and Cohere’s Command R and R+ in cost-performance ratio.
  • Has a maximum context window of 64k tokens.
  • Natively fluent in English, French, Spanish, German and Italian, as well as code..

 

Le Chat 

In addition to its LLMs, Mistral AI offers Le Chat, an AI chatbot that can generate content and carry on conversations with users — similar to platforms like ChatGPT, Gemini and Claude. Mistral AI also allows users to choose which of its models they want operating under the hood — Mistral Large for better reasoning, Mistral Small for speed and cost-effectiveness or Mistral Next, a prototype model that is designed to give brief and concise answers.

Le Chat does not have real-time access to the internet, though, so its answers may not always be up-to-date. And like any generative AI tool, it can produce biased responses and get things wrong. But Mistral says it is working to make its models as “useful and as little opinionated as possible.” 

Le Chat is free and can be accessed at chat.mistral.ai/chat. The company is also developing a paid version for its enterprise clients.

More on Open Source AIGrok: What We Know About Elon Musk’s AI Chatbot

 

What Are Mistral AI’s Models Used For?

All of Mistral AI’s LLMs are foundation models, which means they can be fine-tuned and used in a wide-range of natural language processing tasks, such as:

  • Chatbots: Enabling chatbots to understand natural language queries from users and respond in a more accurate and human-like way. 
  • Text Summarization: Extracting the essence of articles and documents, summarizing their key points in a concise overview.
  • Content Creation: Generating natural language text, including emails, social media copy, short stories, cover letters and much more.
  • Text Classification: Classifying text into different categories, such as flagging emails as spam or non-spam based on their content.
  • Code Completion: Generating code snippets, optimizing existing code and suggesting bug fixes to speed up the development process.

 

How to Use Mistral AI’s Models

All of Mistral AI’s models can be found on its website. They are also available on platforms like Amazon Bedrock, Databricks, Snowflake Cortex and Azure AI.

To use the models directly on Mistral AI’s website, go to La Plateforme, its AI development and deployments platform. There, you can set up guardrails and fine-tune the models to your specifications, then integrate them into your own applications and projects. Pricing ranges depending on the model you use. For example, Mistral 7B costs $0.25 per 1 million input tokens, while Mistral Large costs $24 per 1 million output tokens.

You can also interact with Mistral’s Large and Small models via Le Chat, the company’s free AI chatbot.

 

Is Mistral AI Better Than GPT-4?

Mistral AI’s most advanced LLM, Mistral Large, is the most comparable to GPT-4. Still, GPT-4 scored higher than Mistral Large across all performance benchmarks, indicating that it is superior in a range of NLP tasks, as well as mathematics, history, computer science and general common sense.

Mistral Large is cheaper to use than GPT-4, though. GPT-4 costs $30 per 1 trillion input tokens and $60 per 1 million output tokens, while Large costs $8 per 1 million input tokens and $24 per 1 million tokens. Given that Large lost to GPT-4 on those performance benchmarks by only a few percentage points, it could be a suitable choice for organizations looking for a high-performing LLM at a lower cost.

Looking ahead, this cost-performance ratio will likely get even better as more models enter the market.

“Competition is always good for users. There’s a lot of innovation coming every day,” said Baris Gultekin, head of AI at Snowflake, which partners with Mistral.  “That pushes down the costs for customers, as well as improves the performance. And I expect that to continue.”

Learn From the Experts3 Reasons AI Should Be Open

 

How Do Mistral AI’s Models Work?

Like other large language models, Mistral AI’s models are trained on a massive corpus of text data scraped from the internet, which they can then use for all kinds of natural language processing (NLP) tasks. But several of Mistral’s models have key features that set them apart from the crowd.
 

Mixture of Experts Architecture

Mistral’s models are based on a transformer architecture, a type of neural network that generates text by predicting the next-most-likely word or phrase. But a couple of them (Mixtral 8x7B and 8x22B) take it a step further and use a mixture of experts architecture, meaning it uses multiple smaller models (called “experts”) that are only active at certain times, thus improving performance and reducing computational costs.

While they tend to be smaller and cheaper than transformer-based models, LLMs that use MoE architectures perform equally well or even better, according to Gultekin, making them an attractive alternative. “When an LLM is faster and smaller to run, it’s also more cost effective,” he told Built In. “And that’s appealing.” 

 

Open Source

Many of Mistral AI’s models are open source, meaning their code and data — as well as their weights, or parameters learned during training — are freely available for anyone to access, use and modify. With open source models, users can see how they work and adapt them for their own purposes, said Atul Deo, the general manager of Amazon Bedrock, which also partners with Mistral. 

“You can add your own inference optimizations on top of the open model, you can do certain types of fine-tuning that you can’t do with a proprietary model, because a lot of the details are transparent,” Deo told Built In. “Open source models, in theory, give [you] a lot more flexibility to tinker with the model.”

The fact that some of Mistral AI’s models are open source is especially useful for companies in highly regulated industries like banks and hospitals, where data privacy and governance are crucial, said Erika Bahr, founder and CEO of AI company Daxe. With open source LLMs, these companies can fine-tune them and run them locally in a secure environment without the threat of information leaking. 

“To get the highest level of security standards, you have to be able to see where the data goes,” Bahr told Built In. “If you are able to see all of the code, then you can actually verify where your data is going when it goes through the model.”

 

Function Calling Capabilities

Mistral says its Large, Small and 8x22B have native function calling capabilities, meaning they can integrate with other platforms and perform tasks beyond their original capabilities. It helps make the models more accurate, efficient and versatile.

“It allows you to do fine-tuning underneath another system’s platform,” Bahr said. “So you can actually leverage what they’ve done and then you can fine-tune it even deeper.” For example, Bahr said she went to a hackathon event where the winner integrated a Mistral LLM into a Pac Man game, and then fine-tuned it to do certain moves so that it won the game.

In general, function calling is also useful for tasks like retrieving data in real-time, performing calculations and accessing databases.

 

Multilingual

While many LLMs are only proficient in a single language, most of Mistral’s models are natively fluent in English, French, Spanish, German and Italian — meaning they have a more “nuanced understanding” of both grammar and cultural context, according to the company. So they can be used for complex multilingual reasoning tasks, including text understanding and translation.

Frequently Asked Questions

Mistral AI is a French artificial intelligence startup that makes commercial and open source large language models (LLMs). The company was launched in 2023 by former researchers at Meta and DeepMind. It is known for its transparent, portable, customizable and cost-effective foundation models that require fewer computational resources than those of other AI vendors.

To use Mistral AI’s models, you can go to its website, or to La Plateforme, the company’s platform for AI development and deployment. You can set up guardrails and fine-tune the models to your specifications, then integrate them into your own applications and projects. You can also interact with Mistral’s Large and Small models via Le Chat, an AI chatbot that can generate text and carry on conversations.

According to Mistral AI, GPT-4 scored higher than Mistral Large across all performance benchmarks, indicating that is the superior model. But Large is cheaper to run than GPT-4. Given Large lost to GPT-4 on those performance benchmarks by only a few percentage points, it could be a suitable choice for organizations looking for a high-performing LLM at a lower cost.

Mistral AI was created by former DeepMind employee Arthur Mensch and former Meta researchers Guillaume Lample and Timothée Lacroix. The company was launched in April 2023.

Mistral AI did not respond to requests to be interviewed for this story.

Hiring Now
Basis Technologies
AdTech • Software
SHARE