Amazon Nova is a family of AI foundation models designed by Amazon to help make generative AI development faster and more cost efficient.
Like other AI models, Amazon Nova enables users to analyze documents, diagrams and videos, create both written and visual content, and even build AI agents tailored to their specific needs. The suite features four “understanding” models that have each been optimized for different speeds, capabilities and operational costs — and most of them are multimodal, meaning they accept text, image and video inputs and can generate text outputs. It also includes two “creative content generation” models dedicated to image and video generation.
What Is Amazon Nova?
Amazon Nova is a set of foundation models developed by Amazon. The suite includes one text-only model, three multimodal models and two content generation models focused on creating new images and videos.
Amazon unveiled Amazon Nova at its AWS re:Invent conference in December of 2024, marking a significant expansion of its broader artificial intelligence strategy. Up to this point, the company’s approach to AI had primarily been centered on providing cloud computing services with its Amazon Web Services (AWS) platform, as well as access to third-party foundation models through its Amazon Bedrock library. With the release of its own models, Amazon is entering into direct competition with the likes of OpenAI, Google, Meta and Anthropic — providing users with yet another option to build their AI products with.
Three of Amazon Nova’s “understanding” models are now available on Amazon Bedrock, along with the image and video generation models. The fourth “understanding” model, as well as a “speech-to-speech” and “native multimodal-to-multimodal” model, will be rolled out in 2025, Amazon said in a press release.
What Is Amazon Nova?
Amazon Nova includes four “understanding” models, all of which can be customized and fine-tuned with additional data:
- Amazon Nova Micro can only receive and generate text, but it delivers the lowest latency of the bunch, processing inputs and generating responses the fastest. Amazon says the model is “highly performant” at language understanding, translation, reasoning, code completion, brainstorming and mathematical problem solving. And with a generation rate of more than 200 output tokens per second, it is ideal for tasks that require quick responses.
- Amazon Nova Lite is a “very low-cost” multimodal model that can take in text, images and video to generate text. It can handle inputs up to 300,000 tokens in length, as well as analyze multiple images or up to 30 minutes of video in a single request, according to Amazon, making it particularly well-suited for tasks like customer interactions, document analysis and visual question-and-answering.
- Amazon Nova Pro is what Amazon calls its “highly capable” multimodal model, offering the optimal balance of accuracy, speed and cost for a wide range of tasks. These include video summarization, financial document analysis, mathematical reasoning, software development and the creation of AI agents that can perform complex, multi-step workflows.
- Amazon Nova Premier is the company’s “most capable” multimodal model, designed for complex reasoning tasks and as an advanced tool for “distilling” custom models. Currently in training, the model is slated for release in early 2025, according to Amazon.
Amazon Nova also includes two “creative content generation” models, both of which include “built-in safety controls” like watermarking to “support safe and responsible” AI use.
- Amazon Nova Canvas can produce “studio-quality” images from text or image prompts, according to the company. The model also has features that make it easy to edit existing images using text inputs, including controls for adjusting color scheme and layout.
- Amazon Nova Reels can create videos up to six seconds long using text inputs and reference images. Users can adjust their video’s visual style, pacing and camera movement, including pans, 360-degree rotations and zooms — and all with natural written language. To illustrate the capabilities of the model, Amazon shared this mock advertisement for a fake pasta brand.
In 2025, Amazon plans to release a “speech-to-speech” model that will be able to understand natural language speech — interpreting verbal and non-verbal cues (like tone and cadence) — and provide human-like, conversational interactions with low latency. The company also says it will release a model capable of taking in text, images, audio and video as input, and then generating outputs in any of those modalities — referring to its capabilities as “native multimodal-to-multimodal” or “any-to-any.”
What Is a Foundation Model?
A foundation model is a large, general-purpose AI model that serves as the base — or “foundation” — of an artificial intelligence system. Trained on large quantities of diverse datasets, foundation models can perform a wide range of tasks involving natural language processing, computer vision and generative AI, and can be fine-tuned for specific applications.
What Can Amazon Nova Do?
Amazon Nova’s suite of models can do just about everything most other advanced foundation models can. These are some of the main capabilities Amazon has highlighted:
Integration With Amazon Bedrock
All of the Amazon Nova models are seamlessly integrated with Amazon Bedrock. This means that users can choose which models they want to build their AI products with, and then customize those models using the proprietary data they have stored on the AWS platform. Every stage of AI development — model selection, customization, training, deployment and scaling — can happen in one centralized place, with tools supplied by Amazon.
Multimodal Reasoning
Amazon Nova Lite and Pro both have multimodal reasoning capabilities, meaning they can understand and draw conclusions from text, image and video inputs to produce text outputs. This allows them to generate insights from charts and graphs, summarize video content, create text-based descriptions of drawings and photos and various other tasks.
Multilingual Comprehension
Amazon Nova Micro, Lite and Pro were trained in more than 200 languages, and are especially proficient in English, Spanish, German, French, Italian, Portuguese, Dutch, Turkish, Russian, Japanese, Korean, “simplified” Chinese, Hindi, Arabic and Hebrew. That means these models can process and understand written and visual content in all of these languages, enabling them to perform tasks like summarization, translation, content classification and visual question-answering.
Meanwhile, Amazon Nova Canvas and Reels were primarily trained in English, so any text prompts used to create and edit their AI-generated images and videos must be written in English.
Custom Fine-Tuning
Amazon Nova Micro, Lite and Pro all support custom fine-tuning, allowing customers to train the models using their own proprietary data, including text, images and videos. The process helps to ensure that the models are better suited for the specific needs of the user, whether that be a customer service chatbot that knows a specific company’s products or a diagnostic tool for analyzing medical images.
Amazon says users will also be able to fine-tune Amazon Nova Canvas and Reels “soon.”
Knowledge Distillation
In addition to fine-tuning, Amazon Nova Micro, Lite and Pro support “distillation,” a process where knowledge from a larger, more powerful model is transferred to a smaller, more energy-efficient one. This makes the smaller model more accurate, while also improving speed and reducing operational costs for the user.
Retrieval-Augmented Generation (RAG)
Amazon says Amazon Nova Micro, Lite and Pro “excel” at retrieval-augmented generation (RAG), a technique that involves retrieving information from external sources — such as databases or websites — and incorporating it into the generated responses of AI models. This helps the models to provide more contextually relevant and up-to-date information, without having to rely solely on the data they were trained on.
Agentic Workflows
Amazon Nova Micro, Lite and Pro can all perform “agentic workflows,” as Amazon put it. This means they can act as foundation models for AI agents, enabling them to break down complex tasks into actionable steps and execute those steps by leveraging external services like language models, APIs and databases.
AI agents are considered by many to be the next frontier in generative AI, shifting the industry from knowledge-based tools like chatbots and content generators to action-based systems capable of planning and executing tasks independently, with little to no human oversight. Although they are not fully autonomous yet, AI agents are on the verge of becoming a new kind of skilled virtual coworker, seamlessly working alongside humans in industries ranging from customer service to software engineering.
How Does Amazon Nova Compare to Other Models?
In its announcement, Amazon compared Amazon Nova Micro, Lite and Pro to other models of similar sizes and capabilities, including those in the GPT4o, Gemini, Llama 3.1 and Claude model families. The comparison evaluated factors like text and visual intelligence, agentic workflows, speed and cost.
Text Intelligence
To assess text intelligence, Amazon tested its models using established industry benchmarks for evaluating tasks like language understanding, general reasoning, mathematics, Python code generation, instruction following and language translation.
Amazon Nova Micro and Lite outperformed competitors of similar size and capabilities across the board. Amazon Nova Pro achieved the highest overall scores of the three, but scored lower than Claude 3.5 Sonnet on nearly all of the tests. It also lost to GPT-4o and Gemini 1.5 Pro in areas like common sense reasoning and deep reasoning.
Visual Intelligence
Amazon measured the visual intelligence of Amazon Nova Lite and Pro with standard benchmarks for evaluating tasks like image and document understanding, video captioning, and video question-answering.
Both models outperformed most competitors in their respective classes — except in visual reasoning, where Amazon Nova Lite scored lower than GPT-4o Mini and Gemini 1.5 Flash, and Amazon Nova Pro trailed Claude 3.5 Sonnet, GPT-4o and Gemini 1.5 Pro. Amazon Nova Pro also scored lower than Claude 3.5 Sonnet in image and document understanding.
Agentic Workflows
To evaluate their agentic capabilities, Amazon tested all three models using benchmarks that measured their retrieval-augmented generation and API orchestration capabilities. The company also tested Amazon Nova Lite and Pro on their ability to interact with and extract information from web browsers, as well as their ability to process and understand multiple modalities.
Amazon Nova Micro outperformed both Gemini 1.5 Flash and Llama 3.1 8B on RAG and API orchestration. Amazon Nova Lite and Pro scored higher than most competitors in their respective classes — except in RAG. Amazon Nova Lite lost to GPT-4o mini and Amazon Nova Pro lost to both Claude 3.5 Sonnet and GPT-4o. Amazon Nova Pro also scored marginally lower than GPT-4o in API orchestration.
Speed and Cost
Amazon says all of its models are faster and “at least 75 percent less expensive” than the best performing models in their respective intelligence classes in Amazon Bedrock (which does not host Google’s Gemini or OpenAI’s GPT models, but does host models made by AI21, Anthropic, Cohere, Meta, Mistral AI and Stability AI).
How To Access Amazon Nova
Amazon Nova Micro, Lite and Pro are all available on Amazon Bedrock, along with Amazon Nova Canvas and Reels. Accessing these models is fairly easy once you’ve set up an AWS account:
- Log in to your AWS Management Console.
- Navigate to Amazon Bedrock via the search bar.
- Make sure you are in the U.S. East (N. Virginia) (us-east-1) region. To change regions, choose the Region name at the top right of the console. Then select “U.S. East (N. Virginia) (us-east-1).”
- On the Amazon Bedrock dashboard, navigate to the Model Access and choose “modify model access.”
- To request access to Amazon Nova models specifically, choose “enable specific models” and then select all the models you want access to (Nova Micro, Nova Lite and Nova Pro). You can also select all of them at once by selecting “Amazon” as the provider.
- Make sure your account has the necessary permissions to access these models. This may involve creating or assigning IAM (identity and access management) roles.
- When you’re ready, choose “submit” to request access.
- Access may take several minutes to complete. When access to a model has been granted, the status of the model will be listed as “access granted.”
Frequently Asked Questions
Is Amazon Nova available?
Yes — Amazon Nova Micro, Lite and Pro are all available on Amazon Bedrock, along with Amazon Nova Canvas and Reels. But Amazon Nova Premier will not be available until 2025.
Is Amazon Nova free?
No — all of the Amazon Nova models are only available on Amazon Bedrock, which follows a pay-as-you-go pricing model.
What is a foundation model?
A foundation model is a large neural network architecture trained on massive amounts of diverse data to perform tasks like natural language processing, computer vision and generative AI. The size and versatility of foundation models make them effective building blocks for AI development, as they can be fine-tuned and customized for specific applications.