What Is Nvidia’s Chat With RTX?

Nvidia’s new generative AI tool can run locally on your PC — and it’s free.

Written by Ellen Glover
Published on May. 02, 2024
What Is Nvidia’s Chat With RTX?
Image: Shutterstock

Chat with RTX is a free demo app created by tech giant Nvidia that lets users run a personalized AI chatbot locally on their PC. Also referred to as ChatRTX, the software interacts directly with a user’s documents and files, allowing it to produce more relevant responses than other chatbots. Since it all happens locally on the user’s device, their data is more secure. 

“It’s like having your own mini version of ChatGPT on your desktop,” Greg Asher, senior VP of technology at tech services company Feuji, told Built In. “It can actually use your local hardware and local information without revealing that data to anyone else.”

Nvidia is one of the most important companies in the artificial intelligence industry. Its AI chips power the majority of the world’s AI data centers and are used in everything from self-driving cars to medical imaging equipment. Chat with RTX serves as a showcase for the company’s latest capabilities.


What Is Nvidia’s Chat With RTX?

Released by Nvidia in February 2024, Chat with RTX is a generative AI tool that lets users train a large language model (LLM) and ultimately run a chatbot locally on their own PC using their own documents and files. It is free to download, but requires a Windows 11 operating system equipped with a Nvidia GeForce RTX 30 or 40 series GPU and at least 16 GB of memory.

To work, users direct Chat with RTX to browse specific folders and file formats (.txt, .pdf, .docx, .xml) on their computer, which it then scans to provide quick, straightforward summarizations. That allows users to simply type their natural language queries into Chat with RTX, rather than painstakingly search through folders and documents the old-fashioned way. As a result, they can “interrogate and interpret” their data more efficiently, tech consultant Kieran Gilmurray told Built In. “You type questions in and, hopefully, get clever answers.”

That means you could, for example, ask Chat with RTX to show you all the restaurant recommendations your friend has emailed you in the past year, or have it summarize key points from a folder of lecture notes. All of its responses come with a source attribution, allowing for quick verification.

Soon, Chat with RTX will be able to support voice inputs as well, according to Nvidia, meaning it will be able to understand spoken language and provide text responses. The company says users will also be able to have Chat with RTX find photos in their library through the words and phrases in their text queries.

Related ReadingTop Semiconductor Companies to Know


How Does Chat With RTX Work?

Chat with RTX is powered by the TensortRT-LLM software, an open source library that accelerates and optimizes the performance of other LLMs on the Nvidia platform. At the moment, the bot can leverage either Mistral AI’s Mistral 7B model or Meta’s Llama 2 model to search through local files and answer questions about them. Nvidia says users will soon be able to access Google’s Gemma LLM, too. 

These LLMs are able to produce more reliable and personalized responses thanks to a process called retrieval-augmented generation (RAG), which connects the models to additional knowledge sources (such as enterprise data) so they can retrieve the relevant information and incorporate it into their generated responses.

“It enables an LLM to act like a librarian, searching through vast collections of information, often on the user’s device, to find the most relevant source of information to answer a user’s query,” Jesse Clayton, Nvidia’s product director for RTX AI, told Built In. “It gives the LLM a known datasource from which to draw to answer questions, which tends to avoid hallucinations for questions that can be answered from that dataset.” 

By combining RAG with Nvidia’s TensorRT-LLM software and RTX GPUs, Chat with RTX can run directly on users’ devices without having to transfer data to and from the cloud — setting it apart from ChatGPT, Gemini, Claude and virtually every other popular chatbot out there.

Plus, processing data on-device rather than in the cloud helps Chat with RTX generate responses faster, Asher explained. “You improve that latency by shortening the distance from when the inference is requested.”

Learn More About On-Device AIWhat Is Edge AI?


A demo of Chat with RTX. | Video: Nvidia GeForce

Chat With RTX Is Still a Work in Progress

While Chat with RTX is impressive in many ways, it is still just a tech demo — “it’s a proof of concept, not a commercial application” Michael O’Neill, senior director of technology at Feuji, told Built In.

Chat with RTX is only compatible with the latest Nvidia software and hardware, which may be prohibitive for some users. Its large size (35 GB) can also make installation difficult for those with limited device memory. And the installation process can be complex and brittle due to its multiple layered dependencies (Python, CUDA, TensorRT, etc.), potentially causing issues for less technically inclined users.

Once Chat with RTX is installed, it is also capable of producing false information, so answers to critical questions should always be verified. And, like all generative AI tools, the quality of its outputs largely depends on the quality of its training data. So if your PC is full of out-of-date, incomplete or duplicated data, the bot will likely reflect those errors, O’Neill said. “If you put garbage in, you’re going to get garbage out.”

Looking ahead, Clayton said Nvidia will continue to refine and improve Chat with RTX.

Frequently Asked Questions

Chat with RTX (ChatRTX) is a tech demo developed by Nvidia that enables users to run an AI chatbot locally on their PC using their own documents and files. The app is free to download, but requires a Windows 11 operating system equipped with the Nvidia’s latest software and hardware.

Yes, Chat with RTX is free to download.

Chat with RTX is available on Nvidia’s website and Github. To install it, you will need a Windows 11 operating system, a Nvidia GeForce RTX 30 or 40 series GPU and at least 16 GB of memory.

Chat with RTX is only compatible with the latest Nvidia software and hardware, which may be restrictive for some. It is also large (35GB), which can make installation difficult for those with limited device memory.

Hiring Now
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy