Nvidia Targets Generative AI for Robotics with New Frameworks

At ROSCon 2023 this week, the release of the Metropolis and Isaac ROS 2.0 frameworks makes AI-powered robotics more practical and easier to develop.

News October 19, 2023 by Aaron Carman

This week, ROSCon 2023 is runnig Oct 18-20 in New Orleans. This annual event focuses on developers using the popular ROS (Robot Operating System) in their system designs.

Intending to bring new AI capabilities to edge applications, Nvidia announced yesterday at ROSCon new and powerful expansions to two Nvidia Jetson frameworks. These expansions, targeting the Isaac ROS and Metropolis frameworks, will give designers new tools to leverage the latest developments in AI in edge applications that would normally be impractical.

As edge AI becomes more feasible, Nvidia’s latest expansions could allow designers to rapidly simulate and evaluate models for robotics and IoT applications.

As interest in artificial intelligence has spiked even outside the engineering community with the emergence of models such as ChatGPT, many designers are looking for ways of leveraging generative AI to improve existing performance and create new applications.

In addition, robotics designers have considerable interest in deploying AI-enabled systems but require a method of simulating and evaluating perception scenarios to circumvent data scarcity. With the latest Nvidia frameworks, however, developing high-performance AI systems is easier than ever.

In order to give readers the most important information within the Nvidia announcement, this article covers the key new features of both the Metropolis and Isaac ROS frameworks. In addition, we’ll take a closer look at some of the targeted applications of each new expansion to try and determine how the software may make AI a much more common solution.

Metropolis: Generative AI for Robotics

As designers look to the future of automation, AI techniques are poised to play a critical role in building intuitive systems that outperform human capabilities. Traditionally convolution neural networks (CNNs) form the foundation of robotic vision systems, but the limitations of CNNs leave much to be desired for next-generation systems.

Generative AI models like ChatGPT offer many advantages such as zero-shot learning, where unseen stimuli can still be characterized without requiring a retraining of the entire model. In addition, generative AI models allow for considerably easier interaction with intuitive language interfaces, lowering the learning curve for final systems.

Generative AI models for robotics allows for easier interaction with the model and better performance compared to CNN counterparts.

In a press briefing on Nvidia’s announcements, Nvidia VP and General Manager Deepu Talla discussed the benefits of generative AI versus traditional counterparts. “Because the generative AI network is based on large models, it is fairly generalizable,” said Talla. “So that leads to faster development cycles, higher accuracy, and also few-shot, one-shot, or even zero-shot learning in some cases.”

Nvidia has deployed several tools for developers looking to use generative AI, including the Nvidia Jetson Generative AI lab. Using the tool, developers can use turnkey solutions to form a foundation for generative AI models, and then apply their innovations to create a new application.

Combined with the Metropolis expansion that provides designers with more tools, APIs, and microservices for vision AI, implementing generative AI models is expected to be considerably easier using Nvidia tools.

Isaac ROS: Perception and Simulation

Another pitfall for robotics engineers looking at AI models is the relative scarcity of data. When training and evaluating models, a lack of data can create difficulties in training an effective model. To address this limitation, Nvidia is releasing an expansion to itsIsaac ROS framework to include simulation and perception tools to evaluate models without requiring experimental data.

In addition to simulation software, the Isaac ROS framework is also expanding to include support for more packages to speed development, training, and processing times. Included in Isaac ROS 2.0 is support for Native ROS 2 Humble, NITROS ROS, CUDA NITROS, and more to increase the accuracy and performance of AI models. These releases complement the Omniverse Replicator, allowing designers to synthesize datasets for model evaluation.

The Isaac ROS 2.0 and IsaacSim frameworks allow robotics designers to synthesize data to test computer vision models, removing the issue of data scarcity and allowing more development.

Talla elaborated on the scope and potential impacts of both the Isaac ROS 2.0 and Metropolis platforms. He also emphasized the significance of these efforts for Nvidia.

“To summarize, we believe this is the largest ever platform expansion that we are making. I believe that this release that we’re making here is greater than the sum of everything we’ve done in the last 10 years.”

The Future of AI

With the expansions to two Nvidia frameworks, designers now have many more tools to leverage when deploying AI models in edge devices. The Metropolis framework now makes it much easier to deploy generative AI models, while the Isaac ROS expansion gives designers more simulation and evaluation tools for deploying high-performance AI-enabled robotics.

The decreased cost of entry for AI systems has made Talla expect that a revolution is imminent. “We have recently just crossed over 10,000 companies that have used the Jetson product, over 1,000 partners, and over a million developers,” he said.

“This is what we have done today. But what’s exciting now, in the last 12 months, generative AI has transformed text and natural language processing. Very soon, we will be at the tipping point of that same technology coming to computer vision.”

So, while it is unknown what exact applications will see the most benefit the most from better AI integration, providing designers more tools to develop AI models on edge hardware is certainly an extraordinary feat, and may bring about a new era where interaction with artificial intelligence becomes a commonplace occurrence.

All images used courtesy of Nvidia