robotic-helper-making-mistakes?-just-nudge-it-in-the-right-direction

Visualize a robot assisting you with dishwashing. You request it to retrieve a soapy bowl from the sink, but its gripper slightly overshoots the target.

Leveraging a new framework pioneered by researchers at MIT and NVIDIA, you could adjust the robot’s actions through straightforward interactions. This technique would enable you to indicate the bowl or sketch a path to it on a display, or simply give the robot’s arm a slight push in the desired direction.

In contrast to other techniques for refining robot behavior, this approach does not necessitate users to gather new data and retrain the machine-learning model that underpins the robot’s intelligence. It allows a robot to utilize intuitive, real-time feedback from humans to select a practical sequence of actions that aligns closely with the user’s intent.

In the trials conducted by the researchers, the success rate of their framework was 21 percent higher than a competing method that did not utilize human contributions.

Over time, this framework could empower a user to more effectively guide a factory-trained robot to undertake a multitude of household tasks, even if the robot has never encountered their residence or the items within it.

“We cannot expect non-experts to engage in data collection and fine-tune a neural network model. Consumers will anticipate the robot to function correctly right out of the packaging, and if it doesn’t, they would desire an intuitive way to adjust it. That is the challenge we addressed in this study,” states Felix Yanwei Wang, a graduate student in electrical engineering and computer science (EECS) and the principal author of a publication on this method.

His co-authors comprise Lirui Wang PhD ’24 and Yilun Du PhD ’24; senior author Julie Shah, an MIT professor of aeronautics and astronautics and the leader of the Interactive Robotics Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL); along with Balakumar Sundaralingam, Xuning Yang, Yu-Wei Chao, Claudia Perez-D’Arpino PhD ’19, and Dieter Fox from NVIDIA. The research is set to be presented at the International Conference on Robots and Automation.

Reducing discrepancies

Recently, researchers began employing pre-trained generative AI models to establish a “policy,” or a collection of guidelines, which a robot adheres to in order to execute a task. Generative models are capable of addressing various intricate tasks.

Throughout training, the model only observes feasible robot movements, enabling it to generate legitimate paths for the robot to navigate.

Although these paths are legitimate, it does not imply that they always coincide with a user’s intent in the real world. The robot may have been conditioned to retrieve boxes off a shelf without toppling them, yet it might struggle to access a box perched atop an individual’s bookshelf if the shelf is positioned differently than those it encountered during training.

To address these shortcomings, engineers commonly gather data demonstrating the new task and re-train the generative model, a process that is both costly and labor-intensive and requires expertise in machine learning.

Instead, the MIT researchers aimed to enable users to direct the robot’s behavior during its operation when it makes an error.

However, if a human interacts with the robot to rectify its actions, that could unintentionally lead the generative model to select an inappropriate action. It may reach the desired box, but cause books to tumble off the shelf in the process.

“We aim to allow the user to engage with the robot without introducing such mistakes, ensuring we achieve behavior that is far more aligned with user intent during deployment, while also being valid and practical,” Wang states.

Their framework achieves this by providing users with three intuitive methods to adjust the robot’s behavior, each presenting distinct advantages.

First, users can point to the object they wish the robot to handle via an interface displaying its camera perspective. Second, they can sketch a trajectory in that interface, enabling them to define how they prefer the robot to approach the object. Third, they can manually guide the robot’s arm in the direction they desire it to move.

“When mapping a 2D image of the environment to actions within a 3D space, some information is inevitably lost. Physically nudging the robot is the most direct approach to specify user intent without losing any information,” remarks Wang.

Sampling for effectiveness

To ensure these interactions do not result in the robot selecting an inappropriate action, such as colliding with other objects, the researchers utilize a specific sampling technique. This method allows the model to choose an action from the collection of valid actions that best aligns with the user’s objectives.

“Instead of merely imposing the user’s desires, we provide the robot with insight into what the user intends while allowing the sampling process to oscillate around its own learned behaviors,” Wang clarifies.

This sampling technique empowered the researchers’ framework to surpass the other methodologies they assessed during simulations and experiments with a real robotic arm in a toy kitchen.

While their approach may not accomplish the task immediately, it provides users with the benefit of promptly rectifying the robot’s actions if they observe it performing incorrectly, rather than waiting for it to conclude and then issuing new commands.

Additionally, after a user nudges the robot several times until it successfully picks up the correct bowl, it could record that corrective action and integrate it into its behavior through subsequent training. Consequently, the following day, the robot could retrieve the proper bowl without needing further assistance.

“However, the cornerstone of that continuous enhancement is providing a means for the user to interact with the robot, which is what we have demonstrated here,” Wang asserts.

In the future, the researchers aspire to enhance the speed of the sampling procedure while preserving or improving its efficacy. They also plan to explore robot policy generation in new environments.


Leave a Reply

Your email address will not be published. Required fields are marked *

Share This