Paskelbta - Pakomentuoti

EgoMimic: Georgia Tech PhD student uses Project Aria Research Glasses to help train humanoid robots

Today, we’re highlighting new research from Georgia Tech that helps train robots to perform basic everyday tasks using egocentric recordings from wearers of Meta’s Project Aria research glasses. Check out the video below, read the full story, or apply for your own Project Aria Research Kit.

Imagine having help completing everyday tasks in your home such as doing the laundry, washing dishes, and making repairs. We already use tools to help with these tasks, like washing machines, dishwashers, and electric drills. But what if you could have an even more powerful and flexible tool in the form of a humanoid robot that could learn from you and accelerate any number of physical projects on your to-do list?

Even if you had the available hardware system, teaching a robot to do everyday tasks can only be achieved through a slow and clunky data collection method called robot teleoperation. Until now. By using the Project Aria Research Kit, Professor Danfei Xu and the Robotic Learning and Reasoning Lab at Georgia Tech use the egocentric sensors on Aria glasses to create what they call “human data” for tasks that they want a humanoid robot to replicate. They use human data to dramatically reduce the amount of robot teleoperation data needed to train a robot’s policy—a breakthrough that could some day make humanoid robots capable of learning any number of tasks a human could demonstrate.

Kareer teleoperates the robot to capture co-training data for EgoMimic. Teleoperation can be difficult to scale and require significant human effort.

“Traditionally, collecting data for robotics means creating demonstration data,” says Simar Kareer, a PhD student in Georgia Tech’s School of Interactive Computing. “You operate the robot’s joints with a controller to move it and achieve the task you want, and you do this hundreds of times while recording sensor data, then train your models. This is slow and difficult. The only way to break that cycle is to detach the data collection from the robot itself.”

Today, robot policy models are trained with large amounts of targeted demonstration data specific to each narrow task at a high cost. Kareer hypothesizes that passively collected data from many researchers, like the data captured by Aria glasses, could instead be used to enable data creation for a much broader set of tasks to create more generally useful robots in the future.

Inspired by Project Aria ir Ego-Exo4D which includes a massive egocentric dataset of over 3K hours of video recordings of daily-life activities, Kareer developed EgoMimic, a new algorithmic framework that utilizes human data and robot data for humanoid robot development.

“When I looked at Ego4D, I saw a dataset that’s the same as all the large robot datasets we’re trying to collect, except it’s with humans,” Kareer explains. “You just wear a pair of glasses, and you go do things. It doesn’t need to come from the robot. It should come from something more scalable and passively generated, which is us.” In Kareer’s research, Aria glasses were used to create human data for co-training the EgoMimic framework.

Kareer creates co-training human data by recording with Aria glasses while folding a t-shirt.

Aria glasses aren’t just used for human data collection in Georgia Tech’s research. They’re also used as an integral component of the robot’s real-time operation setup. Aria glasses are mounted to their humanoid robot platform just like a pair of eyes and serve as an integrated sensor package that enables the robot to perceive its environment in real time. The Aria Client SDK is utilized to stream Aria’s sensor data directly into the robot’s policy, running on an attached PC, which in turn controls the robot’s actuation. Using Aria glasses for both the data collection and the real-time perception pipeline minimizes the domain gap between the human demonstrator and the robot, paving the way for scaled human data generation for future robotics task training.

Aria glasses mounted to the top of the robot provide the system with sensor data that allows the robot to perceive and interact with the space.

Thanks to EgoMimic, Kareer achieved a 400% increase in his robot’s performance across various tasks vs previous methods with just 90 minutes of Aria recordings. The robot was also able to successfully perform these tasks in previously unseen environments.

In the future, humanoid robots could be trained at scale using egocentric data in order to perform a variety of tasks in the same way humans do.

“We look at Aria as an investment in the research community,” says James Fort, a Reality Labs Research Product Manager at Meta. “The more that the egocentric research community standardizes, the more researchers will be able to collaborate. It’s really through scaling with the community like this that we can start to solve bigger problems around how things are going to work in the future.”

Kareer will present his paper on EgoMimic at the 2025 IEEE Engineers’ International Conference on Robotics and Automation (ICRA) in Atlanta.

Paskelbta - Pakomentuoti

Introducing Aria Gen 2: Unlocking New Research in Machine Perception, Contextual AI, Robotics, and More

Since its launch in 2020, Project Aria has enabled researchers across the world to advance the state of the art in machine perception and AI, through access to cutting-edge research hardware and open-source datasets, models, and tooling. Today, we’re excited to announce the next step in this journey: the introduction of Aria Gen 2 glasses. This next generation of hardware will unlock new possibilities across a wide range of research areas including machine perception, egocentric and contextual AI, and robotics.

0:00 / 0:00

For researchers looking to explore how AI systems can better understand the world from a human perspective, Aria Gen 2 glasses add a new set of capabilities to the Aria platform. They include a number of advances not found on any other device available today, and access to these breakthrough technologies will enable researchers to push the boundaries of what’s possible.

Compared to Aria Gen 1, Aria Gen 2’s unique value proposition includes:

  • State-of-the-art sensor suite: The upgraded sensor suite features an RGB camera, 6DOF SLAM cameras, eye tracking cameras, spatial microphones, IMUs, barometer, magnetometer, and GNSS. Compared to its predecessor, Aria Gen 1, the new generation introduces two innovative sensors embedded in the nosepad: a PPG sensor for measuring heart rate and a contact microphone to distinguish the wearer’s voice from that of bystanders.
  • Ultra low-power and on-device machine perception: SLAM, eye tracking, hand tracking, and speech recognition are all processed on-device using Meta’s custom silicon.
  • All-day usability: Aria Gen 2 glasses are capable of six to eight hours of continuous use, weigh about 75 grams, and have foldable arms for easy portability.
  • Interaction through audio: Users get audio feedback via best-in-class open-ear force-canceling speakers, enabling user-in-the-loop system prototyping.

Our decade-long journey to create the next computing platform has led to the development of these critical technologies. At Meta, teams at Reality Labs Research and the FAIR AI lab will use them to advance our long-term research vision. Making them available to academic and commercial research labs through Project Aria will further advance open research and public understanding of a key set of technologies that we believe will help shape the future of computing and AI.

The open research enabled by Project Aria since 2020 has already led to important work, including the creation of open-source tools in wide use across academia and industry. The Ego-Exo4D dataset, collected using the first generation of Aria glasses, has become a foundational tool across modern computer vision and the growing field of robotics. Researchers at Georgia Tech recently showed how the Aria Research Kit can help humanoid robots learn to assist people in the home, while teams at BMW used it to explore how to integrate augmented and virtual reality systems into smart vehicles.

And Aria is also enabling the development of new technologies for accessibility. The first-generation Aria glasses were utilized by Carnegie Mellon University in their NavCog project, which aimed to build technologies to assist blind and low-vision individuals with indoor navigation. Building on this foundation, the Aria Gen 2 glasses are now being leveraged by Envision, a company dedicated to creating solutions for people who are blind or have low vision. Envision is exploring the integration of its Ally AI assistant and spatial audio using the latest Aria Gen 2 glasses to enhance indoor navigation and accessibility experiences.

0:00 / 0:00

Envision used the on-device SLAM capabilities of Aria Gen 2, along with spatial audio features via onboard speakers, to assist blind and low-vision individuals seamlessly navigate indoor environments. This innovative use of the technologies, which is still in the exploratory and research phase, exemplifies how researchers can leverage Aria Gen 2 glasses for prototyping AI experiences based on egocentric observations. The advanced sensors and on-device machine perception capabilities, including SLAM, eye tracking, hand tracking, and audio interactions, also make them ideal for data collection for research and robotics applications.

Over the coming months, we’ll share more details about the timing of device availability to partners. Researchers interested in accessing Aria Gen 2 can sign up to receive updates. We’re excited to see how researchers will leverage Aria Gen 2 to pave the way for future innovations that will shape the next computing platform.

Paskelbta - Pakomentuoti

Inside Aria Gen 2: Explore the Cutting-Edge Tech Behind the Device

Earlier this year, we announced our latest research glasses, Aria Gen 2, marking the continuation of Project Aria’s mission to enable researchers across the world to advance the state of the art in machine perception, contextual AI, and robotics through access to cutting-edge research hardware and open source datasets, models, and tooling. Today, we’re excited to share more about the technology inside Aria Gen 2. This includes an in-depth overview of the form factor, audio capabilities, battery life, upgraded cameras and sensors, on-device compute, and more.

What Is Aria Gen 2?

Aria Gen 2 is a wearable device that combines the latest advancements in computer vision, machine learning, and sensor technology. Aria Gen 2’s compact form factor and lightweight design make it an ideal choice for researchers who need to collect data or build prototypes in a variety of settings. The glasses contain a number of improvements when compared to Aria Gen 1, its research predecessor, announced back in 2020.

Aria Gen 2: Advancements and Features

The transition from Aria Gen 1 to Gen 2 marks a significant leap in wearable technology, offering enhanced features and capabilities that cater to a broader range of applications and user needs. Below, we explore the key differences and improvements introduced in Aria Gen 2.

1. Wearability

Aria Gen 2 boasts superior wearability, characterized by enhanced comfort and fit, while accommodating a wider range of face morphologies and a rich sensor suite for research. The glasses maintain a lightweight design (weighing in at 74 – 76g, depending on size) and now include folding arms for easier storage and transport for everyday use. To ensure each wearer has an optimal physical and functional fit, we’ve introduced eight size variations of the device—accounting for a number of human factors including head breadth and nose bridge variation.

Eight size variations of our Aria Gen 2 devices.

2. Computer Vision (CV) Camera Enhancements

High Dynamic Range (HDR): Aria Gen 2’s global shutter camera sensor offers a high dynamic range of 120 dB, compared to the 70 dB range in Gen 1. This allows for superior computer vision tasks across diverse lighting conditions.

0:00 / 0:00

The video illustrates that the CV camera is able to capture highly dynamic scenes with an LED light’s filament being resolved along with the rest of the details from the scene.

Wide Field of View (FOV): Aria Gen 2 is equipped with four computer vision (CV) cameras, doubling the number of CV cameras in Gen 1, to provide a wider field of view and enable advanced 3D hand and object tracking.

Stereo Overlap: The stereo overlap in Gen 2 is increased to 80° from Gen 1’s 35°, facilitating stereo-based foundation models that enhance depth perception and spatial awareness.

0:00 / 0:00

The example here illustrates how the increased stereo overlap enables methods such as NVIDIA’s FoundationStereo to generate depth maps based on rectified stereo images. The depth maps can be fused to generate geometric reconstructions of the scene only using Aria Gen 2’s stereo pair data.

3. New Sensor Integrations

Ambient Light Sensor (ALS): Aria Gen 2 includes a calibrated ALS, enabling better exposure control algorithms and unlocking new capabilities at low frame rates. The ALS’s ultraviolet mode can be used to distinguish between indoor and outdoor lighting as illustrated by the video.

0:00 / 0:00

Contact Microphone: Aria Gen 2 includes a contact microphone embedded in the nosepad of the device, enhancing audio capture in noisy environments.

0:00 / 0:00

The video showcases a wearer in a wind tunnel to simulate a windy scenario where the contact microphone is able to pick up the wearer’s whisper when the acoustic microphones cannot.

Heart Rate: Aria Gen 2 includes a photoplethysmography (PPG) sensor embedded in the nosepad of the device, that enables estimation of heart rate of the person wearing the device.

4. Device Time Alignment

Aria Gen 2 has an onboard hardware solution that utilizes Sub-GHz radio technology to broadcast timing information, enabling precise time alignment with other Aria Gen 2 devices or compatible devices that support Sub-GHz radio. This technology achieves time alignment with an accuracy of sub-millisecond, marking a significant improvement over the software-based alignment of Gen 1.

0:00 / 0:00

The video shows how Aria Gen 2 uses device time alignment for tasks like writing from distributed captures from two Aria Gen 2 devices.

5. On-device Realtime Machine Perception (MP) Signals

Aria Gen 2 features advanced on-device machine perception algorithms that run on Meta’s energy-efficient custom coprocessor. These cutting-edge capabilities enable the device to generate precise and accurate data, tracking how we interact with our surroundings.

Visual Inertial Odometry (VIO)

One of the key features of Aria Gen 2 is its ability to track the glasses in six degrees of freedom (6DOF) within a spatial frame of reference using Visual Inertial Odometry (VIO). This allows for seamless navigation and mapping of the environment, opening up new possibilities for research in contextual AI and robotics.

Eye Tracking

Aria Gen 2 also boasts an advanced camera-based eye tracking system that tracks the wearer’s gaze with unparalleled accuracy. This system provides a wealth of information, including: gaze per eye, vergence point, blink detection, pupil center estimation, pupil diameter, corneal center, etc.

These advanced signals enable a deeper understanding of the wearer’s visual attention and intentions, unlocking new possibilities for human-computer interaction.

Hand Tracking

Aria Gen 2 also features a hand tracking solution that tracks the wearer’s hand in 3D space. This produces articulated hand-joint poses in the device frame of reference, facilitating accurate hand annotations for datasets and enabling applications such as dexterous robot hand manipulation that require high precision.

0:00 / 0:00

Demonstration of Aria Gen 2’s sensors and machine perception capabilities, as well as off-device algorithms built on them.

The Future of Aria Is Here: Stay Informed

Aria Gen 2 glasses pave the way for future innovations that will define the next computing platform. Applications to work with Aria Gen 2 will open later this year, and researchers who are interested in staying informed can join the Aria Gen 2 interest list. Meanwhile, applications for Aria Research Kit with Aria Gen 1 glasses are still being accepted on a rolling basis—apply now to get started immediately.

Join us at CVPR 2025 in Nashville, Tennessee, this June, where the team will showcase Aria Gen 2 glasses through interactive demos. Visit the Meta booth to experience the latest advancements and learn more about the innovative features of Aria Gen 2.