MIT Uses Generative AI to Build Wireless Vision Systems That See Through Walls

Overview

Researchers at MIT’s Media Lab have unveiled two systems that use generative AI and millimeter-wave (mmWave) radar to perceive objects and environments hidden behind walls, packaging, and other common obstructions. The work, announced on March 19, 2026, will be presented at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) and addresses a decade-old challenge in wireless perception: the fundamental inability of conventional sensors to reconstruct surfaces that reflect radar signals away from the receiver.

The Specularity Problem

Millimeter-wave signals, which operate at frequencies similar to Wi-Fi, can pass through drywall, plastic, and cardboard before reflecting off concealed objects. However, mmWave reflections are specular — each wave bounces in a single direction after striking a surface. Large portions of a hidden object’s surface redirect signals away from the radar sensor, leaving those areas effectively invisible. Previous wireless vision systems produced coarse, incomplete reconstructions because they could only capture geometry from the narrow subset of surfaces that happened to reflect signals back toward the receiver.

Wave-Former: Reconstructing Hidden Objects

The first system, Wave-Former, targets the reconstruction of individual objects concealed behind obstructions. The MIT team, led by research assistant Laura Dodds and senior author Fadel Adib, adapted large computer vision datasets to embed mmWave reflection physics into synthetic training data. A generative AI model trained on this data learns to interpret the partial reflections captured by the radar and fill in the missing geometry.

In testing, Wave-Former generated faithful reconstructions of roughly 70 everyday objects — including cans, boxes, utensils, and fruit — boosting accuracy by nearly 20 percent over state-of-the-art baselines. The system works with objects that are fully hidden from line of sight, a scenario where camera-based approaches fail entirely.

RISE: Mapping Entire Rooms Through Walls

The second system, RISE, scales the approach from individual objects to full indoor scenes. Led by former postdoctoral researcher Kaichen Zhou, RISE uses a single stationary mmWave radar to reconstruct room layouts including furniture, using an indirect signal source: the reflections created as people walk through the space.

As a person moves, the radar captures what the researchers call “ghost signals” — secondary multipath reflections that change with each trajectory. RISE analyzes these shifting patterns to build a complete scene reconstruction, with a generative AI model refining the coarse initial output. Tested on more than 100 human trajectories captured by a single radar, RISE generated reconstructions approximately twice as precise as existing techniques.

Practical Applications and Privacy Advantages

The MIT team identifies two near-term use cases. In warehouse logistics, robots equipped with mmWave sensors could verify the contents of sealed packages before shipping without opening them. In smart homes, robots could use wireless perception to understand human location and activity patterns for safer interaction, without relying on cameras.

The privacy dimension is notable. Unlike camera-based perception systems, mmWave radar does not capture identifiable visual information. The systems reconstruct geometry and object shapes rather than photographic detail, which could make wireless vision more acceptable in sensitive environments such as healthcare facilities and residences.

What This Means

“We are using AI to finally unlock wireless vision,” said Fadel Adib, associate professor in MIT’s Department of Electrical Engineering and Computer Science and director of the Signal Kinetics group at the MIT Media Lab. He described the work as “a qualitative leap in capabilities, from being able to fill in gaps we were not able to see before to being able to interpret reflections and reconstruct entire scenes”.

The research bridges two fields that have largely developed in parallel: wireless sensing, which has long struggled with signal sparsity, and computer vision, which has built powerful generative models but remains dependent on line-of-sight data. Whether the approach can scale to cluttered, dynamic environments beyond controlled lab settings remains an open question. Both papers will be presented at CVPR, and the research was supported by the National Science Foundation, the MIT Media Lab, and Amazon.