Projects

Automatic Surface Detection and Mapping

Duration: October 2019 - March 2021
Technologies: OpenCV, TensorFlow
Since Fall 2019, I've been exploring the idea of making a Lightform device automatically find a pre-selected object and map content to it. The detection part is somewhat trivial but the challenge is getting pin point accuracy since a misalignment of even a few pixels can result in a very blurry and not so magical experience. Another challenge is that the detection has to work 99% of the time or else it wouldn't meet the bar of a product feature and be more like a prototype. The device should also be able to quickly realign content if the object or the projector moves.
We've gone on this path in the past too but quickly abandoned it since making it work 99% of the time without any user input has been challenging. Well, second time is the charm right?

Keeping both challenges in mind, I worked on a marker based approach to begin with. I started with adding 4 Aruco markers to a test surface since it guarantees both challenges of detection accuracy and repeatability.

The next iteration removed the dependency on 4 markers and brought it down to 1. I also replaced the marker with the Lightform logo. The single logo as the marker looked good and worked well enough to green light the project from an R&D project to a product feature.

I also added the ability to quick realign content if it goes out of alignment. It needed some trickery with the projector camera correspondences but in the end, it worked pretty damn good!

In the last couple of months, I've worked on removing any dependency on visual marker at all and going fully markerless and still guarantee the accuracy of alignment of content. That was possible by using a CNN for object detection. We trained MobileNetSSD network to detect a set of pre-selected artworks. The model was converted to run with TensorFlowLite on an Android based embedded device. The model is helpful in getting a bounding box over the artwork but not the 4 corners of the artwork. For that, we do a feature based alignment to get the content to align perfectly with the artwork in real life.

Lit-Ball

Duration: June 2019
Technologies: OpenCV, Realsense D415, openFrameworks, toy balls

Created a quick fun projection mapping demo that tracked balls and attracted the particles in space. The detection of the balls was done using the depth feed since the projection was always screwing with the color information and the detection also had to be quick to work in real time. The calibration between the RGB camera and the depth camera was straight forward since they are rigidly mounted and provided in the Realsense SDK. The calibration between the RGB camera and the projector was done using a simple 4 point correspondence technique since the table was planar. The fun part of the project was exploring openFrameworks and creating the graphics for the project and making it looks slick.

RGBD Quick Select

Duration: Oct - Dec 2017
Technologies: OpenCV

Implemented a selection technique in the Lightform creative software that allows users to select objects in an image by painting over them with the cursor (a la Photoshop's Quick Selection tool). The implementation was based on this paper by Jiangyu Liu. Our version of paint selection used continuity in color and depth information to infer the boundaries of semantically different objects.

Video Source

Image based shader effects

Duration: June 2018
Technologies: OpenCV, GLSL shaders, ShaderToy

This was one of my favorite crunch time, I don't know how to do this but it needs to get done project. 3 weeks before the launch of our first hardware device and the software, we needed to add some effects to ship with our software. Our target was 20 and we had 6 and I was given charge of adding more. So I dawned the hat of a creative coder and without any background in graphics or effects, started playing around with things on ShaderToy to understand how the shaders work. I knew that I could import the scan taken from the device as a texture and use that in some creative way to create magical things. Our CEO, Kevin had already done a version of it to create a Tron like effect. So I started playing around with it, some came out nice, some not so nice. But the effect included below, oh boy did it work. I fell in love with it and it quickly became the go to effect for everyone at the company. We named it, Digital Fade.
(later we hired an actual creative coder and he showed us the potential of what you can do with shaders. But still, this one is dear to me and still one of my favorites)

Video Source

Rotation Blurring: Use of artificial blurring to reduce cybersickness in Virtual Reality first person shooters

Mentor: Prof. David Forsyth, University of Illinois Urbana-Champaign
Duration: January - May 2015
Technologies: Oculus DK1, Unity3D

Users of Virtual Reality (VR) systems often experience vection, the perception of self-motion in the absence of any physical movement. While vection helps to improve presence in VR, it often leads to a form of motion sickness called cybersickness. Cybersickness is a major deterrent to large scale adoption of VR.
Prior work has discovered that changing vection (changing the perceived speed or moving direction) causes more severe cybersickness than steady vection (walking at a constant speed or in a constant direction). Based on this idea, we try to reduce the cybersickness caused by character movements in a First Person Shooter (FPS) game in VR. We propose Rotation Blurring (RB), uniformly blurring the screen during rotational movements to reduce cybersickness. We performed a user study to evaluate the impact of RB in reducing cybersickness. We found that the blurring technique led to an overall reduction in sickness levels of the participants and delayed its onset. Participants who experienced acute levels of cybersickness benefited significantly from this technique.

Paper: arXiv

Where's My Drink? Enabling Peripheral Real World Interactions While Using HMDs

Mentor: Prof. David Forsyth, University of Illinois Urbana-Champaign
Duration: June -September 2014
Technologies: Oculus DK1, Unity3D, EmguCV

Head Mounted Displays (HMDs) allow users to experience virtual reality with a great level of immersion. However, even simple physical tasks like drinking a beverage can be difficult and awkward while in a virtual reality experience. We explore mixed reality renderings that selectively incorporate the physical world into the virtual world for interactions with physical objects. We conducted a user study comparing four rendering techniques that balances immersion in a virtual world with ease of interaction with the physical world.

Paper: arXiv
Press Coverage: Discover, MIT Technology Review

Enabling Expressive User Interaction in a Multimodal Interface for Object Selection in Virtual 3D Environments

Mentor: Dr. Sriganesh Madhvanath, Senior Research Scientist, HP Labs India
Duration: July 2011-June 2012
Technologies: StanfordNLP, Blender, Microsoft Kinect v1, Microsoft speech API, C++, Java

We developed a system which facilitates interaction with a 3D scene using speech and visual gestures. The system provides functionality of using speech as a filter to the pointing gesture so as to alleviate the shortcomings of unimodal interface for 3D environments. The system also provides functionality of using only speech to refer to desired objects. References can be made using distinguishable properties of the object, spatial location in the scene and spatial relation with another object.

The project got selected at 14th ACM International Conference on Multimodal Interaction (ICMI '12) as a demo presentation and the paper got published in conference proceedings.[paper]