Projects

Deep Parametric Style Transfer

Published:

I spent the summers of 2018 with Adobe Research working on a deep network that transfers parametrised styles between images. Our end-to-end trainable system embeds a non-differentiable style renderer in the network that allows us to provide a better supervision in terms of an image loss instead of a regression loss on parameters as was the case with previous approaches. We also adapted our method to be webly supervised by exploiting the large number of movie-trailers available in the public domain.

One shot learning for video object segmentation using 3D convolutions

Published:

In this project we explored a new direction for performing video segmentation, i.e. using the I3D network architecture with the pre-trained weights from the training of the kinetic dataset. We adapted the I3D network into a Fully Convolutional Network adding upsampling layers at various stages to give us the final segmentation map. This allowed us to look at all frames of the videos at once giving a superior method to segment the video but at the same time it also introduced some unique problems that we address in this work.

Segmentation Based Data Augmentation

Published:

This project was done as a part of coursework for 16-720B (Introduction to Computer Vision) at Carnegie Mellon University

Live Ads

Published:

This project was made during a hackathon organised by Myntra - a fashion retailing website. Live-Ads spruces up print adveritisement by augmenting them with interactive 3D models that further add value to the content in the advertisement. Our demo was focused towards fashion retailers where we Augmented existing advertisements of Myntra with apparels advertised tried on computer generated 3D models. Our app, identified advertisements registers with it and added custom 3D models dynamically to the advertisement, along with call-to-action buttons to enable the consumer to buy the product directly.

VR Works

Published:

VR Works is an on-going side project I have taken up at work with the goal of exploring the emerging field of virtual reality by prototyping a bunch of ideas and perfecting them for eventual productisation. My focus is on building authoring systems which can help democratise content creation in virtual reality. Current prototypes are geared towards education and training scenarios in virtual reality. Specifically, I prototyped the following three ideas:

Magic Green Screen

Published:

An extension of the work done during summer internship, Magic Green Screen was the marquee feature for Presenter Video eXpress (PVX) 11. Over the course of almost a year it has received some rave reviews from Customers for its ease of use. As a part of a two membered team, my work was focused on the algorithms used for matting and despilling in the feature. Additionally, I also developed a visualisation sytem used in the feature to provide a feedback on the suitability of the recording environment for background separation to the user. The work done on matting and despilling has been filed in the USPTO as a patent, titled: METHOD AND APPARATUS FOR REAL-TIME MATTING USING LOCAL COLOR ESTIMATION AND PROPAGATION.

Marvin - Martian Tour Guide

Published:

The Curiosity Rover has sent back terabytes of data so far. NASA has released stunning images received directly from Mars for public viewing. The project used these images to provide for a VR experience to the general public. We developed an app that can be used in conjunction with the openly available Google Cardboard to see 360 degrees views of Mars along with viewing Mars with a stereoscopic vision as if riding the rover itself.

Keyframe Cut

Published:

Keyframe cut is a background removal algorithm for videos that was developed as a part of a summer internship project with the eLearning team at Adobe Systems, Bangalore. We started our project with a comprehensive iterature survey - populating a knowledge base on computer vision for internal use. Thereafter we focused on devising novel algorithms to solve the problem, given our constraints with talking head videos. Keyframe cut fuses multiple cues to segment each frame of the video. Starting from a completely segmented frame given by the user - the keyframe, we employed GMMs (colour cues), face/body detectors (shape cues) and frame differences (motion cues) to segment each frame. Our final algorithm had an average error rate of under 6% for a test dataset.

Qwitter

Published:

Qwitter is a windows phone 8 application conceptualized and created first for the Microsoft Code.Fun.Do hackathon. The app was selected as one of the nationwide finalists, where it was refined under the mentoring provided by Microsoft Engineers. The app aims to help people looking to quit smoking through a community of other anonymous quitters. Users log their experiences via 140 character messages - qweets which are shared anonymously with other users. Along with this, different milestones are defined to gamify the quitting experience and keep the users engaged. I developed the server side infrastructure of the App along with developing a classifier to detect Spam messages and weed them out of the system. This detector was a specialied SVM trained for short-text classification, such as tweets and SMSes.