Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Siddhant Jain / Computer Vision Engineer

About me

Jupyter notebook markdown generator

Posts

Notes - Importance of data in vision

3 minute read

Published: July 17, 2018

Some notes after going through a few papers and 16-824 lecture slides on the different aspects of data in computer vision.

TIL - Deconvolution

1 minute read

Published: April 23, 2018

While, I had a hand-wavy idea about deconvolution earlier, for a project I had to properly understand the concept. Hence, taking a few minutes off to write a quick note about deconvolution layers

Parallel tracking and mappping for small AR workspaces

5 minute read

Published: March 19, 2018

This is a summary of a paper i recently read. The paper can be found here: http://www.robots.ox.ac.uk/~gk/publications/KleinMurray2007ISMAR.pdf

TIL - ICP Matching

1 minute read

Published: February 24, 2018

For some class reading and a project I am pursuing, I decided to understand the iterative closed point matching algorithm. Here is a gist of my understanding.

TIL - Bayes Filters

3 minute read

Published: February 24, 2018

I am doing a course in Robot Localisation and Mapping this semester which shall hopefully trigger a bunch of posts where I try to explain concepts in the course to myself. Here is the first in that series, Bayes Filter: the foundational algorithm for state estimation.

TIL - Principal Component Analysis

3 minute read

Published: February 20, 2018

I was revising my understanding of Principal Components Analysis and I thought it might be a good idea to articulate what I understood in the form of this blog post. I intend this to serve as quick a tutorial for any future reader who has some idea about PCA and wants to brush up the concepts.

TIL - R-CNN/Fast RCNN/Faster RCNN

9 minute read

Published: February 18, 2018

Writing this post as a way to archive my notes while reading the series of papers: R-CNN, Fast R-CNN and Faster R-CNN. The last in the series is our first reading for the course 16-824: Visual Learning And Recognition at CMU. My hope is that the semester allows me enough time to write these notes for each of the readings. So far, so good, I guess.

TIL - Scanning Images in openCV

3 minute read

Published: February 11, 2016

Often, we would want to scan through each and every pixel of an image in openCV. Now, if you are new to openCV (or lazy about writing code), in all probability, you will end up using the the inbuilt at function to access the pixels. This choice is understandable because all you need to do to specify the co-oridnates of the pixel and you get access to the pixel. However, this method is painfully slow and if you need to optimise your OpenCV code, this should probably the first place you should look at. If you are going to access each and every pixel in the image, in sequence, instead of relying on the at function (which was written to provide ease in random access of pixels in an image), you should go for the pointer based access. That means, get the pointer to the starting location of the image data and increment the pointer by an amount equivalent to the size of one pixel to access the next pixel. Let me explain with some code. Assume we have an image stored in Mat called imageMat. This image is an RGB image with each channel (R,G or B) of size 8 bits (1 Byte). This mean, each pixel is of size 3 bytes. In the following code, we will declare a uchar pointer (pointer pointing to a memory of 1 Byte) and increment it by 3, everytime we want to access the next pixel (simple pointer arithmetic at play here).

TIL - Include guards

1 minute read

Published: January 20, 2016

Small post to note down a small thing about programming that I learnt today because of some error I was getting.

Battling with the computer - 3 of Many

2 minute read

Published: January 13, 2016

TL;DR : My computer won’t go beyond the welcome screen in Windows 7, but it will boot properly. i removed a memory card from the memory slot to finally solve the issue

Parsing command line arguments in Java

1 minute read

Published: November 27, 2015

When you have command line arguments, the beginners instinct is to resort to the basics and “parse” them by using the “args” parameter and indexing into it (like: args[0] , args[1] etc.) . Having succumbed to this instinct recently, i learnt the hard way to write better code always. Thought I must archive.

Automatically Generating Subtitles for an Audio using CMU Sphinx

less than 1 minute read

Published: November 18, 2015

A small tutorial to generate subtitles for an audio automatically using CMU Sphinx.

Experimenting with the Goodreads API

1 minute read

Published: July 02, 2015

Goodreads.com is a website (now owned by Amazon.com) that aims to create a social network sorts for bookreaders. I recently started using the website heavily, primarily to archive the books I have been reading and also to get some good recommendations for lining up the next few books I can read. Now that I home and reading a lot more than usual I ended up spending more than I would have liked on buying the books I want to read. Since a good number of my friends have their own tiny book collections, I was wondering if some of them already own these books that I just spent so much money on. Fortunately, goodreads has an option to mark a particular book as a book you own. However, if I have to see if any of my friends own a particular book, I will have to go to each of their profile and find it out manually. Since I was a little bored anyway, I thought of writing a small script that can go through all my friends on goodreads and compile a list of books they own. Later, if I give a book id, I should simply get names of my goodreads friends who own the book.

Hiccups, Poet-Unknown

1 minute read

Published: June 12, 2015

There is this poem on Hiccups that we used to recite as kids. This poem has been an all-time favourite for the pure entertainment value it provides when recited. I have recited it so many times that I think I still remember parts of it. I thought of googling the parts I remember to find out the entire poem, alas, i couldn’t find anything. So before i forget these parts as well, I am archiving it here. Hopefully, some other wandering soul on the internet might have the other parts and will help me piece it together. So here goes the poem as it is in my head:

Cycling to Work

3 minute read

Published: May 14, 2015

There had always been this itch to get back to cycling and after coming to Bangalore that yearning just became stronger when I saw so many cyclists on the roads. I remember having a similar craving for a cycle in first year of college and after a few weeks it just fizzled out. So, before actually going ahead an buying a cycle, I decided to rent one out for a couple of weeks and yesterday was Day 1 of renting that cycle. This post is just to archive experiences for these two weeks, for I believe this fancy may die too and then later when I want to get back to saddle again, the post will be a reference point for pros and cons.

Monitoring an AWS cloud using the AWS SDK

4 minute read

Published: March 30, 2015

I have now started working on a project that requires me to monitor some metrics from AWS. At first i attempted to use use the Query API provided by AWS and get the desired information by just making some REST calls, however I haven’t been able to make it run correctly yet. So now I am resorting to using the SDK which is supposed to make the process easier. (I will get back to the REST API in a while).

Finding System Statistics in Java

3 minute read

Published: January 20, 2015

Continuing on my quest to get more familiar with Java (and kill some time during this workless phase of the internship) I tried another problem to print some system statistics. The problem statement was to print the CPU utilisation, Disk Utilisation and Memory utilisation after every 2 seconds and print the average of these metrics affter ever minute. While this problem is quite straightforward in C/C++, it becomes a little interesting in Java because of JVM. The existing functions in Java would have all given the statistics for the virtual Machine and not for the actual system on which the program is running. A straight-forward google search suggested the use of SIGAR which basically made the task as simple as doing it C/C++. However, of course, it took away the flexibility of running the program on any machine as the binaries for the API were platform dependent.

Socket client-server program in Java

8 minute read

Published: January 19, 2015

A week ago I started my semester long internship in the Advanced Software Division at EMC, Bangalore. As a warm-up exercise (read: work to do till the projects are figured out), I was asked to write a basic socket client-server program in Java that allows multiple clients to connect to a server and execute the following commands:

chdir [dir] //change directory to the given directory name rmdir [dir] //remove directory mkdir [dir] //make a new directory list [ ] //list current working directory exit //close the connection

I had written a simple one server-one client socket program in C for a lab exercise in a computer network courses at the university and almost began writing a similar program in Java, when I realised the slight complexity that connection of multiple clients at the same time brings to the problem statement.

Trading in the Indian Stock markets

4 minute read

Published: November 17, 2014

Now that I am about to finish my majors, one of which happens to be in economics, I have decided to finally learn some tricks of the trade - literally. ( Fun fact: OED has appended a rather controversial meaning to the word. Check here ) The idea is to start trading virtually in the Indian stock markets. I will be using icicidirect’s platform for virtual stock trading, although the UI as of now seems very discouraging. I might change to something else, with time. The big question is of course how am I goint to invest. Well, I initially intended to invest like a sensex linked fund. That means, I wanted to divide the total money I have among the same stocks and the same weights as is used to calculate the sensex. This was out of a simple observation that as of today the sensex more than doubled in the last five years and this happened while the sensex was clouded with all the negative sentiments and other things like that which financial newspapers report. These days it looks like things are hunky dory in the market and hence it’s best to invest in the market. That said, I also realised that such indices tend to follow a cycle and it looks like we are on the peaks these days and it may not be the best starting point to buy in a sensex linked manner, hence I am choosing to select my stocks in a different manner. It’s 2 in the afternoon and I had a heavy lunch, so I do not feel like doing a lot of work. I am taking the path of least resistance. I have basically selected a few stocks from the 30 in the sensex because I like the companies and more than 66% of the users of moneycontrol.com think that I should buy the stock. Simple. No brains. It’s almost like picking the stocks at random. Since this is testing the waters, I am buying one share of each company and will judge my portfolio’s performance in percentage terms only. Here are the stocks that I am buying: 1) Infosys 2) HDFC Bank 3) Tata Steel 4) ONGC 5) Unilever Now if I look at the graphs, all these stocks are doing really well right now (and hence they should fall in time to come), but I am still sticking to the (very stupid) strategy. Let the users tell me what should I do. It is almost like an experiment. I check moneycontrol for these five stocks everyday and sell only if more than 50% of the users think I should sell a stock. If they do, I sell the stock and shift to another stock that they recommend I should buy. I will sell all stocks and end the experiment, if my portfolio value increases/reduces by 20 percent of what I started with. I plan to continue on this experiment for the coming month. Setting christmas day as my deadline to evaluate the strategy.

Downing a day in Delhi

7 minute read

Published: September 22, 2014

So, yesterday I went to Delhi because I have a bunch of friends living there and a lot of free time. One of them (manickam), has suddenly become a meticulous planner and we ended up going on a food trip. Since I liked most of the places we went to, I am writing this down so that next time I have a handy reference when someone asks what should they do in Delhi. I wish I had bought one of those cameras I have been obsessing about for the last few days (more on that in some other blogpost) by now. Owing to the lack of discounts on cameras in India, please bear with lots of text and no pictures (modern day blog readers’ nightmare?)

Battling with the Computer - 2 of many

1 minute read

Published: April 02, 2014

Another extremely trivial battle I used to have with my computer everyday until I decided to do something about it, finally.

Battling with the Computer - 1 of many

1 minute read

Published: March 27, 2014

My epic plans of self-training myself into a Data Scientist, failed as miserably as the first few models I made for the problem sets on Kaggle. Meanwhile, I was being humbled by more simpler problems in the world of Computers, namely ‘battery plugged in and not charging’. The problem has been solved for now, without a penny being spent; taking it down for future reference.

Kaggle – Data Science London + Scikit-learn (Using k-nn in R)

1 minute read

Published: December 31, 2013

This post won't make much sense unless you have seen this post as well: http://siddhantj.wordpress.com/2013/12/31/kaggle-data-science-london-scikit-learn-using-svm-in-r/

Alright, so moving on.

I implemented the k-nearest neighbour model next and there was a significant improvement in my leader-board ranking (16 positions!) and a 2.5% increment in my accuracy. I tried two different values for k: 5 and 15. For k=15, my accuracy dropped slightly(0.3%) and hence k=5 seems to be a giving a decent value. I am left with just one more allowed submission in the next 6 hours and I am tempted to try another value of k (12, no sound logical reasoning behind the number. Just a hunch.)

I ended up spending a lot of time in order to figure out how to use KNN in R. Have little or no idea about the data types in R which is where I got stuck. So far, I have been floating by looking at examples and learning. Hoping, that I will eventually learn about R as well. Most of my time went in solving a trivial error that was there because my variable that stored the labels of the training set was not of the type - Factor. The problem and the solution that I ended up using are almost exactly explained here

Again, appending the commands that worked:

#assuming train,test and trainLabels are as defined in the previous post
cl=trainLabels[,1]
answer<- knn(train,test,cl, k=5)
write.csv <-(answer,"answer2.csv")

PS: I did try that k=12 hunch and failed. Must learn how to do cross-validation next.

Kaggle - Data Science London + Scikit-learn (Using SVM in R)

1 minute read

Published: December 31, 2013

I finished a course on machine learning last semester and came across kaggle some time around then. I immediately pushed it in my winter-break-to-do list and lucky kaggle got selected as the only thing that actually happened from that list.

Eidetic Memory and the lack of it

less than 1 minute read

Published: November 10, 2013

Eidetic Memory, more commonly known as photographic memory is basically the ability to remember everything you perceive through most of your senses. I just took a test here, and well the predictable happened - I failed. Obviously, I do not have a photographic memory, hence this blog to archive things I want to recollect.

patents

Creating Personalised Catalogoues With Recommendations Embedded In Augmented Viewpoint To Retarget Consumers

Published: October 31, 2016

Using Augmented Reality to make better recommendations for products

Removing Overlays From A Screen To Separately Record Screens And Overlays In A Digital Medium Environment

Published: November 14, 2016

Technique to improve screen recording experiences by showing visual cues which are removed from the recording using image processing

Method And Apparatus For Real Time Matting and De-Spilling Using Local Color Estimation and Propagation

Published: October 10, 2017

This patent is on a matting technique for webcam videos

Download here

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

projects

Qwitter

Published: July 31, 2014

Qwitter is a windows phone 8 application conceptualized and created first for the Microsoft Code.Fun.Do hackathon. The app was selected as one of the nationwide finalists, where it was refined under the mentoring provided by Microsoft Engineers. The app aims to help people looking to quit smoking through a community of other anonymous quitters. Users log their experiences via 140 character messages - qweets which are shared anonymously with other users. Along with this, different milestones are defined to gamify the quitting experience and keep the users engaged. I developed the server side infrastructure of the App along with developing a classifier to detect Spam messages and weed them out of the system. This detector was a specialied SVM trained for short-text classification, such as tweets and SMSes.

Keyframe Cut

Published: August 01, 2014

Keyframe cut is a background removal algorithm for videos that was developed as a part of a summer internship project with the eLearning team at Adobe Systems, Bangalore. We started our project with a comprehensive iterature survey - populating a knowledge base on computer vision for internal use. Thereafter we focused on devising novel algorithms to solve the problem, given our constraints with talking head videos. Keyframe cut fuses multiple cues to segment each frame of the video. Starting from a completely segmented frame given by the user - the keyframe, we employed GMMs (colour cues), face/body detectors (shape cues) and frame differences (motion cues) to segment each frame. Our final algorithm had an average error rate of under 6% for a test dataset.

Marvin - Martian Tour Guide

Published: April 19, 2015

The Curiosity Rover has sent back terabytes of data so far. NASA has released stunning images received directly from Mars for public viewing. The project used these images to provide for a VR experience to the general public. We developed an app that can be used in conjunction with the openly available Google Cardboard to see 360 degrees views of Mars along with viewing Mars with a stereoscopic vision as if riding the rover itself.

Magic Green Screen

Published: September 01, 2015

An extension of the work done during summer internship, Magic Green Screen was the marquee feature for Presenter Video eXpress (PVX) 11. Over the course of almost a year it has received some rave reviews from Customers for its ease of use. As a part of a two membered team, my work was focused on the algorithms used for matting and despilling in the feature. Additionally, I also developed a visualisation sytem used in the feature to provide a feedback on the suitability of the recording environment for background separation to the user. The work done on matting and despilling has been filed in the USPTO as a patent, titled: METHOD AND APPARATUS FOR REAL-TIME MATTING USING LOCAL COLOR ESTIMATION AND PROPAGATION.

VR Works

Published: March 07, 2016

VR Works is an on-going side project I have taken up at work with the goal of exploring the emerging field of virtual reality by prototyping a bunch of ideas and perfecting them for eventual productisation. My focus is on building authoring systems which can help democratise content creation in virtual reality. Current prototypes are geared towards education and training scenarios in virtual reality. Specifically, I prototyped the following three ideas:

Live Ads

Published: April 16, 2016

This project was made during a hackathon organised by Myntra - a fashion retailing website. Live-Ads spruces up print adveritisement by augmenting them with interactive 3D models that further add value to the content in the advertisement. Our demo was focused towards fashion retailers where we Augmented existing advertisements of Myntra with apparels advertised tried on computer generated 3D models. Our app, identified advertisements registers with it and added custom 3D models dynamically to the advertisement, along with call-to-action buttons to enable the consumer to buy the product directly.

Segmentation Based Data Augmentation

Published: December 06, 2017

This project was done as a part of coursework for 16-720B (Introduction to Computer Vision) at Carnegie Mellon University

GradCoin - A poor-to-poor electronic cash transfer system

Published: March 13, 2018

First, check out what is SIGBOVIK here: [http://sigbovik.org/2018/]

Simultaneous Localisation and Mapping for Landing Site evaluation for drones

Published: May 18, 2018

This project was done as a part of coursework for 16-833 (Robot Localisation and Mapping) at Carnegie Mellon University

One shot learning for video object segmentation using 3D convolutions

Published: May 19, 2018

In this project we explored a new direction for performing video segmentation, i.e. using the I3D network architecture with the pre-trained weights from the training of the kinetic dataset. We adapted the I3D network into a Fully Convolutional Network adding upsampling layers at various stages to give us the final segmentation map. This allowed us to look at all frames of the videos at once giving a superior method to segment the video but at the same time it also introduced some unique problems that we address in this work.

Deep Parametric Style Transfer

Published: September 20, 2018

I spent the summers of 2018 with Adobe Research working on a deep network that transfers parametrised styles between images. Our end-to-end trainable system embeds a non-differentiable style renderer in the network that allows us to provide a better supervision in terms of an image loss instead of a regression loss on parameters as was the case with previous approaches. We also adapted our method to be webly supervised by exploiting the large number of movie-trailers available in the public domain.

publications

Paper Title Number 1

Published in Journal 1, 2009

This paper is about the number 1. The number 2 is left for future work.

Recommended citation: Your Name, You. (2009). "Paper Title Number 1." Journal 1. 1(1). http://academicpages.github.io/files/paper1.pdf

Paper Title Number 2

Published in Journal 1, 2010

This paper is about the number 2. The number 3 is left for future work.

Recommended citation: Your Name, You. (2010). "Paper Title Number 2." Journal 1. 1(2). http://academicpages.github.io/files/paper2.pdf

Paper Title Number 3

Published in Journal 1, 2015

This paper is about the number 3. The number 4 is left for future work.

Recommended citation: Your Name, You. (2015). "Paper Title Number 3." Journal 1. 1(3). http://academicpages.github.io/files/paper3.pdf

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Siddhant Jain

Sitemap

Pages

Posts

patents

portfolio

projects

publications

talks

teaching