Kumar Ashutosh Graduate Student i code. i debug. i code. क्रांति की ज्वाला जलती रहनी चाहिए!

hi.

I am a fourth year CS Ph.D. student at UT Austin working with Prof. Kristen Grauman. I am also a visiting researcher at Meta AI. My research interest lies broadly in Computer Vision and Machine Learning. I am currently working on video understanding and video-language models.

Prior to this, I spent five wonderful years at IIT Bombay where I completed my Dual Degree (B.Tech and M.Tech) in Electrical Engineering. In my masters thesis, I was supervised by Prof. Subhasis Chaudhuri where I worked on 3D reconstruction from multi-view images.

I occasionally write blogs about my projects, experiences, travels and thoughts. You can find the blogs below and also in archive. I appreciate questions or feedbacks regarding any of my project, blog or more broadly, my experience. Please email me.

Updates

May 2024:Recognized as an outstanding reviewer (top 2%) for CVPR 2024.
Apr 2024:Ego-Exo4D (paper) is selected as an oral presentation (selection rate: 0.8%) and VidDetours (paper) is selected as a highlight presentation (selection rate: 2.8%) at CVPR 2024.
Feb 2024:Four papers accepted at CVPR 2024 - detours for navigating instructional video (paper), the Ego-Exo4D dataset (paper), identifying sounding actions in videos (paper) and learning object state changes in an open-world (paper).
Sep 2023:Our work on video mined task graphs for keystep recognition in instructional videos (paper) is accepted at NeurIPS 2023.
Feb 2023:Our work on hierarchical video-langague embeddings (HierVL) is accepted at CVPR 2023.
Jan 2023: Started as a visiting researcher at FAIR, Meta AI.
Sep 2022: Our paper on robust stochastic knowledge distillation is accepted at ICDM 2022.
May 2022: Started as a Research Intern (AI) at Meta AI for Summer 2022.
August 2021: Joined UT Austin to pursue a Ph.D. in CS!
June 2021: I completed my Dual Degree (B.Tech with major in Electrical Engineering and minor in Computer Science, M.Tech in Electrical Engineering with specialization in Communication and Signal Processing) from IIT Bombay.
April 2021: I will be joining UT Austin starting Fall'21 to pursue a Ph.D. in Computer Science.
January 2021: Our work on statistically robust bandit algorithms is accepted for oral presentation at AISTATS (selection rate 3%).
August 2020:I will be a TA for the course EE 635 - Applied Linear Algebra.
July 2020:Our work on Lower Bounds of Policy Iterations is accepted at IEEE CDC 2020. (preprint)
May 2020:Participated in pan-India AR hackathon on COVID-19. (presentation)
Apr 2020:Paper accepted at IEEE ComPE-20 to be held from 2-4 July. Conference to take place in virtual mode.
Mar 2020:IIT Bombay suspends classes for all students due to COVID-19.
Nov 2019:Started 40-day internship at 360World at their Budapest office.
 

Updates

May 2024:Recognized as an outstanding reviewer (top 2%) for CVPR 2024.
Apr 2024:Ego-Exo4D (paper) is selected as an oral presentation (selection rate: 0.8%) and VidDetours (paper) is selected as a highlight presentation (selection rate: 2.8%) at CVPR 2024.
Feb 2024:Four papers accepted at CVPR 2024 - detours for navigating instructional video (paper), the Ego-Exo4D dataset (paper), identifying sounding actions in videos (paper) and learning object state changes in an open-world (paper).
Sep 2023:Our work on video mined task graphs for keystep recognition in instructional videos (paper) is accepted at NeurIPS 2023.
Feb 2023:Our work on hierarchical video-langague embeddings (HierVL) is accepted at CVPR 2023.
Jan 2023: Started as a visiting researcher at FAIR, Meta AI.
Sep 2022: Our paper on robust stochastic knowledge distillation is accepted at ICDM 2022.
May 2022: Started as a Research Intern (AI) at Meta AI for Summer 2022.
August 2021: Joined UT Austin to pursue a Ph.D. in CS!
June 2021: I completed my Dual Degree (B.Tech with major in Electrical Engineering and minor in Computer Science, M.Tech in Electrical Engineering with specialization in Communication and Signal Processing) from IIT Bombay.
April 2021: I will be joining UT Austin starting Fall'21 to pursue a Ph.D. in Computer Science.
January 2021: Our work on statistically robust bandit algorithms is accepted for oral presentation at AISTATS (selection rate 3%).
August 2020:I will be a TA for the course EE 635 - Applied Linear Algebra.
July 2020:Our work on Lower Bounds of Policy Iterations is accepted at IEEE CDC 2020. (preprint)
May 2020:Participated in pan-India AR hackathon on COVID-19. (presentation)
Apr 2020:Paper accepted at IEEE ComPE-20 to be held from 2-4 July. Conference to take place in virtual mode.
Mar 2020:IIT Bombay suspends classes for all students due to COVID-19.
Nov 2019:Started 40-day internship at 360World at their Budapest office.
            



ExpertAF: Expert Actionable Feedback from Video

ArXiv 2024
Kumar Ashutosh, Tushar Nagarajan, Georgios Pavlakos, Kris Kitani, Kristen Grauman
Paper  

Detours for Navigating Instructional Videos

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024
Highlight Presentation (Selection rate: 2.8%)
Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, Kristen Grauman
Paper  

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024
Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman
Paper   Website  

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024
Oral Presentation (Selection rate: 0.8%)
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, ... , Michael Wray
Paper   Website  

Learning Object State Changes in Videos: An Open-World Perspective

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024
Zihui Xue, Kumar Ashutosh, Kristen Grauman
Paper   Website  

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

Neural Information Processing Systems (NeurIPS), December 2023
Kumar Ashutosh, Santhosh Kumar Ramakrishnan, Triantafyllos Afouras, Kristen Grauman
Paper   Website  

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

ArXiv 2023
Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman
Paper  

HierVL: Learning Hierarchical Video-Language Embeddings

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023
Highlight Presentation (Selection rate: 2.5%)
Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman
Paper   Website  

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

IEEE International Conference on Data Mining (ICDM), 2022
Ajay Jaiswal, Kumar Ashutosh, Justin F Rousseau, Yifan Peng, Zhangyang Wang, Ying Ding
Paper  




Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Artificial Intelligence and Statistics (AISTATS), 13th-15th April 2021, Online Conference
Oral Presentation (Selection rate: 3%)
Kumar Ashutosh, Jayakrishnan Nair, Anmol Kagrecha, Krishna Jagannathan
Paper  

3D-NVS: A 3D Supervision Approach for Next View Selection

International Conference on Pattern Recognition (ICPR) 2022
Kumar Ashutosh, Saurabh Kumar, Subhasis Chaudhuri
Paper  

Lower Bounds for Policy Iteration on Multi-action MDPs

IEEE Conference on Decision and Control (CDC), 8th - 11th December 2020, Online Conference
Kumar Ashutosh*, Sarthak Consul*, Bhishma Dedhia*, Parthasarathi Khirwadkar*, Sahil Shah*, and Shivaram Kalyanakrishnan
Paper   Code  

Hardware Performance Analysis of Mobile-Based Augmented Reality Systems

IEEE Conference on Computational Performance Evaluation (ComPE), 2nd - 4th July 2020, Online Conference
Kumar Ashutosh
Paper   Presentation  

A Multilayer Network Approach for Studying Creative Ideation from EEG

Brain Informatics Conference (BI), 10th - 12th December 2018, Arlington, Texas, USA
Rohit Bose*, Kumar Ashutosh*, Junhua Li, Andrei Dragomir, Nitish Thakor, and Anastasios Bezerianos
Paper  




RLConnect - Reinforcement Learning agent to play Dots and Boxes

This project is being done in Autumn 2020 semester and expected date of completion is December 2020. We attempt to experiment with RL algorithms to automate Dots and Boxes, a popular high-school pen and paper game.

Capture the Flag - An out of box AR Game

This blog contains report of Research and Development project done in the supervision of Prof. Parag Chaudhuri, IIT Bombay. In this proect, I developed a novel AR-based table top game which fits into the terrian of the scene and adjusts the game object based on the markers. The characters can move in 3D world with correct spatial understanding.

Virtual Keyboard using Leap Motion Controller

This is one of my many hobby projects done this semester. It is aimed to learning how to use Hardwares for AR/VR. I wanted to experiment with the Leap Motion Controller as soon as I knew about it.

Discrete Analog to Digital Converter

In this blog, I include final report of the Electronic Design Lab (EDL) project done under the guidance of Prof. Pramod Murali. The report contains all the relevant details including circuit diagram, results, flowcharts.

RISC Processor Design

This project was done as a part of the Microprocessors course at IIT Bombay. The microprocessor design was done for both multi-cylce and pipleined architecture. The theory required for the project was taught alongisde during 5th Semester in my undergraduate course by Prof Virendra Singh, IIT Bombay.

Extended Visualization: Focus in GLSL

Google Summer of Code is an annual Open Source program which selects open source enthusiasts who work on interesting projects under many different open source organizations. The aim of the program is to promote open source and to teach the practices of good coding skills. The program is high competitive and the acceptance rate is around 20-25%. I was selected in dipy - an open source organization in Medical Imaging. My mentor is an Assistant Professor at Indiana University, Bloomington. The blog contains my overall experience as a GSoCer.

Multi-layered analysis of Brain Networks

I was selected as a research intern under Prof Anastasios Bezerianos and Prof Nitish Thakor, at SINAPSE (Singapore Institute for Neurotechnology), at NUS (National University of Singapore). I worked in eastablishing brain connectivity patterns during creative thinking. I explored how the EEG (Electroencephelogram) brain pattern changes during creative thinking and established statistical differences between the two states of mind.

Discriminative Localization in Medical Images

Blog on Research and Development Project done under Prof Amit Sethi, IIT Bombay.




Other Projects

Lists down some other projects done as part of my curriculum at IIT Bombay. The topics covered by projects listed here includes signal processing, online learning, electrical circuits, microcontroller programming etc.

Augmented Reality Chess Game - ARChess

The aim of the project is to develop a fully functional Augmented Reality based Chess game application.
STALLED. Done indepenently by Rishi Vanukuru

Geolocation-AR: AR Based Navigation System

The aim of this project is to develop a fully functional Augmented Reality based Location Assistant.
STALLED. If you are a junior/sophomore at IITB and this project intrigues you, I can mentor you. Email me.


Built with leonids theme.

Do we already travel across time?

My thought on how Time Travel can actually be a reality and indeed we might be doing it unknowingly in many forms. Read this small piece to know what I think.

Interesting World of Shaders and VTK

The last two weeks were full of fun. Sitting near my screen exploring various shaders was fun. With the help of Elef and Ranveer, I could do a lot of different examples and all this helped me build some really cool visualization examples (At least I was amazed :P ). In this blog, I walk the reader through few things I achieved over the past two weeks.

End of Community Bonding Period

Hey folks!! We have come to the end of the first stage of GSoC project, the Community Bonding Period. Official coding begins NOW!!

Selected for GSoC - 2018

Phew!!! Selected for GSoC-2018 :)

My Contributions to Scikit-Learn

I am an active contributor to sciki-learn, a machine learning package in Python. I list down my contributions to scikit-learn in this blog.

First Quantum of Contribution

I am back to my very own space. I was hoping to write on to this page since a very long time. But yeah, if only I could have contributed earlier. But better late than never.

My First Blog

This is my first blog. My curiosity to explore Jekyll pages instigated me to make a webpage of my own and host it on my Github profile. I would recommend all the readers to give it a try and make your own personal page. Please let me know if you find this blog attractive.

Interactive Map

Weekends in Japan - The End

Part 3 of my weekend trips in Japan. Details my experience in Tokyo.

Weekends in Japan - II

Part 2 of my weekend trips in Japan. Details my experience at FujiQ Highlands and Odaiba

Weekends in Japan - II feature image Location: Yokohama Chinatown, Japan

Weekends in Japan - I

Part 1 of my weekend trips in Japan. Details my experience at Tokyo Skytree, Ueno Park, Sumeda Aquarium, Tokyo Tower, Rainbow bridge, and Kamakura

Weekends in Japan - I feature image Location: Kamakura, Japan

The Untouched Paradise - Trip to Andaman Islands

In this blog post, I pen down my family trip experience to the untouched paradise - Andaman and Nicober Islands. In this week-long journey, we went to many beautiful beaches and historical sites. Scuba diving, I would say, was the pinnacle of my excitement.

The Untouched Paradise - Trip to Andaman Islands feature image Location: Radhanagar Beach, Andaman