I am currently a second year Ph.D. student in the Machine Learning Group at Princetons's Computer Science department. Currently, I am working with Dr. Adam Finkelstein.
I am very interested in the idea of multimodal machine learning, specifically combining textual and visual cues to solve various problems in the field of Question Answering, and Audio Generation. I am also deeply interested in audio and speech tasks related to cognition and how do humans perceive and quantify the various differences in audio.
My interests are aligned towards speech and audio processing and also natural language processing. In one of my current projects, I am trying to combine these two modalities together.
I completed my B.Tech from Indian Institute Of Technology, Guwahati in 2018 in Electronics and Electrical Engineering and Computer Science.
In the summer of 2017, I interned at Carnegie Mellon University, PA - advised by Dr. Bhiksha Raj, who leads the Machine Learning and Speech Processing Group at LTI-CMU. We worked on the problem of retrieving all semantically similar audio given a query clip. Devised a novel siamese neural network based approach which encodes the audio into a vector representation which is useful for retrieving similar recordings.
Previously, in the summer of 2016, I interned at Indian Institute of Technology, Delhi (IIT Delhi) advised by Dr. Tapan K. Gandhi, where I worked on various Machine Learning algorithms for analyzing and detecting the tumors in human brain using MR images. Designed a comprehensive tool where we could easily segment the tumor in the brain MR Image and then also see what 'Brodmann Area' it corresponds to.
- DPLM: A Deep Perceptual Spatial-Audio Localization Metric
Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia
- CDPAM: Contrastive learning for perceptual audio similarity
Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein
To appear in ICASSP 2021 - Canada
- A Differentiable perceptual audio metric learned from just noticeable differences (best paper finalist!)
Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas J. Bryan, Gautham J. Mysore, Zeyu Jin
Interspeech 2020, Shanghai - China
Paper Github Highlight Video Full Video
- Content-based Representations of audio using Siamese neural networks
Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, Bhiksha Raj
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018 Calgary-Canada
- Brain Classification and Segmentation of MR Brain Images
Tanvi Gupta, Pranay Manocha, Tapan Kumar Gandhi, R.K Gupta, B.K Panigrahi
- Tumor Segmentation and Gradation for MR Brain Images
Tanvi Gupta, Pranay Manocha, R.K Gupta, Tapan Kumar Gandhi
Computational Intelligence and Communication Technology (CICT-2018)
- Automated tumor segmentation and brain mapping for the tumor area
Pranay Manocha, Snehal Bhasme, Tanvi Gupta, Tapan Kumar Gandhi
International Conference on Human-Computer Interaction - IHCI 2017