Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

projects

Estimating 3D trajectory using feature-based sparse SLAM

A class project for CS 763 (Computer Vision), Spring '19, IIT Bombay

The presented algorithm revolves around feature-based sparse SLAM, called MonoSLAM, for recovering a camera’s 3D trajectory. Key steps include feature extraction, distance-based feature matching, essential matrix estimation, pose estimation, world coordinate computation, and visualization. Code link here

Extraction of Facial Features from Speech

A class project for CS 753 (Automatic Speech Recognition), Fall '19, IIT Bombay

The project aimed to infer a person’s appearance from their voice using a deep neural network trained on YouTube videos. The network encodes speech into a face feature and then decodes it into a canonical face image.

Download/View here

Frequency based Adversarial Patch Localization

Work done as an Intern at SRI International, Summer '23

This study introduces a frequency-based adversarial patch detection method using SAM segmentation and SVM classification on image segments. Through independent analyses of DFT, FFT, and entropy, our approach proves effective in reliably identifying adversarial patches. View the code here

Active Contour Without Edges using Chan-Vese

A class project for CSE 577 (Medical Imaging), Spring '23, Stony Brook University

This project explores the Chan-Vese model, initially designed for two-phase segmentation and grayscale images in the 1970s. It highlights the model’s adaptability to handle 3-D images and its extension to multi-phase segmentation, demonstrating its dynamic forces for boundary movement and showcasing its efficacy through practical examples.

Download/View here

Bypassing NSFW Gatekeepers

A class project for CSE 509 (Network Security) Fall '23, Stony Brook University

This project delves into the vulnerabilities of NSFW detectors on social media platforms. We employed a systematic black-box attack methodology, leveraging Grad-CAM-generated heatmaps, the study exposes weaknesses in existing detectors, offering insights into the robustness of content moderation systems.

Download/View here

research

Perceptual Feature-Guided Denoising for Speech Enhancement

Bachelor Thesis Project (BTP) under Prof. P Balamurugan (IIT Bombay)

This work proposes an end-to-end deep learning methodology for speech enhancement, employing a fully convolutional neural network (FCN) guided by perceptual feature losses for generating clean audio from noisy inputs. The approach emphasizes the training of the Denoising network to preserve intricate details at multiple layers through another network, FeatureLoss Net.

Download/View here

Shadow Removal using Diffusion Models

Done under Prof. Dimitris Samaras

This work explores the potential of DDPM for shadow removal tasks, where preserving hidden features is crucial. We built on a existing RePaint architecture by passing shadow information in the reverse diffusion gradually.

Download/View here

Imitating Expert Behaviour with Optimal Transport Distances

Done as a research project under Prof. Michael Ryoo

In this report, we introduce an algorithm for reward annotation in offline Reinforcement Learning (RL) employing the Optimal Transport (OT) strategy based on Wasserstein distance. Leveraging OT, the algorithm calculates optimal alignments between unlabeled trajectories and expert demonstrations, treating the similarity measure as a reward label.

Download/View here

Swin Transformer-Based Crack Detection

Submitted to IEEE TITS 2024 (In Review)

In this paper, we propose CrackSwinT, a novel crack detection approach, which employs the Shifted window Transformer (Swin-T) architecture, integrating Swin attention blocks and skip connections within encoders and decoders to capture crack details at multiple levels. Additionally, we present an enhanced Crack500 dataset with refined cracks.

Download/View here

CrackFusion: Crack Segmentation with Diffusion Models

Ongoing (To be submitted to ECCV 2024)

The paper introduces a novel approach using diffusion models to create accurate crack segmentation maps by leveraging original image data during reverse diffusion. A “RefineNet” model then ensures the generated maps at each timestep align topologically with actual crack structures.

teaching

Teaching Assistant for CSE 353 (Machine Learning)

Undergraduate course, Department of Computer Science, Stony Brook University, 2022

Teaching Assistant for CSE 377 (Introduction to Medical Imaging)

Undergraduate course, Department of Computer Science, Stony Brook University, 2023

Neelesh Verma

Posts by Collection

portfolio

Portfolio item number 1

Portfolio item number 2

projects

Estimating 3D trajectory using feature-based sparse SLAM

Extraction of Facial Features from Speech

Frequency based Adversarial Patch Localization

Active Contour Without Edges using Chan-Vese

Bypassing NSFW Gatekeepers

research

Perceptual Feature-Guided Denoising for Speech Enhancement

Shadow Removal using Diffusion Models

Imitating Expert Behaviour with Optimal Transport Distances

Swin Transformer-Based Crack Detection

CrackFusion: Crack Segmentation with Diffusion Models

talks

Talk 1 on Relevant Topic in Your Field

Tutorial 1 on Relevant Topic in Your Field

Talk 2 on Relevant Topic in Your Field

Conference Proceeding talk 3 on Relevant Topic in Your Field

teaching

Teaching Assistant for CSE 353 (Machine Learning)

Teaching Assistant for CSE 377 (Introduction to Medical Imaging)