shubham_sarkar — bash
whoami

Shubham Sarkar

Machine Learning Engineer
cat about.txt

Driving innovation in AI with advanced algorithms and scalable solutions.

about

I'm Shubham Sarkar, a passionate Machine Learning Engineer focused on Deep Learning, NLP, and Computer Vision. I thrive on solving complex problems and enhancing model performance to deliver impactful AI applications. Let's transform ideas into reality through cutting-edge technology.

stack --list
Languages: SQLPythonCJavaScriptHTMLCSSAI/ML: Machine LearningDeep LearningComputer VisionLLM Fine TuningRAGFrameworks/Libraries: Natural Language Processing (NLP)TensorFlowPyTorchscikit-learnHugging FacePandasNumPyLangchainFlaskFastAPIAmazon Web Services (AWS)StreamlitETL PipelinesData Structures and AlgorithmsDatabases: PostgreSQLVector Databases (Chroma)Tools & Platforms: DockerMLflowDVCCI/CDLinux
projects --all

TailorCV.ai

Python, FastAPI, LLM, AI agents, Amazon Web Services

Developed an AI web application that optimizes resumes to job descriptions using LLMs and NLP pipelines, improving resume relevance by up to 80%. Designed a Python and FastAPI backend with HTML, CSS, and JavaScript for the frontend, dockerized the application, and deployed it on AWS ECS. Achieved over 50 users within the first week of launch, demonstrating strong early adoption and real-world impact.

YouTube Sentiment Analysis

TensorFlow, NLP, AWS EC2, Scikit-learn

Created an end-to-end YouTube sentiment analysis pipeline processing over 10,000 user comments, enhancing sentiment classification performance through NLP preprocessing techniques such as tokenization, lemmatization, and stopword removal. Tracked multiple model experiments using MLflow and DVC, enabling reproducible training and systematic comparison of models built with scikit-learn and NLP libraries. Deployed the pipeline on AWS using Docker and exposed predictions via Flask REST APIs, facilitating scalable and reproducible inference.

Smart Product Pricing

Keras, Hugging Face Transformers, ResNet50, OpenCV

Developed an NLP and CV pipeline to analyze 150,000 image and text data using transformer-based text encoders and CNN-based image embeddings, integrating them through a fusion neural network for price prediction. Implemented data preprocessing techniques, including text cleaning, tokenization, and streaming image feature extraction with ResNet and CLIP representations to manage large datasets. Built and fine-tuned models using TensorFlow and scikit-learn, achieving a rank of 142 out of 50,000 participants.

history --log
Jan 2026 – Present

Machine Learning Engineer @ Alignerr

  • Enhanced LLM evaluation precision by 15% through a comprehensive review of a rubric-based scoring framework across six reasoning categories.
  • Analyzed over 50 audio files for integration into ASR pipelines.
  • Evaluated AI agent responses, identifying failure points such as Inference memory, Self Coherence, and rubric evaluation, resulting in a 20% improvement in model accuracy.
May 2025 – September 2025

Deep Learning Research Assistant @ Jadavpur University CMATER Lab

  • Designed and implemented a self-attention mechanism (scaled dot-product) within a pre-trained VGG16, significantly enhancing feature extraction for lung cancer detection from CT scans.
  • Developed a hybrid deep learning architecture achieving 99.54% peak accuracy with only 76k trainable parameters and 0.0256 GFLOPs, facilitating edge-device deployment.
  • Engineered feature fusion through concatenation and element-wise multiplication of original and attention-modulated maps for refined, context-aware representations.
credentials
activity

Core Member · Entrepreneurship Cell, Jadavpur University

Organized national-level events such as E-Summit 2025 and Hult Prize 2025, attracting over 5,000 registrations and 1,000+ attendees. Contributed to the establishment of an Incubation Center at Jadavpur University under the Institution’s Innovation Council (IIC).