A Full Stack Developer based in 36/F Sitalatala Lane, Kolkata, 700011
Hey
I'm Shubham Sarkar, a detail-oriented Full Stack Developer with expertise in machine learning and deep learning. I thrive on collaborating with teams to create scalable AI solutions that drive innovation and meet project goals.
Machine Learning Deep Learning Natural Language Processing (NLP) Computer Vision
Experience
Machine Learning Engineer
Alignerr • Remote
Enhanced LLM evaluation precision by 15% through a comprehensive review of a rubric-based scoring framework across six reasoning categories.
Analyzed over 50 audio files for integration into ASR pipelines.
Evaluated AI agent responses, identifying failure points such as inference memory and self-coherence, resulting in a 20% improvement in model accuracy.
Deep Learning Research Assistant
Jadavpur University CMATER Lab • Kolkata, India
Designed and implemented a self-attention mechanism (scaled dot-product) within a pre-trained VGG16, significantly enhancing feature extraction for lung cancer detection from CT scans.
Developed a hybrid deep learning architecture achieving 99.54% peak accuracy with only 76k trainable parameters and 0.0256 GFLOPs, facilitating edge-device deployment.
Engineered feature fusion through concatenation and element-wise multiplication of original and attention-modulated maps for refined, context-aware representations.
Skills
Machine Learning Deep Learning Natural Language Processing (NLP) Computer Vision LLM Fine Tuning RAG TensorFlow PyTorch scikit-learn Hugging Face Pandas NumPy Langchain Flask FastAPI Amazon Web Services (AWS) Docker MLflow DVC Streamlit CI/CD PostgreSQL SQL ETL Pipelines Linux Vector Databases (Chroma) Python C JavaScript HTML CSS Data Structures and Algorithms
Created an end-to-end YouTube sentiment analysis pipeline processing over 10,000 user comments, enhancing sentiment classification performance through NLP preprocessing techniques.
Tracked multiple model experiments using MLflow and DVC, enabling reproducible training and systematic comparison of models built with scikit-learn and NLP libraries.
Deployed the pipeline on AWS using Docker and exposed predictions via Flask REST APIs, facilitating scalable and reproducible inference.
Keras, Hugging Face Transformers, ResNet50, OpenCV
Developed an NLP and CV pipeline to analyze 150,000 image and text data using transformer-based text encoders and CNN-based image embeddings, integrating them through a fusion neural network for price prediction.
Implemented data preprocessing techniques, including text cleaning, tokenization, and streaming image feature extraction with ResNet and CLIP representations to manage large datasets.
Built and fine-tuned models using TensorFlow and scikit-learn, achieving a rank of 142 out of 50,000 participants.