Building intelligent systems with computer vision and multimodal AI
Core areas where I build and deploy AI solutions
RAG pipelines, vision-language models, cross-modal retrieval, and semantic search
Object detection, video analysis, 3D reconstruction, and image processing
FastAPI microservices, Docker deployment, cloud infrastructure, and scalable architectures
CUDA programming, performance optimization, and parallel processing
Technologies and tools I work with regularly
Selected work showcasing AI development and deployment
Automated deployment system for multimodal RAG with intelligent diagram generation and vision-language model integration. Features one-command setup reducing deployment time from hours to 15-30 minutes.
YOLOv8-based crack detection model achieving 92% accuracy on building inspection images. Engineered robust preprocessing pipeline handling variable lighting conditions.
3D Convolutional Neural Network for emotion classification from video sequences using temporal analysis across multiple academic datasets.
Created Google Colab demo and resolved GPU memory constraints, enabling 1,000+ users to experiment with face-swapping technology.