Adnan Amir::Portfolio Website

VLM-Based Robotic Manipulation

Built an end-to-end pipeline that translates natural language instructions into autonomous robotic manipulation tasks. The system integrates SAM2 computer vision, Google Gemini VLMs, and inverse kinematics control in NVIDIA Isaac Sim. Through systematic evaluation of three perception approaches and multiple VLM models, I optimized detection accuracy from 30% to 94% while reducing inference time by 3x. Implemented self-verification mechanisms enabling automatic error recovery, achieving 100% success on trained tasks and 60% generalization to novel instructions across 45 evaluation episodes.

Kosmos - Autonomous Mars Rover

Inspired by the challenges of robotic exploration in complex and unstructured environments, the Kosmos Mars Rover project was a student-led initiative to design and build an autonomous vehicle. Our team focused on developing an intelligent and resilient platform capable of navigating simulated hazardous terrains with minimal human intervention. This project provided a practical avenue to apply principles of robust perception, sophisticated navigation, and automated decision-making, aiming to contribute to the understanding and development of technologies for more effective and independent robotic operations.

Cold Spray Digital Twin Framework

Exploring the frontiers of additive manufacturing, the Coldspray System project is a research endeavor undertaken to advance a novel metal 3D printing process. This work involves integrating an xArm6 robotic arm as the motion platform for a coldspray nozzle, a technique that deposits metal powder at high velocities to create solid material through kinetic energy bonding. The core focus is to enhance the efficiency and precision of this innovative manufacturing method. This research provides an opportunity to apply cutting-edge reinforcement learning (RL) algorithms to teach the robotic arm optimal spray patterns, specifically aiming to minimize the need for subsequent machining and improve the net-shape capabilities of the coldspray process.

Adaptive Reinforcement Learning for Robust Navigation

This project employed a dual-simulation strategy for developing and validating robust navigation policies. A custom, high-speed Gymnasium environment ("VectorizedDD") was created for efficient vectorized training of reinforcement learning agents, featuring configurable obstacles and Lidar simulation. For more realistic validation and sim-to-sim transfer, the navigation scenarios were also implemented in NVIDIA's Isaac Sim, leveraging its high-fidelity physics and sensor modeling. This approach allowed for rapid algorithm iteration in the custom environment, followed by validation in a more complex, physically accurate simulator.

IoT-Based Autonomous Patrol Bot

This project addresses the vulnerabilities in traditional security systems by introducing an autonomous IoT-based Patrol Bot. Designed as an auxiliary security layer, the bot aims to mitigate human error in surveillance, particularly in small to medium-sized businesses. The system features a compact, stealthy rover that navigates a designated area during off-hours, utilizing computer vision for human detection and an integrated IoT platform (ThingSpeak) to send real-time alerts across multiple channels (Email, Twitter, MQTT, VOIP calls via IFTTT) if an intruder is detected. Key contributions include a cost-effective design, a redundant alert system, and a random exploration navigation algorithm for comprehensive room coverage. This initiative showcases practical application of mechatronics principles, IoT integration, and object detection to provide an accessible and reliable security enhancement.

Real-Time Augmented Reality from Scratch

This project brings virtual objects into the real world through your camera. Using C++ and OpenCV, the application first calibrates your camera for precision. It then detects markers, like a chessboard or an ArUco board, in the live video feed. By tracking these markers, the software calculates the camera's exact position and orientation in real-time. This allows it to accurately overlay virtual 3D objects onto the video, making them appear anchored to the physical world. As you move the camera or the marker, the virtual objects respond realistically, creating a seamless and interactive augmented reality experience.

Adnan Amir