[WIP] Robotic Hand & Computer Vision: Towards Machine Learning Control

Work In Progress

This personal project aims to design and control an anthropomorphic robotic hand prototype. The goal is twofold: to tackle the challenge of compact mechatronic integration (6 Degrees of Freedom) and to explore the potential of computer vision for intuitive control of complex systems.

The prototype’s dimensions are based on the biometrics of my own hand, imposing strict constraints on space and actuator integration.

Mechanical Design (Bio-inspiration & Kinematics)

While drawing inspiration from open-source projects like InMoov, I designed the entire mechanical structure from scratch using PTC Creo, integrating custom solutions to meet my specific constraints.

Architecture: Each finger consists of 3 phalanges. Actuation is remotely located in the forearm using a cable-driven system (tendons) pulled by 5 MG90S servomotors. Particular attention was paid to cable management to ensure a clean layout and minimize parasitic friction.
The 6th Axis: A dedicated degree of freedom for forearm rotation (pronation/supination) is powered by a high-torque MG996R servomotor.
Tribology & Materials: All parts are 3D printed in PLA. For the wrist rotation system, I designed a custom guide to bypass the need for ball bearings (reducing cost and complexity). The PLA-on-PLA interface undergoes a specific surface smoothing treatment to prevent seizing and reduce noise, while mechanically decoupling the rotational force from structural loads.

CAD Hand view — CAD illustrations : phalanges and forearm's curvature gradient.

CAD forearme view with curvature gradient — CAD illustrations : phalanges and forearm's curvature gradient.

Electronic Architecture

The system’s “brain” is not a standard Arduino, but an ESP32. This strategic choice provides superior processing power and, crucially, native Bluetooth/Wi-Fi connectivity, which is essential for wireless communication with the processing PC.

Actuator Control: The ESP32 drives the 6 servomotors via a 16-channel PWM driver (PCA9685).
Protocol: Communication between the microcontroller and the driver is handled via the I2C bus, optimizing pin usage.
Power Supply: A dedicated external battery ensures power autonomy, isolating the control circuit from the power circuit.

Software Pipeline: From Vision to Control

The core of the project lies in the vision-based control algorithm. The Python script running on the host computer follows this logic:

Acquisition & Tracking: Leveraging OpenCV for video stream management and Google’s MediaPipe for real-time extraction of 21 hand landmarks.
Filtering & Normalization: This is the critical step for robustness. Raw coordinates are filtered to smooth movements (jitter reduction) and normalized. This normalization makes the system invariant to the user’s hand distance or orientation relative to the camera.
Communication: Angular commands are sent to the ESP32 via serial link (currently migrating to Bluetooth). The system currently operates in open-loop mode.

Next Steps: Machine Learning Integration

The CAD phase is complete, and the initial prints have been validated. The “Beta” code already allows for functional positional tracking. The next major milestone is software-driven: moving beyond simple “mimicry” toward AI-based gesture recognition.

I am currently developing a “Guessing” mode based on a classic ML pipeline:

Data Collection: Building a custom dataset by recording gesture sequences with intentionally introduced noise.
Training: Benchmarking supervised classification algorithms: K-Nearest Neighbors (KNN) vs. Random Forest. The goal is to conduct a comparative study on latency and accuracy.
Inference: The robot will no longer just copy motion but will “understand” the intent (the gesture) to execute a pre-recorded command associated with the predicted pose.

This project synthesizes the full skill set of a mechatronics engineer: from constrained CAD design to the implementation of Artificial Intelligence algorithms.

Share on

X Facebook LinkedIn Bluesky

[WIP] Robotic Hand & Computer Vision: Towards Machine Learning Control

Hack GAZALIOU

Mechanical Design (Bio-inspiration & Kinematics)

Electronic Architecture

Software Pipeline: From Vision to Control

Next Steps: Machine Learning Integration

Share on

You May Also Enjoy

Fruit Catcher : A investigation in RL discret control

R&D Immersion: Designing a Humanoid Skull at Kalysta

Simulating Celestial Trajectories: A Numerical Approach to the N-Body Problem

First Steps in Sensor Fusion: Giving Sight to a Sumo Robot