A web app to detect the body pose of a human occupant of a vehicle, and warn for unsafe positions in real-time. The app runs on a phone mounted in the interior of the car.
Try it now!: https://jjbel.github.io/ml5-bodypose-example/
We test the accuracy of head turn detection by comparing it with OptiTrack - a marker-based 3D tracking system.
The data is collected and analyzed in MATLAB:
Optitrack Angle | Model Angle |
---|---|
0° | 5.955° |
30° | 28.22° |
60° | 61.80° |
90° | 87.98° |
The app uses the following javascript libraries:
The app is hosted on the Github Pages of this repo: https://jjbel.github.io/ml5-bodypose-example/
BlazePose:
0 nose
1 left_eye_inner
2 left_eye
3 left_eye_outer
4 right_eye_inner
5 right_eye
6 right_eye_outer
7 left_ear
8 right_ear
9 mouth_left
10 mouth_right
11 left_shoulder
12 right_shoulder
13 left_elbow
14 right_elbow
15 left_wrist
16 right_wrist
17 left_pinky
18 right_pinky
19 left_index
20 right_index
21 left_thumb
22 right_thumb
23 left_hip
24 right_hip
25 left_knee
26 right_knee
27 left_ankle
28 right_ankle
29 left_heel
30 right_heel
31 left_foot_index
32 right_foot_index
We initially tried using iPhone Pro models for possibly more accurate pose detection, as they have a Time-of-Flight LiDAR scanner which gives depth data, along with RGB data from the camera. We used ARKit’s body tracking via Unity’s ARFoundation. However ARKit requires full-body visibility to initialize tracking. If a standing person sat in the driving simulator, ARKit would lose tracking.
The lack of visibility is a central issue. ARKit uses the default 24mm lens of the iPhone 14 Pro for the RGB data to accompany the depth data from the LiDAR sensor. When mounting the iPhone on the dashboard of the car, the visibility is too poor for detection. The iPhone has a wider 13mm lens, but the ARKit API does not give a choice of lens.
We tested ARFoundation by building the arfoundation-samples demo app for the iPhone. 2d tracking on a dashboard-mounted iPhone seemed to detect better than 3d tracking, which just failed.
https://github.com/CMU-Perceptual-Computing-Lab/openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
OpenPose uses just RGB data for detection. It can detect poses from an image, video or live camera feed. It supports detecting multiple humans.
However OpenPose is unsuitable because it does not seem to be built for mobile.
Although OpenPose claims realtime detection, testing it on both a laptop and powerful desktop failed to give realtime results.
We tested 3 videos:
video.avi
which comes as a sample with OpenPosedriving-sim 480p
: a 30Hz 1920x1080 video of a person in the driving simulator, downscaled to 480p for better performance. The video was taken using the ultrawide 13mm lensof the iPhone 14 Pro.driving-sim 240p
: the same video downscaled to 240pTesting was conducted on:
ml5.js bodypose is attractive because:
TensorFlow Blog Post: https://blog.tensorflow.org/2021/05/next-generation-pose-detection-with-movenet-and-tensorflowjs.html
Daniel Shiffman (ml5.js contributor): DE Pose Estimation with ml5.js , 3D Pose Estimation with ml5.js