Human face embeddings are worse than face-API. What am I doing wrong? #322
Replies: 3 comments 2 replies
-
can you provide a bit more information? what is exactly 'much worse'? what are configuration settings you're using? and what are |
Beta Was this translation helpful? Give feedback.
-
simple truth is that i've added all of the implementation of extracting face vector embeddings, first, one of biggest factors is crop factorhow tight should the box be around the face as it needs to closely match data that model was trained on, but no such info is provided for models you can play with different crop values by using: config.face.detector.cropFactor = 1.6 but...changing cropFactor also has a direct impact on facemesh/iris/emotion detection, so i needed to use a balance by default even if means lower precision for facial vector extraction second, what is the range of expected values?ideally, it should be 0..1 with 0.5 being a match, but that needs to be fine-tuned thats why all options: MatchOptions = { order: 2, multiplier: 25, threshold: 0, min: 0.2, max: 0.8 } so if currently all matches are grouped together, perhaps using higher multiplier and then correct with min/max threshold would expand the differences to a reasonable range? or perhaps use higher order when calculating distance (e.g. Minkowski with order = 3instead of Euclidean with order = 2)? third,
|
Beta Was this translation helpful? Give feedback.
-
face crop factor is a hidden config parameter how it works is that it find couple of keypoints, calculates box around them and then enlarges the box so face is inside it - 1.4 means 140% of calculated. For example, 1.0 would mean extremely tight fit up to the point where top of the head would get cropped and 2.0 would mean there is too much space around the head. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I've been using face-API within nextcloud/recognize and have been using DBSCAN to cluster faces, which worked pretty well on the 128 dim embeddings returned by face-APIs. One drawback face-API has is that this causes mega clusters of barely identifiable faces. I've just tried switching to human for producing the embeddings, thinking that the 1024 dim embeddings are better, but they seem much worse for clustering by identity.
Beta Was this translation helpful? Give feedback.
All reactions