Human face embeddings are worse than face-API. What am I doing wrong? #322

marcelklehr · 2022-12-23T16:39:35Z

marcelklehr
Dec 23, 2022

Hi,

I've been using face-API within nextcloud/recognize and have been using DBSCAN to cluster faces, which worked pretty well on the 128 dim embeddings returned by face-APIs. One drawback face-API has is that this causes mega clusters of barely identifiable faces. I've just tried switching to human for producing the embeddings, thinking that the 1024 dim embeddings are better, but they seem much worse for clustering by identity.

vladmandic · 2022-12-23T17:42:54Z

vladmandic
Dec 23, 2022
Maintainer

can you provide a bit more information? what is exactly 'much worse'? what are configuration settings you're using? and what are nextcloud/recognize and dbscan?

1 reply

marcelklehr Dec 25, 2022
Author

what is exactly 'much worse'?

It seems that I can't find a good epsilon value to cluster by. Either all faces are lumped together into a mega cluster, or I get multiple smaller clusters for the same person. For values that should be in the sweet spot, both is happening, one mega cluster with multiple identities and multiple 1 or two image clusters per person.

https://github.com/nextcloud/recognize is the repo of the project I'm working on. DBSCAN is the clustering algorithm I'm using. For Face-API embeddings I'm using epsilon = 0.44, for human embeddings I've been playing with values of epsilon = 7.7 (using a Euclidean metric).

This is the human config I'm using

const human = new Human({
		backend: 'tensorflow',
		modelBasePath: 'file://' + __dirname + '/../node_modules/@vladmandic/human/models/',
		face: {
			enabled: true,
			detector: { rotation: true, minConfidence: 0.8, enabled: true },
			mesh: { enabled: true },
			description: { enabled: true },
		},
	})

vladmandic · 2022-12-26T13:56:54Z

vladmandic
Dec 26, 2022
Maintainer

simple truth is that i've added all of the implementation of extracting face vector embeddings,
but i had no time to fine-tune default parameters - as a result, it can be extremely precise but quite finicky

first, one of biggest factors is crop factor

how tight should the box be around the face as it needs to closely match data that model was trained on, but no such info is provided for models

you can play with different crop values by using:

config.face.detector.cropFactor = 1.6

but...changing cropFactor also has a direct impact on facemesh/iris/emotion detection, so i needed to use a balance by default even if means lower precision for facial vector extraction

second, what is the range of expected values?

ideally, it should be 0..1 with 0.5 being a match, but that needs to be fine-tuned

thats why all match methods have optional options param:

options: MatchOptions = { order: 2, multiplier: 25, threshold: 0, min: 0.2, max: 0.8 }

so if currently all matches are grouped together, perhaps using higher multiplier and then correct with min/max threshold would expand the differences to a reasonable range?

or perhaps use higher order when calculating distance (e.g. Minkowski with order = 3instead of Euclidean with order = 2)?

third, `human` includes 3 different implementations for calculating face vector embeddings:

HSE-FaceRes (default)
BecauseofAI MobileFace (included as it was original one, but i don't see benefits over faceres)
DeepInsight InsightFace (this is state-of-the-art model, but its much bigger so not used by default)
and there are 4 variations of insightface: two mobilenet-based ones, one ghostnet-based and one efficientnet-based

that's 7 models to test and each has different preferences towards how face should be pre-processed and results interpreted

you can use any of them by switching config:

config.face.description = { enabled: true } // default

config.face.description = { enabled: false }
config.face.mobilefacenet = { enabled: true, modelPath: 'https://vladmandic.github.io/human-models/models/mobilefacenet.json' }

config.face.description = { enabled: false }
config.face.insightface ={ enabled: true, modelPath: 'https://vladmandic.github.io/insightface/models/insightface-mobilenet-swish.json' }

all-in-all, a lot of variables that need to be fine-tuned and i need help finding optimal parameters

1 reply

marcelklehr Dec 30, 2022
Author

Thank you for your in-depth reply. It seems that tuning cropFactor would be the most low hanging fruit. What are the semantics of this value?

vladmandic · 2022-12-30T15:06:19Z

vladmandic
Dec 30, 2022
Maintainer

face crop factor is a hidden config parameter config.face['scale'], default value is 1.4.
(i said its config.face.detector.cropFactor, but thats wrong as it was moved couple of versions ago)

how it works is that it find couple of keypoints, calculates box around them and then enlarges the box so face is inside it - 1.4 means 140% of calculated.

For example, 1.0 would mean extremely tight fit up to the point where top of the head would get cropped and 2.0 would mean there is too much space around the head.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Human face embeddings are worse than face-API. What am I doing wrong? #322

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Human face embeddings are worse than face-API. What am I doing wrong? #322

marcelklehr Dec 23, 2022

Replies: 3 comments · 2 replies

vladmandic Dec 23, 2022 Maintainer

marcelklehr Dec 25, 2022 Author

vladmandic Dec 26, 2022 Maintainer

first, one of biggest factors is crop factor

second, what is the range of expected values?

third, human includes 3 different implementations for calculating face vector embeddings:

all-in-all, a lot of variables that need to be fine-tuned and i need help finding optimal parameters

marcelklehr Dec 30, 2022 Author

vladmandic Dec 30, 2022 Maintainer

marcelklehr
Dec 23, 2022

Replies: 3 comments 2 replies

vladmandic
Dec 23, 2022
Maintainer

marcelklehr Dec 25, 2022
Author

vladmandic
Dec 26, 2022
Maintainer

third, `human` includes 3 different implementations for calculating face vector embeddings:

marcelklehr Dec 30, 2022
Author

vladmandic
Dec 30, 2022
Maintainer