-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perhaps consider an alternative proposal? #1
Comments
Thanks for the feedback. I'm not (with this proposal) trying to enable a full FaceAR system; however, I think exposing a FaceMesh to the page unlocks a lot of really powerful opportunities, and could potentially allow for polyfilling in some other aspects which you call out (though that's not a main goal). For example a well-defined mesh (and as the goal here is cross-browser compatibility, the mesh would need to be well defined) would allow easily selecting vertices from the mesh that correspond to specific parts of the Face (e.g. the eyebrows/lips/nose). The current WebRTC way of doing this and getting the camera feed (to my knowledge), requires the page to output a The goal for all of this would be for the browser to do local, on-device processing to keep user-data secure, and, as the data is only exposed via a WebRTC stream, it would be following that same well-established permission mechanisms for camera access. By allowing the browser to pre-compute metadata (such as the FaceMesh), pages wouldn't have to load fairly large models to compute this (saving both their and the users bandwidth), potentially eliminate/minimize some texture copies (improving performance), and potentially leverage multithreading and hardware or other accelerations (further improving performance). At it's simplest mobile implementation, you could imagine a I want to finish by stating that my goal is for this to become a W3C spec and not simply one browser-specific implementation. |
The strength of the Insertable Streams framework is that it provides a foundation for developers to write their own processors. A stated key usecase of Inesertable Streams is Funny Hats. What is missing from the current proposal to fulfill its first stated goal? What would the Insertable Streams group need to add, in terms of functionality, to support FaceMesh or other developer javascript that runs more efficiently than today? |
From my research, in practice (as currently impemented) InsertableStreams are primarily being used to provide encryption. The biggest issue is that Insertable streams exposes the encoded data, which (to my knowledge) is not data that you could run any of the above libraries over nor do compositing on; though I will admit I have not tried it myself. You can think of the Edit: This portion of the InsertableStreams explainer clarifies this even more: https://github.com/w3c/webrtc-insertable-streams/blob/master/explainer.md#use-cases |
So then in your new explainer, FaceMesh javascript could run much more efficiently than it does today by leveraging the ability to run inside a newly proposed insertable stream type? |
They could; however, we believe that by allowing the browser to pre-process/provide some of this metadata that it would be more performant than running code in JavaScript and reduce the amount of data that the page/user has to load. |
In general, a 'face-mesh' is only one component of a Face AR system. In addition to a mesh and uv coordinates, you need an occluder (face shape), pixel segmenter (Hair/eyebrow/lips, etc.), eye gaze detection, iris detection, emotion detection, face gesture detection (smile/frown/wink, etc.), blend shapes (animoji, etc.), background detection, shoulder/neck segmentation. In fact, what you need changes every six months as the technology and fidelity improves.
User-space access to the underlying camera feed in GPU is the only real requirement to be able to generate popular effects or create new ones,– and this is already provided by WebRTC as it currently exists. Consider each of the following software libraries that already powers FaceAR on the web without proprietary APIs added to a specific browser. Each of these does only on-device processing, keeping the user-data secure, and uses existing, well-established permission mechanisms for camera access.
FaceMesh by MediaPipe:
https://google.github.io/mediapipe/solutions/face_mesh.html
8th Wall Face Effects:
https://www.8thwall.com/8thwall/face-effects-aframe
Banuba WebAR for Chrome:
https://www.banuba.com/technology/webar
Zappar WebAR (faces):
https://zap.works/webar/
Jeeliz FaceFilter
https://github.com/jeeliz/jeelizFaceFilter
Consider implementing the face-mesh code with existing WebRTC streams or as an open-source library built on top of the the powerful and well-considered new w3c API proposal for WebRTC Insertable Streams. Ultimately, developers want to integrate and/or improve technology, and this existing proposal would limit them to one browser-specific implementation.
The text was updated successfully, but these errors were encountered: