Review: Montage4D

The authors demonstrate steady progress towards commercially-viable Holoportation - they fix most texturing problems.

One of the papers selected for upcoming I3D 2018 conference (15-18 May 2018), Montage4D by Du et al, caught my attention. I loved the Hololens and I’m a sucker for any news or credible rumors regarding improved Augmented reality headsets.

Paper: http://montage4d.com/

Take-aways

Montage4D “extends the Holoportation system and solves the problems of fuzziness caused by normal-weighted blending, visible seams caused by misregistration and occlusion, while ensuring temporal consistency of the rendered images.”. So basically, they solve the most glaring texturing problems that afflicted the original Holoportation system. I look forward to seeing their presentation at I3D.

Furthermore, it’s largely “compatible with the [PerceptiveIO Motion2fusion] pipeline, by integrating the computation of seams, geodesic fields, and view-dependent rendering modules.” Basically, one of the most critical issues left in Holoportation is that the meshes are simply not accurate enough. Motion2fusion made great progress in that regard:

So when a next-gen Holoportation technology be commercially viable? My guess would be 3-4 years - it will take at least that long to overcome the uncanny valley and for Microsoft to release 1-2 extra generations of Hololens. IMHO the most critical technical problem left is to integrate real-time face and human body detection and fitting. The face/body fitting could be done in the cloud, alongside the reconstruction servers, to refine the fit over time and reuse across sessions. That might significantly reduce bandwidth as the human models could be cached on the device. Until that’s done, I think we’ll continue to have face and hand deformations that are so glaring no one but the most early adopters will want to use it – regardless of the price.

Update: PerceptiveIO recently improved hand tracking as well: https://medium.com/@NextRealityNews/former-microsoft-engineers-achieve-best-hand-tracking-capabilities-weve-seen-for-ar-bddc76f2840e

Constructive advice to the Montage4D authors

The paper itself is well written, but the Montage4D intro video would be easier to understand with small improvements:

  • The music is too loud (at least with my headphones), makes it hard to concentrate and understand the speech. I recommend you simply remove it.
  • The speech could be easier to understand due to accent and speaking too fast. I had to listen to some parts 2-3 times. Most people won’t have the patience so they’ll either not understand completely, or stop watching altogether. I have a French accent so I can imagine it’s not due to a lack of practice on your part. Here’s a trick I used before for important videos, e.g. Fortem Omnipresence 3D demo. I used Voices.com - it took only a few hours of my time (picking the actor, reviewing result, adjusting timing of video), cost $200USD, and was ready in 2 days. It may make a difference, e.g. in getting your next grant or getting recruited by the best employer.
  • First impressions matter. When I first watched it, I was very focused on the textured face. I didn’t immediately realize this was showing another algorithm. I would recommend you make it more clear, e.g. much larger text underneath the face with the name of the other algorithm vs Montage4D: