Development Experience Sharing
We've achieved 30fps real time object detection on HoloLens 2 with 60fps rendering rate. The final result and performance are presented in below video.
Since this is the first time we touch AR and we are not farmiliar with C# and Unity, we've tried a lot to fulfill the goal. Below list includes some tips and experiences which may be helpful.
- The official tutorial is for HoloLens 1 and we don't know how to adapt the code to be performed on HoloLens 2. Also, we don't know how to develop and deploy AI algorithms and directly execute them on HoloLens 2 since it doesn't have NVIDIA gpus (CUDA excluded) and its cpu is ARM-architectured. So we decide to make the AI algorithm running on a powerful external device and let the HoloLens to handle video capture and display the location results of objects predicted by AI algorithm. Therefore, establishing a robust and reliable communication connection between the HoloLens and the external device is necessary and important.
- The first way we tried is to convert each video frame captured by the HoloLens camera to a bitmap and then send the bitmap to our PC with NVIDIA gpus. A PC program receives the bitmap then predicts the names and locations of objects in that bitmap and sends those information back to HoloLens. However, the total running speed of this way is really bad. One reason we think is due to the waiting time of the information sending and receiving process on HoloLens. The other reason is that we use TCP as the transferring protocol because the correctness of each pixel in a video frame plays an important role on the prediction process of the AI model.
- We use YOLO as the detection algorithm cause it's fast.
- We tried to make the sending and receiving process asynchronous on the HoloLens, but the performance is not improved significantly.
- The official tutorial about the fundamentals of HoloLens 2 development is really important. It guides us on how to establish, configure, develop, deploy and debug a Unity project to the HoloLens.
- Other guides such as link are also helpful especially in how to set the spatial awareness mesh invisible.
- We tried to follow the suggestion here in order to increase the performance of the streaming process (e.g. using MixedReality-WebRTC) on HoloLens. However, the technology is a bit complicated for us. If you are interested in WebRTC, the building and deploying process can be refered here, and the signal transferring is here.
- We tried to use multi-threading technique to improve the performance. You can refer to here.
- We tried to use async producer/consumer queue design pattern to improve the performance of the streaming process. You can refer to here, here, here and the official link is here.
- We finally find out that the HoloLens actually has a live video stream. Thus, there is no need to send video stream to the PC anymore. Also, because of this, the bitmap conversion process and the corresponding TCP sending process are not required.
- Although UDP is not reliable and sometimes information is lost, we think it's informative enough to transfer the prediction results. This is because both YOLO and UDP are fast, one or two consecutive prediction-result loss won't effectively affect the rendering rate on HoloLens.
- It is quite important to know that the maximum rendering rate of Hololens 2 is 60 fps, which means a simple Unity program that has some basic Mixed Reality functions or draws some simple holograms will run at a maximum of 60 fps on Hololens 2. You may refer to the official links here and here.
- It is also important to know that the camera frame rate of HoloLens 2 is 30 fps. Some people had discussed it here. If you don't make any optimization, the program on HoloLens 2 will get capped at 30 fps, which we believe is intolerable. Therefore, since 60 fps is just a double of 30 fps, we let the HoloLens receives a UDP package in every two consecutive frames. This technique directly improves the rendering rate to 50-60 fps. Meanwhile, it does not bring negative impacts to the holograms generation and the real time object detection process.
- So far, we've throwed away the TCP video streaming process and the multi threading or asynchronous programing because the live streaming, YOLO and the mixed reality program are basically asynchronous and multi-processed. The only time consuming processes are the instantiate of the holograms and their maintenance.
- In terms of the instantiate, you may refer to here and here.
- We tried to use Dictionary class to manage the game objects created by Unity with the purpose of reducing the searching and matching time but it's not necessary.
- Hologram objects can also be stored in a array or a list. You may refer to here and here to decide which one is better.
- Since the maintenance of those objects requires a lot of searching process, you may refer to here, here and here to optimize your for loops. we failed at using "Parallel.For" because we think the iteration and the creation or removing process from a same array or list are not allowed.
- Finally, when delopying the Unity project to Hololens, choose Release instead of Debug mode in Visual Studio will significantly improve the performance of the program on HoloLens.