Altmann, Michael Supervisor: Prof. Gudrun Klinker Advisor: Eichhorn, Christian (@ga73wuj) Submission Date: [created]
Combining multiple machine learning models into an ensemble to combine their strengths and improve their overall result has been an area of research for decades. However, since the invention and spread of very powerful, deep neural networks interest has shifted away from ensembles and towards improving these new models and their training procedures instead of systematically evaluating how various ensemble techniques are applicable to them. As a result, most publications do not use any ensembles to improve their models or do not do so systematically but rather either only average multiple instances of their models or train only one learning-based fusion. Contributing factors are that single neural networks are often already very computationally expensive, can already work sufficiently well on their own, and can be adapted in countless ways such as making them deeper thus able to learn more complex functions. To provide some insight into the question if and how different ensemble techniques can further improve the predictions of single deep neural networks, we create and compare various ensembles from different models that predict the pose of AR-glasses via an outside-in tracking approach in the context of an in-car environment. We systematically compare different existing and novel fusion algorithms, both simple arithmetic and learning-based, in conjunction with different combinations and numbers of models of similar and completely different types and additionally adapt one of these models to deduce the AR glasses pose from a head pose and improve one of the other models. Our results show that despite the already very good results of the single models, different fusion algorithms can further improve the accuracy of the final pose prediction and also that for different ensemble constellations different types of fusions work well, demonstrating that how to best combine deep neural networks to ensembles is indeed a promising research direction which can lead to more accurate and potentially more robust models.
[ PDF (optional) ]
[ Slides Kickoff/Final (optional)]