• Publications
Show publication details

Pöllabauer, Thomas Jürgen; Rojtberg, Pavel [1. Prüfer]; Kuijper, Arjan [2. Prüfer]

STYLE: Style Transfer for Synthetic Training of a YoLo6D Pose Estimator

2020

Darmstadt, TU, Master Thesis, 2020

Supervised training of deep neural networks requires a large amount of training data. Since labeling is time-consuming and error prone and many applications lack data sets of adequate size, research soon became interested in generating this data synthetically, e.g. by rendering images, which makes the annotation free and allows utilizing other sources of available data, for example, CAD models. However, unless much effort is invested, synthetically generated data usually does not exhibit the exact same properties as real-word data. In context of images, there is a difference in the distribution of image features between synthetic and real imagery, a domain gap. This domain gap reduces the transfer-ability of synthetically trained models, hurting their real world inference performance. Current state-of-the-art approaches trying to mitigate this problem concentrate on domain randomization: Overwhelming the model’s feature extractor with enough variation to force it to learn more meaningful features, effectively rendering real-world images nothing more but one additional variation. The main problem with most domain randomization approaches is that it requires the practitioner to decide on the amount of randomization required, a fact research calls "blind" randomization. Domain adaptation in contrast directly tackles the domain gap without the assistance of the practitioner, which makes this approach seem superior. This work deals with training of a DNN-based object pose estimator in three scenarios: First, a small amount of real-world images of the objects of interest is available, second, no images are available, but object specific texture is given, and third, no images and no textures are available. Instead of copying successful randomization techniques, these three problems are tackled mainly with domain adaptation techniques. The main proposition is the adaptation of general-purpose, widely-available, pixel-level style transfer to directly tackle the differences in features found in images from different domains. To that end several approaches are introduced and tested, corresponding to the three different scenarios. It is demonstrated that in scenario one and two, conventional conditional GANs can drastically reduce the domain gap, thereby improving performance by a large margin when compared to non-photo-realistic renderings. More importantly: ready-to-use style transfer solutions improve performance significantly when compared to a model trained with the same degree of randomization, even when there is no real-world data of the target objects available (scenario three), thereby reducing the reliance on domain randomization.

Show publication details

Rojtberg, Pavel; Gorschlüter, Felix

calibDB: Enabling Web Based Computer Vision Through On-the-fly Camera Calibration

2019

Proceedings Web3D 2019

International Conference on 3D Web Technology (WEB3D) <24, 2019, Los Angeles, CA, USA>

For many computer vision applications, the availability of camera calibration data is crucial as overall quality heavily depends on it. While calibration data is available on some devices through Augmented Reality (AR) frameworks like ARCore and ARKit, for most cameras this information is not available. Therefore, we propose a web based calibration service that not only aggregates calibration data, but also allows calibrating new cameras on-the-fly. We build upon a novel camera calibration framework that enables even novice users to perform a precise camera calibration in about 2 minutes. This allows general deployment of computer vision algorithms on the web, which was previously not possible due to lack of calibration data.

Show publication details

Matthiesen, Moritz; Rojtberg, Pavel [1. Gutachten]; Kuijper, Arjan [2. Gutachten]

Interpolation von Kalibrierdaten für Zoom und Autofokus Kameras

2019

Darmstadt, TU, Bachelor Thesis, 2019

In dieser Arbeit wird das Problem betrachtet, dass für jede neue Kameraeinstellung eine neue Kalibrierung vorgenommen werden muss.Ziel dabei ist Kalibrierdaten an bestimmten Kameraeinstellungen zu erstellen, um mithilfe vondiesen die Kalibrierdaten von anderen Kameraeinstellungen herzuleiten. Dabei werden die Kalibrierdaten betrachtet und es wird versucht Beziehungen zwischen den einzelnen Parametern der Kalibrierung herzuleiten. Um diese zu ermitteln wird zwischen verschiedenen Parametern der Kalibrierung interpoliert.

Show publication details

Rojtberg, Pavel; Kuijper, Arjan

Real-time texturing for 6D object instance detection from RGB Images

2019

Adjunct Proceedings of the 2019 IEEE International Symposium on Mixed and Augmented Reality

IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct) <18, 2019, Beijing, China>

For objected detection, the availability of color cues strongly influences detection rates and is even a prerequisite for many methods. However, when training on synthetic CAD data, this information is not available. We therefore present a method for generating a texture-map from image sequences in real-time. The method relies on 6 degree-of-freedom poses and a 3D-model being available. In contrast to previous works this allows interleaving detection and texturing for upgrading the detector on-the-fly. Our evaluation shows that the acquired texture-map significantly improves detection rates using the LINEMOD [5] detector on RGB images only. Additionally, we use the texture-map to differentiate instances of the same object by surface color.

Show publication details

Rojtberg, Pavel

User Guidance for Interactive Camera Calibration

2019

Virtual, Augmented and Mixed Reality: Multimodal Interaction

International Conference Virtual Augmented and Mixed Reality (VAMR) <11, 2019, Orlando, FL, USA>

Lecture Notes in Computer Science (LNCS), 11574

For building a Augmented Reality (AR) pipeline, the most crucial step is the camera calibration as overall quality heavily depends on it. In turn camera calibration itself is influenced most by the choice of camera-to-pattern poses – yet currently there is only little research on guiding the user to a specific pose. We build upon our novel camera calibration framework that is capable to generate calibration poses in real-time and present a user study evaluating different visualization methods to guide the user to a target pose. Using the presented method even novel users are capable to perform a precise camera calibration in about 2 min.

978-3-030-21606-1

Show publication details

Rojtberg, Pavel; Kuijper, Arjan

Efficient Pose Selection for Interactive Camera Calibration

2018

Proceedings of the 2018 IEEE International Symposium on Mixed and Augmented Reality

IEEE International Symposium on Mixed and Augmented Reality (ISMAR) <17, 2018, Munich, Germany>

The choice of poses for camera calibration with planar patterns is only rarely considered — yet the calibration precision heavily depends on it. This work presents a pose selection method that finds a compact and robust set of calibration poses and is suitable for interactive calibration. Consequently, singular poses that would lead to an unreliable solution are avoided explicitly, while poses reducing the uncertainty of the calibration are favoured. For this, we use uncertainty propagation. Our method takes advantage of a self-identifying calibration pattern to track the camera pose in real-time. This allows to iteratively guide the user to the target poses, until the desired quality level is reached. Therefore, only a sparse set of key-frames is needed for calibration. The method is evaluated on separate training and testing sets, as well as on synthetic data. Our approach performs better than comparable solutions while requiring 30% less calibration frames.

978-1-5386-7459-8

Show publication details

Rojtberg, Pavel; Kuijper, Arjan

Efficient Pose Selection for Interactive Camera Calibration

2017

Adjunct Proceedings of the 2017 IEEE International Symposium on Mixed and Augmented Reality

IEEE International Symposium on Mixed and Augmented Reality (ISMAR) <16, 2017, Nantes, France>

The choice of poses for camera calibration with planar patterns is only rarely considered - yet the calibration precision heavily depend on it. This work presents a pose selection method that explicitly avoids singular pose configurations which would lead to an unreliable solution. Consequently camera poses are favoured that reduce the uncertainty of the calibration parameters most. For this purpose the quality of the calibration parameters is continuously estimated using uncertainty propagation. Our approach performs better than comparable solutions while requiring 30% less calibration frames.

Show publication details

Rojtberg, Pavel; Audenrith, Benjamin

X3ogre: Connecting X3D to a State of the Art Rendering Engine

2017

Proceedings Web3D 2017

International Conference on 3D Web Technology (WEB3D) <22, 2017, Brisbane, Australia>

We connect X3D to the state of the art OGRE renderer using our prototypical x3ogre implementation. At this we perform a comparison of both on a conceptual level, highlighting similarities and differences. Our implementation allows swapping X3D concepts for OGRE concepts and vice versa. We take advantage of this to analyse current shortcomings in X3D and propose X3D extensions to overcome those.

Show publication details

Bergmann, Tim Alexander; Kuijper, Arjan [Betreuer]; Rojtberg, Pavel [Betreuer]

Interaktive Echtzeit-Kalibrierung

2016

Darmstadt, TU, Bachelor Thesis, 2016

Diese Arbeit beschäftigt sich mit der Entwicklung einer Heuristik, die unerfahrene Nutzer durch eine flexible Kamerakalibrierung leitet. Hierfür wird ein Qualitätsmaß basierend auf der Arbeit von Hartley und Zisserman [8, S. 138ff] hergeleitet. Dieses Qualitätsmaß wird verwendet, um Nutzern mit Hilfe von Vorschlägen für Kameraposen die Durchführung einer Kalibrierung zu vereinfachen. Durch diese Hilfestellungen gelingt es unerfahrenen Nutzern eine Kamera mit weniger Bildern, aber gleicher Qualität, bezogen auf den Reprojection Error, zu kalibrieren.

Show publication details

Engelke, Timo; Keil, Jens; Rojtberg, Pavel; Wientapper, Folker; Schmitt, Michael; Bockholt, Ulrich

Content First - A Concept for Industrial Augmented Reality Maintenance Applications using Mobile Devices

2015

MMSys '15

ACM Multimedia Systems Conference (MMSys) <6, 2015, Portland, OR, USA>

Although AR has a long history in the area of maintenance and service-support in industry, there still is a lack of lightweight, yet practical solutions for handheld AR systems in everyday workflows. Attempts to support complex maintenance tasks with AR still miss reliable tracking techniques, simple ways to be integrated into existing maintenance environments, and practical authoring solutions, which minimize costs for specialized content generation. We present a general, customisable application framework, allowing to employ AR and VR techniques in order to support technicians in their daily tasks. In contrast to other systems, we do not aim to replace existing support systems such as traditional manuals. Instead we integrate well-known AR- and novel presentation techniques with existing instruction media. To this end practical authoring solutions are crucial and hence we present an application development system based on web-standards such as HTML,CSS and X3D.

978-1-4503-3351-1

Show publication details

Olbrich, Manuel; Franke, Tobias; Rojtberg, Pavel

Remote Visual Tracking for the (Mobile) Web

2014

Proceedings Web3D 2014

International Conference on 3D Web Technology (WEB3D) <19, 2014, Vancouver, BC, Canada>

Augmented Reality is maturing, but in a world where we are used to straightforward services on the internet, Augmented Reality applications require a lot of preparation before they can be used. Our approach shows how we can bring Augmented Reality into a normal web browser, or even browsers on mobile devices. We show how we are, with recent features of HTML5, able to augment reality based on complex 3D tracking in a browser without having to install or set up any software on a client. With this solution, we are able to extend 3D Web applications with AR and reach more users with a reduced usability barrier. A key contribution of our work is a pipeline for remote tracking built on web-standards.

Show publication details

Wientapper, Folker; Wuest, Harald; Rojtberg, Pavel; Fellner, Dieter W.

A Camera-Based Calibration for Automotive Augmented Reality Head-Up-Displays

2013

12th IEEE International Symposium on Mixed and Augmented Reality 2013.

IEEE International Symposium on Mixed and Augmented Reality (ISMAR) <12, 2013, Adelaide, SA, Australia>

Using Head-up-Displays (HUD) for Augmented Reality requires to have an accurate internal model of the image generation process, so that 3D content can be visualized perspectively correct from the viewpoint of the user. We present a generic and cost-effective camera-based calibration for an automotive HUD which uses the windshield as a combiner. Our proposed calibration model encompasses the view-independent spatial geometry, i.e. the exact location, orientation and scaling of the virtual plane, and a view-dependent image warping transformation for correcting the distortions caused by the optics and the irregularly curved windshield. View-dependency is achieved by extending the classical polynomial distortion model for cameras and projectors to a generic five-variate mapping with the head position of the viewer as additional input. The calibration involves the capturing of an image sequence from varying viewpoints, while displaying a known target pattern on the HUD. The accurate registration of the camera path is retrieved with state-of-the-art vision-based tracking. As all necessary data is acquired directly from the images, no external tracking equipment needs to be installed. After calibration, the HUD can be used together with a head-tracker to form a head-coupled display which ensures a perspectively correct rendering of any 3D object in vehicle coordinates from a large range of possible viewpoints. We evaluate the accuracy of our model quantitatively and qualitatively.

Show publication details

Engelke, Timo; Keil, Jens; Rojtberg, Pavel; Wientapper, Folker; Webel, Sabine; Bockholt, Ulrich

Content First - A concept for Industrial Augmented Reality Maintenance Applications Using Mobile Devices

2013

12th IEEE International Symposium on Mixed and Augmented Reality 2013.

IEEE International Symposium on Mixed and Augmented Reality (ISMAR) <12, 2013, Adelaide, SA, Australia>

Although AR has a long history in the area of maintenance and service-support in industry, there still is a lack of lightweight, yet practical solutions for handheld AR systems in everyday workflows. Attempts to support complex maintenance tasks with AR still miss reliable tracking techniques, simple ways to be integrated into existing maintenance environments, and practical authoring solutions, which minimize costs for specialized content generation. We present a general, customisable application framework, allowing to employ AR and VR techniques in order to support technicians in their daily tasks. In contrast to other systems, we do not aim to replace existing support systems such as traditional manuals. Instead we integrate well-known AR- and novel presentation techniques with existing instruction media. To this end practical authoring solutions are crucial and hence we present an application development system based on web-standards such as HTML,CSS and X3D.

Show publication details

Rojtberg, Pavel; Roth, Stefan [Gutachter]; Gao, Qi [Betreuer]

Image Compression Using MRF Priors

2012

Darmstadt, TU, Master Thesis, 2012

Markov Random Field (MRF) models are used in many low-level computer-vision problems like inpainting or denoising. In this work we evaluate the use of MRF natural image priors in the context of image compression. To this end we formulate compression as finding a sparse point representation of the image, while decompression is formulated as MRF based inpainting. For finding a sparse point representation of images we consider using entropy of the prior probability and the variance of the probabilistic expert functions. The results here are competitive with existing methods. For decompression we find the ability to generate structures of the used high order MRF based model to be lacking. However our experiments with a mean modulating model indicate that, generating more structures is possible for inpainting. Furthermore we adapt the binary tree triangular coding for a variance based point selection and use it to evaluate the importance of efficiently storing the sparse point representation of the image. Here we show that using WebP lossless compression is just as adequate to store the sparse point representation in practical cases. An evaluation considering the state of the art lossy JPEG2000 codec however reveals that our MRF prior based method has to be improved further to be competitive in qualitative terms.

Show publication details

Schwarz, Katharina; Rojtberg, Pavel; Caspar, Joachim; Gurevych, Iryna; Goesele, Michael; Lensch, Hendrik P. A.

Text-to-Video: Story Illustration from Online Photo Collections

2010

Knowledge-Based and Intelligent Information and Engineering Systems

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES) <14, 2010, Cardiff, Wales, UK>

We present a first system to semi-automatically create a visual representation for a given, short text. We first parse the input text, decompose it into suitable units, and construct meaningful search terms. Using these search terms we retrieve a set of candidate images from online photo collections. We then select the final images in a user-assisted process and automatically create a storyboard or photomatic animation. We demonstrate promising initial results on several types of texts.

Show publication details

Rojtberg, Pavel; Goesele, Michael [Betreuer]; Gurevych, Iryna [Betreuerin]; Caspar, Joachim [Betreuer]

Generating Storyboards Based on Natural Language Description

2009

Darmstadt, TU, Bachelor Thesis, 2009

This work explores to what extend it is possible to automatically generate films from textual descriptions formulated in natural language today. For this purpose a prototype implementation is presented that explores possibilities and current limitations. The prototype generates a multimedia presentation using photographs friom the web for visualisation and speech synthesis for sound. Thereby the precision of flickr and google images as image providers is evaluated and improving the coherency of the result by user input and post processing is discussed. Finally this work gives suggests directions for future improvements by discussing the current bottlenecks.