Augmented reality (AR), mixed reality (MR) and virtual reality (VR) offer great potential for a variety of different individual applications. Popular apps like Pokémon Go show how big the acceptance among the users for such developments has become.
The research lab nextPlace of the Ostwestfalen-Lippe University takes up this trend in order to examine current research issues of the context of space and time. As a basis, we developed a smartphone app with the following premise: expandability of the real world with social data of the virtual world based on open source software and executable on typical hardware. In addition, the application we presented in the article TwitterGeoStream is being enhanced by AR functionalities.
A key point in terms of AR is the visual enrichment of the real world by additional data that is not immediately visible to the user. In a current typical scenario, a smartphone camera is used as window to the real world. Simultaneously, the live preview and additional information can be shown. In order to create an impression of immersion into an actual reality, it is key to depict the virtual data precisely onto the real objects.
Especially in order to use the exact position and viewing direction of the user, we use a threedimensional view of the data. The perspective image depiction that is based on that is being established through a projection matrix (see [Hartley2003]). AR for smartphones can genereally be achieved through a variety of different technologies.
For the so-called geo-based AR, e.g. GPS positioning data to determine the location of the user in real world, are being used.
With that, it is possible to identify the data of the appropriate radius of the user. Furthermore, we use further smartphone sensors (e.g. position sensors, acceleration sensor, compass) in order to optimize the spatial position of the user and to precisely define the viewing direction tracking. [Schmid2014] for example shows that geo-based AR can generally be used usefully. The main obstacle for specific individual applications is the accuracy of the existent positioning determination via GPS. According to the authors, the discrepancy can amount to ±10m. According to that, Geo-AR is especially useful for objects and distances whose size is bigger than this discrepancy.
In contrast to Geo-AR, Marker-AR is based solely on the picture of the camera roll. The projection of virtual objects is being conducted via markers of the real world. The dedicated markers are e.g. 2D barcodes in the size of a index card and are characterized by a simple but clear visual structure with appropriate contrasts between light and dark. The contrasts are being recognized as features. These points are then being extracted and permanently compared to the preview picture (mapping). Its perspective depiction to the viewing direction of the user, respectively the camera, is equivalent to the relevant depiction of the virtual objects [Kato1999]. This form of the depiction is definitely more precise than those of Geo-AR. Especially the fact that the marker are artificial objects in the world can be viewed as a disadvantage. In order to address this disadvantage, ordinary images recognize and generate key points via different algorithms of image processing [Fiala2005]. This fuidical procedure is also described as marker less procedure or natural feature less (NFG) because it can also be generally conducted onto pictures of the real world.
The authors combine these two approaches (Geo-AR and image marker-AR) in a prototypical smartphone app in order to achieve a very precise approach of the real world with virtual Twitter tweets.
In addition to the current GPS determined position of the user, twitter tweets of the environment are being visualized in realtime (figure 1). These tweets are being illustrated as ascenting bubbles, just like in TwitterGeoStream. The starting position of the tweets are the coordinates provided by Twitter. Depending on the individual settings of the Twitter users, these can be equivalent to the actual location of the individual user. In other cases, there is an abstraction of the exact position which means that there will be an approximate approach onto e.g. a public square, a district, or a city.
Distinct tweet bubbles can only be catched by the app user. The tweet text can be shown as a grafitti on objects of the real world, e.g. facades. Moreover, the user can generate a picture of the real objects in the app which can be conferred to the server application. As a result, the app receives the extracted key points relating to that picture. In the app, these data serve to correctly display the currently selected tweet text correctly (figure 2).
On the geographical position of the marker picture, there is a relevant geo-marker, so that this projection area is easily accessible for other users (figure 3). As soon as other users reach that position, the image marker is being transferred onto the app and can be used for the projection.
The software architecture developed in TwitterGeoStream is already composed on a modular basis and offers lose coupling and strong cohesion. As showcased in figure 4, we only added new functionalities to the server application: The HTTP-REST interface is being expanded to methods of requesting, adding and deleting image-based markers. The tweet data are also being conferred via web sockets between client and server. However, the data format is a generic JSON instead of the CesiumJS specific CZML.
The fast retina keypoing (FREAK) [Alahi2012]-based image recognition and marker generation is integrated in the application layer. Furthermore, a component of ARtoolkit [ARToolkit2016], which, in turn, uses libraries of the established and open library for image processing OpenCV [OpenCV2016], is used. The transmitted image is being analyzed through the FREAK algorithm. It recognizes prominent image points and transfers them into a dedicated data format for the ARTolkit.
These image points serve as the basis for the above mentioned projection of virtual objects in the real space in the client app. The marker generated by the user that are there for the generated images and the extracted image points are being saved in the server via the data access layer. By doing so, the markers can also be retrieved by other users. The processing time of uploading a new marker image until the projection takes up only a few seconds. It is mainly influenced by the available network connection. The TwittAR client is a Android-based smartphone app. It mainly uses the two open source components DroidAR [DroidAR2016] and ARToolkit. The depiction of threedimensional data of the virtual world is being carried out through OpenGL.
The TwittAR smartphone app is a technical prototype and a tool. It serves as a basis for further examinations. The app can offer the user an added value. The user, in turn, generates and shares georeferenced data. The user claimes certain marker image that can be used for further analyses.
Moreover, technology is developing to further determine the users in their spatial context. For example, the procedure of robotics of simultaneous localization and mapping (SLAM) enables the aboved mentioned recognizing and extracting of key points into the live camera picture in realtime, which, in turn, can be used for the projection. The modular composition of the prototype makes it possible to easily integrate and evaluate such technological innovations.
A video of the demo application can be viewed on vimeo.com: