Virtual Reality Models Based on Photogrammetric Surveys—A Case Study of the Iconostasis of the Serbian Orthodox Cathedral Church of Saint Nicholas in Sremski Karlovci (Serbia)

Featured Application: This work presents the process of creating a virtual reality application based on photogrammetric surveying of cultural heritage objects for iconostasis, as an important element of a church’s cultural heritage. The proposed workﬂow comprises photogrammetric surveying and modeling, 3D model optimization and retopology, followed by control and analysis, and creating a VR application. The ﬁnal result is shown as an interactive VR tour through the church that allows a user to visualize the iconostasis in a way it cannot be perceived in real life. Abstract: During recent years, the synergy of virtual reality (VR) and photogrammetry has become an increasingly prevalent way to visualize, represent, preserve and disseminate cultural heritage objects. Photogrammetry o ﬀ ers a reliable method for a faithful and accurate image-based modeling of real-world objects, while VR applications provide not only visualization, but also an immersive and interactive experience of the photogrammetrically reconstructed cultural heritage. This research aims to create and apply the method for providing VR experience in the context of cultural heritage by developing a workﬂow for the VR applications based on photogrammetric models. The proposed workﬂow was applied on the iconostasis of the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci (Serbia). The presented method is based on the following main steps: generation ofan accurate 3D reconstruction of the iconostasisusing photogrammetry, the 3D model optimization, retopology, control and analysis and the process of creating the VR experience using a game-engine. The ﬁnal result is an interactive walk through the church, which provides the user with an opportunity to visualize the iconostasis and its individual icons through di ﬀ erent perspectives and multiple levels of detail, which is not otherwise possible when observing the church interior.


Introduction
During recent years, virtual reality (VR) applications have found a wide and prominent purpose in the field of cultural heritage, and have been constantly evolving. The recent improvementsinthe rendering of complex geometry in the real-time environment haveopened the way for using photogrammetry in game engines, which is illustrated by the implementation of the digitally The aim of this paper is to create the workflow for the representation of real-world heritage and interaction with it in the virtual environment. The workflow focuses on achieving an accurate 3D reconstruction based on photographs; 3D model preparation for its further usage in VR in terms of mesh optimization and texture mapping; and, finally, the process of creating a VR experience using game-engine. The proposed workflow was applied and validated on the case study of the iconostasis of the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci. The VR for the iconostasis is significant as it provides a user with a possibility to interact with the iconostasis and to visualize and disclose its individual icons through different perspectives and multiple levels of detail, which is otherwise not possible inside a church interior.

Photogrammetry
Data processing in photogrammetry implies the 3D reconstruction geometry of an object from a set of images using some software. Nowadays, improvements in highly automated reconstruction methods have made it possible to achieve 3D model representations from a set of photographs and in a shorter time compared to the traditional CAD modeling techniques. This has contributed to the widespread usage of popular photogrammetry software packages which are user friendly and can be used by non-specialists. However, the results of such approaches are quite unreliable in terms of metrics and accuracy.
To produce precise, detailed and photo-realistic 3D digital reconstructions useful for visualization, documentation, and analysis, certain criteria have to be met [4,13,29]. SfM is a method for photogrammetric surveying in which the determination of camera positions and orientation parameters are done automatically. Nowadays, there is a variety of image-based modeling software based on the SfM, such as the open-source ones: VisualSFM, AliceVision, OpenMVS, COLMAP, and Regard3D; and the popular commercial solutions: Agisoft Metashape (new version of Agisoft PhotoScan), Reality Capture, Autodesk Recap Photo, and Context Capture [12,[30][31][32].
This method can be used for creating highly detailed and precise models. Since the precision of SfM models depends on the quality and properties of the available photographs, to achieve reliable and accurate 3D point measurements, the photographs have to be properly acquired. Thus, particular attention has to be given to the design of the surveying plan [4,13]. When the photographs are created, the photogrammetric reconstruction is done in several main steps, which imply building dense point cloud, generating a mesh, and finally projecting texture from the images. The final result of the photogrammetric modeling is a highly detailed 3D reconstruction of an object geometry, with realistic texture representation.

Mesh Optimization and Texture Mapping
Highly detailed and realistic 3D models, produced from photogrammetry, commonly contain a large polygon count which is difficult to manage in a real-time rendering environment. A large number of polygons of such files make it difficult to import and handle in the game-engines intended for creating VR due to the file size. Thus, the high-poly 3D model has to be reduced to the low-poly mesh. On the other hand, for the successful user experience in VR, it is important to maintain the realism of an object 3D reconstruction. It is important to preserve the texture resolution as much as possible. Therefore, mesh optimization is closely related to texture optimization. The most commonly used approach to reduce the polygon count without losing the visual quality of the 3D reconstructed model generally follows the three-step pipeline [19,21,33,34]: transformation of a mesh from a system of triangles to a system of spatial quadrilaterals [35]. There are several software solutions for the cleaning model from redundant geometry, mesh optimization, retopology and repairing the texture such as ZBrush, Blender, 3ds Max, Wrap 3d, Mudbox, MeshLab, Substance Painter, etc.
Retopology is usually performed in ZBrush [19,21,34], using the ZRemesher tool. In addition, MeshLab [19,36], can be also used for both the decimation and the retopology, as it presents an open-source 3D mesh processing software that provides a set of tools for editing, cleaning, inspecting, rendering, and converting unstructured large meshes. In some cases, the CAD software 3ds Max [33] can be used for mesh decimation, but dealing with the high-poly 3D models such as photogrammetric SfM models, automatic decimation in 3ds Max very often crashes or causes geometric errors like self-intersections, holes, and flipped normals [33].
As it has been already mentioned above, mesh optimization is followed by texture optimization. Texture optimization consists of two main complementary steps: mesh unwrapping (UV mapping) and maps baking. Both techniques can be performed by any 3D application that supports unwrapping and texture baking.
Mesh unwrapping is a key process for adding textures appropriately on top of a 3D mesh. In a general workflow, a low-poly model is unwrapped by generating UV coordinates for each vertex in the mesh, thus flatting out the three-dimensional mesh on a map.
Baking textures implies creating maps from the high-poly model and projecting them to the low-poly one, in order to simulate the original high-resolution geometric details on the optimized mesh [19,21]. It is a technique that makes it possible to transfer the characteristics of 3D geometry into a 2D texture. The main characteristics that can be baked are the ones from the isolated attributes, such as normal, vertex colors, ambient occlusion [37], which adds to the realism in terms of details, as well as ambient lighting and shading of the low-poly mesh. The normal, albedo or diffuse map, ambient occlusion map are most commonly used inside a real-time engine, together with the roughness map, and the metallic map [19], all of which keep high realistic details of the high-poly object geometry on the low-poly one. In addition, it is possible to bake multiple, combined characteristics including materials, textures, and lighting all baked into a single texture [37]. This way, the low-poly 3D model and the high-resolution texture maps, separately imported into a real-time game engine can be easily handled for creating a fully performative VR experience.

Virtual Reality
When it comes to creating a VR experience, several different game engines can be employed. The two most popular, however, are Unity [20,22,23,28,33] and Unreal Engine [19,21,24,26,34]. Authors that selected the Unreal Engine [21,24,26,34], pointed out its intuitive use mode, based on the Blueprint Visual Scripting system, as the major characteristic. Compared to the others, this system is extremely flexible, as it relies on the concept of using a node-based interface, thus enabling a user to create gameplay elements directly from the Unreal Editor [34,38]. The Blueprint Visual Scripting system provides a simple workflow within the game-engine which is based on connecting nodes instead of traditional programming. This way, the Unreal Engine provides the possibility to create virtually interactive events and gameplay elements by using Blueprint concepts and tools.
In order to create immersive fruition with a virtual scene, a VR headset system should be employed. The VR experience can be achieved using either PC and HMD system or the cardboard and a smartphone. For the additional interaction and navigation through the virtual environment, the controllers need to be included, too.
When developing an interactive reality-based VR application, special attention needs to be given to the particular way of engaging user with the digital environment, in terms of enabling him/her to experience it in the way which is otherwise not possible inside the real environment. One of the common methods to achieve this is by using VR teleportation mode [21,33]. In the next section, a created method for developing a VR environment from the photogrammetrically surveyed cultural heritage object is presented.

Method for Virtual Reality Model Based on Photogrammetric Survey
In this chapter, we shortly present how the 3D model obtained by photogrammetric surveying is prepared to be used in the Unreal Engine to create a Virtual Reality presentation. The method consists of four steps:
Control and Analysis 4.
Creating a VR environment Photogrammetry was achieved using the Structure from Motion algorithm which consists of surveying and data processing in Agisoft Metashape. The data acquired by the surveying process are photographs of an observed object. The first step is the processing of the obtained photographs using photogrammetry software, which results in creating depth maps. From the depth maps a high-poly triangular mesh is generated.
The next step is Optimization and Retopology and these processes are done in ZBrush software. Through the optimization process, the polygon count decreased. By employing the retopology technique, the triangular mesh is converted to the quadrilateral mesh. The low-poly quadrilateral mesh is obtained as a result, which is then exported to Blender software for the unwrapping process. The unwrapped mesh is exported to ZBrush where the process of baking textures is performed. Also, the unwrapped mesh is exported back to AgisoftMetashape for texturing of the low-poly mesh.
The next step is Control and Analysis of the mesh and textures. We used Substance Painter for texture correcting. From this software are exported Diffuse and Specular maps. 3ds Max was used as a tool for texture wrap control where we checked whether the texture fits well on the low-poly mesh.
Finally, MeshLab was used for the comparison of geometrical details of high-poly and low-poly models i.e. mesh analysis. The result of the third step is a low poly quadrilateral mesh with a high-quality texture.
The last step in the workflow is creating a VR environment. This step is done by the Unreal Engine where we used assets and these are the mesh and texture from the third step. In the last step, a scene was created with appropriate lights and interaction. The block diagram presenting the method is shown in Figure 1. A more detailed technical explanation is given in Section 3.2.

Iconostasis of the Serbian Orthodox Cathedral Church of Saint Nicholas in Sremski Karlovci
In the VR application for the iconostasis of the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci, the iconostasis and the particular icons are visualized in the virtual environment, through different perspectives and multiple levels of detail.
Iconostasis presents a unique element of the Christian Orthodox churches. It serves as an altar barrier of a church interior, the dimensions of which are dictated by the width and height of the church it is situated in. The iconostasis is specific by its highly decorated structure which holds framed icons. As such, it can also be considered as the screen of the religious art paintings and can be seen only in situ. The icons are arranged in rows by height, making it impossible to visualize up close the icons placed at the highest point. Regarding that, the VR application is significant, as it provides a user with a possibility to visualize it in a way that is not possible inside the real environment of a church interior.
The iconostasis of the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci dates back to 1780-1781 and represents the Baroque style. It is situated between the sanctuary containing the altar area and the nave in a church. The dimensions of 11.6 (width) × 14.2 (height) meters follow the width and the height of the church. It has great cultural heritage value since it contains unique religious works, painted by Teodor Kracun and Jakov Orfelin, two prominent 18th century Serbian artists.

Photogrammetric Surveying
The 3D model of the iconostasis is created using photogrammetry approach. In order to obtain highly accurate 3D reconstruction results in terms of the relationship between the texture resolution and ground sample distance (GSD), it is necessary to design the on-site surveying plan. Ground sample distance is the size of the image pixel expressed in an object scale. Taking into account an object scale, a suitable level of detail of the 3D reconstruction can be defined by the GSD. The GSD has to be smaller than the smallest details the model should hold. Historic England [29] defined the

Photogrammetric Surveying
The 3D model of the iconostasis is created using photogrammetry approach. In order to obtain highly accurate 3D reconstruction results in terms of the relationship between the texture resolution and ground sample distance (GSD), it is necessary to design the on-site surveying plan. Ground sample distance is the size of the image pixel expressed in an object scale. Taking into account an object scale, a suitable level of detail of the 3D reconstruction can be defined by the GSD. The GSD has to be smaller than the smallest details the model should hold. Historic England [29] defined the maximum permissible GSD values for the different output scales of an ortho-image derived from photogrammetry.
Designing the surveying plan for the data acquisition implies reconnaissance of site conditions and constraints, in order to determine preferable camera locations. By determining an appropriate GSD value, the shooting distance (distance from the camera to the object), the baseline (distance between two camera locations), and geometrical camera parameters (focal length) can be calculated according to an object site conditions and camera sensor size. We used a camera with a large sensor to avoid diffraction and signal-to-noise artefacts which decrease the image quality. In site image exposure parameters, such as aperture, shutter speed and ISO settings should be set in order to achieve the highest image quality and sharpness.
The surveying plan is designed taking into account the specific iconostasis shape, distinguished by significantly lower object depth in relation to its width, as well as the church interior conditions, characterized by a very low illumination. The Aperture priority mode was employed to determine the correct exposure time. The surveying was performed using a tripod and a D7000camera (NIKON, Tokyo, Japan, pixel number: 4928 × 3275; pixel size: 4.78 µm; sensor size: 23.6 × 15.6 mm; focal length: 18-109 mm), equipped with a wide-angle lens of 24mm focal length. By taking advantage of unrestricted shooting distance, the GSD value was set according to the desired level of detail of the 3D reconstruction. The GSD value was 1.5 mm at 13 m shooting distance and the baseline of 50 cm. After the 3D model reconstruction, the overall accuracy of the model was 2.6 mm.
The data were processed using the photogrammetry software based on the depth-maps method. The mesh was generated on the basis of both the point cloud and depth maps. For the further application, the mesh generated from depth maps was used, since it provided a better detail representation, by means of the more detailed 3D model with closed holes, at the same time containing far fewer polygons. The mesh generated from the depth maps is illustrated in Figure 2.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 21 maximum permissible GSD values for the different output scales of an ortho-image derived from photogrammetry. Designing the surveying plan for the data acquisition implies reconnaissance of site conditions and constraints, in order to determine preferable camera locations. By determining an appropriate GSD value, the shooting distance (distance from the camera to the object), the baseline (distance between two camera locations), and geometrical camera parameters (focal length) can be calculated according to an object site conditions and camera sensor size. We used a camera with a large sensor to avoid diffraction and signal-to-noise artefacts which decrease the image quality. In site image exposure parameters, such as aperture, shutter speed and ISO settings should be set in order to achieve the highest image quality and sharpness.
The surveying plan is designed taking into account the specific iconostasis shape, distinguished by significantly lower object depth in relation to its width, as well as the church interior conditions, characterized by a very low illumination. The Aperture priority mode was employed to determine the correct exposure time. The surveying was performed using a tripod and a D7000camera (NIKON, Tokyo, Japan, pixel number: 4928 × 3275; pixel size: 4.78 μm; sensor size: 23.6 × 15.6 mm; focal length: 18-109 mm), equipped with a wide-angle lens of 24mm focal length. By taking advantage of unrestricted shooting distance, the GSD value was set according to the desired level of detail of the 3D reconstruction. The GSD value was 1.5 mm at 13 m shooting distance and the baseline of 50 cm. After the 3D model reconstruction, the overall accuracy of the model was 2.6 mm.
The data were processed using the photogrammetry software based on the depth-maps method. The mesh was generated on the basis of both the point cloud and depth maps. For the further application, the mesh generated from depth maps was used, since it provided a better detail representation, by means of the more detailed 3D model with closed holes, at the same time containing far fewer polygons. The mesh generated from the depth maps is illustrated in Figure 2. The data acquired by the surveying process are images of an observed object, output usually in JPEG, TIFF or RAW format. Recording both JPEG and RAW files is beneficial, since the RAW format preserves the whole dynamic range offered by the camera, and allows image post-capture alteration without affecting the original data [29]. When it comes to the 3D model texturing, retaining the RAW format is useful as it allows us to readjust an image exposure with no loss of quality.
In terms of the data processing and 3D reconstruction workflow, the software follows the common pipeline which consists of the following main steps: The data acquired by the surveying process are images of an observed object, output usually in JPEG, TIFF or RAW format. Recording both JPEG and RAW files is beneficial, since the RAW format preserves the whole dynamic range offered by the camera, and allows image post-capture alteration without affecting the original data [29]. When it comes to the 3D model texturing, retaining the RAW format is useful as it allows us to readjust an image exposure with no loss of quality.
In terms of the data processing and 3D reconstruction workflow, the software follows the common pipeline which consists of the following main steps: The very first step is the multi-image alignment. It implies feature detection, matching point pairs, and triangulation. The object points in 3D space are derived by triangulation of the image rays projected from at least two, but ideally more images. In this step, the Bundle Adjustment process is employed to refine the geometric properties of the camera and the lens distortion and to optimize the 3D reconstruction. The output of the image alignment process is an object sparse point cloud, followed by the estimated camera positions. The difference between the 3D position of the estimated tie point, and its re-projected point position on an image (RMS reprojection error) can be used as one of the metrics for evaluating the SfM process [31]. The sparse point cloud can be optionally optimized in terms of the removal of low-quality tie points, either manually or automatically. The second step of the processing is the calculation of the dense point cloud, which consists of the re-computation of the orientation parameters, using a larger amount of image pixels. It is usually followed by the manual or/and automatic point filtering in order to remove noise from the dense point cloud. The next step implies building mesh from the object dense cloud data. The new release of the Agisoft PhotoScan successor software -the Metashape from 2019, is the most accurate low-cost solution [31,32] and it is possible to create mesh directly from depth maps data, without the need to produce dense point cloud previously [39]. The calculation of depth-map for each image is the intermediate step of the processing, whereas the generated depth maps present the source data for creating a dense cloud and for the new meshing method [40]. A new depth map based mesh generation method, compared to the traditional one based on a dense cloud [39]: • Produces largely complete geometry with the much greater detail and lower number of polygons than the PhotoScan, • Halves mesh generation runtime, • Reduces memory consumption, • Uses the GPU, instead of CPU The abovementioned leads to the conclusion that a new mesh generation method might be a more suitable solution for further VR visualization, compared to the mesh generated from the dense cloud, which could be used for different reconstruction and structural evaluation analysis.
The final phase of the data processing is texture generation, used to cover the 3D model with a texture adopted from the images. In this case, the gathered RAW data might be post-processed in order to improve an overall process of object texture mapping.

Optimization and Retopology
The 3D model representing the iconostasis obtained as a product of Agisoft Metashape software was the mesh with too many polygons. In particular, the 3D model had 1.250.000 polygons, and the polygons obtained by the software were triangles (Figure 3a). The 3D model with triangle-based topology can produce sharp angles that can affect the design of a tessellated surface [41]. Such kind of model is not suitable for use in a game engine and VR environment. Thus, the retopology, i.e. transformation of the mesh from a system of triangles to a system of spatial quadrilaterals is extremely important. The surface consisting of a quadrilateral polygon system (Figure 3b) is more convenient for the algorithms used for VR rendering, performed in real-time [42][43][44].
When dealing with the photogrammetry models, it is recommended to optimize mesh, in terms of decreasing polygon count, before performing the retopology technique, as it improves its overall speed and performance [45]. In addition, to improve the general result, it is also possible to redo a retopology over a previous one, since it provides a better polygon flow while keeping the poly-count unchangeable [45]. In any case, before performing the optimization, the mesh should be duplicated, in order to keep an original model as a reference from which the detail information will be projected to the optimized mesh [34,46].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 21 in order to keep an original model as a reference from which the detail information will be projected to the optimized mesh [34,46]. In this paper, the high-poly mesh was created in Agisoft Metashape software, and the ZBrush was used for the mesh optimization and retopology process. The Decimation Master tool in ZBrush [47] was used as a standard step for preparing models for export and use in the game engine. This tool groups adjacent polygons lying in the same plane into one larger polygon, thus reducing the number of vertices on the surface. This way, the polygon count of the 3D mesh of the iconostasis (Figure 4a) has been easily reduced while keeping all of the mesh details (Figure 4b). For the retopology, the DynaMesh tool [48] in ZBrush was used to preserve details during the transformation process of converting triangles mesh to quadrilaterals mesh ( Figure 5). In order to create an optimal geometrical representation of surfaces in terms of the reorganization of the direction of the spreading of the polygons, the ZRemesher [49] tool was employed.   In this paper, the high-poly mesh was created in Agisoft Metashape software, and the ZBrush was used for the mesh optimization and retopology process. The Decimation Master tool in ZBrush [47] was used as a standard step for preparing models for export and use in the game engine. This tool groups adjacent polygons lying in the same plane into one larger polygon, thus reducing the number of vertices on the surface. This way, the polygon count of the 3D mesh of the iconostasis (Figure 4a) has been easily reduced while keeping all of the mesh details (Figure 4b). For the retopology, the DynaMesh tool [48] in ZBrush was used to preserve details during the transformation process of converting triangles mesh to quadrilaterals mesh ( Figure 5). In order to create an optimal geometrical representation of surfaces in terms of the reorganization of the direction of the spreading of the polygons, the ZRemesher [49] tool was employed.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 21 in order to keep an original model as a reference from which the detail information will be projected to the optimized mesh [34,46]. In this paper, the high-poly mesh was created in Agisoft Metashape software, and the ZBrush was used for the mesh optimization and retopology process. The Decimation Master tool in ZBrush [47] was used as a standard step for preparing models for export and use in the game engine. This tool groups adjacent polygons lying in the same plane into one larger polygon, thus reducing the number of vertices on the surface. This way, the polygon count of the 3D mesh of the iconostasis (Figure 4a) has been easily reduced while keeping all of the mesh details (Figure 4b). For the retopology, the DynaMesh tool [48] in ZBrush was used to preserve details during the transformation process of converting triangles mesh to quadrilaterals mesh ( Figure 5). In order to create an optimal geometrical representation of surfaces in terms of the reorganization of the direction of the spreading of the polygons, the ZRemesher [49] tool was employed.   After completing the mesh geometry editing process, a low-poly mesh unwrap process needs to be done. Unwrapping is the process of developing a mesh into a plane and the result of this is that each mesh's vertex is described by two coordinates (X and Y). Good unwrapping allows 2D textures to be correctly applied to the 3D model. The low-poly, optimized model from the ZBrush was exported to Blender software, which was used for mesh unwrapping ( Figure 6). The Smart UV project tool which was used for this process is an automatic tool that cuts the mesh based on an angle threshold (angular changes in mesh) [50]. It is a good method for both simple and complex geometric forms, such as mechanical devices or architectural objects. This algorithm examines the shape of a mesh, the selected faces and their relation to one another, and creates a UV map based on this information and the settings that were entered.
The UV map, the flat representation of the surface of a 3D model, is used to easily wrap textures back to the surface. After unwrapping in Blender, the unwrapped mesh was imported back to the ZBrush for projecting details from the high poly model to the low poly unwrapped mesh (Figure 7). After completing the mesh geometry editing process, a low-poly mesh unwrap process needs to be done. Unwrapping is the process of developing a mesh into a plane and the result of this is that each mesh's vertex is described by two coordinates (X and Y). Good unwrapping allows 2D textures to be correctly applied to the 3D model. The low-poly, optimized model from the ZBrush was exported to Blender software, which was used for mesh unwrapping ( Figure 6). The Smart UV project tool which was used for this process is an automatic tool that cuts the mesh based on an angle threshold (angular changes in mesh) [50]. It is a good method for both simple and complex geometric forms, such as mechanical devices or architectural objects. This algorithm examines the shape of a mesh, the selected faces and their relation to one another, and creates a UV map based on this information and the settings that were entered.
The UV map, the flat representation of the surface of a 3D model, is used to easily wrap textures back to the surface. After unwrapping in Blender, the unwrapped mesh was imported back to the ZBrush for projecting details from the high poly model to the low poly unwrapped mesh (Figure 7). This step needs to be done to transfer the level of detail from the high-poly model to the low-poly model, and as a result, the Displacement and Normal maps have been created to preserve details. This step needs to be done to transfer the level of detail from the high-poly model to the low-poly model, and as a result, the Displacement and Normal maps have been created to preserve details. The next step in the process of the texture projection is to replace the initial high poly model with the newly created low-poly mesh. The low-poly model with details from the high-poly one was imported back into Agisoft Metashape, and then the texture from the original photos (RAW format) was projected onto the unwrapped low-poly 3D model (Figure 7). The option "Keep UVs" was enabled during the texture projection process because otherwise the software Agisoft Metashape creates its UV map.

Control and Analysis
The texture obtained in the previous step contained gaps represented as black stain spots, which occurred on the parts of the model that were not accessible to capture during the photogrammetric The next step in the process of the texture projection is to replace the initial high poly model with the newly created low-poly mesh. The low-poly model with details from the high-poly one was imported back into Agisoft Metashape, and then the texture from the original photos (RAW format) was projected onto the unwrapped low-poly 3D model (Figure 7). The option "Keep UVs" was enabled during the texture projection process because otherwise the software Agisoft Metashape creates its UV map. This step needs to be done to transfer the level of detail from the high-poly model to the low-poly model, and as a result, the Displacement and Normal maps have been created to preserve details. The next step in the process of the texture projection is to replace the initial high poly model with the newly created low-poly mesh. The low-poly model with details from the high-poly one was imported back into Agisoft Metashape, and then the texture from the original photos (RAW format) was projected onto the unwrapped low-poly 3D model (Figure 7). The option "Keep UVs" was enabled during the texture projection process because otherwise the software Agisoft Metashape creates its UV map.

Control and Analysis
The texture obtained in the previous step contained gaps represented as black stain spots, which occurred on the parts of the model that were not accessible to capture during the photogrammetric

Control and Analysis
The texture obtained in the previous step contained gaps represented as black stain spots, which occurred on the parts of the model that were not accessible to capture during the photogrammetric surveying. Substance Painter [51] was used to correct those gaps, within which the black parts of the texture were corrected manually, by combining colors from the surrounding pixels.
From the improved texture of the iconostasis 3D model, two additional maps such as the Diffuse map and the Specular map were exported from Substance Painter. Since the iconostasis contains parts that are covered with reflective material, the Specular map was desirable to represent a surface's shininess and highlight colors. All of these four maps (Diffuse, Displacement, Normal and Specular) must have the same U and V texture coordinates to be well wrapped to the mesh. 3ds Max was used for checking that textures fit well to the low poly model (Figure 8). All of the created maps, Diffuse, Displacement, Normal and Specular, were imported in 3ds Max for control (Figure 9).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 12 of 21 surveying. Substance Painter [51] was used to correct those gaps, within which the black parts of the texture were corrected manually, by combining colors from the surrounding pixels. From the improved texture of the iconostasis 3D model, two additional maps such as the Diffuse map and the Specular map were exported from Substance Painter. Since the iconostasis contains parts that are covered with reflective material, the Specular map was desirable to represent a surface's shininess and highlight colors. All of these four maps (Diffuse, Displacement, Normal and Specular) must have the same U and V texture coordinates to be well wrapped to the mesh. 3ds Max was used for checking that textures fit well to the low poly model (Figure 8). All of the created maps, Diffuse, Displacement, Normal and Specular, were imported in 3ds Max for control (Figure 9). We also analyzed Wrap 3D [52] which is software for retopology, usually for humanlike characters for CG animation. For the retopology process in this software, it is necessary to have a reference model that is optimized and the polygons have to be quads. The software library contains only humanlike reference models, so it is not quite suitable for use in the optimization and retopology of architectural objects.
MeshLab [53] was used for controlling the quality of the optimized mesh (low-poly) as well as the deviations from the original mesh (high-poly). The reference mesh was the mesh obtained in Agisoft Metashape. Both meshes (high and low-poly) were imported and aligned in MeshLab to process the quality control of the details on the low-poly model in comparison with the referent highpoly 3D model. The low-poly and the high-poly 3D models are compared in MeshLab by calculating vertex geometric distance between them, using the Distance from Reference Mesh filter. The result of the vertex quality is illustrated as the visual representation of the values (Figure 10). By the visual representation of the histogram values, shown in green in Figure 10, it can be seen that the greatest level of detail of the high-poly model was preserved on the low-poly one, while the deviation between the vertices of the models is lower than 1mm, in particular: -0.81mm for the concave parts of the model; and +0.79mm for the convex parts. The product of post-processing of the mesh is the low poly mesh and high-quality textures which are the assets that are further imported into the virtual reality environment [54,55]. We also analyzed Wrap 3D [52] which is software for retopology, usually for humanlike characters for CG animation. For the retopology process in this software, it is necessary to have a reference model that is optimized and the polygons have to be quads. The software library contains only humanlike reference models, so it is not quite suitable for use in the optimization and retopology of architectural objects.
MeshLab [53] was used for controlling the quality of the optimized mesh (low-poly) as well as the deviations from the original mesh (high-poly). The reference mesh was the mesh obtained in Agisoft Metashape. Both meshes (high and low-poly) were imported and aligned in MeshLab to process the quality control of the details on the low-poly model in comparison with the referent high-poly 3D model. The low-poly and the high-poly 3D models are compared in MeshLab by calculating vertex geometric distance between them, using the Distance from Reference Mesh filter. The result of the vertex quality is illustrated as the visual representation of the values (Figure 10). By the visual representation of the histogram values, shown in green in Figure 10, it can be seen that the greatest level of detail of the high-poly model was preserved on the low-poly one, while the deviation between the vertices of the models is lower than 1mm, in particular: -0.81mm for the concave parts of the model; and +0.79mm for the convex parts. The product of post-processing of the mesh is the low poly mesh and high-quality textures which are the assets that are further imported into the virtual reality environment [54,55].

Virtual Reality
Using Virtual Reality to create an interactive walk through the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci provides the user with an opportunity to view the iconostasis in a way it cannot be seen in real life. The VR environment [56] for the iconostasis contains a 3D model [57,58], the additional interactive light and heightened horizontal platforms to see the icons more precisely and closely, as well as image galleries and the additional textual information about the main characteristics of the presented art piece. The game engine Unreal Engine 4 was used to create the overall environment which is suitable for head-mounted display (HMD) and motion controllers, which were tested on an Oculus Rift (Facebook Technologies, LLC, Menlo Park, California, USA). Depending on what the primary goal of the user was: teleportation, lifting objects, relocating to another level, an existing script was used, or a script that had to be created by the C ++ programming language.

Movement Through the Church Using Controllers
In order to navigate through the virtual scene, the user has to use the HMD and controllers [56,59] where the range of motion is predefined and the teleport to the desired location is enabled. By tracking the head and hand movement of the user and monitoring the controllers, it is possible to know exactly where the user is positioned and where they are looking. Within a predefined range of motion, pressing a specific controller button, the teleport to the desired location is enabled.
The inability to teleport is the main problem that most often occurs in practice [59]. Creating a Volume of the desired dimensions, which is placed below those parts of the scene in the frame, is the way of solving the problem of not being able to move. It is a transparent, four-sided prism, that is not visible during the walk, and it allows the user to navigate seamlessly through 3D space ( Figure 11). church of Saint Nicholas in Sremski Karlovci provides the user with an opportunity to view the iconostasis in a way it cannot be seen in real life. The VR environment [56] for the iconostasis contains a 3D model [57,58], the additional interactive light and heightened horizontal platforms to see the icons more precisely and closely, as well as image galleries and the additional textual information about the main characteristics of the presented art piece. The game engine Unreal Engine 4 was used to create the overall environment which is suitable for head-mounted display (HMD) and motion controllers, which were tested on an Oculus Rift (Facebook Technologies, LLC, Menlo Park, California, USA). Depending on what the primary goal of the user was: teleportation, lifting objects, relocating to another level, an existing script was used, or a script that had to be created by the C ++ programming language.

Movement Through the Church Using Controllers
In order to navigate through the virtual scene, the user has to use the HMD and controllers [56,59] where the range of motion is predefined and the teleport to the desired location is enabled. By tracking the head and hand movement of the user and monitoring the controllers, it is possible to know exactly where the user is positioned and where they are looking. Within a predefined range of motion, pressing a specific controller button, the teleport to the desired location is enabled.
The inability to teleport is the main problem that most often occurs in practice [59]. Creating a volume of the desired dimensions, which is placed below those parts of the scene in the frame, is the way of solving the problem of not being able to move. It is a transparent, four-sided prism, that is not visible during the walk, and it allows the user to navigate seamlessly through 3D space ( Figure 11).
By tracking the head and hand movement of the user and monitoring the controllers, it is possible to know exactly where the user is positioned and where he/she is looking [59,60]. Teleportation through a scene is easy for the users because they are not required to navigate via joystick on the controllers. However, the user in real life is standing in one spot and the distance to be moved can be chosen with just one click of a button, as well as the direction they want to turn after the teleportation is done. Collision is of great importance in this part of the project, because of which walls, ceilings, floors, and iconostasis must be covered to prevent teleportation through those parts of the scene. The sensors do not always detect the controllers and it takes some time for the detection to run smoothly. Figure 11.Navigation through the church using motion controllers. Figure 11. Navigation through the church using motion controllers.
By tracking the head and hand movement of the user and monitoring the controllers, it is possible to know exactly where the user is positioned and where he/she is looking [59,60]. Teleportation through a scene is easy for the users because they are not required to navigate via joystick on the controllers. However, the user in real life is standing in one spot and the distance to be moved can be chosen with just one click of a button, as well as the direction they want to turn after the teleportation is done. Collision is of great importance in this part of the project, because of which walls, ceilings, floors, and iconostasis must be covered to prevent teleportation through those parts of the scene. The sensors do not always detect the controllers and it takes some time for the detection to run smoothly.
In addition to moving through the ground level of the scene [59,61], two elevated horizontal platforms were created in the height of the icons on the iconostasis, which are invisible while using the VR equipment. Both platforms are located at the same distance from the iconostasis, and by teleporting through the space, the user can climb to them to observe the icons from close range, which cannot be done in real life without scaffolding, ladders, and similar equipment. The movements at the elevated levels are no different from the movements at the ground level of the church, except that the teleportation signal needs to be pointed to the particular platform in order to move properly. In that way, certain details of the iconostasis can be seen easier than in the real world.

Interactive Additional Light
Additional scene lighting is implemented using a flashlight. For an asset to be interactive, it has to be presented as a Blueprint [38]. The Blueprints system is based on the concept of using a node-based interface to create virtually interactive events and gameplay elements from the Unreal Engine. By using virtual hands it can be moved, lifted and thrown, all during a virtual walk through the scene ( Figure 12). These capabilities are provided by Blueprint, which can be applied to assets as desired. The collision of an interactive object has to be detailed in order to give a more realistic effect when capturing a given model. For example, a complex collision has to be used on a particular high-poly model, since it is possible that placing a normal collision on a more complex asset will not have the same effect as when a complex collision is placed on a more complex object. The flashlight is created as an asset, to which the light is attached. The intensity and color of the light can be changed independently of the model of the flashlight. The lighting and model together make a Blueprint that is interactive in the scene and which the user can control with their hand [55] by holding a specific button on the controller. When releasing the same button, it will fall out of his hands. When teleporting through a scene, the user can carry a flashlight with them to additionally illuminate certain segments of the iconostasis that may not be sufficiently illuminated. According to the horizontal platforms, mentioned in the previous section, the user can navigate the height levels while carrying a flashlight, thus illuminating the desired iconostasis segments much more easily than in the real world.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 15 of 21 In addition to moving through the ground level of the scene [59,61], two elevated horizontal platforms were created in the height of the icons on the iconostasis, which are invisible while using the VR equipment. Both platforms are located at the same distance from the iconostasis, and by teleporting through the space, the user can climb to them to observe the icons from close range, which cannot be done in real life without scaffolding, ladders, and similar equipment. The movements at the elevated levels are no different from the movements at the ground level of the church, except that the teleportation signal needs to be pointed to the particular platform in order to move properly. In that way, certain details of the iconostasis can be seen easier than in the real world.

Interactive Additional Light
Additional scene lighting is implemented using a flashlight. For an asset to be interactive, it has to be presented as a Blueprint [38]. The Blueprints system is based on the concept of using a nodebased interface to create virtually interactive events and gameplay elements from the Unreal Engine. By using virtual hands it can be moved, lifted and thrown, all during a virtual walk through the scene ( Figure 12). These capabilities are provided by Blueprint, which can be applied to assets as desired. The collision of an interactive object has to be detailed in order to give a more realistic effect when capturing a given model. For example, a complex collision has to be used on a particular high-poly model, since it is possible that placing a normal collision on a more complex asset will not have the same effect as when a complex collision is placed on a more complex object. The flashlight is created as an asset, to which the light is attached. The intensity and color of the light can be changed independently of the model of the flashlight. The lighting and model together make a Blueprint that is interactive in the scene and which the user can control with their hand [55] by holding a specific button on the controller. When releasing the same button, it will fall out of his hands. When teleporting through a scene, the user can carry a flashlight with them to additionally illuminate certain segments of the iconostasis that may not be sufficiently illuminated. According to the horizontal platforms, mentioned in the previous section , the user can navigate the height levels while carrying a flashlight, thus illuminating the desired iconostasis segments much more easily than in the real world.

Iconostasis Image Galleries
Creating an icon gallery [61,62] involves creating as many additional levels as there are interactive icons in a scene (Figure 13a,c,e). While observing the iconostasis, it is not possible to receive any additional information about the icons. What activates the ability to display icons in more detail are interactive widgets and a laser, which can be activated by pressing a specific controller button. The laser comes out of the virtual fist, and as it is activated and pointed at a particular icon, a corresponding frame or widget appears in front of it (Figure 13b,d,f). The widgets need to be created as Blueprint and programmed so that by clicking on the widget in front of the corresponding icon,

Iconostasis Image Galleries
Creating an icon gallery [61,62] involves creating as many additional levels as there are interactive icons in a scene (Figure 13a,c,e). While observing the iconostasis, it is not possible to receive any additional information about the icons. What activates the ability to display icons in more detail are interactive widgets and a laser, which can be activated by pressing a specific controller button. The laser comes out of the virtual fist, and as it is activated and pointed at a particular icon, a corresponding frame or widget appears in front of it (Figure 13b,d,f). The widgets need to be created as Blueprint and programmed so that by clicking on the widget in front of the corresponding icon, the user is moved to the isolated space where that icon is located. Pressing the appropriate button on the controller activates the widget and the user moves to the new space. The principle of creating an icon gallery is identical for all icons used in this project.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 16 of 21 the user is moved to the isolated space where that icon is located. Pressing the appropriate button on the controller activates the widget and the user moves to the new space. The principle of creating an icon gallery is identical for all icons used in this project. A separate level was created for each icon and each of them consists of an image of the icon, a short text about the name of the icon, the author and the method of painting (Figure 13b,d,f), as well as a widget to return to the main stage, which looks the same at all levels and is displayed as a symbol of the church. The lighting and environment are identical for each level, and the user is prevented A separate level was created for each icon and each of them consists of an image of the icon, a short text about the name of the icon, the author and the method of painting (Figure 13b,d,f), as well as a widget to return to the main stage, which looks the same at all levels and is displayed as a symbol of the church. The lighting and environment are identical for each level, and the user is prevented from moving in those, in order to focus on the icon and textual information. The only options for users in that space are to observe and to return to the main level, which is the church.
In the paper, a VR method based on photogrammetric surveying of real-world heritage objects is created. It is explained how a 3D model obtained by photogrammetric imaging is prepared to be used in the Unreal Engine to create the VR presentation and how the VR scene is developed. VR is a specific way of experiencing space and one of the benefits is the interaction through VR. Compared to other methods of observing cultural heritage, VR is different because it gives the impression of being present in a given, virtual space. Virtual reality technology provides a real-world view of the existing world, augmenting the user's perception of computer-generated digital content.

Lessons Learned
In the following, we briefly summarize the lessons learned from this work. This section aims to provide concise practice for creating a Virtual Reality model based on a photogrammetric survey. The main lessons learned from the research were:

•
The photogrammetric surveying and modeling advantage is the detailed and high-quality 3D reconstruction of an object of cultural heritage, with significant time-saving.

•
To use the photogrammetric model in a VR environment the optimization and retopology are inevitable in order to reduce polygon count. The reduction should be done in a way that allows mesh unwrapping and texture baking to preserve all the realistic details of the reconstructed 3D model.

•
The control and analysis, based on the inspection of the geometrical details of the mesh and textures, confirmed the effectiveness of the optimization and retopology. The confirmation of this conclusion was a low poly quadrilateral mesh with a high-quality texture, which is suitable for real-time rendering in VR application.

•
The prepared 3D model placed in a game engine allowed generating an interactive virtual content that offers the users additional information about the real-world object. However, incorporating the 3D model obtained by photogrammetry in the VR scene was not an easy task and the preparation of a scene for an interactive virtual tour took a long time. The preparation of the VR scene, in terms of creating the interaction and navigation within the virtual environment, requires additional skills in handling the scripting system inside a game engine, as well as additional equipment such as a VR headset system and motion controllers.The main task was to create interactive widgets that help the user teleport to different levels, depending on the choice of icon, as well as to create a Blueprint for an interactive element that emits light. It was necessary to create an environment in which the user was moving, focusing on the iconostasis, but also on the surroundings, so that the user would not be in an unfinished space.
The results of the research confirmed that the synergy of photogrammetry and VR offers a great opportunity for the visualization of cultural heritage interactively. The method described in this paper has shown to be efficient in terms of creating virtual reality content based on the photogrammetrically reconstructed real-world object. This method can be applied to a variety of similar objects.

Conclusions
The precise and highly detailed virtual representation of cultural heritage objects remains a challenge, even with the rapid development in computer graphics. The combination of photogrammetry and VR offers great possibilities in new ways for interactive visualization of cultural heritage. However, incorporating photogrammetry in the VR scene is not easy and the preparation of a scene for an interactive virtual walk can take a long time. This is why it is important to develop a method for connecting photographs as input data and 3D VR interactive scene as a result.
The method presented here consists of four main steps. The steps, explained through a block diagram, can be used in a similar manner for a wide range of different cultural heritage objects. A VR interactive walk through the Serbian Orthodox Cathedral church of Saint Nicholas in Sremski Karlovci has been made as an application of the created method. The photogrammetric approach is used to create a realistic representation of the iconostasis. Optimization, retopology, control and analysis are used to adjust the 3D model to be applicable in VR. The virtual content is generated by placing the 3D model in the game engine and offering additional content that complements the information about the real environment.
The VR tour provides the user with an opportunity to view the iconostasis in a way it cannot be seen in real life. The representation of the iconostasis usinga virtual environment presented in this research generates virtual content that offers the users additional content that complements the information about the real environment. In future work, we are planning to investigate the possibilities of creating a more functionally complex scene in the VR environment and apply the method for cultural heritage objects of different sizes and shapes. We also intend to include semantics as a way for developing a strategy of models simplification and use of interaction [14,15], and that is related to the particular users and application needs.