Special Issue on Visual Sensors

Visual sensors have characteristics that make them interesting as sources of information for any process or system [...].


Introduction
Visual sensors have characteristics that make them interesting as sources of information for any process or system. On the one hand, they are able to capture a very precise and high-resolution environmental information while occupying a small size and with a reduced price. On the other hand, they are able to capture a large quantity of information from the environment around them. These properties are the reason they have been employed for several decades for the resolution of multiple tasks. This high versatility in their fields of application makes them increasingly used as a source of information to solve a variety of diverse tasks.
Nowadays, a wide variety of visual systems can be found, from the classical monocular systems to omnidirectional, RGB-D, and more sophisticated 3D systems. Every configuration presents some specific characteristics that make them useful to solve different problems. Their range of applications is wide and varied. Among them, we can find robotics, industry, agriculture, quality control, visual inspection, surveillance, autonomous driving, and navigation aid systems.
Visual systems can be used to obtain relevant information from the environment, which can be processed to solve a specific problem. The aim of this Special Issue is to present some of the possibilities that vision systems offer, focusing on the different configurations that can be used and novel applications in any field.
In this Special Issue, 63 contributions were submitted and 36 of them were published (i.e., 57% acceptance rate). The published articles present a very adequate vision of how visual sensors are used in very different fields of application, from mapping for navigation of mobile robots to object recognition or scene reconstruction.

Contributions to the Special Issue on Visual Sensors
In the field of visual navigation of mobile robots, SLAM (Simultaneous Localization and Mapping), Visual odometry, etc., we find different alternatives that are presented in some of the papers of the Special Issue. Thus, in [1], an RGB-D SLAM algorithm is presented using the concept of orientation relevance taking into account the Manthattan Frame Estimation. Teng et al. [2] provided a method for aircraft pose estimation without relying on 3D models using two widely separated cameras to acquire the pose information. In [3], a new framework for online visual object tracking is proposed. A motion-aware strategy is employed to predict the possible region and scale of the target in the frame by utilizing the previously estimated 3D motion information. Wang et al. [4] provided an improved indoor visual SLAM method that uses point and line segment features extracted by stereo cameras, achieving robust results. In [5], an RGB-D sensor is employed. In this case, the purpose is to make a dense 3D semantic mapping of the environment by means of Pixel-Voxel network. Aladem et al. [6] proposed a low-overhead real-time ego-motion estimation (visual odometry) system based on either a stereo or RGB-D sensor. By means of the proposed algorithm, a local map is used, requiring significantly less memory and computational power. Nawaf et al. [7] provided the details of a visual odometry method adapted to the underwater context. They employed the captured stereo image to provide real-time navigation and a site coverage map, which is necessary to conduct a complete underwater survey. Valiente et al. [8] presented a visual information fusion approach for robust probability-oriented feature matching. This approach can be used in a more general SLAM procedure. This strategy permits obtaining relevant areas in the image reference system, from which probable matches could be detected.
Image retrieval aims at browsing, searching, and retrieving images from a large database of digital images. Proposing new descriptors of an image that define the characteristics of the image can be key in this regard. García-Olalla et al. [9] presented a new texture descriptor booster based on statistical information of the image. This descriptor is employed in texture-based classification images. Fareed et al. [10] proposed a framework for salient region detection that uses appearance-based and regression-based schemes to reduce the computational complexity and focusing on the salient parts of the image. In this sense, Feng et al. [11] proposed a texture descriptor for image retrieval designing a local parallel cross pattern in which the local binary pattern map is fused with the color map. In addition, Feng et al. [12] proposed a hybrid histogram descriptor used for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptual uniform histogram and a motif co-occurrence histogram including the probability of a pair of motif patterns. Finally, García-Olalla et al. [13] proposed a method for textile based image retrieval for indoor environments based on describing the images with different channels (RGB, HSV, etc.) and using the combination of two different descriptors for the image.
Visual sensors can also be an important source of information to help and support for other tasks. Thus, in [14], a novel global point cloud descriptor is proposed for reliable object recognition and pose estimation, which can be applied to robot grasping operation. Martínez-Martin et al. [15] provided an approach based on depth cameras to robustly evaluate the manipulation success in robot object manipulation. The method proposed allows the robot to accurately detect the presence or absence of contact points between the robot manipulator and a held object. Xue et al. [16] presented a vision system capable of automatic 3D joint detection. The detection method is applied in a robotic seam tracking system for gas tungsten arc welding.
The calibration of vision systems plays a very important role in different applications where these types of sensors are used. Having a well-calibrated system will permit more robust results to be achieved in later stages. Zhang et al. [17] presented a simple calibration method for laser range finder systems needing only a calibration board. In [18], an alternative approach that uses gray-code patterns displayed on an LCD screen to determine camera parameters is provided. The proposed approach is 1.5 times more precise than using standard calibration with a checkerboard pattern. Finally, Choi et al. [19] proposed a method that automatically calibrates four cameras of an around view monitor system in a natural driving situation.
Object recognition is a task in which a vision system is almost always involved. During the past few years, many proposals have been made in this area including different methods that allow the recognition of the objects present in an image. In this way, Kapuscinski et al. [20] presented a method for hand shapes recognition based on skeletal data. It encodes the relative differences between vectors associated with the pointing direction of the particular fingers and the palm normal. Wang et al. [21] presented a new spatiotemporal action localization detector that consists of sequences of per-frame segmentation masks. This proposed detector can pinpoint the starting or ending frame of each action category in untrimmed videos. In [22], a system for automatically designing the field-of view of a camera, the illumination strength, and the parameters in a recognition algorithm is presented. Nguyen et al. [23] proposed a new presentation attack detection method for an iris recognition system using a near infrared light camera image. This method tries to avoid the effect that presentation attack images captured using high-quality printed images can cause in classic iris recognition systems. Fu et al. [24] presented an approach for pedestrian detection combining different methods previously proposed together with an efficient sliding window classification strategy. The detector achieves fast detecting speed at the same time as state-of-the-art accuracy. Wang et al. [25] proposed a model to resolve the 3D reconstruction problem for dynamic non-rigid objects with a single RGB-D sensor.
Over the past few years, the field of visual systems is shifting from classical statistical methods to deep learning methods. Video-based person detection and recognition is an important task with many problems and challenges such as lighting variation, occlusion, human appearance similarity, etc. In [26], a video-based person re-identification method with hybrid deep appearance-temporal features is proposed. Another application using deep learning methods was presented by Arsalan et al. [27]. The authors proposed a densely connected fully convolutional network, which can determine the true iris boundary even with inferior-quality images by using better information gradient flow between the dense blocks. Liu et al. [28] proposed a method to improve the performance of the star sensor under dynamic conditions based on the ensemble back-propagation neural network.
Scene reconstruction is a key task necessary to accomplish more complex problems such as mobile robot navigation. Xia et al. [29] presented a visual inertial odometry as a solution to the robot navigation system. Cheng et al. [30] presented a high-accuracy method for globally consistent surface reconstruction using a single fringe projection profilometry sensor. Lane marking detection and localization are crucial for autonomous driving and lane-based pavement surveys. In [31], a novel methodology is presented for automated lane marking identification and reconstruction. In addition, a case study is given to validate the proposed methodology. Finally, Zhang et al. [32] proposed an improved method for UAV image seamline searching. The experimental results show that the proposed method can effectively solve the problems of ghosting and seams in the panoramic UAV images.
Finally, one of the most widely discussed topics about vision systems is to establish visual measurements. Some of the papers of the Special Issue revolve around this problem. In [33], the authors presented an improved rotation-angle measurement method based on geometric moments that is suitable for automatic sorting systems. In [34], a stereo vision system is employed for measuring the ram speed of steam hammers. The system tries to decrease the influence of strong vibration. The accuracy and effectiveness of the method was experimentally verified. Li et al. [35] proposed a pose estimation method for sweet pepper detachment. The acquired point cloud is separated into candidate planes that are separately evaluated using a scoring strategy. Yang et al. [36] presented a comparative analysis of digital image correlation based stereo 3D shape measurements.