Validation, Reliability, and Responsiveness Outcomes of Kinematic Assessment with an RGB-D Camera to Analyze Movement in Subacute and Chronic Low Back Pain

Background: The RGB-D camera is an alternative to asses kinematics in order to obtain objective measurements of functional limitations. The aim of this study is to analyze the validity, reliability, and responsiveness of the motion capture depth camera in sub-acute and chronic low back pain patients. Methods: Thirty subjects (18–65 years) with non-specific lumbar pain were screened 6 weeks following an episode. RGB-D camera measurements were compared with an inertial measurement unit. Functional tests included climbing stairs, bending, reaching sock, lie-to-sit, sit-to-stand, and timed up-and-go. Subjects performed the maximum number of repetitions during 30 s. Validity was analyzed using Spearman’s correlation, reliability of repetitions was calculated by the intraclass correlation coefficient and the standard error of measurement, and receiver operating characteristic curves were calculated to assess the responsiveness. Results: The kinematic analysis obtained variable results according to the test. The time variable had good values in the validity and reliability of all tests (r = 0.93–1.00, (intraclass correlation coefficient (ICC) = 0.62–0.93). Regarding kinematics, the best results were obtained in bending test, sock test, and sit-to-stand test (r = 0.53–0.80, ICC = 0.64–0.83, area under the curve (AUC) = 0.55–84). Conclusion: Functional tasks, such as bending, sit-to-stand, reaching, and putting on sock, assessed with the RGB-D camera, revealed acceptable validity, reliability, and responsiveness in the assessment of patients with low back pain (LBP). Trial registration: ClinicalTrials.gov NCT03293095 “Functional Task Kinematic in Musculoskeletal Pathology” 26 September 2017

pain in the gluteal region and the upper leg, and no clear increase in activity level and reduction in participation restrictions after 3 weeks [6,23].
Participants with low back pain as a result of a specific spinal disease, infection, presence of a tumor, osteoporosis, fracture, inflammatory disorder, or cauda equina syndrome were excluded 6 . In addition, they were excluded if they had a hip arthroplasty in the last 6 months, participated in a study with an experimental treatment during the recruitment or the study, had a severe cardiovascular disease (Category D) according to the Senior Europeans (SENIEUR) protocol [24], had alterations in the participant's cognition that did not allow us to understand the order of the research, or were pregnant.
The measurement was performed twice: The initial measurement and another 1 month later, in order to study responsiveness. For responsiveness, the changes across the time period between the two measurements were tested (4 weeks natural course). All patients followed a conservative low back pain exercise program.

Setting
The study was carried out in the UZ Brussel Hospital in Brussels, Belgium. Patients were recruited from the physical medicine department of the hospital from January 2019 to June 2019.

Ethical Considerations
The Medical Ethics Committee of the UZ Brussel Hospital approved this study (Nr. 2018/366). The guidelines for good clinical practice (GCP), the principles of Declaration of Helsinki, and the Belgian Law of 7 May 2004 related to experiments on humans were followed. All subjects gave their informed consent for inclusion before they participated in the study. The subjects received an informed consent with information about the study that was the signed by the patient. The patient was free to stop and leave the study at any time.

Motion Capture RGB-D Camera System and Inertial Measurement Unit
The motion capture RGB-D camera Xtion Pro (ASUS, Taipei, Taiwan) was used in the study. The distance between the camera and the participant was set approximately at 2.5 m. The camera was placed between 40-45 • , with respect to the direction of movement, and at a height of 90 cm from the floor ( Figure 1).
Sensors 2020, 20,689 3 of 17 radiating pain in the gluteal region and the upper leg, and no clear increase in activity level and reduction in participation restrictions after 3 weeks [6,23]. Participants with low back pain as a result of a specific spinal disease, infection, presence of a tumor, osteoporosis, fracture, inflammatory disorder, or cauda equina syndrome were excluded 6 . In addition, they were excluded if they had a hip arthroplasty in the last 6 months, participated in a study with an experimental treatment during the recruitment or the study, had a severe cardiovascular disease (Category D) according to the Senior Europeans (SENIEUR) protocol [24], had alterations in the participant's cognition that did not allow us to understand the order of the research, or were pregnant.
The measurement was performed twice: The initial measurement and another 1 month later, in order to study responsiveness. For responsiveness, the changes across the time period between the two measurements were tested (4 weeks natural course). All patients followed a conservative low back pain exercise program.

Setting
The study was carried out in the UZ Brussel Hospital in Brussels, Belgium. Patients were recruited from the physical medicine department of the hospital from January 2019 to June 2019.

Ethical Considerations
The Medical Ethics Committee of the UZ Brussel Hospital approved this study (Nr. 2018/366). The guidelines for good clinical practice (GCP), the principles of Declaration of Helsinki, and the Belgian Law of 7 May 2004 related to experiments on humans were followed. All subjects gave their informed consent for inclusion before they participated in the study. The subjects received an informed consent with information about the study that was the signed by the patient. The patient was free to stop and leave the study at any time.

Motion Capture RGB-D Camera System and Inertial Measurement Unit
The motion capture RGB-D camera Xtion Pro (ASUS, Taipei, Taiwan) was used in the study. The distance between the camera and the participant was set approximately at 2.5 m. The camera was placed between 40-45°, with respect to the direction of movement, and at a height of 90 cm from the floor ( Figure 1). The IMU MP67B (InvenSense, San Jose, USA) from an iPhone6s (Apple Inc., Cupertino, CA, USA) collected the information about mobility angle and acceleration along three axes with the gyroscope, accelerometer, and magnetometer. The IMU showed high accuracy in a medically acceptable limit (±5°), and the angular velocity noise level was 0.09°/s [25]. This methodology, based on an IMU from a smartphone, was validated previously using the timed up-and-go (TUG) test [26]. The smartphone was placed at the level of the thorax over the sternum inside a specific belt around the thorax (Figure 2). The SensorLog® 2.2v app from Apple App Store processed the sensor data The IMU MP67B (InvenSense, San Jose, USA) from an iPhone6s (Apple Inc., Cupertino, CA, USA) collected the information about mobility angle and acceleration along three axes with the gyroscope, accelerometer, and magnetometer. The IMU showed high accuracy in a medically acceptable limit (±5 • ), and the angular velocity noise level was 0.09 • /s [25]. This methodology, based on an IMU from a smartphone, was validated previously using the timed up-and-go (TUG) test [26]. The smartphone was placed at the level of the thorax over the sternum inside a specific belt around the thorax (Figure 2). The SensorLog®2.2v app from Apple App Store processed the sensor data using the Core Location and Core Motion frameworks from the iPhone. The recording rate was set at 100 Hz. A 3D coordinate references system for both instruments was used ( Figure 2). using the Core Location and Core Motion frameworks from the iPhone. The recording rate was set at 100 Hz. A 3D coordinate references system for both instruments was used ( Figure 2).

Figure 2.
Joints information collected by the camera and 3D reference system of the camera and the inertial measurement unit.

Functional Tests
Functional tests are based on physical assessment used in LBP patients and inspired by frequently impaired daily activities as described by the patients [27][28][29][30]. (Figure 3): (a) Modified stairs climbing test (stairs test): Subject had to climb two-steps stairs without assistance by placing one foot on each step (height and depth of each step was 15 × 30 cm) [27]. (b) Bending test: A pen was placed on the floor in front of the subject. The subject was asked to bend forward from the hips and pick up the pen without assistance [27]. (c) Reaching test: Subject facing a shelf placed at patient's head height +15%. Patient was instructed to place a pen on the shelf without help or assistance [27]. (d) Sock test: Subject had to put on his sock on the dominant foot sitting without help or assistance.
The chair had 44 cm sitting height [27]. (e) Lie-to-sit test: Patient had to perform the lying-to-sit transition [28]. Starting from a supine position, the patient was asked to turn on his side and then sit using his arm, while the legs were lowered at the side of the table. (f) Sit-to-stand test (STS test): A chair with a 44 cm sitting height was used. The patient was instructed to stand up and sit down from the chair without using hands or assistance [29]. (g) TUG test: The patient started seated on a chair (44 cm seating height) and was asked to get up and walk until reaching a cone at a 3 m distance from the chair, turn around it, return it to the chair, and sit down again. Patients walked as fast as possible without running [30].
All the tests were standardized in order to improve accuracy in the analysis.

Functional Tests
Functional tests are based on physical assessment used in LBP patients and inspired by frequently impaired daily activities as described by the patients [27][28][29][30]. (Figure 3): (a) Modified stairs climbing test (stairs test): Subject had to climb two-steps stairs without assistance by placing one foot on each step (height and depth of each step was 15 × 30 cm) [27]. (b) Bending test: A pen was placed on the floor in front of the subject. The subject was asked to bend forward from the hips and pick up the pen without assistance [27]. (c) Reaching test: Subject facing a shelf placed at patient's head height +15%. Patient was instructed to place a pen on the shelf without help or assistance [27]. (d) Sock test: Subject had to put on his sock on the dominant foot sitting without help or assistance.
The chair had 44 cm sitting height [27]. (e) Lie-to-sit test: Patient had to perform the lying-to-sit transition [28]. Starting from a supine position, the patient was asked to turn on his side and then sit using his arm, while the legs were lowered at the side of the table. (f) Sit-to-stand test (STS test): A chair with a 44 cm sitting height was used. The patient was instructed to stand up and sit down from the chair without using hands or assistance [29]. (g) TUG test: The patient started seated on a chair (44 cm seating height) and was asked to get up and walk until reaching a cone at a 3 m distance from the chair, turn around it, return it to the chair, and sit down again. Patients walked as fast as possible without running [30].

Questionnaires
The validated version of questionnaires in French and Dutch were used in order to describe the sample. The Roland-Morris Disability Questionnaire (RMQ) was used for low back disabilities [31,32] and the EuroQoL-5D-VAS [33,34] and SF-12 questionnaires [35] were used for quality of life and health.
A GPE scale was used to collect the overall measure of change during the month between the measurements [36]. The scale had 7 items that ranged from 1 ("very much improved") through to 4 ("no change") to 7 ("very much worse") [36].

Measurement Procedure
The approximate time was 60 min per test. The measurement was divided into three parts: filling in the questionnaires, preparing the participant, and performing the functional tasks.
The smartphone was placed on the patient using a belt at the level of the thorax, and the motion capture area of the depth camera measurement was shown to the patient on the computer screen. Participants watched a video of each test before each measurement, and the rater gave them a standardized instruction. Hereby, they could familiarize each test before the data collection. After performing the starting position that allowed the body recognition by the camera, participants performed the maximum number of repetitions during 30 s for each functional task and then a rest of 120 s was allowed following each test, in order to prevent fatigue. Three repetitions of the TUG were recorded.
The participant was in a static position at the beginning and end of each test for 10 s in order to improve the synchronization of both data sets in the data processing.

Variables
Displacement (degrees), time (seconds), velocity (m/s), and acceleration (m/s 2 ) were obtained from the three assessment tools. The flexion-extension trunk displacement was calculated directly from the data and represented by the pitch angle and the anteroposterior acceleration by the acceleration in Z, as shown in Figure 2. Velocity and acceleration were calculated indirectly based on the following formulas: "velocity = displacement/time" and "acceleration = velocity/time". The outcomes were extracted from the interval of movement between control points. Functional tests were marked with two control points: the starting position (A) and the ending position (B) of each test. Therefore, the A→B interval was measured. The control points in the TUG test were: the starting point (A); the stand-up position (B); reaching the turning point (C); the point immediately before the participant starts to sit down (D); and the return to the starting point (E). Consequently, A-B, B-C, All the tests were standardized in order to improve accuracy in the analysis.

Questionnaires
The validated version of questionnaires in French and Dutch were used in order to describe the sample. The Roland-Morris Disability Questionnaire (RMQ) was used for low back disabilities [31,32] and the EuroQoL-5D-VAS [33,34] and SF-12 questionnaires [35] were used for quality of life and health.
A GPE scale was used to collect the overall measure of change during the month between the measurements [36]. The scale had 7 items that ranged from 1 ("very much improved") through to 4 ("no change") to 7 ("very much worse") [36].

Measurement Procedure
The approximate time was 60 min per test. The measurement was divided into three parts: filling in the questionnaires, preparing the participant, and performing the functional tasks.
The smartphone was placed on the patient using a belt at the level of the thorax, and the motion capture area of the depth camera measurement was shown to the patient on the computer screen. Participants watched a video of each test before each measurement, and the rater gave them a standardized instruction. Hereby, they could familiarize each test before the data collection. After performing the starting position that allowed the body recognition by the camera, participants performed the maximum number of repetitions during 30 s for each functional task and then a rest of 120 s was allowed following each test, in order to prevent fatigue. Three repetitions of the TUG were recorded.
The participant was in a static position at the beginning and end of each test for 10 s in order to improve the synchronization of both data sets in the data processing.

Variables
Displacement (degrees), time (seconds), velocity (m/s), and acceleration (m/s 2 ) were obtained from the three assessment tools. The flexion-extension trunk displacement was calculated directly from the data and represented by the pitch angle and the anteroposterior acceleration by the acceleration in Z, as shown in Figure 2. Velocity and acceleration were calculated indirectly based on the following formulas: "velocity = displacement/time" and "acceleration = velocity/time". The outcomes were extracted from the interval of movement between control points. Functional tests were marked with two control points: the starting position (A) and the ending position (B) of each test. Therefore, the A→B interval was measured. The control points in the TUG test were: the starting point (A); the Sensors 2020, 20, 689 6 of 16 stand-up position (B); reaching the turning point (C); the point immediately before the participant starts to sit down (D); and the return to the starting point (E). Consequently, A-B, B-C, C-D, and D-E intervals were measured. Following this procedure, the kinematic pattern of each test and the selected intervals were obtained ( Figure 3).

Data Recording and Processing
Anthropometric characteristics (age, weight, height, and body mass index) were recorded for each participant. Kinematic data was correlated with the timestamp provided by each tool. A timestamp is a sequence of characters giving the date and time of day. The synchronization between both devices (camera and IMU) was made with the time-stamp data from both devices and the 10 s before and after the test by a researcher.
Software libraries OpenNI2 and NiTE2 were used to extract the information from the RGB-D camera and create a virtual skeleton representation with the location of the skeletal joints ( Figure 2). The representation was captured when the patient was in front of the camera and performed the starting position with the upper limbs raised sideward. The software, MRPT, was developed for a previous study [15] and has been released publicly as part of the open-source software library [37].
The parameterization to calculate the patient's movement delivered inclination angles and angular speed between the skeletal joints. The 3D positions that corresponded to the "neck" and "torso" joint labels ( Figure 2) were used to calculate the angle between them as the trunk flexion, because it coincided with the movement of the center of mass. This coincided with measuring body motion at the T7 level. The inertial measurement unit was placed over the chest at the same level. The smartphone's orientation and the dimension of space were measured as follows: flexion-extension (α, pitch angle): rotation axis was Y, with positive data indicating flexion, and negative values indicating extension [15]. Rotation (β, yaw angle): the rotation axis was X, where positive data indicated right rotation, while negative values indicated left rotation [15]. Finally, inclination (γ, roll angle): the inclination axis was Z, where positive data indicated right inclination, while negative values indicated left inclination [15].
In the case of the depth camera, let P N = (X N ,Y N ,Z N ) and P T = (X T ,Y T ,Z T ) be the 3D spatial coordinates of the neck and torso joints as measured by the range camera, respectively [15]. The equivalent flexion-extension (α) angles can then be computed as [15]: Mean and standard deviation (SD) were calculated for time, displacement, velocity, and acceleration.

Statistical Analysis
The internal validity, reliability, and external responsiveness were analyzed using the previously outlined variables. The third repetition was chosen for the validity and responsiveness analysis, and the first three repetitions were chosen for the reliability analysis. There repetitions were chosen in order to avoid fatigue. If the participant was not able to perform three repetitions due to the severity of the condition, the third one repetition was calculated as an average of the first two. In addition, a descriptive analysis was performed on each variable from the kinematic devices and questionnaires, and the mean and SDs were included.
Internal validity was calculated by the correlation between the measurements of the RGB-D camera and the IMU using a parametric test, Pearson correlation or non-parametric test, or Spearman correlation (r), according to the data distribution by the Kolmogorov-Smirnov test previously used [38].
The correlation values were classified into three categories: poor (r ≤ 0.49), moderate (r = 0.50-0.74), and strong (r ≥ 0.75) [38]. A Bland-Altman plot was created for those tests with moderate or strong correlation in the kinematic variables (displacement, velocity, and acceleration) to show the agreement of the measure tools. The reliability was measured as a way of monitoring the measurements by the intraclass correlation coefficient (ICC) two-way random-effects model 2.1, 95% CI, and the standard error of measurement (SEM). The reliability results were classified into these categories: poor (ICC ≤ 0.49), moderate (ICC = 0.50-0.74), good (ICC = 0.75-0.89), and excellent (ICC ≥ 0.90) [39].
The area under the curve (AUC) of the receiver operating characteristic (ROC) curves was the chosen method to quantify the external responsiveness [40,41]. The external criteria to classify the patient for external responsiveness analysis was the global perceived effect scale. Two categories were created in order to obtain a dichotomic variable. The categories from 1 to 3 ("very much improved" to "a little improved") were classified as "improved". The categories from 4 to 7 ("no change" to "very much worse") were classified as "nonimproved" [36]. The levels of external responsiveness were classified according to the AUC in low (0.50-0.70), moderate-to-high precision (0.70-0.90), and high precision (0.90) [40].
Data analysis was conducted by an external, blinded, and expert researcher. All analyses were done using SPSS version 22 software (SPSS Inc., Chicago, IL, USA).

Results
Thirty subjects participated in the initial measurement. A total of 23% (n = 7) of the patients measured in the first phase did not complete the study because they did not attend the second measurement 1 month later. The mean of the anthropometric characteristics and the score of the questionnaires were calculated ( Table 1). The mean and standard deviations of the kinematic variables and the repetitions were determined in each test (Table 2).
Regarding the lie-to-sit (LTS) test, there were patients who did not perform more than two repetitions due to the severity level of complaints. Therefore, this test was completed only by 27 subjects in the first measurement and 19 subjects in the second.
Of the participants whom completed the study, the results of the GPE scale were: 1-very much improved 0%, 2-much improved 13%, 3-a little improved 34.8%, 4-no change 39.1%, 5-a little deterioration 4.3%, 6-much worse 4.3%, and 7-very much worse 4.3%. Therefore, 47.8% of the sample was classified in the category of "improved" and 52.2% was classified in the category of "nonimproved".  The results of internal validity, reliability, SEM, and responsiveness are shown in Table 3. The time variable had excellent values in the validity of all tests (r = 0.93-1.00) and the reliability was between moderate and excellent (ICC = 0.62-0.93). Regarding displacement, velocity, and acceleration, the tests with moderate to strong internal validity and reliability results were the bending test (r = 0.53-0.99, ICC = 0.75-0.93), STS test (r = 0.59-0.92, ICC = 0.64-0.92), and sock test (r = 0.53-0.99, ICC = 0.64-0.83). The best responsiveness results were the STS test (AUC = 0.64-0.85) and the stairs test (AUC = 0.60-0.84). Table 3. Internal validity, reliability, and responsiveness outcomes from the variables extracted from the RGB-D camera. Inertial measurement unit (IMU); CAM. RGB-D camera; lie-to-sit (LTS); sit-to-stand (STS); timed up-and-go (TUG); area under the curve (AUC); intraclass correlation coefficient (ICC).  The data used in this study are available in Supplementary Materials. The Bland-Altman graphs for agreement illustrate this for each of these tests (Figures 4-6). The Bland-Altman graphs for agreement illustrate this for each of these tests (Figures 4-6).

Discussion
The aim of this study was to analyze the internal validity, reliability, and external responsiveness of a human movement capture system using an RGB-D camera or depth camera. Following the recommendation of Clark et al. (2019), the information extracted from the chosen angle was carefully selected and validated for these functional tests and this type of patient, so it is expected that future studies will check the clinical contribution of this assessment [17]. Broadly, the time variable obtained the results closer to 1, while the other measurement properties, such as displacement, velocity, and acceleration, showed an internal validity between poor to moderate (r = −0.12-0.80), a reliability between poor and excellent (ICC = −0.01-0.93), and external responsiveness between low and moderate (AUC = 0.55-0.84).

Six Functional Tests
The results were different and irregular depending on the test, as they were in previous studies.  [15,42]. On the other hand, another very similar study showed poor results in most of the correlations (r < 0.4) [14]. There were three tests in the present study that had better results, reaching minimal quality norms, according to the classification shown in the statistical analysis of the STS test, bending test, and sock test. The three tests have moderate results in common in terms of internal validity (r = 0.53-0.80) and moderate to excellent reliability (ICC = 0.64-0.93). In addition, they are tests where the trunk flexion was greater than 15 • , except for the LTS test. The LTS test had poor results (r = 0.09-0.24, ICC = 0.16-0.48), which may be due to the complexity of the task and the overlapping joint points [17].
The STS test obtained good correlation data (r = 0.59-0.73), reliability (ICC = 0.64-0.75), and responsiveness (AUC = 0.64-0.77). The STS test is one of the functional tests that evaluates the strength of the most used lower limbs [43] and it has already been used as a test to study the reliability of the depth chamber, but never in LBP patients [44]. Galna et al. (2014) and Matthews et al. (2019) compared a RGB-D camera with an active motion capture system with markers in patients with Parkinson's disease (r = 0.99, ICC = 0.98) [20] and healthy patients (ICC = 0.97-0.98, mean absolute error = 1.7-2.8) [44], respectively. They obtained good results but they measured the linear displacement of the head [20] and the center of mass [44], unlike this study, which took the angle formed by the trunk flexion. Regarding the agreement, the data were more compact around the average compared to the other tests and had lower limits (SD = 12.74, 4.21, 0.39) than the bending test. Mentiplay et al. (2018) obtained an agreement with better limits to these tests (SD = 1.96) by measuring lumbar flexion during a single leg squat [42]. The difference in the agreement may be due to the different criterion validity used in both studies. Due to these results and other results in previous studies, the STS test captured by a depth camera can be a valid clinical tool to analyze the functionality of the patient, in addition to being an easy test to analyze [45].
The  [15]. These results on displacement are consistent with what has been previously commented. The tests in this study that had an average flexion of less than 16 • had worse reliability and validity data.
Regarding the sock test, no other studies have been found that analyze the kinematics of this test. This test obtained worse results than the previous mentioned tests, and there was a greater dispersion between points in the Brand-Altman plot. In addition, the difference between the means in this test was greater than other tests (IMU = 16.52 • , RGB-D Camera = 30.95 • ). Therefore, the acceptability of this data could not be considered with equal firmness in the sock test and the bending test or the STS test. New studies on the kinematics of this test should be more precise in order to validate this test and prove if it can be useful for assessing and classifying patients [46].
Finally, in terms of responsiveness, the best results in velocity and acceleration were obtained in the bending test, STS test, and stairs test (AUC = 0.71-0.84). The data on the displacement in all the tests was poor (AUC = 0.55-0.77), except in the stairs test, but its low results in reliability and validity made its recommendation difficult as a test in front of a depth camera. The good data in the STS tests and bending tests in velocity and acceleration (AUC = 0.72-0.84), together with the other reliability and validity analysis, showed the relevance of the kinematic variables for the assessment of the individual, as also shown by Galán-Mercant et al. (2016) [47]. Objective tools are needed that offer information that the human being is not able to analyze visually.

TUG Test
The results in the TUG test were quite poor compared to the results of the other functional tests ( Table 3). The acceleration variable in the first two phases showed acceptable validity (r = 0.60-0.66), but the reliability varied between low and moderate (ICC = 0.08-0.57). Other studies have used RGB-D cameras to examine the TUG test and found that the length of the first step may provide important clinical information [48]. We did not examine this outcome measure in our cohort. Regarding the study by Moreno et al. (2017), they obtained moderate but better results compared to this study in trunk displacement and velocity in the first and last intervals (r = 0.64-0.67, r = 0.58-0.79) and poor results in the intermediate intervals (r < 0.10, r < 0.01) [15]. One possible explanation for this difference is related to technics, namely that the subject took the risk of getting too close to the camera and there may be an exceptional loss of recorded signal due to the working range area [49], and another explanation is the problem of overlapping joint points in the turn of the TUG test. The overlapping joint problem is indeed a problem of the depth camera and it is a point to take into account [17].
Regarding  [50], although this study only focused on assessing the first and last phase of the TUG test, where trunk flexion is predominant in getting up and sitting in the chair. It is important to reflect on why the results obtained in this study in the TUG test were not consistent with those observed in previous studies, because the TUG test did not show good results under the present circumstances. Perhaps, future studies should take into account the complexity of the test and positioning of the cameras because it is necessary to investigate the validity of a depth camera with another different perspective.

Limitations
The limitations of the depth camera in this study should not be overlooked. An important limitation was that there were patients who passed the inclusion criteria but could not perform the three repetitions that were requested in the LTS test. This occurred mainly because the level of severity of the complaints was so high that some patients could not perform several repetitions.
The poor-moderate correlation in most of the kinematic variables could be due to the criterion validity chosen for this study. The IMU and the RGB-D camera could measure the trunk flexion, although they did not share the same reference system, since the IMU was attached to the trunk and the camera took a global representation of the body. Despite this, a previous study already correlated them satisfactorily [15]. It is also a good tool to assess the kinematics with great applicability [26,51] and the data from previous studies that used different gold standard show results, similar to that commented above.
The systematic review of Papi et al. (2018) recommended performing the kinematic analysis of the whole body in patients with LBP, not limiting the analysis to the lumbar region [10]. On the other hand, Clark et al. (2019) says that the angles obtained from trunk flexion can have high precision through precalibration [17], so an advantage of the depth camera is that it does not need prior calibration like other motion capture systems [44]. This study of the RGB-D camera and LBP, collecting the information from the lumbar region and trunk movement and being the location of the center of mass, is a relevant motion descriptor as a kinematic point [52], and this reference has been chosen several times in the literature [26,53].

Conclusions
The RGB-D camera used to assess functional tests can be a valid tool depending on the type of test to be analyzed. Kinematics analyzed during the STS test, bending test, and sock test reached validity, reliability, and responsiveness measures from moderate to good, and this procedure could have potential in the assessment of movement or motor control in patients with LBP. Therefore, large movements are detected with acceptable reliability and validity, although smaller or more precise movements must be further analyzed in future studies to improve the registration and analysis protocol.
Supplementary Materials: The following are available online at http://www.mdpi.com/1424-8220/20/3/689/s1. Authors uploaded as "Supplementary Files" the database with all the extracted data used in this study in order to maintain the integrity, transparency, and reproducibility of research records.