Inter- and Intrarater Agreement of Spot Sign and Noncontrast CT Markers for Early Intracerebral Hemorrhage Expansion.

BACKGROUND
The aim of this study was to assess the inter- and intrarater reliability of noncontrast CT (NCCT) markers [Black Hole Sign (BH), Blend Sign (BS), Island Sign (IS), and Hypodensities (HD)] and Spot Sign (SS) on CTA in patients with spontaneous intracerebral hemorrhage (ICH).


METHODS
Patients with spontaneous ICH at three German tertiary stroke centers were retrospectively included. Each CT scan was rated for four NCCT markers and SS on CTA by two radiology residents. Raters were blind to all demographic and outcome data. Inter- and intrarater agreement was determined by Cohen's kappa (κ) coefficient and percentage of agreement.


RESULTS
Interrater agreement was excellent in 473 included patients, ranging from 96% to 99%. Interrater κ ranged from 0.85 (95% CI [0.78-0.91]) to 0.97 (95% CI [0.94-0.99]) for NCCT markers and 0.93 (95% CI [0.88-0.98]) for SS, all p-values < 0.001. Intrarrater agreement ranged from 96% to 100%, with κ ranging from 0.85 (95% CI [0.78-0.91]) to 1.00 (95% CI [0.10-0.85]) for NCCT markers and 0.96 (95% CI [0.92-1.00]) for SS, all p-values < 0.001.


CONCLUSIONS
NCCT imaging findings and SS on CTA have good-to-excellent inter- and intrarater reliabilities, with the highest agreement for BH and SS.


Introduction
Intracerebral hemorrhage (ICH) is the most severe form of stroke with a one month morbidity and mortality rates approaching 50% and death or severe disability exceeding 75% [1][2][3][4]. Factors reflecting the dynamic nature of hematoma evolution are particularly important for clinical outcome [5,6]. As so, early secondary hematoma expansion due to active extravasation and rebleeding has been reported in 38% of patients after initial imaging on computed tomography (CT) and reported as an independent prognostic factor for poor functional outcome [7,8]. It moreover forms an appealing therapeutic target, as early hematoma expansion is potentially modifiable in the acute onset time frame [9,10]. Therefore, identification of patients with ICH and a potential risk of hematoma expansion is crucial for triage of patients and prediction of the functional outcome [1]. To this end, several imaging markers on non-contrast computed tomography (NCCT) have been described to be predictive of hematoma expansion and poor functional outcome [6]. These imaging characteristics include the blend sign (BS) [6,11], the black hole sign (BHS) [6,12,13], the island sign (IS) [6,14], and the more general appearance of hypodensities (HD) [6,15]. In addition, a contrast extravasation on computed tomography angiogram (CTA) within an ICH is referred as the established spot sign [16,17]. Some of the imaging characteristics present each with overlapping definitions and criteria, such as BHS and HD [6]. The issue of error in radiology has been recognized for many years [18]. On the part of radiologists, the "lack of knowledge" and "under reading specific causes" contribute amongst others to the specific causes of error [18]. Nonetheless, there is limited data concerning the formal and independent assessment of the reliability of the above-mentioned hematoma markers [6]. According to a recently published study from Dowlatshahi et al., five NCCT markers were analyzed for inter-and intrarater reliability [19]. Interrater and intrarater reliabilities were good-to-excellent, albeit based on a relatively small patient cohort (n = 40) [19]. In addition, SS on CTA has not been analyzed by Dowlatshahi et al. [19], despite contributing individually the most for not only hematoma expansion but also outcome prediction in patients with ICH when compared to five commonly used NCCT markers [1]. Wada et al. were the first to report the SS on CTA with an excellent interrater reliability, however the number of patients was again relatively small (n = 39) and secondly intrarater reliability was not assessed [20]. We hypothesized that the commonly used NCCT markers and SS on CTA had a high level of inter-and interrater reliability. To test and evaluate this hypothesis, the interrater and intrarater reliability of four commonly reported NCCT markers and the established SS on CTA of hematoma expansion was assessed in a large multicenter cohort.

Study Population
We retrospectively studied the databases of three German tertiary stroke centers for patients with spontaneous ICH aged >18 years between January 2016 and December 2018. (University Hospital of Muenster, University Hospital Hamburg-Eppendorf and Charité University Hospital Berlin).
As inclusion criteria, we defined (1) primary spontaneous ICH confirmed on NCCT confirmed by a senior physician or fellow radiologist with extensive experience in stroke imaging and (2) NCCT and CTA performed on admission within 6 h after symptom onset. Primary spontaneous ICH were included despite severity and size, except for being accompanied by sub-or epidural and subarachnoid hemorrhage. Both databases included patients with anticoagulant treatment, but excluded patients with sub-or epidural hematoma and subarachnoid hemorrhage, and with secondary causes of ICH such as head trauma, brain tumor, vascular malformation, primary intraventricular hemorrhage, or secondary ICH from hemorrhagic transformation of ischemic infarction. Additionally, we obtained vascular risk factors (hypertension and diabetes mellitus) and surgery procedures (craniectomy) from patients' clinical records. This multicenter retrospective study was approved by the ethics committee (Ethik-Kommission der Ärztekammer Hamburg [WF-054/19], Ethik-Komission der Charité Berlin [EA4/011/20] and Ethik-Komission der Uniklinik Münster [2017-233-f-S]) and written informed consent was waived by the institutional review boards. All study protocols and procedures were conducted in accordance with the Declaration of Helsinki. Due to the retrospective nature of the study, patient consent was not needed.

Image Acquisitions
The CT scans were performed using standard clinical parameters with axial 5 mm section thickness. In detail the employed imaging protocols were the following: CT scans at the University Hospital Hamburg-Eppendorf were performed on a 256 slice scanner (Philips iCT 256. Amsterdam, Netherlands) with the following imaging parameters: NCCT with 120 kV, 280-320 mA, 5.0 mm slice reconstruction; CTA: 100-120 kV, 260-300 mA, 1.0 mm slice reconstruction, 5 mm MIP reconstruction with 1 mm increment, 0.6-mm collimation, 0.8 pitch, H20f soft kernel, 80 mL highly iodinated contrast medium, and 50 mL NaCl flush at 4 mL/second; scan started 6 s after bolus tracking at the level of the ascending aorta.
CT scans at the Muenster Universtiy Hospital were performed on a 2 × 128 slice scanner (Siemens SOMATOM Definition Flash. Erlangen, Germany) with the following imaging parameters: NCCT with 120 kV, 280 mA, 5.0 mm slice reconstruction; CTA: 100-120 kV, between 260-300 mA, 1.0 mm slice reconstruction, 5 mm MIP reconstruction with 1 mm increment, 0.5-mm collimation, 0.8 pitch, H20f soft kernel, 60 mL highly iodinated contrast medium, and 30 mL NaCl flush at 4 mL/second; scan started 6 s after bolus tracking at the level of the ascending aorta.

Image Analysis
The location of the hematoma and presence of intraventricular hemorrhage was assessed and documented. The hemorrhage locations were classified as deep (basal ganglia and thalamic), lobar, or within the brain stem and pons, or cerebellum.
Two raters experienced in stroke imaging independently reviewed images in a random order, blind to all demographic and outcome data and were not involved in the clinical care of assessment of the enrolled patients. They had corresponding experience in neuroradiology and stroke imaging with three years (J.N.) and four years (S.E.) of experience. Images were randomized and presented again to both raters one month later for a second reading, so as to minimize recall of the patients' follow-up scans.
Raters analyzed axial NCCT images and subsequently the corresponding CTA to determine the presence of the following markers: BS, BHS, IS, HD, and SS were rated in due consideration of the proposed consensus standard of Morotti et al. [6]: In brief, BS was defined as a hypoattenuating area adjacent to a hyperattenuating area of the hematoma, with a clear separation between them at a density difference of at least 18 Hounsfield Units (HU) [11,21]. The BHS consists of a relatively hypodense area which is encapsulated within a hyperdense area and which is not connected with the adjacent brain tissue [12,13]. The relatively hypodense area has an identifiable border and a difference of at least 28 HU between the two density regions [12,21]. The IS consists of at least 3 scattered small hematomas all separate from the main hematoma or at least 4 small hematomas some or all of which may connect with the main hematoma [14]. The imaging sign HD was defined as any hypodense region strictly encapsulated within the hemorrhage with any shape, size, and density which does not require a density measurement [6,15,22]. The spot sign on CTA was well-acknowledged as the foci of enhancement within the intracranial hematoma, detected on CTA source images [23,24].

Statistical Analysis
Level of agreement was calculated as the number of agreements divided by the total number of readings. Interrater and intrarater agreement was calculated and expressed as percentage of agreement and Cohen's κ statistic with stratified kappa with 95% upper and lower confidence intervals (CI) [25]. The intrarater estimates for percentage agreement were calculated as the percentage of pairs of readings (first reading or second reading) that agreed over all pairs for all raters. The interrater estimates were calculated as the percentage of pairs of readings from pairs of the two raters that were in agreement. For intrarater agreement, kappa calculations were stratified by rater, whereas for interrater agreement, Kappa calculations were stratified by each combination of reading (reading 1 or reading 2) from pairs of the two raters. Analyses were performed using the statistical software package SPSS version 25 ® (IBM Corporation, Armonk NY) and and R Statistics ® Version 3.5.1 (R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2018).

Results
The 473 patients had a median age of 69 years (IQR: 67.9-70.6) with 252 female patients (53.3%). A total of 326 (69.1%) patients suffered from arterial hypertension and 65 (13.8%) from diabetes mellitus. Bleeding locations were most frequently located in the basal ganglia and thalamus (deep) in 206 (43.6%) and lobar in 210 (44.4%) patients. Infratentorial bleeding locations consisted of 40 (8.5%) cerebellar and 17 (3.6%) brain stem and pons ICH. A total of 248 patients had intraventricular extension (52.5%). A total of 110 (23.3%) supratentorial and 5 (1.1%) infratentorial craniectomy procedures were performed. A clinical outcome above three on a modified Rankin Scale (mRS) was observed in 326 (68.9%) cases) ( Table 1). Of 473 patients with spontaneous ICH, 179 (30%) presented with HD, 125 (26%) with BH, 114 (25%) with IS, 76 (15%) with BS, and 75 (12%) with SS ( Figure 1). Distribution of poor clinical outcome (mRS > 3) [26] was higher in patients with the presence of NCCT markers and SS on CTA ( Table 2). The distribution of BS, HD, and SS increases rapidly with a coappearance of signs above three. Whereas in coappearance of two to three signs, distribution is predominated by BHS and IS (Table 3). Overall, interrater agreement was excellent for BH, BS, IS, HD and SS in the first (98.7%, 96%, 96.4%, 96.2%, and 98.5%, respectively) and also the second rating (98.1%, 97%, 97%, 97.3%, and 98.3%, respectively). Interrater stratified kappa was the best for BH and SS, and the lowest for IS in the first rating      95% confidence interval (CI). *stratified Cohen's kappa across two raters in two ratings. Legend: Intrarater agreement of four noncontrast computed tomography imaging signs and spot sign on computed tomography angiography specified with percentage agreement and stratified kappa with 95% confidence interval (CI). *stratified Cohen's kappa across one rater in two ratings.   Legend: Interrater agreement of four noncontrast computed tomography imaging signs and spot sign on computed tomography angiography specified with percentage agreement and stratified kappa with 95% confidence interval (CI). * stratified Cohen's kappa across two raters in two ratings. Legend: Intrarater agreement of four noncontrast computed tomography imaging signs and spot sign on computed tomography angiography specified with percentage agreement and stratified kappa with 95% confidence interval (CI). * stratified Cohen's kappa across one rater in two ratings.

Discussion
Hematoma expansion is a therapeutic target of clinical interventions and a potentially modifiable predictor of clinical outcome [8,27,28]. There is growing evidence that different imaging markers in NCCT and SS in CTA predict early hematoma expansion and therefore offer great additional value [1,21]. The main finding of our study is that four commonly used NCCT imaging marker and SS on CTA have excellent interrater and intrarater reliabilities. Prevalence of all signs in our study was comparable to other studies [1].
To our knowledge, this is the first study incorporating analysis of interrater and intrarater agreement of markers on NCCT and SS on CTA in a large multicenter cohort without controlling for factors such as image acquisition parameters or scanner type. A recent, large, independent patient data meta-analysis suggested that prediction of ICH expansion was improved by the addition of information on SS from CTA, alongside time from symptom onset, baseline hematoma volume, and antithrombotic medication [27]. According to the recent study of Sporns et al., SS on CTA had a higher sensitivity for clinical outcome prediction than the NCCT imaging markers (BS, BH, IS, HD) alone, and hence recommendable to be acquired collectively [1]. A recently published study by Dowlatshahi et al. analyzed the interrater and intrarater reliability of NCCT imaging markers for hematoma expansion, but SS on CTA was not being assessed [19] which this study has taken this into consideration. Implementing rater assessments of two radiology residents offers complementary yet distinctive information to the results by Dowlatshahi et al., considering that not only neurologists but synergistically also radiology residents are generally the first to analyze the CT scans.
The large patient population suggests a high reliability and importance of these imaging parameters for outcome prediction in patients with ICH. Further, there is considerable interest in hematoma expansion prediction scores that incorporate NCCT markers [15]. Taking into account that SS has a higher sensitivity for outcome prediction than BS and hypodensity alone, both NCCT and CTA should be acquired if possible [1]. In accordance with the evaluation of reliabilities in our study it would be conceivable to improve such prediction scores by adding SS on CTA. Scores with NCCT markers as the BAT Score still have their 'raison d'être in a setting where CTA is not readily available or with strong contraindications for contrast application (distinct allergy, far progressed renal dysfunction), sole acquisition of noncontrast CT and evaluation of BS and BHS is a valuable option for detecting hematoma growth associated with poor outcome-as the NCCT imaging markers of BS and BHS are strongly correlated with the established SS. In line with this, our results indicate a strong increase in the distribution of markers coappearance above 3 imaging signs of BS, SS, and HD. Whereas coappearance in one to three signs is predominated by BHS and island sign.
In conclusion, this study had several limitations. Firstly, the level of experience was not being evaluated. In this study, the reading time was not taken under consideration as image interpretation in an emergency setting is much more complicated, rushed, and confusing compared to elective scans. Further prospective studies are required to validate the use of NCCT and CTA markers in clinical practice and should take into consideration both the level of experience and reading time.
In conclusion, the high interrater and intrarater reliability suggests that all NCCT signs and SS are easy to use, thereby supporting their use in emerging scores and development and validation of machine learning tools to predict hematoma expansion, or either within randomized clinical or therapeutic trials targeting hematoma expansion [6,15,16,29].