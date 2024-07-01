The aim of the present study was first to assess intra- and inter-observer reliability of the CSE and possibly, to assess the correlation between the evaluation findings and the final clinical decision. This should lead to a better understanding of the added value of the CSE in preoperative counselling and to a better standardisation of the procedure.

Over the last decade, male sling surgery has taken an appreciable place in the surgical treatment of male post-prostatectomy stress urinary incontinence (UI). The AdVance™ sling (Boston Scientific Corp., Marlborough, MA, USA) is, in this setting, one of the most frequently used slings worldwide with demonstrable good functional outcomes and low complication rates [ 1 ]. The choice of male sling is dependent on multiple factors, including UI severity and patient characteristics namely age, activity, radiotherapy history, previous UI surgery or stricture treatment [ 2 , 3 ]. Adequate patient selection is key to achieving optimal results after AdVance sling implantation. To date, no decision-assisting algorithm has been developed to aid clinicians in identifying the ideal sling candidate. Cystoscopic evaluation of sphincter function and the coaptive zone is advocated to identify sling candidates, by repositioning of the membranous urethra during urethroscopy using the so-called ‘repositioning test’ (RT) [ 4 ]. However, the impact of observer experience on the cystoscopic sphincter evaluation (CSE) has not been investigated. Moreover, it is not clear whether performing a CSE before surgery influences the final decision of a specific surgery choice or not.

The intra-observer reliability/agreement was determined for each observer separately, and the mean reliability is presented with the 95% CI, as well as interquartile range and range. The inter-observer reliability/agreement was determined for each recording (count 1/2) separately, and the mean reliability is presented with 95% CI, as well as both individual values. For intra-observer reliability or agreement, the number of observers could be as low as one, provided that the two ratings per observer were available. For inter-observer reliability or agreement, the number of observers should be at least two. Analyses were performed using Statistical Analysis System (SAS) software (version 9.4 of the SAS System for Windows; SAS Institute Inc., Cary, NC, USA). The SAS macro ‘icc9’ was used for estimation of the ICC and the coefficients of variation. The SAS macro ‘magree’ was used for estimation of the kappa coefficient. Coefficient values were interpreted as: <0, no agreement; 0–0.20, slight; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, substantial; and 0.81–1, almost perfect agreement [ 5 ]. Figure 3 shows the different steps of our methodology and the interpretation of the results of each step.

An invitation to join the project with a comprehensive explanation of our objectives, was sent via e-mail to 62 evaluators. Of them 26 (response rate 42%), accepted to join. All the anonymised recordings were shared on-line via protected, personalised link to the 26 responders that completed the project. The invitation was sent to several observers. Urologists were chosen randomly but had to be members of a European/American scientific society of urology and considered experts in the field of male UI in general (regardless of experience with slings). Residents and students were randomly chosen from our department regardless of their level of experience or interest in the treatment of male UI. The link also included the necessary instructions on-line and a form with a few questions on personal sling experience. The recordings were randomly offered to the same evaluators twice. The evaluators had different years of practice and different levels of experience with male sling surgery namely: medical students, urology residents and full urologist with 0–5, 5–10, and >10 years of practice, respectively. The level of experience was defined as: ‘no experience at all’, ‘some procedures seen’, ‘some procedures performed’, and ‘surgery performed on a regular basis’. Some patients were good sling candidates, others not. According to literature a ‘good’ sling candidate is a patient with mild (1–2 pads/day or pad weight <200 g/24 h) to moderate (3–4 pads/day or pad weight <400 g/24 h) UI, without previous radiotherapy or urethral stricture surgery [ 2 ]. The observers independently examined and rated each recording and were blinded to the patients’ clinical characteristics, to the following surgery choice, and to each other's findings.

A flexible videocystoscope is positioned distally of the sphincter region with a view of the whole circumference of the external urinary sphincter. First, sphincter closure is observed during voluntary contraction of the pelvic floor. Next, repositioning of the posterior urethra is performed by applying a gentle mid-perineal pressure parallel to the anal canal (midway between scrotum and anus) and below the bulbar urethra. The goal is to relocate the posterior urethra 2–3 cm proximally. The test is positive if there is a complete closure during active sphincter contraction and if the sphincter closes autonomously, in a reflex and concentric manner with complete closure during repositioning of the posterior urethra with a ‘coaptive zone’ of at least 1 cm (Fig. 1 ). The length of the coaptive zone is estimated visually [ 4 ]. Figures 1 and 2 show a positive and a negative RT, respectively.

A CSE is routinely performed and recorded in the electronic patient record at our centre during outpatient video-urethrocystoscopy in the supine position in all UI surgery candidates. The CSE recordings, in patients evaluated at our institution between March 2018 and February 2020, were extracted and anonymised. The recordings were randomly offered for evaluation to different observers through a secured website. The patients were both ideal sling candidates and artificial urinary sphincter (AUS) candidates. We did not require Ethics Committee approval because the data are commonly recorded during each examination and, moreover, were extracted retrospectively and anonymously. Patients sign a consent on data processing and a privacy agreement upon entering the hospital.

The design of the study did not allow us to perform complete descriptive analyses on patients’ characteristics. In Table 2 , we give an overview of the available characteristics. The mean (median, range) age was 70 (71, 46–87) years and the majority of patients had not undergone radiotherapy (73%). Slings were implanted in 19/35 patients (54%) and the rest underwent AUS surgery. In two patients, surgery was cancelled due to COVID-19 restrictions. During follow up, 54% and 35% of the patients were completely dry and socially continent, respectively. In Table 3 , we give the numbers and percentages of overall ratings (37 patients × 26 observers × two ratings). According to these results most patients had a normal bladder neck without strictures (58% and 67%, respectively), a long coaptation during the RT (≥1 cm in 53%), and were good sling candidates (58%), with an estimated good outcome (66%). The intra-observer reliability for the CSE is shown in Table S1 . The ICC was 0.54 (moderate), 0.58 (moderate), and 0.60 (substantial) for medical students, residents, and urologists, respectively. Observers performing CSE on a regular basis had an ICC of 0.66. The inter-observer reliability for CSE is shown in Table S1 . The ICCs ranged between 0.31 and 0.53, with the lowest ICC value observed between urologists (0.31). Interestingly, after stratifying observers according to years of experience and procedure experience, the lowest agreement values were observed between urologist with >10 years of experience, performing sling surgery on a regular basis.

In total, 37 recordings were included in the study. Overall, 26 observers scored the recordings twice. Of the 26 observers, eight (31%) were medical students, 12 (46%) were urologists, and six (23%) were urology residents. The majority of urologist had >10 years of experience (seven of 12 [58%]). Eight (31%) of the observers performed sling surgery on a regular basis and four (15%) had no experience at all. Interestingly, the majority of the observers did not perform a CSE routinely (19/26 observers [73%]; Table 1 ).

Discussion

To the best of our knowledge this is the first study to blindly assess the inter- and intra-observer reliability of the CSE in patients with post-prostatectomy UI. We designed a blinded study to assess the reliability and, accordingly, the reproducibility of this test in a scenario where other influencing variables were unknown. Our study failed to demonstrate any acceptable level of agreement within and between observers and, as result, we could not assess any correlation with the following clinical decision. Therefore, the added value of the CSE in this context should be critically reviewed. Interestingly, our data showed higher levels of agreement in observers with less experience, not performing a CSE routinely: intra-observer reliability ICC between 0.54 and 0.70 and inter-observer reliability ICC between 0.40 and 0.73.

On the contrary, experienced surgeons implanting slings on a regular basis had the lowest ICC (intra-observer reliability ICC between 0.55 and 0.60; inter-observer reliability ICC between 0.31 and 0.40). According to our results the CSE alone seems to have a limited value in clinical practice. The reason of our counterintuitive results could be that the test, if separated from other clinical variables, does not give any added value to the clinical assessment of the patients. Therefore, it cannot be seen separated from the complex evaluation of the patient. Some might argue that the study design is artificial as normally this is an interactive test that is performed with an awake patient, already knowing the patient's characteristics. However, the focus of our project was exactly to separate the test from the rest to challenge its reliability as to whether it is mandatory or not.

Our results are in contrast with previous literature advocating an important role of the RT during patient assessment before sling surgery [4]. The RT is considered a minimally invasive, not time-consuming, easy to perform, and easy to learn test. It does not require additional diagnostic procedures as a urethrocystoscopy is a recommended assessment before male UI surgery [4, 6, 7]. Bauer et al. [4], in a study including 65 patients treated with an AdVance sling implant and reported a 6-month higher cure rate in patients with a positive RT compared to their counterparts, based on daily pad usage, urine loss in 1-h pad tests, and Patient Global Impression of Improvement score. They underlined the role of the RT as a useful tool for preoperative patient selection. It should be noted that in their study the RT was included in a more holistic assessment of the patients including other, and possibly more important, characteristics and as such the added value of the RT could not be appreciated.

Previous studies on patients assessment, sling efficacy and safety, mainly recruited ideal candidates who were not previously exposed to other treatments (i.e., radiotherapy and/or urethral stricture surgery). By contrast, the literature evidence for non-ideal patients is poor [8], few studies adopted a prospective approach, and most of the available studies did not use a standardised definition of the degree of UI, as well as a standardised protocol for the assessment of the patients [9, 10]. In an ideal population, the results of this surgery are excellent; however, results can differ substantially between ideal patients and patients with comorbidities or known risk factors such as morbid obesity, previous radiotherapy to the prostatic fossa, previous urethral stricture treatment, the presence of detrusor overactivity, and the degree of UI [2, 11].

Our preliminary study has several strengths: first, the methodological approach. It is a blinded study where all the cystoscopies were performed by two expert surgeons using the same standardised technique. Moreover, our methodology makes it possible to isolate the CSE from the general clinical aspects of the patients. It allows the evaluation of the added value of the CSE alongside the rest of the normal outpatient evaluation. Second, the different experience of the observers: this allowed us to correlate expertise to CSE variability. Third, the inclusion of both ideal candidates for sling with a following successful outcome and non-ideal candidates with irradiated non-coaptive sphincters.

Last, the inter-observer reliability could be estimated twice, based on the first and second evaluation of the observers, respectively. The results showed very similar estimates of the inter-rater reliability in all subgroups, hence validating our findings. We should emphasise that the recordings were very similar to each other, which is inherent in the type of procedure, so that there is a little chance of recall bias. Another reason for a low risk of recall bias is that the recordings were randomly assigned and randomly assessed by the observers during the same session.

Although the results are promising, this is a preliminary study with a relative low number of cystoscopies included however, there is no reason to believe that the point estimates would be affected by increasing the sample size. The standardisation of the technique remains the main issue. At our centre we perform the CSE as outlined but it will be always difficult to generalise our results and compare them with other centres. However, we feel we should emphasise that the sample we evaluated is representative for the population to whom we applied the scoring system (Table S6). The homogeneous characteristics of this sample can explain the lower than expected ICC levels, as the ICC depends on the heterogeneity of the sample. A critical analysis of our results leads us to reconsider the role of this test because it is very likely that its results and, as a result, our following surgical choice, could be influenced by the known clinical characteristics of the patients. However, we feel we should emphasise that, although in the light of our results the CSE is not mandatory, it can be useful in more ‘complicated’ cases (urethral stenosis, large urine leakage that worsens during the day).