Published on in Vol 9, No 3 (2021): March

Preprints (earlier versions) of this paper are available at, first published .
Comparative Analysis of Paper-Based and Web-Based Versions of the National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index (NFBSI-16) Questionnaire in Breast Cancer Patients: Randomized Crossover Study

Comparative Analysis of Paper-Based and Web-Based Versions of the National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index (NFBSI-16) Questionnaire in Breast Cancer Patients: Randomized Crossover Study

Comparative Analysis of Paper-Based and Web-Based Versions of the National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index (NFBSI-16) Questionnaire in Breast Cancer Patients: Randomized Crossover Study

Original Paper

1Department of Breast Surgery, The First Affiliated Hospital of China Medical University, Shenyang, China

2Department of Ophthalmology, He Eye Hospital, Shenyang, China

3Mathematical Science Research Centre, Queen's University Belfast, Belfast, United Kingdom

*these authors contributed equally

Corresponding Author:

Feng Jin, MD, PhD

Department of Breast Surgery

The First Affiliated Hospital of China Medical University

No 155 Nanjing Road, Heping District

Shenyang, 110001


Phone: 86 18040031101


Background: Breast cancer remains the most common neoplasm diagnosed among women in China and globally. Health-related questionnaire assessments in research and clinical oncology settings have gained prominence. The National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index (NFBSI-16) is a rapid and powerful tool to help evaluate disease- or treatment-related symptoms, both physical and emotional, in patients with breast cancer for clinical and research purposes. Prevalence of individual smartphones provides a potential web-based approach to administrating the questionnaire; however, the reliability of the NFBSI-16 in electronic format has not been assessed.

Objective: This study aimed to assess the reliability of a web-based NFBSI-16 questionnaire in breast cancer patients undergoing systematic treatment with a prospective open-label randomized crossover study design.

Methods: We recruited random patients with breast cancer under systematic treatment from the central hospital registry to complete both paper- and web-based versions of the questionnaires. Both versions of the questionnaires were self-assessed. Patients were randomly assigned to group A (paper-based first and web-based second) or group B (web-based first and paper-based second). A total of 354 patients were included in the analysis (group A: n=177, group B: n=177). Descriptive sociodemographic characteristics, reliability and agreement rates for single items, subscales, and total score were analyzed using the Wilcoxon test. The Lin concordance correlation coefficient (CCC) and Spearman and Kendall τ rank correlations were used to assess test-retest reliability.

Results: Test-retest reliability measured with CCCs was 0.94 for the total NFBSI-16 score. Significant correlations (Spearman ρ) were documented for all 4 subscales—Disease-Related Symptoms Subscale–Physical (ρ=0.93), Disease-Related Symptoms Subscale–Emotional (ρ=0.85), Treatment Side Effects Subscale (ρ=0.95), and Function and Well-Being Subscale (ρ=0.91)—and total NFBSI-16 score (ρ=0.94). Mean differences of the test and retest were all close to zero (≤0.06). The parallel test-retest reliability of subscales with the Wilcoxon test comparing individual items found GP3 (item 5) to be significantly different (P=.02). A majority of the participants in this study (255/354, 72.0%) preferred the web-based over the paper-based version.

Conclusions: The web-based version of the NFBSI-16 questionnaire is an excellent tool for monitoring individual breast cancer patients under treatment, with the majority of participants preferring it over the paper-based version.

JMIR Med Inform 2021;9(3):e18269



Breast cancer accounts for the highest proportion of malignant tumors among women (excluding skin cancers) globally. According to an International Agency for Research on Cancer report [1], the worldwide burden for breast cancer was 2.1 million cases in the year 2018, accounting for 1 in 4 cancer cases among women. Advancements in breast cancer screening, detection, and treatment over the last few decades have produced an increased chance of cure for early-stage breast cancer patients, while advanced (metastatic) disease patients now have prolonged survival and varying degrees of controlled symptoms [2,3]. However, full-aspect and long-term treatment can impact patients’ and survivors’ quality of life and therefore require continual health management during and after the process of recovery [4].

Breast cancer and its treatment have been documented to significantly disrupt patients’ health-related quality of life, which has been found to predict survival time and additionally showed more significance for noncurative patients [5-10]. To assess treatment benefits, patient-reported outcome measures (PROMs) provide unique perspectives on cancer symptoms from patients’ experience, some of which can be neglected by clinicians and laboratory tests [11-13]. The National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index (NFBSI-16) PROMs were regulated on the foundation of the Functional Assessment of Chronic Illness Therapy (FACIT) measurement system to assess high-priority symptoms of breast cancer, emphasizing patients’ input, which can be applied to help evaluate the effectiveness of treatments for breast cancer in clinical practice and research [14-16].

The migration from paper-based to web-based versions does not guarantee preservation of psychometric properties of the scale since various factors have the potential to impact the performance of the questionnaire scale when adapted for web-based administration, such as layout, instructions, or restructuring of item and response. Researchers have investigated methods of validation, routes of administration, practical considerations, and reliability of electronic PROMs [17-27]. Gwaltney et al’s meta-analysis on assessing the equivalence of computer versus paper versions of PROMs showed “a high overall level of agreement between paper and computerized measures” [28]. The review encompassed the fields of rheumatology, cardiology, psychiatry, asthma, alcoholism, pain assessment, gastrointestinal disease, diabetes, and allergies. In contrast, a study of the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Core 30 found small but statistically significant differences in scale mean scores (3 to 7 points on a 100-point scale) associated with mode of administration [29]. Various validated web-based questionnaires in oncology have been demonstrated to be reliable and effective tools for assessing PROMs in therapeutic clinical and research settings [30-33]. Currently in China, web-based versions of clinical research questionnaires using WeChat are rapidly growing in number, and various studies have validated the WeChat-based administration of health-related questionnaires [34-36]. To cover the large growing patient base in China, we expected web-based administration of the NFBSI-16 to be a reliable methodology to assess the impact of disease, treatment, and well-being status among patients with breast cancer. Additionally, it could be a more cost-effective and efficient method to apply in the growing number of patients in certain demographics.

The aim of this study was to analyze the reliability of a web-based NFBSI-16 questionnaire (Chinese language) for measuring disease- and treatment-related symptoms and concerns in breast cancer patients, comparing it with the validated paper-based version.

Study Design and Patient Enrolment

Patients were recruited from the Department of Breast Surgery of the First Affiliated Hospital of China Medical University, Shenyang, China, between October 2019 and January 2020. The inclusion criteria were female gender, full legal age, proven diagnosis of breast cancer, being under systemic anticancer treatment, ability to follow study instructions, sufficient literacy and fluency in Chinese to comprehend the questionnaires, ability and willingness to complete the study protocol, and signed declaration of consent. Potential participants were excluded if they could not provide informed consent or participated in other studies (burden of participation). Participants had an initial clinic visit at which eligibility was assessed. All eligible participants were randomly chosen from the hospital’s central registry and invited to volunteer for the study via face-to-face interview with a trained research clinician. Written informed consents were obtained. The study protocol was approved by the First Affiliated Hospital of China Medical University ethics committee.

The study was a randomized crossover design in which all participants completed both a standard paper questionnaire and a web-based version of the NFBSI-16 (Multimedia Appendix 1). Patients in group A were assigned to start with the paper-based version followed by the web-based version on their smartphone in the same session. Patients in group B completed the web-based version followed by the paper-based version. Participants were randomized immediately after enrolment to group A or B in a 1:1 ratio using a computer-generated randomization list with a specified seed and block size of 6, based on the mode of administration to be completed first. Between each session from paper-based to web-based and web-based to paper-based, participants were given a break of 15 minutes during which they were invited into a quick patient education seminar, which was also a routine activity in our department as a distractor task to lower the potential carryover effect. All participants were provided with written instructions for completion of the paper- or web-based questionnaires prior to their questionnaires being administered. After completing both versions of the NFBSI-16 questionnaire, participants were invited to state their preference for either the paper- or web-based NFBSI-16 questionnaire.


The NFBSI-16 contains items from the original FBSI and FACIT measures selected by patients and clinicians according to their priority concern [15], which presented as a more direct tool to reflect the effectiveness of treatments for advanced breast cancer. The NFBSI-16 comprises 16 items with 4 dimensions for ease of use and scoring: Disease-Related Symptoms Subscale–Physical (DRS-P), Disease-Related Symptoms Subscale–Emotional (DRS-E), Treatment Side Effects Subscale (TSE), and Function and Well-Being Subscale (FWB). Therefore, clinicians and researchers can individually view and assess subscale scores when concerned about a particular class of symptoms. The questionnaires were self-completed, and careful attention was paid to the design and layout of the web-based version. In order to reduce the risk of errors in posing, interpretation, recording, and coding responses and potential interrater variability, the theory-based guidelines for self-administered questionnaire design were followed by the authors (Multimedia Appendix 1) [37]. The web-based user interface and paper for the paper-based questionnaires were free from all other information such as logos, slogans, advertisement, etc. The instructions for completing the web-based and paper-based questionnaires were included at the beginning of the web-based interface and header of the paper, respectively. In brief, while participating in the web-based assessment, patients had to scan a redesignated Quick Response code using their smartphone. This action automatically took them to a web-based test, and the user had to select the intensity or severity of the 16 items. After completing the 16th question, the interface turned into a blank screen indicating the test was over. On the other hand, the paper-based questionnaire test was conducted using white paper and pencil. The text was printed using clear 12-point font.

Testing of the Instrument

During pretesting and pilot testing, 3 colleagues specializing in oncology and 3 nonexperts evaluated the web-based questionnaire’s usability, accessibility, and clarity of the user interface. This testing was only conducted on the functionality of the web-based questionnaire since the format, structure, and sequence of items in the web-based questionnaire were the same as in the validated paper-based questionnaire.

Computation of Subscale Scores

Data from the paper questionnaires were entered manually into an electronic patient management system by the authors, and data from the web-based questionnaires were automatically captured after the participant completed the online questionnaire and downloaded to the electronic patient management system. All data was anonymized. We assessed the completeness of the data on a per-item basis and questionnaire basis. The total scores were obtained by taking the mean score across completed items and multiplying by 16, the number of items (following official guidelines) [15]. All subscale totals ranged from 0 to 4, with a score of 0 representing that the patient agrees with the item “not at all” and 4 representing “very much”. Subscale scores and total scores were computed for each participant and each mode of administration separately. Comparative analyses of individual items, subscales, and total score were the primary goal of the study.

Statistical Analysis

All statistical analyses were conducted using SPSS Statistics, version 25 (IBM Corp). Frequency analysis was performed to determine the descriptive sociodemographic characteristics of patients. Referring to ISPOR ePRO Good Research Practices Task Force recommendations [21], we conducted the evaluation of measurement equivalence. Reliability, internal consistency, disparity of responses, and the rate of consistency between paper- and web-based responses were assessed. Reliability was calculated for the 16 individual items as well as for scores of the 4 subscales (DRS-P, DRS-E, TSE, and FWB) and the NFBSI-16 total score in accordance with the NFBSI-16 guidelines [15]. The primary study outcome was to assess the reliability of single items and total score of the web-based questionnaire. The Wilcoxon test was used to identify possible statistically significant differences in the test of parallel forms reliability, both between the single items and the scores due to the ordinal nature of the data. The secondary outcome measure was to assess the consistency and agreement of the web-based questionnaire with the paper-based questionnaire. The mean values of the paper- and web-based measures were calculated, consistency analyses were performed by calculation of the Spearman rank correlation coefficient (Spearman ρ), and agreement rates for each item were assessed using rank correlation (Kendall τ) for each scale. As a second measure of test-retest reliability, we calculated the Lin concordance correlation coefficient (CCC) [38]. Finally, all answers to the ‘‘preference’’ questionnaire were compared between the web-based and the paper version of the NFBSI-16 using χ2 tests. In all analyses, P<.05 (2-tailed) was considered indicative of statistically significant differences (α=.05). As such an analysis is considered an explorative study, all reported P values can be taken as purely descriptive. All figures (box plot and correlation diagram) were generated in SPSS Statistics.

Enrolment of Patients

The final analysis included 354 patients with breast cancer receiving systematic treatment who completed both the paper- and web-based versions of NFBSI-16 questionnaire. Initially, 380 patients were assessed for eligibility. 26 patients were excluded, as shown in the study flow diagram (Figure 1). Since there was no internal difference between group A and group B, demographically, two groups were combined in the final analysis. The mean age was 49.5 years (SD 10.44). Other basic characteristics of patients are shown in Table 1.

Figure 1. Study flow diagram.
View this figure
Table 1. Basic characteristics of study participants.
Patient characteristicsn (%)
Menstrual status
Premenopause133 (37.6)
Perimenopause107 (30.2)
Postmenopause114 (32.2)
Level of education completed
Primary59 (16.7)
Secondary161 (45.5)
Tertiary134 (37.9)
Marital status
Single16 (4.5)
Married338 (95.5)
Rural165 (46.6)
Urban189 (53.4)
Neoadjuvant therapy 244 (68.9)
Adjuvant therapy 110 (31.1)

Parallel Forms Reliability

The Wilcoxon signed rank test analyzed parallel reliability in the single items of the NFBSI-16, shown in Table 2. No systematic location difference between the two versions of questionnaires (paper- and web-based versions) was observed for continuous variables except for item 5 (GP3 question). A very large proportion of the items answered by the patients had the same response (ties) in both versions of the questionnaire, suggesting high parallel reliability as only one significant difference (out of 16 in total) could be found in the single-item comparison. A statistically significant difference could only be identified in question GP3, “Because of my physical condition, I have trouble meeting the needs of my family.” GP3 was reported slightly higher in the paper-based questionnaire (mean 2.07, SD 0.98), while in the web-based version the same participants scored it at a mean of 2.00 (SD 0.91). Additionally, the medians of the item GP3 for the paper- and web-based questionnaires were the same (median 2; IQR 1-3). While the web-based total mean score was slightly higher than the paper-based score by 0.08 points, they had no statistically significant difference between them. Figure 2 illustrates the distribution of the paper-based and web-based total scores in a box plot. The slightly higher total web-based total score can be attributed to a few outliers shown in the box plot. The web-based whisker of the box plot IQR was within the broader IQR of the paper-based version. In addition, slight differences of less than 0.50 points were found between the paper-based and web-based questionnaires when the item scores of the 4 dimensions (DRS-P, DRS-E, TSE, and FWB) were calculated and compared. However, all 4 dimensions’ scores showed no statistically significant differences when compared (Table 3).

Table 2. Parallel test-retest reliability of single items and total score (Wilcoxon test).
NFBSI-16a itemsPaper-based patient scoreWeb-based patient scoreP valueΔ |Mean−Mean'|
Mean (SD)Median (IQR)Mean\' (SD)Median (IQR)
Disease-Related Symptoms Subscale – Physical (DRS-P)
GP1 (item 1)2.31 (0.92)2 (2-3)2.32 (0.90)2 (2-3).580.01
GP4 (item 2)2.19 (0.90)2 (2-3)2.17 (0.88)2 (2-3).360.02
GP6 (item 3)2.29 (1.13)2 (1-3)2.30 (1.15)2 (1-3).880.01
B1 (item 4)2.01 (0.89)2 (1-3)2.00 (0.88)2 (1-3).790.01
GP3 (item 5)2.07 (0.98)2 (1-3)2.00 (0.91)2 (1-3).02b0.07
HI7 (item 6)2.59 (1.02)2 (2-3)2.59 (1.06)2 (2-3).910.00
BP1 (item 7)1.88 (0.93)2 (1-2)1.90 (0.93)2 (1-2).270.02
GF5 (item 8)2.59 (1.18)2 (2-3)2.55 (1.17)2 (2-3).380.04
Disease-Related Symptoms Subscale – Emotional (DRS-E)
GE6 (item 9)2.00 (1.04)2 (1-2)2.01 (1.05)2 (1-2).730.01
Treatment Side Effects Subscale (TSE)
GP2 (item 10)2.20 (1.15)2 (1-3)2.25 (1.10)2 (1-3).160.05
N6 (item 11)1.87 (0.98)2 (1-2)1.85 (0.93)2 (1-2).420.02
GP5 (item 12)2.77 (1.01)3(2-3)2.75 (1.00)3 (2-3).450.02
B5 (item 13)2.98 (1.35)3 (2-4)2.98 (1.33)3 (2-4).890.00
Function and Well-Being Subscale (FWB)
GF1 (item 14)2.52 (1.04)2 (2-3)2.55 (1.01)2.5 (2-3).140.03
GF3 (item 15)2.82 (1.12)3 (2-4)2.81 (1.08)3 (2-4).830.01
GF7 (item 16)2.82 (1.19)3 (2-4)2.85 (1.21)3 (2-4).670.03
Total score
NFBSI-16 score37.92 (7.79)38 (32-42.5)37.88 (7.71)38 (32.75-42).980.04

aNFBSI-16: National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index.

bStatistically significant difference.

Figure 2. Box plot comparison of paper-based and web-based distribution of total scores.
View this figure
Table 3. Parallel test-retest reliability of subscales (Wilcoxon test).
NFBSI-16a subscalePaper-based patient outcomeWeb-based patient outcomeP valueΔ |Mean−Mean'|
Mean (SD)Median (IQR)Mean (SD)Median (IQR)
Disease-Related Symptoms Subscale–Physical35.90 (9.59)36 (30-42)35.64 (9.58)34 (10-42).430.26
Disease-Related Symptoms Subscale–Emotional32.00 (16.54)32 (16-32)32.05 (16.84)32 (16-32).980.05
Treatment Side Effects Subscale39.20 (13.39)40 (28-48)39.20 (12.86)36 (32-48).620.00
Function and Well-Being Subscale43.37 (13.37)42.67 (32-53.33)43.62 (13.72)42.67 (32-53.33).320.25

aNFBSI-16: National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index.

Test of Internal Consistency

Table 4 shows the Spearman ρ correlation values between the individual items from the paper- and web-based questionnaires. All 16 items demonstrated a high correlation (>0.8) between paper- and web-based items. Individual item internal consistency test was performed by Kendall τ analysis between the two versions. In all items, the rank correlation was high as the Kendall τ coefficients ranged between 0.787 and 0.877 and were all statistically significant. With each data point reflecting an individual patient’s total NFBSI-16 score, Figure 3 depicts a positive correlation between total paper-based and web-based scores. Overall, CCC agreement between paper-based and web-based questionnaires’ item scores were all comparably high at 0.94 (fair: 0.21-0.40; moderate: 0.41-0.60; substantial: 0.61-0.80; almost perfect: 0.81-1.00), as represented in Table 5.

Table 4. Correlation between test-retest in individual items and subscale (Spearman ρ and Kendall τ analysis).
ItemsSpearman ρP valueKendall τP value
Disease-Related Symptoms Subscale – Physical (DRS-P)
 GP1 (item 1)0.89<.0010.877<.001
 GP4 (item 2)0.84<.0010.810<.001
 GP6 (item 3)0.86<.0010.804<.001
 B1 (item 4)0.90<.0010.87<.001
 GP3 (item 5)0.85<.0010.825<.001
 HI7 (item 6)0.85<.0010.813<.001
 BP1 (item 7)0.89<.0010.856<.001
 GF5 (item 8)0.84<.0010.796<.001
 Subscale total0.93<.0010.827<.001
Disease-Related Symptoms Subscale – Emotional (DRS-E)
 GE6 (item 9)0.85<.0010.826<.001
 Subscale total0.85<.0010.882<.001
Treatment Side Effects Subscale (TSE)
 GP2 (item 10)0.88<.0010.830<.001
 N6 (item 11)0.89<.0010.857<.001
 GP5 (item 12)0.83<.0010.795<.001
 B5 (item 13)0.84<.0010.788<.001
 Subscale total0.95<.0010.882<.001
Function and Well-Being Subscale (FWB)
 GF1 (item 14)0.82<.0010.787<.001
 GF3 (item 15)0.86<.0010.821<.001
 GF7 (item 16)0.83<.0010.79<.001
 Subscale total0.91<.0010.825<.001
Total score
Figure 3. Correlation between total paper-based and web-based scores.
View this figure
Table 5. Agreement between paper-based and web-based questionnaires scores (Lin concordance correlation coefficient analysis).
ItemsRca95% CI
Disease-Related Symptoms Subscale – Physical (DRS-P)
GP1 (item 1)0.920.90-0.94
GP4 (item 2)0.850.82-0.88
GP6 (item 3)0.860.83-0.88
B1 (item 4)0.90.88-0.71
GP3 (item 5)0.860.83-0.89
HI7 (item 6)0.860.83-0.89
BP1 (item 7)0.880.87-0.91
GF5 (item 8)0.850.82-0.88
Subscale total0.940.93-0.95
Disease-Related Symptoms Subscale – Emotional (DRS-E)
GE6 (item 9)0.840.81-0.87
Subscale total0.840.81-0.87
Treatment Side Effects Subscale (TSE)
GP2 (item 10)0.870.85-0.90
N6 (item 11)0.880.86-0.91
GP5 (item 12)0.860.83-0.89
B5 (item 13)0.830.80-0.86
Subscale total0.960.95-0.97
Function and Well-Being Subscale (FWB)
GF1 (item 14)0.850.82-0.88
GF3 (item 15)0.860.83-0.89
GF7 (item 16)0.840.81-0.87
Subscale total0.910.89-0.93
Total score

aRc: concordance correlation coefficient.

Patient Preference

Table 6 shows a majority of the participants preferred answering the same questions in a web-based format rather than paper-based format. The difference in preference was statistically different.

Table 6. Analysis of participant preference.
Patient preferenceObserved, nExpected, nResidualChi-square (df)Asymptotic significance
Preferred paper-based questionnaire98177−79
Preferred web-based questionnaire25617779
Total354  70.5a (1).001b

a0 cells (0.0%) have expected frequencies less than 5. The minimum expected cell frequency is 177.0.

bStatistically significant difference.

Estimation of the Carryover Effect

To assess the carryover effect, we let sA denote the sum (total scores from web-based items plus the total scores from paper-based items for each respondent) from group A and let sB denote the sum from group B. We estimate the carryover effect in both groups (A and B) using the Wilcoxon test on the sum values sA and sB, and at a level of significance of 5%, the possible carryover effect is not significantly different between the different sequences (P=0.84).

Principal Results

Overall, reliability was considered to be excellent for the web-based version as measured with the Wilcoxon signed rank test and CCC. Additionally, Spearman ρ correlation and Kendall τ analysis showed that mean differences were all close to zero, supporting good reliability of the web-based version of the NFBSI-16 self-administered questionnaire. In this study, we used the Wilcoxon signed rank test and CCC to assess test reliability. However, different methods can be used to assess test-retest reliability, and there is much discussion in the literature on the best possible methodology [39]. Intraclass correlation coefficient (ICC) was first introduced in 1954 and is a modification of the Pearson correlation coefficient. However, modern ICC is calculated by mean squares (ie, estimates of the population variances based on the variability among a given set of measures) obtained through analysis of variance (ANOVA). The disadvantage of ICC in patient group analysis is that if the groups are mainly homogeneous, the ICC tends to be low, because the ICC compares variance among patients to total variance. If patient groups are mainly heterogeneous, the ICC tends to be high. Thus, ICC would only generalize to similar populations. Additionally, the 1-way ICC does not consider the order in which observations were made [40]. Therefore, the CCC is a useful measure as it not only covers mean differences between the first and second measurements, such as ICCs calculated by a 1-way ANOVA, but also takes the variance differences between the first (paper-based) and second (web-based) measurements into consideration by reducing the magnitude of the resulting test-retest reliability estimate. In conclusion, CCC is a better tool that distinguishes bias between imprecision [39,40].


This study may also have some limitations. First, the significant difference in item 5 (GP3) between paper- and web-based measurement of the NFBSI-16 (Table 3) was an unexpected finding. We think this significant difference might be due to an outlier. This assumption was supported by the fact that even though 293 out of 354 (total) patients had the same answer for the paper- and web-based for item 5 (high number of similarities), a significant difference in the mean was detected. Second, according to the nature of this study, it is difficult to generalize some of our findings as its limited by demographic settings.

Comparison With Prior Work

NFBSI-16 includes all 8 items from the original FBSI and 8 additional items from FACIT measures, which cover most essential breast cancer–related symptoms and concerns endorsed by both oncology patients and clinicians [15]. Compared to the previous version (FBSI), it emphasizes patient input following Food and Drug Administration guidance for PROMs [41] and has been validated as a comprehensive and powerful tool to evaluate the effectiveness of treatments for breast cancer in clinical practice and research. In addition, the layout of 4 clear separated subscales benefits any clinicians, patients, or researchers by allowing them to view particular domains they are concerned about. However, the reliability of an electronic version in Chinese language has not been tested. This paper describes the evaluation of the test-retest reliability of the web-based version of the NFBSI-16 self-administered questionnaire. When designing a web-based version of a validated paper-based questionnaire, one has to take into consideration variables such as text size, column formatting, contrast, layout, use of corrective lenses, etc. We created the web-based NFBSI-16 to be consistent with the original as far as possible. In addition, technology skills required to complete a web-based questionnaire can differ from those needed to complete a paper-based questionnaire. However, our study found no clinically significant differences between scores obtained from the paper-and web-based versions. Gwaltney et al’s [28] meta-analysis reported the average correlation between paper-based and electronic assessment was 0.90 (95% CI 0.87-0.92; n=32). Our findings suggest that the NFBSI-16 questionnaire achieved a good test-retest reliability, with the total NFBSI-16 score correlation equal to 0.94.


In summary, the web-based version of the NFBSI-16 clearly showed comparable reliability and is thus a promising measure in evaluating studies in patients undergoing treatment for breast cancer and in monitoring individuals. The test-retest reliability supports the value of the web-based version of the NFBSI-16 for clinical studies with relatively moderate sample sizes. Furthermore, the majority of participants in our study preferred it over the paper-based version; we recommend using the web-based version of the NFBSI-16 in clinical studies. Currently, the longitudinal validity of the web-based version of the NFBSI-16 and the validity of several other demographic groups in China are being investigated, giving clinicians more choice when evaluating health-related symptoms and quality of life in patients with breast cancer and other malignant tumors.


This study was sponsored by National Natural Science Foundation of China (No. 81773163) and Science and Technology Plan Project of Liaoning Province (No. 2013225585).

Conflicts of Interest

None declared.

Multimedia Appendix 1

Screenshot of web-based version National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index (NFBSI-16) questionnaire.

PNG File , 231 KB

  1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018 Sep 12 [FREE Full text] [CrossRef] [Medline]
  2. DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, et al. Breast cancer statistics, 2019. CA Cancer J Clin 2019 Nov;69(6):438-451 [FREE Full text] [CrossRef] [Medline]
  3. Sledge GW. Curing Metastatic Breast Cancer. J Oncol Pract 2016 Jan;12(1):6-10. [CrossRef] [Medline]
  4. Waks AG, Winer EP. Breast Cancer Treatment: A Review. JAMA 2019 Jan 22;321(3):288-300. [CrossRef] [Medline]
  5. Zhang Q, Zhang L, Yin R, Fu T, Chen H, Shen B. Effectiveness of telephone-based interventions on health-related quality of life and prognostic outcomes in breast cancer patients and survivors-A meta-analysis. Eur J Cancer Care (Engl) 2018 Jan;27(1). [CrossRef] [Medline]
  6. Montazeri A. Health-related quality of life in breast cancer patients: a bibliographic review of the literature from 1974 to 2007. J Exp Clin Cancer Res 2008 Aug 29;27:32 [FREE Full text] [CrossRef] [Medline]
  7. Yeo W, Kwan W, Teo P, Nip S, Wong E, Hin L, et al. Psychosocial impact of breast cancer surgeries in Chinese patients and their spouses. Psychooncology 2004 Feb;13(2):132-139. [CrossRef] [Medline]
  8. Efficace F, Therasse P, Piccart MJ, Coens C, van Steen K, Welnicka-Jaskiewicz M, et al. Health-related quality of life parameters as prognostic factors in a nonmetastatic breast cancer population: an international multicenter study. J Clin Oncol 2004 Aug 15;22(16):3381-3388. [CrossRef] [Medline]
  9. Lee CK, Hudson M, Simes J, Ribi K, Bernhard J, Coates AS. When do patient reported quality of life indicators become prognostic in breast cancer? Health Qual Life Outcomes 2018 Jan 12;16(1):13 [FREE Full text] [CrossRef] [Medline]
  10. Quinten C, Martinelli F, Coens C, Sprangers MAG, Ringash J, Gotay C, Patient Reported Outcomes and Behavioral Evidence (PROBE) and the European Organization for Research and Treatment of Cancer (EORTC) Clinical Groups. A global analysis of multitrial data investigating quality of life and symptoms as prognostic factors for survival in different tumor sites. Cancer 2014 Jan 15;120(2):302-311 [FREE Full text] [CrossRef] [Medline]
  11. van Egdom LS, Oemrawsingh A, Verweij LM, Lingsma HF, Koppert LB, Verhoef C, et al. Implementing Patient-Reported Outcome Measures in Clinical Breast Cancer Care: A Systematic Review. Value Health 2019 Oct;22(10):1197-1226 [FREE Full text] [CrossRef] [Medline]
  12. Rock E, Kennedy D, Furness M, Pierce W, Pazdur R, Burke L. Patient-reported outcomes supporting anticancer product approvals. J Clin Oncol 2007 Nov 10;25(32):5094-5099. [CrossRef] [Medline]
  13. Yost KJ, Yount SE, Eton DT, Silberman C, Broughton-Heyes A, Cella D. Validation of the Functional Assessment of Cancer Therapy-Breast Symptom Index (FBSI). Breast Cancer Res Treat 2005 Apr;90(3):295-298. [CrossRef] [Medline]
  14. Krohe M, Tang DH, Klooster B, Revicki D, Galipeau N, Cella D. Content validity of the National Comprehensive Cancer Network - Functional Assessment of Cancer Therapy - Breast Cancer Symptom Index (NFBSI-16) and Patient-Reported Outcomes Measurement Information System (PROMIS) Physical Function Short Form with advanced breast cancer patients. Health Qual Life Outcomes 2019 May 29;17(1):92 [FREE Full text] [CrossRef] [Medline]
  15. Garcia SF, Rosenbloom SK, Beaumont JL, Merkel D, Von Roenn JH, Rao D, et al. Priority symptoms in advanced breast cancer: development and initial validation of the National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index (NFBSI-16). Value Health 2012 Jan;15(1):183-190 [FREE Full text] [CrossRef] [Medline]
  16. Ma J, Pazo EE, Zou Z, Jin F. Prevalence of symptomatic dry eye in breast cancer patients undergoing systemic adjuvant treatment: A cross-sectional study. Breast 2020 Oct;53:164-171 [FREE Full text] [CrossRef] [Medline]
  17. De Castro A, Macías JA. SUSApp: A mobile app for measuring and comparing questionnaire-based usability assessments. : Association for Computing Machinery; 2016 Presented at: ACM International Conference Proceeding Series; 2016; New York. [CrossRef]
  18. Schleyer TKL, Forrest JL. Methods for the design and administration of web-based surveys. J Am Med Inform Assoc 2000;7(4):416-425 [FREE Full text] [CrossRef] [Medline]
  19. Bateman H, Goh S, Doyle SA. Internet-based surveys of health professionals. Fam Pract 2004 Jun;21(3):329. [CrossRef] [Medline]
  20. Swoboda WJ, Mühlberger N, Weitkunat R, Schneeweiß S. Internet Surveys by Direct Mailing. Social Science Computer Review 2016 Aug 18;15(3):242-255. [CrossRef]
  21. Coons SJ, Gwaltney CJ, Hays RD, Lundy JJ, Sloan JA, Revicki DA, ISPOR ePRO Task Force. Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value Health 2009 Jun;12(4):419-429 [FREE Full text] [CrossRef] [Medline]
  22. Thumboo J, Wee H, Cheung Y, Machin D, Luo N, Feeny D, et al. Computerized administration of health-related quality of life instruments compared to interviewer administration may reduce sample size requirements in clinical research: a pilot randomized controlled trial among rheumatology patients. Clin Exp Rheumatol 2007;25(4):577-583. [Medline]
  23. Hays RD, Bode R, Rothrock N, Riley W, Cella D, Gershon R. The impact of next and back buttons on time to complete and measurement reliability in computer-based surveys. Qual Life Res 2010 Oct;19(8):1181-1184 [FREE Full text] [CrossRef] [Medline]
  24. Tiplady B. ePROs: Practical Issues in Pen and Touchscreen Systems. Applied Clinical Trials.   URL: [accessed 2007-02-03]
  25. Eysenbach G, CONSORT-EHEALTH Group. CONSORT-EHEALTH: improving and standardizing evaluation reports of Web-based and mobile health interventions. J Med Internet Res 2011 Dec 31;13(4):e126 [FREE Full text] [CrossRef] [Medline]
  26. Barentsz MW, Wessels H, van Diest PJ, Pijnappel RM, Haaring C, van der Pol CC, et al. Tablet, web-based, or paper questionnaires for measuring anxiety in patients suspected of breast cancer: patients' preferences and quality of collected data. J Med Internet Res 2014 Oct 31;16(10):e239 [FREE Full text] [CrossRef] [Medline]
  27. Steele GC, Gill A, Khan AI, Hans PK, Kuluski K, Cott C. The Electronic Patient Reported Outcome Tool: Testing Usability and Feasibility of a Mobile App and Portal to Support Care for Patients With Complex Chronic Disease and Disability in Primary Care Settings. JMIR Mhealth Uhealth 2016 Jun 02;4(2):e58 [FREE Full text] [CrossRef] [Medline]
  28. Gwaltney CJ, Shields AL, Shiffman S. Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: a meta-analytic review. Value Health 2008 Mar;11(2):322-333 [FREE Full text] [CrossRef] [Medline]
  29. Cheung YB, Goh C, Thumboo J, Khoo K, Wee J. Quality of life scores differed according to mode of administration in a review of three major oncology questionnaires. J Clin Epidemiol 2006 Feb;59(2):185-191. [CrossRef] [Medline]
  30. Triberti S, Savioni L, Sebri V, Pravettoni G. eHealth for improving quality of life in breast cancer patients: A systematic review. Cancer Treat Rev 2019 Mar;74:1-14. [CrossRef] [Medline]
  31. Matthies LM, Taran F, Keilmann L, Schneeweiss A, Simoes E, Hartkopf AD, et al. An Electronic Patient-Reported Outcome Tool for the FACT-B (Functional Assessment of Cancer Therapy-Breast) Questionnaire for Measuring the Health-Related Quality of Life in Patients With Breast Cancer: Reliability Study. J Med Internet Res 2019 Jan 22;21(1):e10004 [FREE Full text] [CrossRef] [Medline]
  32. Wallwiener M, Matthies L, Simoes E, Keilmann L, Hartkopf AD, Sokolov AN, et al. Reliability of an e-PRO Tool of EORTC QLQ-C30 for Measurement of Health-Related Quality of Life in Patients With Breast Cancer: Prospective Randomized Trial. J Med Internet Res 2017 Sep 14;19(9):e322 [FREE Full text] [CrossRef] [Medline]
  33. Hartkopf AD, Graf J, Simoes E, Keilmann L, Sickenberger N, Gass P, et al. Electronic-Based Patient-Reported Outcomes: Willingness, Needs, and Barriers in Adjuvant and Metastatic Breast Cancer Patients. JMIR Cancer 2017 Aug 07;3(2):e11 [FREE Full text] [CrossRef] [Medline]
  34. Sun Z, Zhu L, Liang M, Xu T, Lang J. The usability of a WeChat-based electronic questionnaire for collecting participant-reported data in female pelvic floor disorders: a comparison with the traditional paper-administered format. Menopause 2016 Aug;23(8):856-862. [CrossRef] [Medline]
  35. Wen Z, Geng X, Ye Y. Does the Use of WeChat Lead to Subjective Well-Being?: The Effect of Use Intensity and Motivations. Cyberpsychol Behav Soc Netw 2016 Oct;19(10):587-592. [CrossRef] [Medline]
  36. Li W, Han LQ, Guo YJ, Sun J. Using WeChat official accounts to improve malaria health literacy among Chinese expatriates in Niger: an intervention study. Malar J 2016 Nov 24;15(1):567 [FREE Full text] [CrossRef] [Medline]
  37. Jenkins CR, Dillman DA. Towards a Theory of Self‐Administered Questionnaire Design. In: Survey Measurement and Process Quality. New Jersey: Wiley Series in Probability and Statistics; 1997:165-196.
  38. Lin L, Torbeck LD. Coefficient of accuracy and concordance correlation coefficient: new statistics for methods comparison. PDA J Pharm Sci Technol 1998;52(2):55-59. [Medline]
  39. Schuck P. Assessing reproducibility for interval data in health-related quality of life questionnaires: which coefficient should be used? Qual Life Res 2004 Apr;13(3):571-586. [CrossRef] [Medline]
  40. Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics 1989 Mar;45(1):255-268. [Medline]
  41. U.S. Department of Health and Human Services FDA Center for Drug Evaluation and Research, U.S. Department of Health and Human Services FDA Center for Biologics Evaluation and Research, U.S. Department of Health and Human Services FDA Center for Devices and Radiological Health. Guidance for industry: patient-reported outcome measures: use in medical product development to support labeling claims: draft guidance. Health Qual Life Outcomes 2006 Oct 11;4:79 [FREE Full text] [CrossRef] [Medline]

ANOVA: analysis of variance
CCC: concordance correlation coefficient
DRS-E: Disease-Related Symptoms Subscale–Emotional
DRS-P: Disease-Related Symptoms Subscale–Physical
FACIT: Functional Assessment of Chronic Illness Therapy
FWB: Function and Well-Being Subscale
ICC: intraclass correlation coefficient
NFBSI-16: National Comprehensive Cancer Network–Functional Assessment of Cancer Therapy–Breast Cancer Symptom Index
PROMs: patient-reported outcome measures
TSE: Treatment Side Effects Subscale

Edited by C Lovis; submitted 16.02.20; peer-reviewed by R Fox, PC Rassu, L Guo; comments to author 22.07.20; revised version received 15.09.20; accepted 31.01.21; published 02.03.21


©Jinfei Ma, Zihao Zou, Emmanuel Eric Pazo, Salissou Moutari, Ye Liu, Feng Jin. Originally published in JMIR Medical Informatics (, 02.03.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.