Yanli Zhu,Yuntao Song,Guohui Xu,Zhihui Fan,Wenhao Ren
Key Laboratory of Carcinogenesis and Translational Research(Ministry of Education/Beijing),1Department of Pathology;2Department of Head and Neck Surgery;3Department of Ultrasound,Peking University Cancer Hospital and Institute,Beijing 100142,China
Abstract Objective:To evaluate the diagnostic utility of The Bethesda System for Reporting Thyroid Cytology(TBSRTC)at Peking University Cancer Hospital,the incidence of noninvasive follicular thyroid neoplasms with papillary-like nuclear features(NIFTP),and the impact of reclassification on cytopathological outcomes.Methods:We performed a retrospective study of thyroid fine-needle aspiration(FNA)cases between April 2014 and March 2019.The FNA results were classified according to TBSRTC.Post-surgery histological findings were followed up.Results:A total of 2,781 thyroid FNAs were performed.The incidences of the 6 diagnostic categories(DCs I?VI)were 14.8%,17.1%,15.8%,2.3%,11.6% and 38.5%,respectively.A total of 1,122 patients(40.3%)had corresponding histological results.NIFTP accounted for 0.4% of papillary thyroid carcinoma(PTC)cases,and there was no significant difference in the risk of malignancy(ROM)for each TBSRTC DC between“NIFTP=carcinoma(Ca)”and“NIFTP≠Ca”.When“NIFTP=Ca”,the sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV),and accuracy of TBSRTC were 98.0%,84.0%,99.4%,58.3%,and 97.5%,respectively.When“NIFTP≠Ca”,the sensitivity,specificity,PPV,NPV and accuracy of the TBSRTC were 98.1%,81.5%,99.3%,61.1%,and 97.5%,respectively.Conclusions:TBSRTC is effective in the preoperative diagnosis of thyroid nodules in Peking University Cancer Hospital.The impact of the reclassification of NIFTP on cytopathological outcomes is limited because of its low incidence,and the revised ROMs are not suitable for Asian patients.
Keywords:Bethesda system;thyroid;FNA;cytopathology;NIFTP
Fine-needle aspiration(FNA)of the thyroid gland is a rapid,cost-effective,safe and widely accepted method in the preoperative evaluation of thyroid nodules.The Bethesda System for Reporting Thyroid Cytology(TBSRTC)was proposed in 2007 at the National Cancer Institute Thyroid Fine Needle Aspiration State of the Art and Science Conference held in Bethesda,Maryland and was published in January 2010(1).The TBSRTC consists of 6 diagnostic categories(DCs),and each diagnostic category is associated with a specific risk of malignancy(ROM)and a recommendation for clinical management.Therefore,TBSRTC can not only aid clinicians in the treatment of the thyroid nodules of their patients,but also provide a uniform diagnostic terminology for pathologists to communicate relatively more effectively with clinicians and to share data between different laboratories easily.
Due to the advances in the molecular diagnosis of thyroid nodules and the introduction of the new pathologic entity“noninvasive follicular thyroid neoplasms with papillary-like nuclear features(NIFTP)”and the others factors,TBSRTC II was published in the spring of 2018(2).The foremost modification in the revised version involves the ROMs.Primarily,the ranges of the ROMs have been updated according to the most recent data in the literature.Moreover,each DC is associated with two different ROMs:One considers NIFTP as a carcinoma,and one considers NIFTP as a tumor with low malignant potential.Therefore,the reclassification of NIFTP is of great significance for the cytological interpretation of thyroid nodules because the ROMs for TBSRTC DCs would decrease as reported in Western practice(3-6),and the revised 2015 American Thyroid Association(ATA)guidelines cites data from TBSRTC II(7).However,some Asian studies have shown high ROMs in the indeterminate FNA categories and low incidences of NIFTP in their thyroid FNA practice(8-10).Some authors have analyzed the multifactorial reasons for the discrepancies between Asian and Western studies(8,11).Thus,the incidences of NIFTP and the impact of reclassification on cytopathological outcomes remain unclear,and whether the revised ROMs are suitable for Asian patients remains to be further studied(8,11).
In addition,since its introduction,TBSRTC has become one of the most quoted pathological terminology and reporting systems in the published scientific literature(12).However,to date,the majority of published studies refer to original TBSRTC,and various studies have investigated different ROMs and have reported discrepancies in malignancies in their respective institutions.
We adopted this reporting system in 2015 and no retrospective study was performed at Peking University Cancer Hospital after adopting this system.Thus,we analyzed our data of thyroid FNAs over nearly four years after using TBSRTC in correlation with post-surgery histological results to investigate the validity of this reporting system.The ROM for each DC was examined and compared with the first and second edition of TBSRTC.A further aim was to convey the ROM of each DC in Peking University Cancer Hospital to clinical colleagues to contribute to optimizing patient care.
We conducted a computer search by using the keyword“TBSRTC”for all thyroid FNAs performed at Peking University Cancer Hospital between April 2014 and March 2019,and the retrospective study was approved by the Ethics Committee of Beijing Cancer Hospital(No.2018KT101).FNAs were performed by surgeons or sonographers either by palpation or ultrasound guidance.The aspirates were prepared as direct smears stained with hematoxylin-eosin and/or liquid-based cytology stained with Papanicolaou stain.Diagnostic terminology was used according to the recommended 6 categories of TBSRTC:Nondiagnostic or unsatisfactory(ND/UNS;I);Benign(B;II);Atypia of undetermined significance or follicular lesion of undetermined significance(AUS/FLUS;III);Suspicious for follicular neoplasm or follicular neoplasm(SFN/FN;IV);Suspicious for malignancy(SM;V);and Malignant(M;VI).We calculated the incidences of each DC of the FNAs performed by surgeons or sonographers and compared the results from the sonographer and surgeon groups.
Histologic follow-up was retrieved in Peking University Cancer Hospital.The FNA results were correlated with post-surgery histological results.
True-negative cases included FNA benign cases confirmed as benign via histopathology.The true-positive cases had FNA category DC VI findings and malignant final histology.For“NIFTP=Ca”and“NIFTP≠Ca”,the ROM for each DC was evaluated,and the sensitivity,specificity,positive predictive value(PPV),negative predictive value(NPV)and accuracy of TBSRTC were assessed individually.We compared our findings with the first and second editions of TBSRTC.
Statistical analysis was performed using IBM SPSS Statistics(Version 20.0;IBM Corp.,New York,USA).The variables were mainly categorical,and the test used was the Chi-square test.A P value less than 0.05 was considered significant.
A total of 2,781 thyroid FNAs were performed.The average patient age was 46.7 years,with a range of 15?89 years old.There were 2,109 women(75.8%)and 672 men(24.2%)with the female to male ratio of 3.1:1.0.For all 2,781 FNAs,the incidences of DCs I?VI were 14.8%,17.1%,15.8%,2.3%,11.5% and 38.5%,respectively.When the sonographer and surgeon groups were compared in terms of nondiagnostic FNA proportion,no significant difference was found(P>0.05).The overall distribution of diagnoses of thyroid FNAs performed by sonographers and surgeons are presented inTable 1.
There were a total of 13 cases of repetitive thyroid FNAs performed in Peking University Cancer Hospital during the study.The 13 cases were all categorized as suspicious malignant by clinicians,while the first FNA results were not malignant or suspicious malignant(DCs I?IV).A repetitive FNA was performed on 2(2/42,4.8%)patients with DC I findings and on 9(9/90,10.0%)patients with DC III findings(Table 2).A total of 66.7%(6/9)of the DC III patients resolved into definitive diagnostic categories.The percentage of FNA repetition and the repetitive and final histologic diagnosis after the first FNA in the study cohort are shown inTable 2.
Of the study population,1,122(40.3%)patients had corresponding follow-up partial or total thyroidectomy in Peking University Cancer Hospital.A total of 132(11.8%)patients had nonneoplastic lesions,17(1.5%)patients had benign neoplasms,969(86.4%)patients had malignantneoplasms,and 4(0.4%)patients had NIFTP.The flowchart of case screening is presented inFigure 1.When“NIFTP=Ca”,the ROM for each DC was 59.5% for DC I,41.7% for DC II,50.0% for DC III,34.5% for DC IV,78.5% for DC V,and 99.4% for DC VI.When“NIFTP≠Ca”,the ROM for each DC was 57.1% for DC I,38.9% for DC II,50.0% for DC III,31.0% for DC IV,78.5% for DC V,and 99.3% for DC VI.The distribution of the cytologic-histologic correlation of the nodules within the excised thyroid glands and the ROM for each category are presented inTable 3.
Table 1 Distribution of diagnoses of thyroid FNAs
Table 2 Percentage of FNA repetition and repetitive and final histologic diagnosis after the first FNA
When“NIFTP=Ca”,the sensitivity,specificity,PPV,NPV and accuracy of TBSRTC were 98.0%,84.0%,99.4%,58.3% and 97.5%,respectively.When“NIFTP≠Ca”,the sensitivity,specificity,PPV,NPV and accuracy of the TBSRTC were 98.1%,81.5%,99.3%,61.1% and 97.5%,respectively(Table 4).
FNA plays an important role in the preoperative assessment of thyroid nodules(13-17).TBSRTC consists of 6 DCs,each associated with a corresponding ROM that translates directly into clinical management and this had contributed to the wide acceptance of TBSRTC internationally.TBSRTC has also facilitated the communication between cytopathologists and clinicians and has become highly accepted in the clinical community,as shown by its endorsement by the ATA as part of the revised 2015 ATA guidelines for the management of thyroid nodules in adults(7).
We reported 15.8% of AUS/FLUS and 38.5% of malignant cases with a ratio of 2.4.This figure conformed to the range of 1.0?3.0 recommended by Kraneet al.(18),indicating the reasonable use of the AUS/FLUS category in Peking University Cancer Hospital.A ratio of AUS:M>3.0 is likely because of the overdiagnosis of AUS or underdiagnosis of M.AUS:M ratios<1.0 are mostly due to low AUS rates,and the sensitivity might decrease when the ratio is<1.0(18).Moreover,the indeterminate cases(DC III,IV and V)occupied 29.7% of all thyroid FNAs in our study,which was close to the data(1/3)reported by a comprehensive review and meta-analysis(19),demonstrating that thyroid cytopathology has limitations and that we properly applied TBSRTC.
We had a higher proportion of DC I(14.8%)in this study than the no more than 10% of specimens proposed by TBSRTC.The reason for ND/UNS depends on the FNA operator in most cases,and a high proportion of DC I reflects poor technique in sampling,slide preparation orfixation.Previous studies have underscored the fact that the most significant diagnosis problem associated with FNA is the inadequate extraction of materials during the procedure(20).Therefore,efforts should be aimed at reducing the proportion of inadequate FNA to a minimum,or at least within acceptable limits(10%).It is imperative that noncytopathologist thyroid FNA operators(sonographers and surgeons in our hospital)receive adequate normative quality training related to FNA technology and sample preparation(21).
Table 4 Diagnostic utility of TBSRTC
An adequate specimen ensures a low false-negative rate and is dependent on the technique of the aspirator.Undoubtedly,the non-diagnostic FNA proportion reflects the skill level of the FNA performer;that is,the lower the proportion of DC I,the higher the skill level of the FNA performer.We found that the non-diagnostic results of FNA by sonographers and surgeons were similar,and the results was consistent with those in the literature(22-24),demonstrating that although surgeons are not experts in the use of ultrasound guidance(USG),USG-guided FNA can also be performed by surgeons in their clinical practice.
Although investigations such as repetitive FNA or molecular testing for AUS/FLUS have been proposed by the latest guidelines of the ATA(7),only 13 cases of repetitive thyroid FNAs were performed in Peking University Cancer Hospital during the study,and there were two major reasons for the low proportion of repetitive aspirations.First,our clinicians preferred referring to clinical and ultrasound features for indeterminate patients by thyroid FNA,and they might recommend a diagnostic lobectomy for suspicious patients.Second,Chinese patients tend to choose another hospital to have a repetitive thyroid FNA or have an operation directly instead of having two FNAs in one hospital.However,66.7% of DC III patients had definitive results via repetitive aspirations,in accordance with the information reported in other studies to resolve the dilemma in up to 50% of cases of AUS/FLUS(III)(25),demonstrating that repetitive FNA contributes to optimizing patient care.
We compared the ROMs calculated in the current study with those reported for the first and second edition TBSRTC,and with two reviews of meta-analysis covering a large study of cases(Table 5).For“NIFTP=Ca”and“NIFTP≠Ca”,we found that the ROMs in DC I,II and III were much higher than those recommended in the original and revised TBSRTC guidelines.Our observed ROMs were also slightly elevated for DC V and DC VI.The malignancy rates of DC IV were 34.5% and 31.0%,which were higher than the original edition TBSRTC(15%?30%)but were in line with the revised edition(25%?40% and 10%?40%).
There are three possible reasons for these elevated ROMs in the present study.First,the resected cases were often operated because of worrisome ultrasound features thus,the“surgical follow-up”bias resulted from a selected group.Our increased rates of malignancy in DC I and DC II may reflect other suspicious clinical features that prompted the election of surgical excision.Therefore,our low resection rates of DC I,II and III(10.2%,7.6% and 20.5%,respectively)and the large number of patients who did not undergo surgery may cause these results to be biased.Second,our high malignancy rate for DC III suggests that the ROM might be higher than the estimated.This underscores the importance of using this terminology carefully.However,we admit that some lesions were classified under DC III because the specimens were suboptimal.A badly smeared,fixed,or stained preparation is thus classified as AUS/FLUS because of technicalreasons that do not depend on the nature of the lesion itself or any associated clinical conditions(26).Third,Chinese patients tend to be more concerned about false positive results than about false negative results,which may pressure cytopathologists to underdiagnose FNA cases to avoid making false-positive diagnoses,and a strict triage of patients with indeterminate thyroid nodules for surgery is usually applied,which ultimately leads to low resection rates and high ROMs for indeterminate nodules.
Table 5 Comparison of risk of malignancy among different studies
According to TBSRTC II,the ROM ranges needed to take into account whether NIFTP is considered a carcinoma.In our study,when“NIFTP=Ca”,the ROM for DC was 59.5% for DC I,41.7% for DC II,50.0% for DC III,34.5% for DC IV,78.5% for DC V,and 99.4% for DC VI.When“NIFTP≠Ca”,the ROM for each DC was 57.1% for DC I,38.9% for DC II,50.0% for DC III,31.0% for DC IV,78.5% for DC V,and 99.3% for DC VI.Considering“NIFTP≠Ca”in the statistical analysis,the ROM for DC has changed little,and there were no significant differences in ROMs between“NIFTP=Ca”and“NIFTP≠Ca”,which was different from reports in Western practice that indicated the ROMs would decrease when“NIFTP≠Ca”.Thus,the impact of reclassification on cytopathological outcomes is limited,and the revised ROMs are not suitable for Asian patients.The results were easy to understand because we found only 4 cases of NIFTP according to the histological results.The four cases were cytologically classified as DC I,DC II,DC IV and DC VI,consistent with the previous reports describing that most NIFTP cases were cytologically classified as intermediate categories(27,28),and notably,DC VI was supported the view that NIFTP cases could be cytologically reported as malignant(9,29).The cytological characteristics of NIFTP are not specific and show slightly or moderately expressed nuclear features of papillary thyroid carcinoma(PTC)without specific morphological features.However,the introduction of NIFTP was essential to address a subset of low-risk thyroid neoplasms that were overtreated,and we await reliable criteria that might lead to a diagnosis of NIFTP on cytological specimens.In our study,cases of NIFTP accounted for 0.4% of PTC cases,which was consistent with the incidences of NIFTP in Asian countries and considerably lower than those in the Western countries(8),showing that the diagnostic threshold of PTC-type nuclear features is subjective and that these rigid diagnostic criteria may be adopted by our department.Another explanation might be that our practices for patients with indeterminate thyroid nodules are relatively conservative,resulting in a reduced incidence of NIFTP in Peking University Cancer Hospital.Furthermore,the reliability and accuracy of any reporting system is built on experience,including not only cytologic interpretations but also years of follow-up of regarding cytologic and histologic correlations,and the diagnosis of NIFTP is not an exception.Therefore,educational programs related to the term of NIFTP,such as local seminars,should be provided to ensure that most pathologists adopt the term NIFTP in their daily practice and enhance interobserver reproducibility for this diagnosis.
The sensitivity,PPV and diagnostic accuracy all exceeded 90.0%,indicating that thyroid FNA is an important part of preoperative diagnosis.The relatively low NPV(58.3%)showed that despite a thyroid nodule being initially diagnosed as benign by FNA,it may have potential for malignancy.Therefore,to avoid missed malignancies,nodules should be carefully reevaluated by repetitive FNA,ultrasonography,molecular analysis,or multidisciplinary discussion.Moreover,notably,the cases of DC II do not mean ‘‘negative for malignancy’’,and patients with suspicious clinical features should have a repetitive FNA,undergo close clinical follow-up or even have a diagnostic lobectomy.
Similarly,there are two main limitations of our study.First,the incidence of NIFTP was calculated upon the post-surgery histological reports.If we had reviewed the slides of all cases diagnosed as encapsulated follicular variant of papillary thyroid carcinoma(eFV-PTC),some noninvasive eFV-PTC might have been reclassified to NIFTP and the incidence of NIFTP would increase.However,unfortunately,we cannot recheck all cases of eFV-PTC because not all cases were diagnosed in our histological reports,and some pathologists diagnosed with only PTC without a subtype.Nevertheless,we believe the impact should be minimal,considering that a symposium related to NIFTP was held in the Department of Pathology.Second,the post-surgery histological results were based on the reports in Peking University Cancer Hospital,and some patients who underwent surgery in other hospitals were not included in our follow-up.
TBSRTC is effective in the preoperative diagnosis of thyroid nodules in Peking University Cancer Hospital with high sensitivity,PPV and accuracy for cancer diagnosis and low specificity and NPV.Therefore,to avoid missed malignancies,nodules should be carefully reevaluated by repetitive FNA,ultrasonography,molecular analysis,or multidisciplinary discussion.The incidence of NIFTP was considerably low,and there was no significant difference in the ROM between“NIFTP=Ca”and“NIFTP≠Ca”,which was different from the reports in Western practice indicating that ROMs would decrease when“NIFTP≠Ca”.Thus,the impact of reclassification on cytopathological outcomes is limited,and the revised ROMs are not suitable for Asian patients.
None.
Conflicts of Interest:The authors have no conflicts of interest to declare.
Chinese Journal of Cancer Research2020年2期