首页 > 美文阅读

LungCAD A Clinically Approved, Machine Learning System for Lung Cancer Detection

更新时间:2023-07-12 04:22:59 阅读：评论：0

LungCAD:A Clinically Approved,Machine Learning System for Lung Cancer Detection

R Bharat Rao,Jinbo Bi, Glenn Fung,Marcos

Salganicoff Siemens Medical Solutions 51Valley Stream Parkway, Malvern,PA19355

Nancy Obuchowski

Quantitative Health Sciences

The Cleveland Clinic

Foundation

9500Euclid Ave.,Cleveland,

OH44195

David Naidich

Department of Radiology

New Y ork University Medical

Center

400East34Street,New Y ork,

NY10016

ABSTRACT

We prent LungCAD,a computer aided diagnosis(CAD) system that employs a classiﬁcation algorithm for detecting solid pulmonary nodules from CT thorax studies.We brieﬂy describe some of the machine learning techniques developed to overcome the real world challenges in this medical do-main.The most signiﬁcant hurdle in transitioning from a machine learning rearch prototype that performs well on an in-hou datat into a clinically deployable system,is the requirement that the CAD system be tested in a clini-cal trial.We describe the clinical trial in which LungCAD was tested:a large scale multi-reader,multi-ca(MRMC) retrospective obrvational study to evaluate the eﬀect of CAD in clinical practice for detecting solid pulmonary nod-ules from CT thorax studies.The clinical trial demonstrates that every radiologist that participated in the trial had a sig-niﬁcantly greater accur

acy with LungCAD,both for detect-ing nodules and identifying potentially actionable nodules; this,along with otherﬁndings from the trial,has resulted in FDA approval for LungCAD in late2006. Categories and Subject Descriptors

I.5.m[Pattern Recognition]:Miscellaneous

General Terms

Algorithms

Keywords

computer aided detection,lung cancer prognosis,classiﬁca-tion,clinical trial

1.INTRODUCTION

Lung cancer is the most commonly diagnod cancer world-wide,accounting for1.2million new cas annually.Lung Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on theﬁrst page.To copy o

therwi,to republish,to post on rvers or to redistribute to lists,requires prior speciﬁc permission and/or a fee.

KDD’07,August12-15,2007,San Jo,California,USA. Copyright2007ACM978-1-59593-609-7/$5.00.cancer is an exceptionally deadly dia:6out of10people will die within one year of being diagnod.The expected 5-year survival rate for all patients with a diagnosis of lung cancer is only15%,compared to65%for colon,89%for breast and99.9%for prostate cancer.In the United States, lung cancer is the leading cau of cancer death for both men and women,causing more deaths than the next three most common cancers combined,and costs$9.6Billion to treat annually.However,lung cancer prognosis varies greatly de-pending on how early the dia is diagnod;as with all cancers,early detection provides the best prognosis.At one extreme are the patients diagnod with metastatic tumors (that have spread far from the lung),for whom the5-year survival rate is just2%.On the other hand,when diagnod at an early stage,when the dia is still localized within the lung,the5-year survival rate is49%,and many treat-ment options(surgery,radiotherapy,chemotherapy)are vi-able.Today,only24%of lung cancer cas are diagnod at an early stage.[1,10].

The recent development of multidetector computed to-mography(MDCT)scanners has made it feasib

le to detect lung cancer at very early stages in priciple.Despite the ad-vances in technology,many potentially clinically signiﬁcant lesions still remain undetected[13].One contributing factor is the explosion of generated data:The state-of-the-art64-slice dual-source CT acquires up to3,687axial images in30 conds for each patient(each image must then be carefully examined by a radiologist).There is a growing connsus among clinical experts that the u of computer-aided di-agnosis(CAD)software when ud as a cond , in conjunction with the radiologist)not only oﬀers the po-tential to improve the detection accuracy of a radiologist, but also to reduce mistakes related to misinterpretation[2, 11].In order for a CAD system to be ud in clinical prac-tice in the United States,it mustﬁrst receive approval from the the Food and Drug Administration(FDA).All CAD systems must go through a rigorous clinical trial to receive approval(in much the same way as a new drug).A handful of CAD systems have received approval for detecting breast cancer lesions in the past8years.To be approved CAD sys-tems must show satisfactory performance in two areas.The principal value of CAD is determined not by its stand-alone performance,but rather by carefully measuring the incre-mental value of using Computer-Aided Diagnosis in normal clinical practice with the radiologist in-the-loop.Secondly, CAD systems must not have a negative impact on patient

management(for instance,fal positives which cau the radiologist to recommend unnecessary,an

d potentially dan-gerous,follow-ups).Additionally,designing a trial for lung cancer detection is considerably more challenging than for breast cancer.One factor is the relative diﬃculty in obtain-ing ground truth(correct labeling)for lung cancer related lesions.Whereas,in breast cancer virtually all suspicious lesions are routinely biopsied(providing deﬁnitive histolog-ical ground truth),a lung biopsy is a dangerous procedure, with a2%risk of rious complications(including death); this makes obtaining deﬁnitive ground truth infeasible,par-ticularly for patients being evaluated for early signs of lung cancer.

Section2describes some of the machine learning chal-lenges involved in learning a classiﬁer for detecting lung cancer.We review some of our previous solutions.Section3 describes the clinical trial design for our LungCAD system, which includes a fairly complex mechanism for determin-ing ground truth and measuring incremental improvement. Section4summarizes the experimental results of the clin-ical trial that has resulted in granting clinical approval for LungCAD.We conclude in Section5with some discussion about CAD in general and future challenges.

2.MACHINE LEARNING CHALLENGES LungCAD system consists of5stages:1.lung gmenta-tion to identify the lung area within the chest;2.candidate generation which identiﬁes suspicious unhealthy candidate regions of interest(ROI)from a medical image;

3.feature extraction that computes descriptive features for each can-didate so that each candidate is reprented by a vector x of numerical values or attributes[15];

4.classiﬁcation that diﬀerentiates candidates bad on candidate feature vectors;

5.visual prentation of CADﬁndings to the radiologist in order for him to accept or reject the CADﬁndings.In this ction,we focus on learning the classiﬁer in Step4. Automatic learning technologies greatly reduce the time required to develop algorithms that act as“cond readers”besides improving the diagnostic accuracy.Many standard algorithms(such as support vector machines(SVM),back-propagation neural nets,kernel Fisher discriminants)have been ud to learn classiﬁers for detecting malignant struc-tures[2,11].However,the general-purpo learning meth-ods either make implicit assumptions that are commonly violated in CAD applications,or cannot eﬀectively address the diﬃculties arin when learning a CAD system.

Non-IID Data Traditional learning methods almost uni-versally assume that the training samples are independently drawn from an identical albeit unobrvable underlying dis-tribution(the IID assumption),which is often not the ca in CAD systems.Due to spatial adjacency of the regions identiﬁed by a candidate generator,both the features and the class labels of veral adjacent candid

ates are highly cor-related.This is true both in the training t and in the test-ing data.A batch-classiﬁcation algorithm in[14]derives a probabilistic classiﬁcation model by specifying a priori guess on the candidate labels with a covariance matrixΣthat en-codes the spatial-proximity-bad correlations within an im-age.Multiple-instance learning methods[9,3]optimize the classiﬁer design by taking into account the fact that multi-ple candidates can exist to associate with a single malignant structure.Random eﬀects may exist in patient images from the same hospital,or exist in diﬀerent candidates extracted from the same patient.The approach in[7]propos to u additional mix-eﬀect parameters,each for one hospital,or for one patient.All the algorithms improve the classiﬁca-tion accuracy signiﬁcantly.

Unbalanced Data and Speed In the candidate identi-ﬁcation stage,high nsitivity(ideally clo to100%)is es-ntial,becau any cancers misd at this stage can never be found by the CAD system,which potentially produces many fal positives(less than1%of the candidates are pos-itive),making the classiﬁcation problem highly unbalanced. Moreover,a CAD system has to satisfy real-time require-ments that itﬁnishes running during the radiologistsﬁrst read.The issues were addresd by employing eﬀective cascaded classiﬁcation frameworks as shown in[4,5].The method in[4]investigates a cascaded classiﬁcation approach that solves a quence of linear programs,each

construct-ing a spar hyperplane(linear)classiﬁer.It incorporates the computational complexity of various features into the cascade design for time eﬃciency.A more recent work[5] does not follow standard cascade procedure where individ-ual classiﬁers are optimized towards one speciﬁc stage given the candidates survived from early stages.Instead,it us a novel AND-OR cascade training strategy which optimizes all of the classiﬁers in the cascade in parallel by minimiz-ing the regularized risk of the entire system and providing implicit mutual feedback to individual classiﬁers to adjust parameter design.The cascaded approaches have been compared with the well-known cascade AdaBoost,and are superior with many additional advantages.再接再厉什么意思

Irrelevant and Redundant Features When arch-ing for descriptive features,rearchers often deploy a large amount of experimental image features to describe the iden-tiﬁed candidates,which conquently introduces irrelevant and redundant features.Feature lection is esntial in CAD systems.A previous LungCAD system[15]utilizes a greedy forward lection approach to lect one feature at one time from the feature t according to certain discrim-inant score ranking.Recent rearch has focud more on general sparsity treatments to construct spar estimates of classiﬁer parameters,such as in[6,4].The models control the classiﬁer complexity by spar-favoring regularization terms,such as the 1-norm regularization||w||1=

|w i| for a linear classiﬁer of the form sign(w T x).

3.LUNGCAD TRIAL DESIGN

The clinical trial design is illustrated in Figure2.The principal challenges we faced in designing the clinical trial are described below:

Measure incremental improvement:The principal value of CAD is determined not by its stand-alone performance, but rather by carefully measuring the incremental value of Computer-Aided Diagnosis in normal clinical practice;as re-ﬂected in incremental improvement in accuracy as objective evaluation by the radiologist.

Patient management impact:It is not enough that Lung-CAD improves the detection of lung cancer.It must result in a net improvement in patient management since unnec-essary fal positiveﬁndings lead to unnecessary follow-ups. Ground truth:As discusd earlier,due to the unavail-ability of lung biopsies,an alternative method had to be devid for determining ground truth.

We retrospectively collected MDCT studies from200con-cutive patients(mean age:61.5y,56%male)who had been

Figure1:A multicenter,Multi-Reader Multi-Ca(MRMC)retrospective clinical study to asss the incre-mental value of LungCAD in the identiﬁcation of pulmonary nodules on thoracic CT examinations(CRO= contract rearch organization,GR=general radiologist,CR=chest radiologist).

referred for evaluation of potential pulmonary nodules from

4clinical sites:NYU,Univ.of Pennsylvania,Univ.of Mary-

land and the Cleveland Clinic;The studies were procesd

by an independent Contract Rearch Organization(CRO),

BioImaging,Inc,Yardley PA.4studies were excluded due

to respiratory or cardiac motion,or image artifacts.

All196studies were initially evaluated by17board-certiﬁed

general radiologists(GR)in active community practice,each

握力器有用吗

using a predetermined randomized order,to detect poten-

tial nodules of diameter≥3mm.The GR’s were required

to score potential nodules on a“nodule”scale,from1(“un-

likely”)to10(“deﬁnite”).GR’s were also required to deter-

mine if each nodule could be identiﬁed as“actionable”again

on a10point scale(0−2denoting“no followup needed”,

3−6“indeterminate”,>6“deﬁnite need for followup”).To

illustrate,a benign calciﬁed granuloma would be reprented

as true(10),non-actionable(<3)nodule.

Then CAD-identiﬁed potential nodules were prented to

the GR’s(after eliminating nodules that had already been

found by the GR),and were assd using the same two

scales.The blinded,independent reviews were re-nt to

the CRO,whereﬁndings were examined by an independent

fellowship trained chest radiologist to consolidate any nod-

ules independently found by more than one GR.

The results were then reviewed parately by5fellowship-

trained expert chest radiologists(CR)randomly chon from

a panel of10,each interpreting100studies.Expert CR’s

were required to evaluate each nodule parately without

knowledge of whether the had been identiﬁed by radiol-

ogists or by CAD,and to asss them on both a“nodule”

and“actionability”binary decision and its rating.Further,

the nodule size and lung lobe in which each nodule was en

was also recorded.For nodule candidates to be considered

true nodules(ground truth)a minimum connsus of3out

of5experts was necessary.

A note on sample size:Bad on pilot studies,we assumed

that at least60%of patients would have a nodule in an aver-

age of3lobes,that the CR’s would have average ROC area

without CAD of0.80with moderate inter-reader variability,

and that CAD would improve the ROC area by0.025.To

yield80%power in the trial,we estimated that17readers

and200patients would suﬃce.

4.CLINICAL TRIAL RESULTS

职场自我介绍Ground truth was deﬁned as having at least3of5expert

chest radiologists identifying at least one nodule in a lobe

(aﬀected lobe);otherwi,lobes were labeled normal.Sim-

ilarly,an actionable lobe was one in which3or more CR’s

identiﬁed one or more actionable nodules.

A total of1320(≥3mm)nodules were identiﬁed in196

patients of which863(65.4%)were interpreted by expert

大海虾CR’s as actionable.(Unless speciﬁed otherwi,from here

on all nodules will be assumed to be in the clinically rel-

evant range of≥3mm in diameter.)181patients had at

least1nodule(prevalence rate of92.3%):only15patients

were interpreted as normal(all lobes were normal).1320

nodules were detected in525(53.6%)of980(=196×5)po-

tentially evaluable lobes of which397(40.5%)had at least

one actionable nodules.

The primary measurement for the diagnostic accuracy of

the17general radiologists(GR),both with and without

CAD,for detecting solid pulmonary nodules,is the area

under ROC curve,using lobes as the unit of analysis.A

nonparametric estimator was ud to adjust for the clustered

data as described in[12].Sensitivity was deﬁned as the

probability that a GR identiﬁed at least one nodule in an

aﬀected lobe;speciﬁcity was deﬁned as the probability that a

GR did not identify a nodule in a normal ,correctly

identiﬁed it as nodule-free).

Figure2shows that the17GR’s accuracy for identify-

ing nodules ranged from0.704to0.853without CAD to

0.6

0.65

0.7

0.75

0.8

0.85

0.9

属兔几月出生最好7

91011121314151617

With CAD Without CAD

Figure 2:Area under receiver operating curve with and without CAD,for actionable solid nodules.

Figure 3:Average nonparametric ROC curve of all 17readers for detecting nodules without and with CAD.

0.738to 0.883with CAD.The most important result was that every one of the 17GR’s had statistically signiﬁcantly greater accuracy with CAD for detecting lung nodules .As-sd collectively,the GR’s mean accuracies were 0.780and 0.828,without and with CAD,respectively (p <0.001;95%CI of 0.036to 0.059),as shown in Figure 3.

Similar results were achieved for the clinically signiﬁcant actionable nodules:the 17GR’s accuracy for ranged from 0.699to 0.854without CAD to 0.760to 0.880with CAD.Again,every one of the 17GR’s had statistically signiﬁcantly greater accuracy with CAD for identifying actionable lung nodules .We stress the ﬁndings becau most CAD trials demonstrate a statistically signiﬁcant increa for the read-ers considered as a group,with only some of the readers in-dividually having statistically signiﬁcantly greater accuracy.The results are particularly signiﬁcant becau every GR showed statistically signiﬁcant improvement for both tasks -detecting nodules,and identifying actionable nodules.Fig-

Figure 4:Average nonparametric ROC curve of all 17readers without and with CAD for identifying actionable nodules.

ure 4shows the ROC performance for all 17readers without and with CAD for identifying actionable nodules.

We varied the deﬁnition of expert truth by changing the number of expert conﬁrmations required for acceptance from any 1CR to 2,3,4,5expert CR’s for both nodules and ac-tionable nodules.With one exception,every one of the 17GR’s showed statistically signiﬁcant improvement both for detection an

d identiﬁcation of actionable nodules with CAD.(The sole exception was the ca that all 5expert CR’s must agree about an actionable nodule -which tended to happen for fewer and more obvious actionable nodules,thus mak-ing it harder to shown statistically signiﬁcant improvement,but the trend was towards improvement with CAD.)In an-other analysis,statistical improvement in GR’s accuracy was achieved for all nodules regardless of size (≥3mm ).

To determine the patient management impact we esti-mated the number of patients,where CAD lead to a positive management ,a recommendation for additional imaging studies and/or biopsy in an actionable lobe which was misd without CAD);and estimated the number of pa-tients where CAD lead to a negative management change:a recommendation for additional studies and/or biopsy in a normal lobe which was correctly diagnod without CAD.As this is a patient-level analysis,patients with both positive and negative management changes were labelled as a posi-tive change,under the assumption that detecting a misd nodule is more beneﬁcial to a patient than the risk of an unnecessary follow-up (typically another imaging exam).The average number of patients with a positive manage-ment change resulting from using CAD was 24.8(averaged across the CR’s),meaning that 7.9patients (=196/24.8)must be evaluated for a positive management change,on av-erage.On the other hand,12patients had negative manage-ment changes (averages across the 1

7CR’s),meaning that 16.3patients must be evaluated with CAD for a negative management change to result.As the positive management changes exceeded the negative management changes on av-erage,this was suﬃcient,even without considering that on

average positive management changes are more beneﬁcial than negative managment changes are harmful.

Additional details on the multi-reader,multi-ca(MRMC) statistical methodology ud,are provided in our submis-sion and in[16].The LungCAD clinical trial summary of safety and eﬀectiveness[8](which is available on the FDA’s web site)contains many more results and analys,includ-ing:patient-level analysis of GR’s increa in accuracy with CAD,bootstrap sampling to estimate variability of expert CR’s.

5.DISCUSSION

To summarize our clinical results,CAD is an eﬀective c-ond reader,both for detecting nodules and for identifying potentially actionable nodules.The fal positive rate is acceptably low given the incread rate of positive manage-ment changes.Theﬁndings have resulted in LungCAD being granted clinical approval by the FDA for detecting solid pulmonary nodules from CT thorax studies.Al

though some debate remains about the preci value of screening (for breast cancer,and now for lung cancer),all experts agree that early detection is key for improvement of can-cer cure rates.Many eﬀorts are ongoing to pave the way for MDCT to be ud for identifying lung cancer at early stages. However,much remains to be done in this area.First,our study focud on solid pulmonary nodules;in high risk pa-tients,part-solid and ground-glass nodules(GGN)are also en on chest MDCTs.GGNs are deﬁned as nodules with hazy attenuation without obscuration of underlying vascular markings,and will necessitate the development of improved machine learning and image processing methods to detect. Our focus in this study been to detect pulmonary nodules. However,the eventual goal is not just to detect nodules,or even to detect actionable nodules,but to detect lung cancer in early stages,and thereby,intervene and treat the patient and improve survival.Therefore,CAD needs to move in the next few years,from detecting nodules to classifying nod-ules as benign or malignant.Aﬁrst step could be to report the probability of malignancy,although the clinical and reg-ulatory challenges to design a trial to prove the eﬃcacy of such a system would be daunting(larger sample size is not the answer-our study took nearly two years to complete-and the FDA is already taking steps to reduce the regula-tory burden,while ensuring the safety and eﬃcacy of CAD). An even more intriguing notion would be to identify lesions that are currently benign,but would have a high probability of turning malignant-pre-cancerous lesions-to move from a reactive paradigm of treating cancer to a more proactive paradigm of prevention.

We have described some machine learning challenges in the lung CAD domain and reviewed some of our previous machine learning work.Our methods are not speciﬁc to lung cancer only,and have shown equivalent or superior perfor-mance on other data ts.For instance,the PECAD(Pul-monary Embolism)problem(that formed the basis of the 2006KDDCup)is very diﬀerent in its evaluation criteria; treatment of PE is systemic(as oppod to localized in lung cancer)and the goal is to identify patients as having one of more PE’s or being PE-free.In the ColonCAD problem, the goal is to detect all pre-cancerous polyps;the cost of a fal positive is not very high,and the treatment of choice is to remove all potentially suspicious lesions.Yet,despite the very diﬀerent optimization criteria and the vastly diﬀerent medical domain knowledge,many of the machine learning methods described here,also translate to the and other CAD problems.

6.REFERENCES

[1]American Lung Association.Trends in lung cancer

morbidity and mortality report.2006.

[2]S.G.Armato-III,M.L.Giger,and H.MacMahon.

Automated detection of lung nodules in CT scans:

preliminary results.Medical Physics,28(8):1552–1561,

2001.

回锅肉炒饭[3]J.Bi and J.Liang.Multiple instance learning of pulmonary

embolism detection with geodesic distance along vascular

structure.In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2007.

[4]J.Bi,S.Periaswamy,K.Okada,T.Kubota,G.Fung,

M.Salganicoﬀ,and R.B.Rao.Computer aided detection

via asymmetric cascade of spar hyperplane classiﬁers.In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2006.

[5]M.Dundar and J.Bi.Joint optimization of cascaded

classiﬁers for computer aided detection.In Proceedings of

IEEE Conference on Computer Vision and Pattern

Recognition,2007.

[6]M.Dundar,G.Fung,J.Bi,S.Sandilya,and R.B.Rao.

Sparﬁsher discriminant analysis for computer aided

detection.In Proceedings of SIAM International

Conference on Data Mining,2005.

[7]M.Dundar,B.Krishnapuram,J.Bi,and R.B.Rao.

Learning classiﬁers when the training data is not IID.In

Proceedings of the20th International Joint Conference on Artiﬁcial Intelligence,2007.

[8]Food and Drug Administiration.Siemens Syngo lung CAD

summary of safety and eﬀectiveness,PMA No.0500022.

October2006.

上午的英语[9]G.Fung,M.Dundar,B.Krishnapuram,and R.B.Rao.

Multiple instance algorithms for computer aided diagnosis.

In Advances in Neural Information Processing Systems,

2006.

[10] A.Jemal,R.Siegel,E.Ward,T.Murray,J.Xu,and M.J.

Thun.Cancer statistics.CA Cancer J.Clin.,57:43–66,

2007.

[11] D.P.Naidich,J.P.Ko,and J.Stoechek.Computer aided

diagnosis:Impact on nodule detection amongst community level radiologist.A multi-reader study.In Proceedings of

CARS2004Computer Assisted Radiology and Surgery,

pages902–907,2004.

[12]N.A.Obuchowski.Nonparametric analysis of clustered roc

侮辱英语

curve data.Biometrics,53:170–180,1997.

[13]S.J.Swenn,J.R.Jett,T.E.Hartman,D.E.Midthun,

S.J.Mandrekar,S.L.Hillman,A.-M.Sykes,G.L.

Aughenbaugh,A.O.Bungum,and K.L.Allen.CT

screening for lung cancer:ﬁve-year prospective experience.

Radiology,235(1):259–265,2005.

[14]V.Vural,G.Fung,B.Krishnapuram,J.Dy,and R.B.Rao.

Batch-wi classiﬁcation with applications to computer

aided diagnosis.In Proceedings of European Conference on Machine Learning,2006.

[15]M.Wolf,A.Krishnan,M.Salganicoﬀ,J.Bi,M.Dundar,

G.Fung,J.Stoeckel,S.Periaswamy,H.Shen,P.Herzog,

and D.P.Baidich.CAD performance analysis for

pulmonary nodule detection on thin-slice MDCT scans.In

H.Lemke,K.Inamura,K.Doi,M.Vannier,and

A.Farman,editors,Proceedings of CARS2005Computer

Assisted Radiology and Surgery,pages1104–1108,2005. [16]X.-H.Zhou,N.A.Obuchowski,and D.K.McClish.

Statistical Methods in Diagnostic Medicine.Wiley,New

York,NY,2002.

本文发布于:2023-07-12 04:22:59，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/1091943.html

上一篇：FDA-approved Drug Library_1787种FDA抑制剂_可用于高通量筛选和高内涵筛选_AbMole中国

下一篇：工作流引擎flowable学习----idea创建maven请假流程例子

标签：自我介绍炒饭职场回锅肉

留言与评论（共有 0 条评论）