英语轻松读发新版了,欢迎下载、更新

自我监督的人工智能预测诊断时原发性皮肤鳞状细胞癌的结果不佳

2025-02-16 01:51:02 英文原文

作者:Carucci, John A.

介绍

皮肤鳞状细胞癌(CSCC)是第二常见的人类癌症,估计在美国的发生率超过100万例,在过去的20年中,全球发病率越来越大1,,,,2,,,,3,,,,4尽管大多数CSCC预后都预后,但一部分肿瘤与不良预后有关(PO)5,,,,6包括局部复发(LR),淋巴结转移(NM),远处转移(DM)和疾病特异性死亡(DSD)。CSCC导致大多数角质形成癌(KC)在美国死亡(美国);据估计,每年约有10,000名患者死于CSCC,这与其他癌症(例如白血病,非霍奇金淋巴瘤和黑色素瘤)的DSD率相似。7大约5%和2%的患者发生淋巴结转移和CSCC特异性死亡8,,,,9尽管PO很少见,但这些值可能低估了CSCC的国家癌症注册表数据可以准确评估发病率和流行率。此外,我们必须认识到,大量患有CSCC的患者的不良结局比例相对较小,导致大量转移患者和疾病特异性死亡的患者。依次会导致明显的发病率,死亡率和公共卫生成本负担,如果患者在课程的早期发现最高风险并接受适当的辅助干预措施,则可以避免。10。此外,许多流行病学研究11,,,,12通常,结合角质形成癌(基底细胞和鳞状细胞癌)的数据,因此发生率和患病率很大。随着CSCC发病率的越来越多,PO也可能正在上升,这使得这是不明显的公共卫生实体。

确定诊断时预后不良的患者可能具有挑战性。常用的CSCC分期系统包括美国癌症分期手册联合委员会第8版(AJCC 8)13,,,,14以及杨百翰和妇女医院(BWH)登台系统。AJCC 8系统利用以下因素进行CSCC肿瘤(T)分期:肿瘤大小,深层侵袭,周围性侵袭(PNI)或骨入侵的证据。BWH分期模型包括肿瘤大小,分化差,PNI,皮下脂肪以外的延伸,并根据存在的高风险特征分配T级。虽然大多数POS发生在高阶段肿瘤(BWH T2B和AJCC 8 T3及以上)中,但25%的POS仍出现在低阶段肿瘤中,尤其是T2A/T2,突出了结局的异质性,并且与风险分层的困难伴随着困难特定子集15。因此,即使在较低阶段肿瘤中,预测有差预后的患者的预测也必须在工作,治疗,术后监测和辅助治疗的早期实施方面影响管理16,,,,17。重要的是,在文献中,识别哪些患者可能有POS的风险是在此低阶段亚组中的确定性,并且仍在紧急研究中。

基因表达具有预测能力18,,,,19,所有患者可能无法使用此类技术,并且与高成本有关。另一方面,在CSCC中发现的独特的组织病理学特征的目视检查对于预测进展风险及其在预测低分阶段肿瘤中PO风险中的作用仍然需要进一步阐明20。使用苏木精和曙红(H&E)染色组织的显微镜分析是诊断CSCC的金标准,可用于识别高风险特征,例如侵袭深度,PNI和分化程度。然而,光学显微镜在确定预后本身方面具有有限的作用,鉴于采样误差的潜力以及组织病理学特征的固有读取子变异性。最近,有监督的机器学习算法已在皮肤黑色素瘤领域中使用,以鉴定预后重要的特征,对免疫疗法的反应,一年的无病生存和突变预测,并有令人鼓舞的结果21,,,,22,,,,23,,,,24。但是,目前研究KC的机器学习的研究,例如鳞状细胞癌,但是受到限制。很少有研究研究了在CSCC的全幻灯片图像(WSI)上使用人工智能(AI):最近,Knuutila等。25经过培训的有监督的重新系统结构,以表明人工智能可以预测CSCC原发性肿瘤幻灯片的转移风险,但是这种方法不适合描述分类器使用了哪些功能。

在本文中,我们的目标是通过使用来自原发性CSCC肿瘤的剃须,打孔器或移动活检标本的WSI的图像,通过自我监视的深度学习方法来研究与CSCC结果相关的组织形态学特征。虽然对监督的方法进行了培训,以直接学习特定的标签(这通常很费时以获得,需要专业知识并可能产生偏见)26,,,,27,,,,28,自我监督的方法在未标记的数据集中实现了共同模式的自我发现29,,,,30,,,,31,,,,32。监督方法也经常被描述为黑匣子,其决策过程难以解释33,,,,34,尽管努力制定解释策略35,阻碍了人类和监管批准组织的接受34。因此,调查预测结果预测的无偏见且可解释的策略至关重要,目前,自学的学习范式为研究和医学中的AI开发提供了绝佳的机会36。在这项研究中,我们使用了来自三个学术机构的163名患者作为发展队列,而来自其他两个机构的563名患者作为测试队列(补充图。1a)并基于我们的研究核心基于组织学表型管道(HPL)29。这条管道(图。1)最近在一项肺癌研究中开发了),并显示以自我监督的方式聚集了重要的WSI特征,从而在亚型分类和生存预测方面取得了希望29。此外,HPL提供了额外的解释性层。监督方法也被视为基线比较,也可以作为从图像中提取更多信息的手段(补充图。1B)。

图1:自我监督的组织学表型学习管道的适应性研究皮肤鳞状细胞癌。一个
figure 1

首先将玻片以0.5 um/Pixel(相当于20不的放大倍数)为224 − 224像素的较小图像。b这些瓷砖的一部分被用来训练自我监督的巴洛 - 双wins架构。c一旦受过训练,将三个队列中的所有瓷砖都投射到受过训练的网络上,以提取其瓷砖向量表示z,这是每个图像的128向量编码。d然后使用莱顿方法过度聚类这些矢量表示,以获取均匀的簇(称为组织形态表型簇,HPC),并在视觉上识别从组织表示中识别伪影。在瓷砖矢量表示z的umap中,每个点代表一个瓷砖,每种颜色a不同的hpc。e从研究中除去了属于高度富集在伪影中的HPC的瓷砖。f然后,清洁的数据集将经过更详细的分析,并进行新的莱顿聚类。清洁瓷砖矢量表示的UMAP z显示了26个HPC,对应于26组自识别表型,以及与面板中示例幻灯片相对应的前5个簇的代表瓷砖(c)。g然后,可以使用所得的HPC来生成显示简化的幻灯片表示的热图,并进行了分析,以确定由自我监督的方法和患者结果确定的表型之间的潜在相关性。在这里,与面板中的示例幻灯片部分相对应的热图(一个显示),前5个簇编号,并与面板中的群集相对应(f)。莱因哈德(Reinhard)的颜色标准化后,所有瓷砖均显示47

结果

自我监督的学习突出了与差和良好结果相关的组织形态表型

从三个机构中收集了167名用作开发队列的患者的载玻片(补充表1,补充图。1a):纽约大学(纽约大学),加利福尼亚大学旧金山大学(UCSF)和Brigham and妇女医院(BWH),以及有关患者是否发展好还是不良结果的临床信息(请参阅方法部分,以及补充部分如图。2)。作为外部队列,从Salamanca(CAUSA)的Complejo Asistencial Universitario Universiatrio consitario de Salamanca(CAUSA)和410名来自亚利桑那州Mayo诊所(Mayo)的患者收到了153例患者的幻灯片。

临床结果数据被注明为良好和差的二进制标签,良好的结果表明,在最近的随访和PO中,没有疾病的证据,代表了局部复发,淋巴结转移,远处转移或特定疾病的死亡。此外,纽约大学和UCSF开发数据集(中位随访,分别为38.0和32.7个月)以及CASA和Mayo测试数据集(中位数随访,51.5和41.9个月,可用于纽约大学和UCSF开发数据集(分别为38.0和32.7个月),可用于无病生存率(DFS)数据(分别为38.0和32.7个月)分别),允许我们使用这些队列执行Cox回归模型。关于年龄,肿瘤阶段,大小的进一步患者信息,不适合UCSF和BWH队列。为了客观地识别有可能预测PO的表型,我们使用了基于Barlow-Twins自我监督方法的管道37如图图所述。1(有关详细信息,请参见方法)。该管道已被证明可以成功识别有意义的表型在肺腺癌(例如不同的组织学亚型和组织类型)上,并将其与整体和无复发生存联系起来29。Barlow-Twins模型分为两个相同的编码器,这些样本批次来自相似的瓷砖,但以不同的方式扭曲。在试图最大程度地减少这两个网络上的嵌入与目标互相关之间的经验互相关时,投影到该网络中的图像最终会导致瓷砖向量表示,如果相应的图像具有相似的特征,这些图像矢量表示彼此之间更相似。为了简化对矢量表示的分析,使用Leiden算法将共享相似表型的瓷砖共享分为称为HPC(组织形态学表型簇)的组,该算法具有具有层次聚类的社区检测算法的优势作为实体之间的连接。

遵循方法部分中解释的策略,以减少过度拟合,我们冻结了与分辨率râ= 0.75相对应的莱顿群集配置,从而使数据集将数据集分为26 hpcs(图。2a)。PAGA(基于分区的图形抽象)表示(图。2b)说明如何在图和相应UMAP的顶部和相应UMAP的簇(图2C)似乎更丰富与预后不良风险相关的瓷砖。正如预期的那样,在选定的分辨率下,表型的比例不同,而HPC的相对大小差异很大(补充图。3a,,,,4a,,,,5a)。但是,我们可以看到患者的代表性很好,并且在大多数患者中,许多表型都在各种程度上存在(补充无花果。3b,,,,4b,,,,5b),除HPC 25外,对不同机构的代表性进行了类似的观察(补充图。3C,,,,4b,,,,5b,,,,6b)。

图2:无监督的方法产生的簇富含来自预后较差的患者的瓷砖,对三个队列的良好代表,并预测无病生存率,同时为这一预测提供了重要的瓷砖簇。一个
figure 2

UMAP与26个Leiden群集在分辨率为0.75处发现。b带有节点连接的莱顿群集的PAGA表示。节点的大小与瓷砖的数量成正比,其颜色与与良好/良好结果患者相关的瓷砖的比例成正比。cUMAP带有颜色显示与良好/不良结果患者相关的瓷砖(绿色/橙色)。每个点都是瓷砖。d比较了对RFS开发队列的RF(NYU+UCSF)和外部队列(CAUSA,Mayo)的RF的预测的单变量分析(3倍交叉验证的平均值)。C-索引低于0.5(绿色),表明预后差的风险较低,而C索引高于0.5(橙色)表示预后不良的风险较高。e面板C的详细信息对于两个簇C的详细信息,其中开发队列和外部队列显示出相同的趋势(参见补充图。6对于所有集群)。错误条显示置信区间。fHPC的PAGA的投影显示了开发队列的交叉验证和外部队列的相干趋势。g使用COX回归,开发队列(NYU+UCSF)对无监督HPL方法的预测高风险患者的Kaplan Meier曲线,其无监督HPL方法的结果很差。第一行是使用整个数据集计算的,而第二和第三则仅显示阶段T2A(BWH分期)和T2(AJCC分期)的患者子集。误差线显示95%置信区间(CI)。括号之间显示了95%的危险比(LOGRANK)。在整个数据集中计算的中位价值用于从高风险患者中划分。h与G相同,但使用蛋黄酱作为测试队列。与G相同,但使用CAUSA作为测试队列。使用HPL管道,可以通过每个HPC中包含的图块的百分比来描述每个患者,这又可以用作回归模型的输入,以估计HPC结果预测能力。

为了评估莱顿聚类对预测结果预测的可变性和影响,我们还进行了三倍的交叉验证对数回归,用于差与良好结果的二元分类(补充图。6d)和用于生存预测分析的COX回归(补充图。6e)各种决议。我们观察到上面选择的Râ= 0.75的分辨率也允许在两种方法中成功预测,同时防止过度拟合。

自我监督的学习确定与生存相关的组织形态表型簇

如前所述,可以通过分配给每个HPC的图块的比例来描述每个患者,而这种简化的表型描述可用于研究结果可预测性。对每个HPC进行单变量的C-指数分析,我们观察到一些HPC与差或良好的结果相关(图。2d),对于其中10个HPC,在两个外部队列中证实了趋势(图。2e,补充图。7)。投影在PAGA代表上(图。2f),我们注意到与结果相关的那些位于图的上部,与更好的结果相关的那些位于图的下部。

使用描述HPC分布的整个HPC矢量的多元对数回归方法,我们研究了预测这些HPC的二进制结果的潜力。在3个开发数据集上运行的良好结果二进制分类导致平均3倍的交叉验证AUC为0.724(验证集)和0.689(测试集;补充图。8a)。尽管CAUSA队列的表现较低(AUC-0.554),但它们在蛋黄酱外部队列中非常出色,其中包含MOHS切除,剃须和打孔活检(AUC〜0.844;补充图)。8a)。基于自我监督的方法,该结果允许无偏见的瓷砖簇选择,这些块被确定为与实现分类最相关的瓷砖。它还提供了一种识别对此类预测重要的表型的简便方法,例如森林图分析,该图显示了主要簇对该预测的贡献(补充图。8b)。

作为比较,我们培训了监督网络Inception V3,该网络依赖于选定的标记区域。通过APERIO的ImageScope接口,通过三个董事会认证的MOHS显微镜外科医生(M.C.,S.R.J,R.W。)和一名高级皮肤病学居民(M.J.)进行了基于共识协议的手动自由文本注释。如果所有审阅者都同意注释功能,则达成共识。我们遵循Johannet等人类似使用的管道。22在研究黑色素瘤中。在这里,我们首先训练了算法(补充图。8c,e,补充表2,,,,3)确定手动注释的区域(正常皮肤,原位鳞状细胞癌,侵入性的鳞状细胞癌,其他和伪影)。其他组的注释包括真皮,脂肪,腺组织,平滑肌,软骨,炎症性浸润以及正骨和过度高核的存在。人工制品注释包括负空白空间,气泡或笔标记。其次,我们检查了选定的区域(侵入性CSCC)还是一组感兴趣的区域(正常,原位,侵入性)可以用于预测二进制结果(补充图。8d,f)。这种有监督的方法在侵入性CSCC上实现了AUC = 0.675的性能,而开发队列的感兴趣区域(正常,原位和侵入性组合)的表现为0.671。在CAUSA和Mayo测试队列上,它分别在入侵CSCC上分别实现了AUC = 0.576和0.711的性能,并且在利息区域内分别实现了AUC = 0.598和0.598和0.726。除了执行比自学方法稍差(补充图)。9以及补充表中的更多指标4,,,,5),这两步方法依赖于手动注释,应从其进行预测的区域的选择,以及直接解释模型如何做出决定或使用哪些表型子集的区域的选择。这三个瓶颈都是由下一部分所述的自我监督方法来解决的。

更有趣的是,在可用DFS数据的两个数据集的情况下,我们执行了COX回归,并获得了Harrell的C-Index 0.73(0.72 UNO的C-Index)和DFS预测p - 交叉验证队列上的2.2e-4值(图。2G),Harrell的C索引为0.84(0.83 UNO的C-INDEX),并且p-Mayo队列上的Valueâ<0.0001(图。2H),Harrell的C-Index为0.62(0.62 Uno的C-Index)和p-CAUSA队列上的2.4e 2值(图。2i)。森林情节(补充图。10a)和形状图(补充图。10b计算)以解释不同HPC对该预测的贡献,并了解哪些表型会影响模型的预测。预计,单变量方法确定的大多数HPC在这种多变量方法中似乎也相关。在BWH T2A和AJCC-8 T2肿瘤的一部分中,这种方法在区分较差与良好结果方面特别有效(图。2g i),目前面临重大结果异质性,因此使预后挑战。请注意,卡普兰·梅尔(Kaplan)绘制了从监督分类器产生的不良结果概率中推断出的总体表现较低的表现(补充图。11,,,,12)。这种自我监督的方法可能有可能解决由于AJCC T2和BWH T2A组中相对缺乏结果均匀性引起的当前分期系统的差距。

集群解释突出了组织形态学表型,与局部复发,转移和良好结果的可能性增加有关

HPL的自我监督方法提供了可解释的能力,使我们能够确定HPC在我们的COX回归模型(DFS)中或在日志回归模型中的良好/差方面,以较低或更高的总体PO风险来加权。。首先,从每个HPC中选择了100个随机图块,并通过三个董事会认证的MOHS显微镜外科医生(M.C.,S.R.J,R.W。)和一名高级皮肤病学居民(M.J.)进行视觉分析。这些观察的细节显示在补充表中6(请参阅补充图。13对于随机选择的图块的子集)。当我们在基于分区的图形抽象上投射这些观察结果时(Paga图。3),我们观察到共享类似表型的HPC与特定和相干区域相连,并确认HPL表示的连贯性。进一步的验证是在外部人群上类似的(补充图2。14,,,,15)。

图3:PAGA图显示了在皮肤鳞状细胞癌全滑动图像上发现的特征的连贯组织。
figure 3

由一组MOHS外科医生提供的注释,其中包括从开发队列中随机选择的瓷砖(Nyu+ucsfâ+bwh),每个HPC(从补充表中获取注释6),并在图2的PAGA图上投影。2b

在图中4和图。5,我们显示了与图1的COX模型相关的HPC和瓷砖的示例。2G

图4:来自HPC的瓷砖的示例与较高的预后风险相关。一个
figure 4

从某些HPC中随机选择的瓷砖的示例,导致预测不良结果的风险。b,,,,c手术后不久(10.5个月,局部复发)和手术后几年(46个月,淋巴结转移)后不久(10个月复发)的患者数据的示例。对于每种情况,显示了原始幻灯片的一小部分以及相应的热图和相关的形状决策图。热图的颜色显示了与每个瓷砖相关的HPC,其中属于传说中显示的每个HPC的瓷砖比例(在每个患者可用的整个幻灯片上计算出的百分比)。Shap决策图的顶部显示了决定曲线颜色的预测值。从底部到顶部的读数,每个HPC的外形值累积,并根据绝对形状的重量进行排序。在右侧,与每个群集相关的图块的比例显示在log10刻度上。莱因哈德(Reinhard)的颜色标准化后,所有瓷砖均显示47

图5:来自HPC的瓷砖的示例与较低的预后风险相关。一个
figure 5

从某些HPC随机选择的瓷砖的示例,导致预测良好结果。bHPC之间的相互作用分析显示了两组HPC,这些HPC倾向于在载玻片上相邻。每列显示了与给定HPC与与其相邻图块相关的HPC相关的每个图块相互作用的归一化比例。树状图对应于HPC的双层聚类。c,,,,d来自未重复且遵循超过三年的患者的数据示例。对于每种情况,显示了原始幻灯片的一小部分以及相应的热图和相关的形状决策图。热图的颜色显示了与每个瓷砖相关的HPC,其中属于传说中显示的每个HPC的瓷砖比例(在每个患者可用的整个幻灯片上计算出的百分比)。Shap决策图的顶部显示了决定曲线颜色的预测值。从底部到顶部的读数,每个HPC的外形值累积,并根据绝对形状的重量进行排序。在右侧,与每个群集相关的图块的比例显示在log10刻度上。莱因哈德(Reinhard)的颜色标准化后,所有瓷砖均显示47

在图中4a我们显示了从HPC中随机选择的瓷砖对C-Index的影响最高,以预测PO的较高风险,而在图中。4b,c,PO患者的三个热图显示了投影在WSI部分的HPC组合物。对于图2的患者。4b例如,肿瘤在10.5个月后复发,其相关的WSI在HPCS 1、6和20中表现出很高的富集,后两个具有较高的外形log危险比值同义,而PO的风险较高。HPC 6显示出丝分裂分化不佳的角质形成细胞的深度侵袭。HPC 20显示出与有丝分裂的分化差和多态性角质形成细胞,HPC 1类似地表现出分化差的角质形成细胞,在先前的研究中已显示出与POS相关的研究,例如局部复发,例如38

图中的WSI。4C显示了HPCS 0、1、5、6和13的高度存在,后2显示了高外形log危险比值。HPC 13显示了一些多形特征以及深度浸润的肿瘤细胞,其中一种与POS相关的特征。类似地,HPC 6表现出深层侵袭,分化差和显着的非典型性,这些因素被认为有助于POS38。我们的研究产生的塑造决策图表明,某些HPC的瓷砖的富集或耗尽如何使肿瘤在手术后不久重现的患者重新决定(LR,图,图。4b)和患者在手术后46个月反复出现(NM,图。4C)。

在图中5a我们显示了对HPC的随机选择的瓷砖,对C-Index的影响最高,以预测PO的低风险。在图中5b,分析显示,通过检查属于给定HPC的每个瓷砖,HPC与其相邻瓷砖相关的HPC群体倾向于在幻灯片上相邻。两组HPC被确定为较低的PO风险被认为具有相对较高的相互作用:HPCS 7、12和16是一组,HPC 3、8和24与另一组相互作用。所有这些HPC的共同点是存在良好的角质形成细胞和缺乏异型或多态性的存在,这对于具有良好结局的肿瘤有望。在图中5C,d,我们显示了与两名患者相关的幻灯片,他们在没有疾病的情况下进行了三年以上。这些幻灯片显示了图中看到的HPC 3、8和24的存在。5b,以及HPCS 7、12和16。同样,良好的分化,缺乏多态性和相对非特异性的高性病特征可以归因于这些发现。

由于与良好结果相关的HPC是那些具有正常出现角质形成细胞的HPC,因此幻灯片选择可能会影响瓷砖的产生百分比,因此会影响最终分类。为了探讨这一点,我们首先生成了一个带有热图颜色的树状图,显示了所有具有多个载玻片的患者的HPC组成,并进行了分层聚类(补充图。16a),在大多数情况下证明属于同一患者的幻灯片在附近聚集,显示出相似的总体成分。然后,对于每个患者,我们在与每个群集相关的瓷砖百分比(患者内差异)中计算了自身幻灯片的方差簇)。补充图。16B,C表明,大多数患者和大多数HPC的患者内方差与患者间方差的比率高于1,这意味着属于同一患者的幻灯片之间的均匀性更多。最后,我们计算了每张幻灯片而不是每个患者制造的决策图(补充图。16d),表明在给定患者的幻灯片之间的结论仅略有不同。

开发和测试集之间的其他差异和潜在的混杂因素是活检的类型和解剖部位的来源。虽然NYU队列的切除片与剃须活检的幻灯片几乎一样多,但CAUSA测试队列仅由宽阔的局部切除组成,另一方面,蛋黄酱队列主要由剃须活检(补充表占主导地位)1)。我们注意到,大多数HPC都包含来自所有类型制剂的瓷砖(补充图2。3e,,,,4d)。在一些罕见的HPC中,切除和活检之间的瓷砖相对比例的变化更大(补充图。3D,,,,4C):例如,位于PAGA图的左上方并包含皮下脂肪/组织的HPCS 1和13在剃须活检中相对较少比切除,这是一致的,考虑到剃须样品的深度比样品的深度更薄通常是全厚的切口。虽然生存预测管道似乎在Mayo队列的剃须和切除活检中表现良好(补充图。17a c),我们不能排除,在较大的训练集中,活检类型的特定COX回归方法将导致更好的性能,特别是对于像CAUSA队列中使用的那样广泛的本地切除标本。最后,由于解剖部位与可用的患者数量相比,解剖部位是如此多样化,因此有关潜在偏见的结论更难得出,但是我们也注意到大多数HPC至少包含来自大多数站点的至少几个瓷砖(补充无花果。3f,G,4e,f,5光盘)。在此阶段,我们没有发现组织源对生存预测的任何两种影响,尽管需要较大的每个源组群体(补充图。17d, e)。

讨论

In this study, we demonstrate that the self-supervised approach of HPL can be successfully applied to analyze sets of cSCC WSIs from initial biopsy samples from different institutions, grouping in a coherent manner a variety of histopathological features linked to good and POs. The ability to obtain prognostic information from biopsy slides alone is significant as it may guide clinical decision making regarding treatment and surveillance of patients with potential PO. The HPCs identified were used to predict the good versus poor outcome with an area under the curve (AUC) of ~0.7 and the disease-free survival (DFS) with a c-index of 0.73 (p-value = 2.2e−4) on the cross-validation of the development cohort, and AUC ~ 0.58–0.73 and c-index~0.62–0.84 on the test cohorts. The performance remains compelling in a subset of AJCC-8 T2 and BWH T2a tumors (c-indexes of 0.85 and 0.71, respectively, on the cross-validation cohort; 0.96 and 0.75, respectively, on the Mayo cohort; and 0.56 for the BWH T2a on the CAUSA cohort). The performance achieved here can also be placed in the context of other methods to which it could be potentially combined to further refine the precision of the outcome. For example, Zhao et al.19showed that protein expression of AXIN2 and SNAIL have a c-index of 0.69 in predicting recurrence-free survival, and that, although their available clinicopathological data alone had little prediction power (c-index of 0.40), the c-index was increased to 0.75 when combining clinicopathological with the protein expression. Using supervised deep-learning architectures, few studies have explored the predictability of cSCC from WSIs, all of which were unable to pinpoint which features were used by the algorithm to make the decision. Focusing on prediction of metastasis in a cohort of 104 patients harboring cSCC, Knuutila et al.25achieved AUCs within 0.629–0.689, performing better (AUC = 0.747) when restricting the study to those that recurred rapidly (within 180 days). This study was done using either the whole slide or tumor regions manually annotated by pathologists. On the other hand, using two cohorts of 54 melanoma patients, Comes et al.24achieved AUCs = 0.667–0.695 in predicting the one-year disease free survival using regions of interest manually pre-selected by pathologists. In addition to its performance, the advantages of the HPL pipeline, initially developed on lung cancer29, is that its training is self-supervised and does not require any manual pre-annotations. Furthermore, it provides an additional layer of interpretation highlighting which phenotypes weighed in favor of higher or lower risk prediction.In our study we found that enrichment in HPCs 3 and 8 correlated with a prediction of good outcome from cSCC. Both HPCs demonstrate well-differentiated keratinocytes, lack of atypia or pleomorphism, and the relatively non-specific presence of hyperkeratosis. Alternatively, enrichment in HPCs 7 and 16, which feature well-differentiated cells and lacked pleomorphism, deep invasion, or significant atypia, also correlated, to a lesser extent towards a lower risk of PO. Additionally, among the HPCs identified that correlated with higher risk of PO, the two major phenotypes identified were severe pleomorphism with poor differentiation and deep invasion. This result is consistent not only with previous studies but also with the current BWH and AJCC-8 staging systems

38。In a recent meta analysis, Zakhem et al.revealed that tumors with invasion beyond the subcutaneous fat were associated with a statistically significant risk of LR and DSD39,,,,40,,,,41。Moreover, ulcerated tumors, poorly differentiated tumors, PNI, lymphovascular invasion, desmoplastic stroma and immunosuppression are all significantly associated with POs38,,,,42。These attributes have been defined as key defining features of aggressive behavior by cSCC and some are considered in staging.It is for this reason that the ability of our machine learning algorithm to detect these features, at time of biopsy, is significant.Our system may offer a standardized method for feature identification given the potential for inherent inter-reader variability in identification of high risk histopathologic features by dermatopathologists.Poor differentiation and invasion beyond the subcutaneous fat have been associated with an increased risk of metastasis43。More interestingly, the different clusters identified in this study and their association with certain types of outcomes are located in well-defined and coherently connected regions of the UMAP and of the PAGA graphs (Fig.6a, b)。The two sets of HPCs associated with good outcome (HPCs 3, 8 and 24 as one group, and 7, 12 and 16 on the other) and identified as having high numbers of interactions on the slides are each connected in the PAGA and located on the lower right side of the UMAP.On the other hand, the top side of the UMAP is dominated by HPCs associated with POs.

Fig. 6: Specific HPCs are correlated with poor outcome.一个
figure 6

,,,,bProjection on the UMAP and PAGA graph of the HPCs associated with high and low risk of poor outcome.cUltimately, we anticipate such a deep-learning tool, which identifies patients at higher risk with poor outcome and provides histomorphological interpretability, could assist treating physicians in making decisions on an increased post-operative follow-up and management strategy.控制板 (c) created with biorender.com.In this study, we uniquely used a self-supervised learning followed by community-based clustering to predict cSCC-free survival, while describing the phenotypes of the clusters weighing the most in these findings. Predictive clusters for PO included those with poor differentiation whereas lack thereof, enrichment in non-specific hyperkeratosis without atypia tended to favor prediction of good prognosis. Importantly, we demonstrate significant potential to optimize clinical decision-making in that this approach is particularly efficient at differentiating PO risk in low stage tumors (BWH T2a and AJCC-8 T2 tumors). This addresses a large gap in the literature relating to outcome homogeneity between low stage tumors (BWH T2a and AJCC T2) given that ~25% of POs occur in low stage tumors

15。Gupta等。previously evaluated risk factors for poor outcomes in stage T2a cSCC and identified a predictive model for those at risk of poor outcomes using major and minor criteria with high specificity (97.4%) but low sensitivity (7.7%) (15).Thus, while this model is highly accurate at identifying tumors without poor outcomes, it does not perform as well at identifying tumors that will develop poor outcomes.Additionally, this study was not externally validated.While direct comparison of these models is outside the scope of this study, we hope that our findings may add to the literature demonstrating an adjunctive method for accurately identifying patients at risk for poor outcome on an externally validated cohort.Overall, we believe incorporation of these data along with existing clinical information may augment identification of highest risk patients and would allow for rationally based, focused clinical follow up that may lead to the development of algorithms for further imaging and work up.

Our results suggest that prognostication of cSSC can benefit from self-supervised learning to not only assist clinicians in predicting outcomes but also highlight histomorphological patterns associated with these outcomes.These findings play a significant role in patient care, as prognostic information from initial biopsy slides alone may guide clinical decision making with regard to diagnostic workup, treatment and surveillance of patients with high risk for PO.Clinicians may use the information gained from self-supervised learning as an adjunct in clinical decision making and assigning pre-test probabilities for patients that might be at higher risk for PO and benefit from further workup and management.(如图。6c)。For example, patients identified at high-risk may be deemed appropriate for pre-operative imaging, more frequent follow up or removal of a primary tumor with complete margin control and enhanced pathologic staging provided by Mohs micrographic surgery44,,,,45

Development of, and access to, large datasets will be crucial to further validate and expand the current study. Ultimately, the ability to assess the risk of PO at time of initial diagnosis could provide the basis to establish and test diagnostic and therapeutic protocols that could ultimately optimize clinical outcome.

方法

道德批准

The NYU study number is 20-01740 and is classified as non-human research, therefore was not subject to IRB review at NYU. UCSF received approval for expedited IRB review (protocol #21-34087) and BWH (protocol# 2021P000701). The cohort from the Complejo Asistencial Universitario de Salamanca was approved by the local IRB. The cohort from the Mayo Clinic was subject to IRB review (ID #21-012833, from Clinicopathologic and Multi-Omic Stratification of Cutaneous Squamous Cell Carcinoma).

数据集

Datasets of shave or punch biopsy specimens were collected from three institutions each with separate IRB approval processes (Supplementary Table1, Supplementary Fig.1a): 119 slides from 42 patients were collected at New York University (NYU), 95 slides from 95 patients were collected at the University of California San Francisco (UCSF) and 40 slides from 40 patients were collected at the Brigham and Women’s Hospital (BWH). For cases at NYU, multiple slides from a single lesion were analyzed per patient. For slides from BWH and UCSF, single slides of the highest yield resolution were obtained for ease of logistical coordination between sites. Due to lack of information or poor slide quality (e.g., out of focus), fourteen slides were removed from the study (Supplementary Fig.2)。We therefore ended up including slides from 163 patients diagnosed with cSCC on initial biopsy who developed good or poor outcomes (NYU: 38 patients,n = 31 and 7, respectively; UCSF: 85 patients,n = 58 and 27, respectively; BWH: 40 patients,n = 20 and 20, respectively). The size of the training cohort was determined by the samples available in the institutions selected to compose the training cohort, and due to the limited amount available, it was determined by sample size calculation approaches. To benefit from the best resolution available, whole slide images were scanned on an Aperio AT2 scanner and captured at 0.25 um/pixel at 40 X using JPEG2000 compression and stored as a svs pyramidal file.De-identified slides from other collaborating institutions were sent to NYU Langone Health’s Dermatologic Surgery & Cosmetic Associates Office. The scanned images did not contain any patient information. De-identified samples were simply classified as good versus POs. All de-identified physical slides were stored at NYU Langone Health’s Dermatologic Surgery & Cosmetic Associates Office until the analysis was complete. Upon completion of analysis, the slides were returned to the original institution.Adult patients 18–89 years old with existing slides of biopsy proven cSCC obtained prior to January 1, 2021 were included. Patients were excluded if they were outside the specified age range and had no histological confirmation of cSCC. Patients treated at NYU Langone Health’s Dermatologic Surgery & Cosmetic Associates Office were identified for inclusion by a NYULH study team member based on the above inclusion and exclusion criteria above. Patients at NYU were identified using retrospective chart review of those with poor outcome, and thereon manual review of slides were performed to select those with the best slide quality. Patients treated at collaborating institutions were identified by individuals at those institutions based on the inclusion and exclusion criteria.Patients were classified as having a PO if the tumor was successfully treated but the tumor came back at any time in the future, either in the form of local recurrence (LR), nodal metastasis (NM), distant metastasis (DM) or if the patient had a disease-specific death (DSD). Otherwise, if no tumor was detected at subsequent visits, the patients were classified as having a good outcome. In total, 119 patients were associated with good outcomes and 44 with POs. For UCSF and NYU, the times to LR, NM, DM or DSD were also available, giving further granularity into the disease-free survival (DFS) analysis. However, this data was not available for the BWH cohort.

In addition, two external cohorts were obtained (Supplementary Table

1

): 156 slides from 153 patients (112 good outcome, 41 poor outcome) from the Complejo Asistencial Universitario de Salamanca (CAUSA) scanned with MoticEasyScan One (Motic, Hong Kong) and 411 slides from 410 patients (455 good outcome, 55 poor outcome) from the Mayo Clinic (Mayo) scanned with a Leica’s GT450 scanner (Leica Biosystems).

Self-supervised-based analysisThe self-supervised-based study was based on the Histomorphological Phenotype Learning (HPL) through self-supervised learning and community detection pipeline developed by Quiros et al.29

and summarized as follows and in Fig.

1(and Supplementary Fig.1)。Using the DeepPATH tools46, images from the 3 datasets were first tiled (removing those where the background covers more than 75% of the tile, and applying color normalizing using the Reinhard’s method47), and converted to a h5 file such as each tile fed to the self-supervised pipeline has a field of view of 224 ×224 pixels at a pixel size of about 0.5 um (corresponding to a magnification of 20×, Fig.1a)。The 2,069,052 resulting tiles were split such that 40% of the tiles from each dataset were combined and used to train the self-supervised Barlow-Twins algorithm based network37(如图。1B)。After training (Supplementary Fig.6a), all the tiles were projected into the 128 dimensionztile representation vector of the trained network (Fig.1C) and are represented by UMAPs48and PAGA49in this manuscript (Fig.1d)。As a filtering step, a first Leiden clustering50was achieved using a resolution of 7 in order to obtain a large (n = 136) number of Histomorphological Phenotype Clusters (HPCs) and over-cluster and increase the chance of having homogeneous HPCs. Those HPCs were visually inspected to identify those containing artifacts (air bubbles, blurring, dust, etc. Fig.1d), with the goal to remove from the rest of study the tiles from HPCs representing artifacts. We identified 9 clusters containing artifacts which were removed from the dataset. Those artifacts are all contained within regions protruding from the rest of the UMAP (Fig.1e), and were exclusively artifacts without any underlying tissue. The resulting 1,998,932 tiles were then used for the rest of the study. Next, another round of Leiden clustering was applied to the remaining tiles (Fig.1f), and each HPC was mapped back to the slide of each patient (Fig.1G)。Each patient is therefore described by a patient vector representation which is embedded in the percentage of tiles associated with each HPC.

Considering the small development dataset, the analyses were done using a 3-fold cross-validation approach to study the variability of the approach while allowing each set to have enough samples. In each fold, a different third was used as a test set, while the remaining tiles are split between training (80%) and validation (20%). To ensure representativity and proper split, folds were generated randomly with a single constraint on the RFS to ensure each train and test sets have similar Kaplan–Meier profiles. After Leiden clustering, each whole slide image (WSI) can be represented by a codebar called a WSI vector representation which describes the distribution of tiles in each HPC. When a patient has more than one slide available, those can also be aggregated into a “patient vector representation”. Logistic and/or Cox proportional hazards regressions have been run using the patient vector representations from the training sets, and evaluated using the validation and test sets left. Similar to the previous study on lung cancer29, the performance was analyzed on a set of cluster configurations via n-fold cross validation to estimate variability at a given Leiden resolution, the Wald test being used to measure the significance on each regression and using Fischer’s method to combine thep-values. Once done, we locked down a fold for further analysis.As the resolution parameter r of the Leiden clustering algorithm is increased, the UMAP appears as split into more HPCs (Supplementary Fig.

6c, green curve). However, as the number of HPC increases and gets smaller, in terms of average number of tiles, the risk of obtaining institution or patient-specific clusters increases (Supplementary Fig.6c, purple and cyan curves), which would be a sign of over-fitting. Indeed, increasing the number of clusters too much increases the risk of detecting features which are patient or institution specific (which may be caused by the fact that the 3 cohorts were stained in 3 different institutions and scanned on 2 different scanners, or may be related to some other phenotypes specific to natural variations between individuals). However, we are interested in finding common patterns across the three institutions, with enough meaningful (or compact) clusters to describe the diversity of common patterns found in this disease. Therefore, to select the best Leiden cluster resolution for the subsequent analysis of the HPCs, we checked, for each resolution: 1- the average patient and institution presence in HPCs (see details below); 2- the performance of the binary classifier (good versus poor outcome) via the AUC (Area Under the receiver operating Curve) of the logistic regression approach; 3- the performance of the Cox Regression for survival prediction. The average patient presence (Supplementary Fig.6c) is defined as the average percentage of patients present in the HPCs at a given resolution, either counting all patients even if only 1 of their tiles belong to a certain HPC, or using a 1% threshold. Similarly, the institution presence (Supplementary Fig.6c) is defined as the average percentage of institutions present in the HPCs at a given resolution, either counting all institutions, or only counting those with at least 1% of their tiles associated with a given HPC. Despite the small size of our cohort and limited number of institutions involved, it allows us to get a sense of the potential generalization of the study, and these averages will tend to get smaller as the HPCs become more patient or institution specific. We notice that at resolutions higher than r = 0.75, these averages decrease, showing that more HPCs become specific and less generalizable across patients and institutions.Binary classification between good and poor outcome was done using all three development samples, while survival analysis data was only available for the NYU and UCSF datasets. Those analyses were done using a three-fold cross validation approach (Supplementary Fig.6d, e), using folds consistent with those used for Leiden cluster determination to study report influence on variations between different Leiden clustering runs (Supplementary Fig.

2)。The logistic and cox regressions were done following the approach detailed in Quiros et al.29Briefly, WSI vector representations were built for each patient to describe the percentage of tiles associated with each HPC, and center log-ratio transformation was applied to use those in linear models.A three fold cross-validation analysis was performed such as, for each fold, one third of the patients is used as a test set, and the rest is used for the training/validation process.For each fold, the regressions were fit using the training set and assessed with the validation and test sets.

Elastic-net penalty models were used for regression where we optimized the alpha parameter (final value of 0.25) for the logistic regression analysis, and the alpha and l1 ratio parameters (to final values of 0.35 and 0.01 respectively) for the Cox regression analysis. After having locked a cluster configuration, the medium of the hazard predictions on the training set was used to define the threshold between the low and high risk groups used on the test set and shown in Fig.2G。The statistical significance between the two groups is measured using the log rank test and ap-value threshold of 0.05.In addition to the cross-validation results on the development cohort, the generalizability of those trained networks was tested by inferences on the two external cohorts, CAUSA and Mayo.

Cluster analysis

UMAPs

48(Uniform Manifold Approximation and Projection) and PAGA49(partition-based graph abstraction) were used to visualize the tile vector representations and resulting Leiden clusters. PAGA provides an additional layer of interpretability by preserving the topology where edges between the nodes denote statistically relevant connectivity between HPCs.For each HPC, 100 tiles were randomly selected and visually interpreted (blinded from the positions of the HPCs on the PAGA) by three board certified Mohs surgeons and a senior dermatology resident (M.C., S.R.J., R.W.) who freely annotated and labeled each of them (no features to choose from were supplied). Consensus annotations, which arrived if all reviewers agreed with the annotation features, are shown in Supplementary Table

6。Annotations were then mapped into the PAGA (Fig.3for the development cohort, and, for the external test cohorts, in Supplementary Figs.14,,,,15), where interesting connections between nodes can be seen despite the pathologists’ annotations having been done without knowledge of the PAGA.Analyses of the correlation between the clusters and external annotations were done following the HPL pipeline29

: SHAP (SHapley Additive exPlanations) and Forest plots were used to evaluate how each HPC affects the log odds ratio of patients. The SHAP values were calculated across each test set of 3-fold cross-validation analyses. The Forest plots are based on the log hazard ratio of Cox proportional hazards model over the train sets of a 3-fold cross-validation. The coefficients were averaged across fold and combinedp-values with Fisher’s combined probability test. Correlations with pathologic diagnostic and type of recurrence (LR or overall metastases) was achieved using Spearman’s rank correlation with a significance threshold of 0.01 on thep-values (adjusted with the Benjamini/Hochberg51method for false discovery rate). Overall metastases include both nodal and distant ones.Furthermore, Cox proportional hazards regressions univariate analysis was performed on each cohort using a 3-fold cross validation approach.Supervised analysisTo explore whether the performance of the outcome prediction in a supervised manner, we used DeepPATH

46

and followed an approach comparable to the one used to predict response to Melanoma treatment in Johannet et al.

22, training inception v352twice at a magnification of 20x: first to automatically segment the slides, second to predict the outcome from selected segmented regions.For the segmentation, a 3 and 5-class network were trained.In the 3-class approach explored, the network was trained to identify the following classes: regions of interest, artifacts and other features (muscle, bone, cartilage, hair follicles, nerve…).The goal was therefore to simply be able to sort out the artifacts and other features regions judged irrelevant by the team to later predict outcome.In the 5-class approach, the network was designed to split more precisely the “regions of interest”, and it was therefore trained to identify the following classes: invasive SCC, in-situ SCC, normal epidermis, artifacts and other特征。Next, we trained a network to study the predictability of the good versus poor outcome using the regions of interest only, or using the invasive SCC only.

数据可用性

Reasonable requests for cohort data may be addressed to the corresponding authors.

代码可用性

The codes are written in Python and available on github. The supervised approach relies on the DeepPATH (https://github.com/ncoudray/DeepPATH)。The un-supervised approach relies on the Histomorphological Phenotype Learning pipeline (https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning) and more details on the code and options used in this study are reported inhttps://github.com/ncoudray/AI-analysis-of-cutaneous-squamous-cell-carcinoma/。参考

Lomas, A., Leonardi-Bee, J. & Bath-Hextall, F. A systematic review of worldwide incidence of nonmelanoma skin cancer.

  1. br。J. Dermatol. 166, 1069–1080 (2012).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  2. Waldman, A. & Schmults, C. Cutaneous Squamous Cell Carcinoma.Hematol.Oncol。临床北部。 33, 1–12 (2019).

    文章一个 PubMed一个 Google Scholar一个 

  3. Stang, A. et al. Incidence and mortality for cutaneous squamous cell carcinoma: comparison across three continents.J. Eur。学院。皮肤病。Venereol. 33, 6–10 (2019).

    文章一个 PubMed一个 PubMed Central一个 Google Scholar一个 

  4. Leiter, U. et al. Incidence, Mortality, and Trends of Nonmelanoma Skin Cancer in Germany.J. Invest。皮肤病。 137, 1860–1867 (2017).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  5. Van Lee, C. B. et al. Recurrence rates of cutaneous squamous cell carcinoma of the head and neck after Mohs micrographic surgery vs. standard excision: a retrospective cohort study.br。J. Dermatol. 181, 338–343 (2019).

    文章一个 PubMed一个 Google Scholar一个 

  6. Lansbury, L., Bath-Hextall, F., Perkins, W., Stanton, W. & Leonardi-Bee, J. Interventions for non-metastatic squamous cell carcinoma of the skin: systematic review and pooled analysis of observational studies.BMJ 347, f6153–f6153 (2013).

    文章一个 PubMed一个 PubMed Central一个 Google Scholar一个 

  7. Rogers, H. W., Weinstock, M. A., Feldman, S. R. & Coldiron, B. M. Incidence Estimate of Nonmelanoma Skin Cancer (Keratinocyte Carcinomas) in the U.S. Population, 2012.贾马皮尔托尔。 151, 1081–1086 (2015).

    文章一个 PubMed一个 Google Scholar一个 

  8. Karia, P. S., Han, J. & Schmults, C. D. Cutaneous squamous cell carcinoma: Estimated incidence of disease, nodal metastasis, and deaths from disease in the United States, 2012.J. Am。学院。皮肤病。 68, 957–966 (2013).

    文章一个 PubMed一个 Google Scholar一个 

  9. Eigentler, T. K. et al. Survival of Patients with Cutaneous Squamous Cell Carcinoma: Results of a Prospective Cohort Study.J. Invest。皮肤病。 137, 2309–2315 (2017).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  10. Stevenson, M. L. et al. Use of Adjuvant Radiotherapy in the Treatment of High-risk Cutaneous Squamous Cell Carcinoma With Perineural Invasion.贾马皮尔托尔。 156, 918–921 (2020).

    文章一个 PubMed一个 PubMed Central一个 Google Scholar一个 

  11. Lewis, K. G. & Weinstock, M. A. Trends in nonmelanoma skin cancer mortality rates in the United States, 1969 through 2000.J. Invest。皮肤病。 127, 2323–2327 (2007).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  12. 美国癌症协会。Facts & Figures.Paperpile,https://paperpile.com/app/p/47ca1717-2267-0081-9d88-750bb97346e9(2023)。

  13. Karia, P. S., Morgan, F. C., Califano, J. A. & Schmults, C. D. Comparison of Tumor Classifications for Cutaneous Squamous Cell Carcinoma of the Head and Neck in the 7th vs 8th Edition of the AJCC Cancer Staging Manual.贾马皮尔托尔。 154, 175–181 (2018).

    文章一个 PubMed一个 Google Scholar一个 

  14. Amin, M. B. et al. The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more ‘personalized’ approach to cancer staging.CA Cancer J. Clin. 67, 93–99 (2017).

    文章一个 PubMed一个 Google Scholar一个 

  15. Gupta, N. et al. Identifying Brigham and Women’s Hospital stage T2a cutaneous squamous cell carcinomas at risk of poor outcomes.J. Am。学院。皮肤病。 86, 1301–1308 (2022).

    文章一个 PubMed一个 Google Scholar一个 

  16. Work Group et al.Guidelines of care for the management of cutaneous squamous cell carcinoma.J. Am。学院。皮肤病。78, 560–578 (2018).

  17. Jennings, L. & Schmults, C. D. Management of high-risk cutaneous squamous cell carcinoma.J. Clin。Aesthet.皮肤病。 3, 39–48 (2010).

    PubMed一个 PubMed Central一个 Google Scholar一个 

  18. Wysong, A. et al. Validation of a 40-gene expression profile test to predict metastatic risk in localized high-risk cutaneous squamous cell carcinoma.J. Am。学院。皮肤病。 84, 361–369 (2021).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  19. Zhao, G. et al. AXIN2 and SNAIL expression predict the risk of recurrence in cutaneous squamous cell carcinoma after Mohs micrographic surgery.Oncol。Lett。 19, 2133–2140 (2020).

  20. Yanofsky, V. R., Mercer, S. E. & Phelps, R. G. Histopathological variants of cutaneous squamous cell carcinoma: a review.J. Skin Cancer 2011, 210813 (2011).

    文章一个 PubMed一个 Google Scholar一个 

  21. Yacob, F. et al. Weakly supervised detection and classification of basal cell carcinoma using graph-transformers on whole slide images.Sci代表 13, 7555,https://doi.org/10.1038/s41598-023-33863-z(2023)。

    文章一个 PubMed一个 PubMed Central一个 CAS一个 Google Scholar一个 

  22. Johannet, P. et al. Using Machine Learning Algorithms to Predict Immunotherapy Response in Patients with Advanced Melanoma.临床癌症。 27, 131–140 (2021).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  23. Kim, R. H. et al. Deep Learning and Pathomics Analyses Reveal Cell Nuclei as Important Features for Mutation Prediction of BRAF-Mutated Melanomas.J. Invest。皮肤病。 142, 1650–1658.e6 (2022).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  24. Comes, M. C. et al. A deep learning model based on whole slide images to predict disease-free survival in cutaneous melanoma patients.科学。代表。 12, 20366 (2022).

    文章一个 PubMed一个 PubMed Central一个 CAS一个 Google Scholar一个 

  25. Knuutila, J. S. et al. Identification of metastatic primary cutaneous squamous cell carcinoma utilizing artificial intelligence analysis of whole slide images.科学。代表。 12, 9876 (2022).

    文章一个 PubMed一个 PubMed Central一个 CAS一个 Google Scholar一个 

  26. Sali, R. et al. Deep Learning for Whole-Slide Tissue Histopathology Classification: A Comparative Study in the Identification of Dysplastic and Non-Dysplastic Barrett’s Esophagus.J.个人。医学 10, 141 (2020).

    文章一个 Google Scholar一个 

  27. Dimitriou, N., Arandjelović, O. & Caie, P. D. Corrigendum: Deep Learning for Whole Slide Image Analysis: An Overview.正面。医学 7, 419 (2020).

    文章一个 Google Scholar一个 

  28. Shmatko, A., Ghaffari Laleh, N., Gerstung, M. & Kather, J. N. Artificial intelligence in histopathology: enhancing cancer research and clinical oncology.Nat Cancer 3, 1026–1038 (2022).

    文章一个 PubMed一个 Google Scholar一个 

  29. Quiros, A. C. et al. Self-supervised learning in non-small cell lung cancer discovers novel morphological clusters linked to patient outcome and molecular phenotypes.https://arxiv.org/pdf/2205.01931.pdf(2022)。

  30. Chen, Z., Li, X., Yang, M., Zhang, H. & Xu, X. S. Optimization of deep learning models for the prediction of gene mutations using unsupervised clustering.Hip Int. 9, 3–17 (2023).

    CAS一个 Google Scholar一个 

  31. Kim, J. et al. Author Correction: Unsupervised discovery of tissue architecture in multiplexed imaging.纳特。方法 19, 1662 (2022).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  32. Chen, R. J. et al.Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning.在:2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(IEEE, 2022).

  33. Poon, A. I. F. & Sung, J. J. Y. Opening the black box of AI-Medicine.J. Gastroenterol。肝。 36, 581–584 (2021).

    文章一个 PubMed一个 Google Scholar一个 

  34. Van der Laak, J., van der Laak, J., Litjens, G. & Ciompi, F. Deep learning in histopathology: the path to the clinic.纳特。医学 27, 775–784 (2021).

    文章一个 PubMed一个 Google Scholar一个 

  35. Guidotti, R. et al. A Survey of Methods for Explaining Black Box Models.ACM计算。幸存。 51, 1–42 (2018).

    文章一个 Google Scholar一个 

  36. Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J. AI in health and medicine.纳特。医学 28, 31–38 (2022).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  37. Zbontar, J., et al.Twins: Self-Supervised Learning via Redundancy Reduction.在:Proceedings of the 38th International Conference on Machine Learning(eds. Meila, M. & Zhang, T.) vol. 139 12310–12320 (PMLR, 2021).

  38. Zakhem, G. A., Pulavarty, A. N., Carucci, J. & Stevenson, M. L. Association of Patient Risk Factors, Tumor Characteristics, and Treatment Modality With Poor Outcomes in Primary Cutaneous Squamous Cell Carcinoma.贾马皮尔托尔。 159, 160 (2023).

    文章一个 PubMed一个 Google Scholar一个 

  39. Brancaccio, G. et al. Risk Factors and Diagnosis of Advanced Cutaneous Squamous Cell Carcinoma.皮肤病。实践。概念 11, e2021166S (2021).

    文章一个 PubMed一个 PubMed Central一个 Google Scholar一个 

  40. Que, S. K. T., Zwald, F. O. & Schmults, C. D. Cutaneous squamous cell carcinoma: Incidence, risk factors, diagnosis, and staging.J. Am。学院。皮肤病。 78, 237–247 (2018).

    文章一个 PubMed一个 Google Scholar一个 

  41. Rowe, D. E., Carroll, R. J. & Day, C. L. Jr. Prognostic factors for local recurrence, metastasis, and survival rates in squamous cell carcinoma of the skin, ear, and lip. Implications for treatment modality selection.J. Am。学院。皮肤病。 26, 976–990 (1992).

    文章一个 PubMed一个 CAS一个 Google Scholar一个 

  42. Campoli, M., Brodland, D. G. & Zitelli, J. A prospective evaluation of the clinical, histologic, and therapeutic variables associated with incidental perineural invasion in cutaneous squamous cell carcinoma.J. Am。学院。皮肤病。 70, 630–636 (2014).

    文章一个 PubMed一个 Google Scholar一个 

  43. Schmults, C. D., Karia, P. S., Carter, J. B., Han, J. & Qureshi, A. A. Factors Predictive of Recurrence and Death From Cutaneous Squamous Cell Carcinoma.贾马皮尔托尔。 149, 541 (2013).

    文章一个 PubMed一个 Google Scholar一个 

  44. Gibson, F. T., Murad, F., Granger, E., Schmults, C. D. & Ruiz, E. S. Perioperative imaging for high-stage cutaneous squamous cell carcinoma helps guide management in nearly a third of cases: A single-institution retrospective cohort.J. Am。学院。Dermatol。88, 1209–1211 (2023).Canavan, T. N. et al. A cohort study to determine factors associated with upstaging cutaneous squamous cell carcinoma during Mohs surgery.

  45. J. Am。学院。皮肤病。 88, 191–194 (2023).

    文章一个 PubMed一个 Google Scholar一个 

  46. Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning.纳特。医学 24, 1559–1567 (2018).

    文章一个 PubMed一个 PubMed Central一个 CAS一个 Google Scholar一个 

  47. Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P. & July-Aug. Color transfer between images.IEEE计算。图形。应用。 21, 34–41 (2001).

    文章一个 Google Scholar一个 

  48. McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: Uniform Manifold Approximation and Projection.J. Open Source Softw. 3, 861 (2018).

    文章一个 Google Scholar一个 

  49. Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells.基因组生物。 20, 59 (2019).

  50. Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities.科学。代表。 9, 5233 (2019).

    文章一个 PubMed一个 PubMed Central一个 CAS一个 Google Scholar一个 

  51. Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.J. R. Stat.Soc。ser。B Methodological 57, 289–300 (1995).

    文章一个 Google Scholar一个 

  52. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(IEEE, 2016).

下载参考

致谢

The authors thank the team of the Center of Biospecimen Research and Development from the New York University for their help in digital pathology and histology for this project. The Center of Biospecimen Research and Development, RRID:SCR_018304, is partially supported by the Cancer Center Support Grant P30CA016087 at the Laura and Isaac Perlmutter Cancer Center. Parts of the computational analysis for this work were supported by the NYU Langone High Performance Computing (HPC) Core’s resources, and we thank the HPC team for their support. We would like to thank the Genome Technology Center (GTC) for expert library preparation and sequencing, and the Applied Bioinformatics Laboratories (ABL) for providing bioinformatics support and helping with the analysis and interpretation of the data. GTC and ABL are shared resources partially supported by the Cancer Center Support Grant P30CA016087 at the Laura and Isaac Perlmutter Cancer Center. This work has used computing resources at the NYU School of Medicine High Performance Computing (HPC) Facility. Other funding Sources include NCI/NIH Cancer Center Support Grant P30CA016087 (A.T.). A.C.Q. is supported by a scholarship from School of Computing Science, University of Glasgow.

作者信息

作者注意

  1. These authors contributed equally: Nicolas Coudray, Michelle C. Juarez, Maressa C. Criscito, Adalberto Claudio Quiros.

  2. These authors jointly supervised this work: Aristotelis Tsirigos, John A. Carucci.

作者和隶属关系

  1. Applied Bioinformatics Laboratories, New York University School of Medicine, New York, NY, USA

    Nicolas Coudray & Aristotelis Tsirigos

  2. Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, NY, USA

    Nicolas Coudray & Aristotelis Tsirigos

  3. The Ronald O. Perelman Department of Dermatology, New York University Grossman School of Medicine, New York, NY, USA

    Michelle C. Juarez, Maressa C. Criscito, Mary L. Stevenson, Nicole A. Doudican & John A. Carucci

  4. School of Computing Science, University of Glasgow, Glasgow, Scotland, UK

    Adalberto Claudio Quiros & Ke Yuan

  5. Department of Dermatology, Northwell Health, New York, NY, USA

    Reason Wilken

  6. Department of Dermatology, Thomas Jefferson University, Philadelphia, PA, USA

    Stephanie R. Jackson Cullison

  7. School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK

    Ke Yuan

  8. Cancer Research UK Beatson Institute, Glasgow, Scotland, UK

    Ke Yuan

  9. Department of Dermatology, University of California, San Francisco, San Francisco, CA, USA

    Jamie D. Aquino, Daniel M. Klufas, Jeffrey P. North & Siegrid S. Yu

  10. Department of Dermatology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA

    Fadi Murad, Emily Ruiz & Chrysalyne D. Schmults

  11. Instituto de Biología Molecular y Celular del Cáncer (Lab 20), Campus Miguel de Unamuno, Salamanca, Spain

    Cristian D. Cardona Machado & Javier Cañueto

  12. Instituto de Investigación Biomédica de Salamanca, CANC-30, Salamanca, Spain

    Cristian D. Cardona Machado & Javier Cañueto

  13. Department of Dermatology, Complejo Asistencial Universitario de Salamanca, Salamanca, Spain

    Cristian D. Cardona Machado & Javier Cañueto

  14. Department of Computer Science, University of Illinois, Urbana-Champain, IL, USA

    Anirudh Choudhary

  15. Mayo Clinic, Scottsdale, AZ, USA

    Alysia N. Hughes, Alyssa Stockard, Zachary Leibovit-Reiben & Aaron R. Mangold

  16. Department of Pathology, New York University School of Medicine, New York, NY, USA

    Aristotelis Tsirigos

贡献

N.C. and A.C.Q. designed and executed the experiments, and wrote the python codes. N.A.D helped obtain IRB approval and provided logistic coordination between all institutions. M.C.C., S.R.J., R.W. and M.C.J. provided slide annotations and histological assessment of HPCs. N.C., M.C.J. and M.C.C. prepared the manuscript which was later edited or approved by all co-authors. N.C., A.C.Q, K.Y. and A.T provided deep-learning expertise. M.C.C., S.R.J., R.W., M.C.J., and J.A.C. provided squamous cell cancer expertise. R.W, M.L.S., D.M.K, S.Y., J.C., F.M., C.D.S., C.D.C.M., J.C., A.C., A.N.H., A.S., Z.L.-R. and A.R.M. provided the slides and corresponding data. J.A.C. and A.T. jointly supervised the study.

相应的作者

对应Aristotelis Tsirigos或者John A. Carucci。道德声明

竞争利益

The authors declare the following competing interests: A.T.

is a co-founder of Imagenomix;N.C. is a scientific advisor for Imagenomix.其他作者宣称他们没有竞争利益。

附加信息

Publisher’s note关于已发表的地图和机构隶属关系中的管辖权主张,Springer自然仍然是中立的。

补充信息

引用本文

Check for updates. Verify currency and authenticity via CrossMark

Coudray, N., Juarez, M.C., Criscito, M.C.

等。Self supervised artificial intelligence predicts poor outcome from primary cutaneous squamous cell carcinoma at diagnosis.NPJ数字。医学 8, 105 (2025). https://doi.org/10.1038/s41746-025-01496-3

下载引用

  • 已收到

  • 公认

  • 出版

  • doihttps://doi.org/10.1038/s41746-025-01496-3

关于《自我监督的人工智能预测诊断时原发性皮肤鳞状细胞癌的结果不佳》的评论


暂无评论

发表评论

摘要

本文讨论了一项使用自我监督的人工智能(AI)的研究,以预测诊断时原发性皮肤鳞状细胞癌的不良预后。这是要点和发现:要点:1。研究人员开发了一个AI系统,该系统可以分析皮肤癌活检的图像,而无需标记训练数据。2。使用自我监督的学习技术对AI模型进行了训练,从而使其可以从未标记的图像中学习。3。该研究涉及分析来自各个机构和国家的600多个活检样本。主要发现:1。AI系统在预测原发性皮肤鳞状细胞癌(CSCC)患者的预后差方面达到了很高的准确性。2。就预测能力而言,它表现优于传统的临床分期系统。3。该模型可以识别与人类病理学家不明显的CSCC肿瘤中与侵略性行为相关的特征。含义:1。这种AI方法可能会改善皮肤鳞状细胞癌患者的早期检测和风险分层。2。它突出了在医学成像应用中的自我监管学习技术的潜力,在医学成像应用中,大型标记数据集可能受到限制。3。发现,深度学习模型可以从组织病理学图像中提取预后信息,而无需大量的临床注释。限制:1。需要对更大,更多样化的患者人群进行验证。2。需要进一步的研究,以了解AI模型的预测与推动CSCC不良结果的特定生物学机制的关系。总体而言,这项工作证明了自我监督的深度学习方法的潜力,以改善皮肤癌病理学的预后评估,如果在常规护理环境中进行验证和实施,这可能具有重大的临床意义。

相关讨论