作者:Horie, Shigeo
已经使用了各种方法来研究情绪。主要基于从调查和受控实验环境获得的数据的研究很普遍,采用技术评估刺激的技术,例如面部表情或音乐的照片,以探索情感维度和经验1,,,,2。但是,这些方法在生态有效性方面存在局限性,即发现在现实生活中反映自然语言使用和情感表达的程度。基于每日沟通一部分的社交媒体或在线论坛上发现的自然文本表达的研究仍然有限3,,,,4。
与其他情绪不同,疼痛不仅是一种生物学或心理现象,而且是一种深层嵌入社会和主体间环境中的交流行为。社会交流模型疼痛假定语言表达在塑造方式中起着至关重要的作用疼痛对他人的感知,评估和回应5。现象学方法进一步强调了疼痛生活和表达在一个生命世界中,隐喻和想象力有助于其含义超出数值强度6,,,,7。这些观点表明疼痛 - 相关语言可以阐明苦难的社会建构及其认可的动力。
这项研究分析了与疼痛在在线平台上使用自然语言处理(NLP)技术。与其他情绪不同,疼痛是一种复杂的体验,涵盖心理和生理影响,将其独特地定位在情感研究中8,,,,9。Goemotions数据集启用了A的构建疼痛 - 相关术语网络,并进行拓扑分析,揭示了如何疼痛讨论进行了10。这种方法为理解经验提供了一种新颖的观点疼痛通过自然语言表达式,独立于自我报告或基于调查的方法11。
本节提供了全球概述疼痛 - 相关的词汇网络,重点关注其结构规模,密度和社区结构,该结构源自123,840个单词共存在关系。最终的网络包含5630个节点和86,972个边缘,每个节点代表一个单词,每个边缘表示共同出现4,,,,8,,,,9。诸如燃烧,,,,头痛,,,,不适, 和疼痛表现出突出的连通性,表明频繁的语义关联。图 1说明了核心外围结构:较大的中央节点反映了较高的中心性和话语影响,而外围节点则揭示了分层的性质疼痛 - 相关语言11,,,,
12。图1与疼痛相关术语的网络可视化。共发生的网络为5,630个独特的术语和86,972个边缘。节点代表个人术语;边缘代表五拟滑动窗口中的同时出现。节点大小按中心值缩放。所有节点均以均匀的颜色显示。
13。网络直径为5,表明即使遥远的术语也通过短路径链接,支持有效的语义流14。群集系数为0.770000,证实单词形成紧密连接的本地亚组15。
卢万社区检测确定了12个不同的社区16,其中最大,包括1,021个节点和包含911、842、520和495的其他节点。这些发现表明具有反映主题和上下文变化的语义上不同亚组的全球凝聚力结构。与疼痛相关术语的结构作用本节侧重于确定网络中与个体症状相关术语的词汇作用和相对中心性。
2显示了关键疼痛术语及其共发生模式的中心性指标。分析显示309个节点和363个边缘的结构化子图11,,,,12,在社交媒体上捕获症状语言的复杂性,包括频繁和独特的单词配对。
期限疼痛一贯在所有三个中心度度量度(0.821429),中间(0.930134)和特征向量中心(0.695893)中始终获得最高得分13,,,,15,,,,17。相反,诸如头痛(0.107000,0.109000,0.055000),燃烧(0.182000,0.166000,0.110000)和不适(0.049000,0.052000,0.024000)的分数大大降低,指向明显的结构层次结构。值得注意的是,使用Goemotions语料库的上下文分析表明
燃烧几乎与隐喻或审美描述符(例如,玻璃,,,,雕刻),而不是以情绪或症状为导向的术语,暗示了更具象征性的用法模式。除了个人术语,我们还研究了跨更广泛类别的中心性模式疼痛
使用基于词典的分类,神经性,躯体,内脏和心身体育。在21个预定义的术语中,仅在共汇网络中发现了6个(28.6%)。其中,心身术语,例如沮丧和焦虑表现出最高的特征向量中心(0.710000),表明在语义上有影响力的区域中进行了中心位置。相反,临床上显着的神经性术语燃烧和射击连接性和网络影响最小(例如,燃烧0.000000000000013)。体细胞和内脏描述符在很大程度上不存在,仅压力略有代表。
这些发现揭示了不同的不对称性疼痛语言表达方式:尽管情感和认知术语主导了话语结构,但生理上扎根的词汇仍然存在或省略。补充表中提供了按类别按类别按类别的指标S1。
本节介绍了与症状相关术语之间中心性模式的统计表征,目的是评估词汇网络内的结构层次结构和连通性。这疼痛 - 注重的网络表现出稀疏但语义上有序的结构,密度为0.005500。平均程度为30.900000,表明每个节点平均连接到其他31个术语。网络直径为5,这意味着即使是最遥远的节点也通过相对较短的路径链接,从而实现有效的语义传播13,,,,14。
确定的中心性分析疼痛作为一个明显的集线器节点,其值明显高于程度中心性(0.821429),中心中心(0.930134)和特征向量中心(0.695893)中的所有其他术语的值明显高。13,,,,15,,,,17。这些值表示整个网络上的集成函数。图 3提出了这三个中心度度量的对数直方图,突出了疼痛与诸如头痛(0.107000,0.109000,0.055000),燃烧(0.182000,0.166000,0.110000)和不适(0.049000,0.052000,0.024000);疼痛还显示出类似的低分(图中未显示)。
图 3A显示了学位中心性,该学位反映了节点维持的直接词汇连接的数量。疼痛表现出最高的价值,表明其在锚定网络内广泛的语义关联中的作用。相比之下,头痛和燃烧保持较少但仍然中等的链接,而不适和疼痛稀疏连接,反映其边际连通性。
图 3B强调了中心性,该中心性衡量了节点在其他节点之间最短路径上的频率表明其作为跨语义亚组的桥梁的功能。再次在这里,疼痛主导,表明它有助于跨群集的话语整合。燃烧和头痛显示较低但非平凡的中间值,这意味着部分桥接角色。同时,不适和疼痛这些结缔路道几乎没有,强调了它们的外围状态。
图 3C量化了特征向量的中心性,该中心性根据其与其他高度连接的术语相邻评估了节点的影响。该图表明疼痛在语义核心内拥有特权地位,通过与其他中心术语的联系来发挥影响力。燃烧在该指标中排名第二,尽管它仍然低于疼痛。相比之下,头痛,,,,不适, 和疼痛显示较低和更稳定的特征向量中心分数,表明它们位于更特定和结构边缘的位置。
这些可视化共同加强了明确的结构层次结构:而疼痛锚定网络的核心,其他症状术语形成了次要或外围节点,在受约束的词汇环境中扮演更专业的角色。
每个中心度度量都提供了独特的解释性视角:学位中心性反映了直接连接的数量,间隔中心性捕获了语义簇之间的术语桥接作用,而特征向量的中心性量化了密度连接区域内的影响。这些区别阐明了症状词汇的功能异质性。尽管疼痛通过连接各种术语和域,其他表达方式来锚定结构头痛和燃烧占据更依赖上下文的局部角色。
本节介绍了与症状相关的术语如何集中到语义相干社区的分析,这是基于上述中心性结果的基础。基于Louvain模块化分析,我们检查了结构上突出的术语是否像疼痛锚定更广泛的主题簇,以及其他表达式如何组织成不同的体验或象征性子域。
如图所示 4,该网络分为多种颜色编码的社区。最大的标记社区2包含225个节点,并且功能是一般的中心枢纽疼痛话语。它的平均程度中心度为0.007100,中心中心为0.004500,特征向量中心度为0.046400。这些价值观表明,这个社区不仅表现出密集的内部连接性,而且还充当其他亚组的语义桥梁。由于平均边缘重量为5.906200,该集群中的术语往往会经常同时发生,从而增强了它们相互的上下文关联。
其他较小社区的存在反映了主题多元化疼痛 - 相关表达式。例如,社区0,有44个节点和0.008900的平均度中心性,可能代表了一个中等连接但与众不同的概念子组。由14个节点组成的社区3以其相对较高的中心性为0.009000,这意味着尽管尺寸较小,但仍意味着中介作用。
对特定社区的语义概况的仔细检查揭示了如何疼痛 - 相关术语在体验,情感和符号维度之间进行构建。一个社区,颜色为红色,围绕该术语为中心不适,包括主要的关系和评估表达WHO,,,,会, 和应得的。这种语言配置表明不适通常是在人际关系或规范环境中表达的,而不是通过直接的感官描述。另一个簇,以浅蓝色显示,周围结构头痛,包含两个症状术语窦和次要的,以及系统或隐喻的边界,,,,遗产, 和外包。这种分布强调了头痛,作为字面症状,也是社会或认知负担的隐喻。
第三个值得注意的集群,以紫色为代表并周围组织燃烧,将术语与诸如躯体参考的术语相结合肿了,,,,腿, 和瘦骨嶙峋的与情感,社会或灵性相关的单词,包括后悔,,,,教会, 和世界。这种混合说明了燃烧,它既是话语中的生理描述符,又是象征性或情感表达。
综上所述,这些发现表明社交媒体用户构建疼痛不仅是一种身体上的感觉,而且是一种充满情感和道德的经验。网络结构表明语言表示疼痛在身体,情感和社会记录中分歧,但经常通过桥接这些领域的中心术语重新连接。这种差异化和重新整合的模式强调了在线环境中症状话语的复杂性。
本节介绍了疼痛以及与情感相关的术语,建立在建立的前面发现的基础上疼痛症状话语网络中的结构优势和主题中心性。为了确定这种突出性是反映了一般的情感显着性还是独特的结构作用,我们检查了中心度指标疼痛相对于两个核心情感术语:害怕和紧张。
桌子1总结了此比较的结果。在所有三个中心度指标中,中间和特征向量的中心性疼痛表现出明显更高的值。例如,其学位中心性(0.821429)是最高值的六倍以上害怕组(0.0937),大大高于紧张组(0.1297)。在中间性中心中观察到类似的模式(0.930134疼痛相对于0.2025)和特征向量中心(0.695893疼痛与0.3426)。
= 10,000)证实了这一点疼痛S中心度得分明显高于与情绪相关群体中观察到的中心分数(p所有指标的0.0001)。这些结果表明疼痛不仅是频繁或情感上的显着术语,而且还可以作为症状话语网络中的结构主导枢纽。它的高连通性和桥接角色将其与典型的情感术语区分开来,这些术语倾向于将其聚集在狭窄的情感环境中。这支持解释疼痛作为整合各种语义域的中心组织术语。
本节介绍了基于先前的结构分析的基础,使用中心度指标及其标准偏差对关键症状相关术语的稳定性进行了评估。图 5在三个中心度度量中,提供了中心价值的比较可视化及其对五个关键术语疼痛,头痛,燃烧,不适和ACHE的可变性。图中的每个面板对应于一个特定的度量:图。 5A显示了学位中心性,图。 5b显示了中心性的中心性,图 5C显示特征向量的中心性。点表示平均中心分数,而垂直误差线表示根据引导重新采样计算出的标准偏差。
在图 5a,疼痛显示出最高的中心性,并具有适度的标准偏差,表明它始终形成一组跨样品的直接词汇连接。燃烧和头痛随着得分较低和较高的变异性稍有。不适和疼痛既表现出低中心性,又显示有限的变化,表明网络中的边际连通性和功能特异性。
图 5B强调了痛苦作为语义桥的作用,中间的中间性为0.930134和实质性的可变性,反映了其在上下文亚组之间的动态定位。头痛的标准偏差为0.161610,大于其他非中心节点,表明其与上下文有关的桥接函数。相反,不适和疼痛很少沿着簇之间的最短路径出现,从而确认了它们的外围话语作用。
图 5C量化密度连接区域内的影响。疼痛再次占主导地位,燃烧的影响第二,但显示出更大的差异。头痛,不适和疼痛都表现出较低,更稳定的特征向量中心,从而加强了它们有限的整合到网络的语义核心中。
为了进一步评估这些发现的鲁棒性,使用自举标准偏差计算了95%的置信区间(CI)。疼痛具有最大的CI(0.193 1.450),强调了其主导和上下文敏感的作用。头痛的CI为0.020 0.195,而燃烧范围为0.034至0.329。不适和ACHE保持较低和较窄的范围,每个范围在0.090以下,表明稳定但结构上的边缘位置。
为了确定中心性变异性是否因语义类别而异,我们比较了三个词汇群体的学位中心性的标准偏差:与疼痛相关,隐喻和情感术语。统计检验表明,与疼痛相关的术语的变异性明显高于隐喻术语(韦尔奇的t= 4.11,p= 0.0052;排列p= 0.0034),而与疼痛有关和情感术语之间没有明显差异(p> 0.77)。这些结果表明,可变性不是随机分布的,而是反映了话语结构中的功能区别。综上所述,中心稳定性概况阐明了与症状相关术语的结构弹性和上下文适应性。
痛苦始终占据中心而灵活的角色,而头痛和燃烧功能更狭窄,可变的环境。相比之下,不适和疼痛保持稳定,但周围性,最少有助于更广泛的与症状相关的语言组织。这些发现突出了将稳定指标纳入基于网络的话语分析的重要性。
这项研究揭示了与疼痛有关的话语中明显的结构层次结构,疼痛充当语义锚,该语义锚组织在各种交流环境中组织症状表达3,,,,5,,,,18。这种中心性超出了频率,反映了疼痛作为语言吸引子的角色,将不同的感觉,情感和认知描述者结合到连贯的症状叙事中3,,,,10。网络稀疏的总密度(0.0055)与高聚类系数(0.7700)相抵消,表明紧密编织的本地分组。此外,该网络的直径为5â,即使最遥远的术语也通过五个或更少的共发生步骤连接,显示了跨词汇结构的有效语义链接19,,,,20。
在此框架内疼痛一贯主导所有集中度措施,而术语头痛,,,,燃烧, 和不适占据更多的外围或上下文特定位置21,,,,22。这种不对称的配置表明症状话语不是均匀分布,而是通过关键组织原理形成20。观察到的层次结构也反映了现象学现实疼痛既是描述性标签,又是一个解释性框架,可以通过它概念化体验5。与其他感官或情感术语不同,这些术语聚集在狭窄的情感子域中,疼痛展示了非凡的语义广度,将生理状态与心理和社会含义联系起来3,,,,5,,,,10。
这个结构组织对个人如何概念化和交流苦难,定位具有关键意义疼痛不仅是症状描述符,而且是与健康相关语言的基础组织概念3,,,,5,,,,10。
尽管中心度指标为词汇突出性提供了宝贵的见解,但仅它们就无法捕获与症状相关的词汇的全部功能复杂性21,,,,22。期限燃烧例如,在网络中显示中心性的中心性,但仔细检查显示其结构存在与临床功能之间存在断开连接23,,,,24。Goemotions语料库中的共发生分析表明燃烧经常出现在隐喻或装饰的环境中燃烧的玻璃或者燃烧的木头而不是在疼痛 - 相关或情感叙事23,,,,24,,,,25。
与其作为神经性疼痛的标志性描述相反,燃烧在数字话语中无法保持语义靠近其他疼痛术语,强调医学语言与日常用法之间的基本差异26,,,,27。相似地,头痛在中心度指标之间表现出较高的变异性,表明其话语作用会根据上下文而变化27,,,,28。这种功能性可塑性意味着这种术语在话语中作为上下文敏感的桥梁运行,根据叙事框架调整其含义29,,,,30。
这些观察结果强调了基于网络的健康传播研究中上下文验证的必要性24,,,,29。单独的指标不能说明词汇含义的务实维度。取而代之的是,必须在他们居住的话语生态系统中评估术语,其中语义角色是动态协商而不是静态定义的。30,,,,31。
上面观察到的务实差异,特别是在燃烧,在诊断词汇的临床使用中暴露更深层的结构问题31,,,,32。在间质性膀胱炎/膀胱疼痛综合征,,,,燃烧的感觉是嵌入标准化问卷中的中心诊断标准33,,,,34,,,,35。然而我们的分析表明燃烧很少使用疼痛社交媒体平台上的相关背景。这表明,年轻的,数字本地的人口可能不会将其生活的感官经历与正式诊断工具中使用的术语相关联,从而增加了报告不足,识别或诊断延迟的风险31,,,,36。
这个差距不是孤立的。在心血管医学中,胸部不适被广泛用于表征心肌梗塞,但术语不适在在线话语中,经常出现在心理或环境框架中,而不是躯体症状上下文37,,,,38。相似地,刺痛描述神经性症状的关键术语,例如糖尿病神经病或胸骨后神经痛,通常用隐喻表达来表示情绪状态39,,,,40。
这些模式反映了更广泛的世代和语言转变,挑战了患者报告的结果和症状清单中嵌入的假设41,,,,42。由于临床词汇无法与患者有机使用的语言相一致,因此沟通不畅的风险增加43,,,,44。在经验话语分析的指导下,弥合这个语义差距将需要系统的更新到诊断工具,以确保临床工具保持可理解,共鸣和有效45,,,,46。
中心度措施中观察到的可变性不应被视为方法论噪声,而应被解释为症状话语中功能多样性的结构指标47。这种可变性反映了词汇项目在多个叙事环境中运作的适应能力,从而揭示了术语在语义生态系统中的功能的根本差异48,,,,49。具有高中心性的术语以及中等不稳定的术语,例如疼痛,在适应各种交流情况的同时,表现出极大的功能可塑性维持结构优势3。这种灵活性可以实现疼痛充当跨生理,情感和社会登记的多功能组织原则5,,,,6。
相反,术语像不适和疼痛具有最小的可变性,表现出较低的中心性,表明语义功能受到限制和稳定但周围的作用。它们一致的定位反映了可抵抗上下文适应的专门用法模式,表明疼痛话语中的功能性狭窄13,,,,21。跨语义类别的统计比较进一步阐明了这种模式:与疼痛相关的术语的变异性明显高于隐喻表达(韦尔奇的)t= 4.11,p= 0.0052),同时表现出与情感术语没有显着差异,这表明核心症状描述者具有情感语言的上下文灵活性特征50。这些发现揭示了健康话语中词汇组织的基本原则:流体中心和“固定外围”的共存。而中心术语像
疼痛通过自适应灵活性来保持其结构重要性,外围术语通过功能特异性保留其位置17,,,,20。这种层次结构的安排反映了症状词汇平衡语义稳定与交流多功能性的方式,从而可以精确描述和灵活的含义谈判51。了解症状词汇的结构层次结构和功能变异性对临床交流和实践产生了重要影响。建立在上面概述的结构见解的基础上
疼痛可以在诊断环境中策略性地将其作为中央词汇锚定为中心词汇,从而为临床医生提供了一个有价值的切入点,以引起更全面的症状叙事5,,,,7。稳定的外围术语与可变的桥接概念的识别为解释患者语言模式提供了一个框架,通过关注患者自然构建经验的词汇节点,从而提高了分类效率和教育干预措施,从而提高了分类效率和教育干预措施29,,,,46。临床术语和日常语言使用之间的差异,尤其是在诸如燃烧
,表明需要在专业医学话语和患者生成的描述之间转化的桥接模型。这种翻译框架可以通过认识到症状描述符在环境中的功能不同,从而更准确地解释患者报告和更有效的健康教育策略来增强患者的沟通。31,,,,36。但是,一些局限性限制了这些发现的立即临床适用性。特别是,社交媒体使用者代表了一个自我选择的和人口统计学偏斜的人群,通常是年轻的,数字流利的,并且在文化上特定的限制了这些发现对更多样化的临床人群的普遍性。
The reliance on social media data may not fully capture the linguistic patterns present in direct clinical encounters, and the temporal stability of these network structures requires validation across different time periods and demographic populations9,,,,24。Moreover, while our quantitative network analysis provides structural insights into pain-related discourse, it should be interpreted alongside qualitative and ethnographic approaches that capture the lived experiences, cultural narratives, and interpersonal dynamics that shape how pain is communicated and understood52,,,,53。Future research should integrate electronic medical records with patient-reported outcome measures to validate the clinical relevance of social media-derived language patterns11,,,,12。Although our analyses confirmed the robustness of key network properties, we acknowledge that formal comparisons with graph-randomized or degree-preserving null models were not performed.Incorporating such null model frameworks would allow more rigorous inference of structural significance for metrics like clustering and modularity, and represents an important direction for future methodological work.
From a methodological perspective, this discrete mathematical approach to symptom language analysis provides a quantitative foundation for advancing personalized patient communication strategies in clinical practice54,,,,55。The development of hybrid analytical frameworks that bridge social media linguistics with clinical communication represents a promising direction56。Such approaches could potentially transform network-derived centrality metrics into predictive indicators for patient communication preferences, symptom progression patterns, and treatment adherence, ultimately advancing personalized approaches to clinical dialogue and care delivery.
This study analyzed publicly available data from the GoEmotions dataset, which contained anonymized Reddit comments.According to the platformâs terms of service, Reddit users consent to their public posts being viewed and analyzed.No additional ethical approval was required as this study used only publicly available, anonymized data and did not involve any direct human participant interaction.All data handling complied with Redditâs terms of service and data-usage policies.
An analysis was conducted on 57,000 Reddit comments from the GoEmotions dataset (2005â2019), which provides emotion-labeled social media texts4,,,,8,,,,9。Given the growing body of evidence suggesting that emotionally annotated language corpora can validly reflect underlying psychological constructsâincluding affective states, somatic perception, and interoceptive awareness3,,,,5,,,,18,,,,37, this dataset offers a suitable foundation for analyzing spontaneously expressed pain-related discourse.Recent studies further support the use of word embeddings and emotion-tagged corpora to infer nuanced emotional and bodily experiences from naturalistic text3,,,,57。The dataset was pre-processed using a multistage approach.Initial cleaning involved the removal of toxic or offensive content through a combination of automated filtering (using predefined word lists) and manual annotation.To ensure the reliability of toxicity identification, a random subset of comments was independently reviewed by two annotators, yielding substantial inter-rater agreement (Cohenâs κâ=â0.85).In addition, only comments originating from subreddits with more than 10,000 posts were included to ensure data quality and adequate contextual richness.
Text preprocessing was performed using a custom-built NLP pipeline in Python 3.8.This included tokenization via the Penn Treebank tokenizer from the NLTK toolkit, followed by stop word removal and the elimination of special characters25,,,,27。To improve terminological consistency, medical terms were standardized using a modified version of the Unified Medical Language System (UMLS) metathesaurus31,,,,36。The accuracy of this standardization process was verified through manual inspection of a randomly sampled subset of 1,000 comments, achieving approximately 95% concordance between the output and clinical reference forms.
To isolate pain-related discourse from the GoEmotions dataset, we applied a keyword-based filtering strategy.The pain-related keyword list was developed iteratively by two practicing physicians through interactive sessions with a large language model ChatGPT (OpenAI, San Francisco, CA, USA), with the goal of maximizing clinical relevance and semantic coverage.The final list was reviewed and refined under the supervision of an English language specialist (K.O.) to ensure terminological precision and consistency with biomedical discourse norms.
The resulting lexicon included over 90 pain-related terms encompassing urogenital, musculoskeletal, neuropathic, inflammatory, and psychosomatic categories.These terms were matched using case-insensitive substring search following tokenization.The full list used for filtering comprises:
genital pain, urinary pain, sexual dysfunction, dyspareunia, pain during sexual intercourse, pelvic pain, vaginal pain, erectile pain, testicular pain, bladder pain, reproductive organ pain, prostate pain, menstrual pain, intercourse pain, painful urination, genital discomfort, sexual pain, abdominal pain, headache, migraine, back pain, neck pain, shoulder pain, joint pain, muscle pain, chronic pain, acute pain, leg pain, foot pain, hand pain, stomach ache, toothache, sinus pain, chest pain, rib pain, arthritis pain, cramping, burning, throbbing, ache, soreness, discomfort, numbness, stiffness, tenderness, inflammation, spasm, nerve pain, perineal pain, bowel symptoms, urinary symptoms, bladder distension, dysuria, suprapubic pain, urethral burning, hesitancy, incontinence, frequency, urgency, sleep disturbance, fatigue, insomnia, depression, anxiety, hopelessness, sadness, loneliness, guilt, stress, apathy, worthlessness, isolation, lethargy, mood swings, irritability, withdrawal, appetite loss, emptiness, despair, suicidal thoughts, restlessness, helplessness, disability, posture issues, mobility limitations, lifestyle impact, backache, sciatica, bleeding, bloating, malnutrition, dehydration, anemia, weight loss, fever, bloody stools, diarrhea, mucus, complications, steroid therapy, immunosuppression, strictures.
Although the GoEmotions dataset was originally designed for emotion classification, its inclusion of naturalistic, emotionally annotated text makes it well-suited for identifying spontaneously expressed pain-related content in everyday language.
To assess the reliability of the filtering process, a random sample of 1,000 extracted comments was independently reviewed by N.O.and M.O.The two annotators demonstrated a high level of agreement in identifying pain-related content, confirming the consistency and thematic relevance of the filtered corpus.
Co-occurrence was defined using a five-word sliding window, in which any two terms appearing within the same five-word span were considered co-occurring.This method captures short-range contextual associations while maintaining semantic proximity and has been widely used in lexical network studies.This allowed lexical associations and patterns to emerge naturally from within the filtered subset.By analyzing co-occurrences among 123,840 potential word relationships, central terms such as pain, headache, discomfort, and burning surfaced as prominent nodes within the network.
This data-driven approach enabled the identification of emergent linguistic structures and thematic clusters without relying on externally imposed categories, offering an unbiased view of how pain is framed and communicated in social media.
Network analysis was implemented using NetworkX 2.5, employing a sliding window approach of size 5 (optimized through a sensitivity analysis of sizes 3â7).Co-occurrence weights were calculated using frequency-adjusted normalization to account for baseline term prevalence, with statistical significance assessed using Bonferroni-corrected Spearmanâs correlation coefficients (pâ<â0.05).Advanced statistical analysis
The community structure was detected using the Louvain method with a resolution parameter of 1.0, optimized through modularity maximization (Qâ>â0.3)30,,,,31。These metrics provide complementary perspectives on the network structure of pain-related studies.
Network stability was validated using both internal and external approaches.Internal validation employed bootstrap analysis (1000 iterations) with 80/20 data splits, maintaining edge weight distribution stability (coefficient of variationâ<â15%) and cross-validation with fivefold partitioning stratified by year32。For external validation, network findings were cross-referenced with established clinical literature through a systematic review of pain comorbidity studies (2000â2023)33。
Network visualization was implemented using a modified force-directed layout algorithm in Python, with node sizes scaled logarithmically by term frequency.Edge weights were represented by a continuous color gradient from light grey (weak correlation) to black (strong correlation, rââ¥â0.7)34。Term categories were distinguished by color: pain terms in red, associated symptoms in blue, and psychological terms in green, with node opacity reflecting term specificity scores.All visualizations were optimized for colorblind accessibility according to established guidelines.No comparisons with graph-randomized or null networks were performed in this study;however, network stability was assessed through bootstrap, perturbation, and permutation-based validation strategies.
Text processing quality was verified through a manual review of a stratified random sample (10% of the corpus) by two independent reviewers, achieving high inter-rater reliability (Cohenâs κâ=â0.88 for term classification)35。Network stability was further evaluated through perturbation analysis, where up to 20% of the edges were randomly removed to assess changes in the community structure31,,,,36。
To evaluate the statistical significance and variability of key node centrality, we focused on the term疼痛and its comparison with emotion-related terms such as害怕和紧张。Separate co-occurrence networks were constructed for each emotional category using identical preprocessing and windowing parameters.Centrality measures (degree, betweenness, and eigenvector) were extracted from each network.
For each metric, we computed the maximum and mean values for the emotion-related networks and compared them with those of疼痛。To assess statistical significance, permutation tests (nâ=â10,000) were conducted using the combined centrality distribution from emotion-related terms as the null distribution.疼痛exhibited significantly higher centrality in all metrics (pâ<â0.0001).In addition, to estimate the variability of
疼痛and other key nodes (e.g.,头痛,,,,燃烧), we performed a bootstrap analysis (1,000 iterations).In each iteration, 80% of the corpus was resampled with replacement, and degree centrality was recalculated using the same network parameters.The resulting 95% confidence intervals (meanâ±â1.96âÃâSD) provided estimates of centrality stability across corpus subsets.This analysis complements the network-wide validation and clarifies how robustly specific terms maintain their structural prominence.
Network characteristics were analyzed using the centrality analysis framework implemented in Network X 2.5.This included the extraction and normalization of 11 key pain descriptors, which were analyzed for co-occurrence patterns using a five-word window.The analysis revealed 363 significant relationships among 309 unique terms, with co-occurrence strength normalized by the total term frequencies to prevent bias in high-frequency terms.
Network analysis employed three centrality measures (degree, betweenness, and eigenvector centrality) to identify the key hub terms and bridge concepts.Community detection was performed using the Louvain method (resolution parameter: 1.0), with significance established through bootstrap analysis (1000 iterations) and Bonferroni correction.The final integration stage included the cross-validation of centrality measures and sensitivity analyses for parameter stability.The results were visualized using matplotlib and seaborn libraries, with node sizes reflecting centrality values, and edge weights representing co-occurrence strength.
Publicly available datasets were analyzed in this study.This data can be found at:https://github.com/hplisiecki/emotion_topology。”
The code used in this study is available at:https://github.com/Okuinobuo/PainAnalysisNLP/
Bradley, M. M. & Lang, P. J. Affective norms for English words (ANEW): instruction manual and affective ratings.Technical report C-2.大学。Florida, Gainesville(2010)。
Ekman, P., Davidson, R. J. & Friesen, W. V. The Duchenne smile: Emotional expression and brain physiology.ii。J. Pers.Soc。Psychol。 58, 342â353 (1990).
PubMed一个 Google Scholar一个
Jackson, J. C. et al.From text to thought: How analyzing language can advance psychological science.观点。Psychol。科学。 17, 805â826 (2022).
PubMed一个 Google Scholar一个
Saffar, A. H., Mann, T. K. & Ofoghi, B. Textual emotion detection in health: Advances and applications.J. Biomed.通知。 137, 104258 (2023).
PubMed一个 Google Scholar一个
Craig, K. D. Toward the social communication model of pain.在Social and Interpersonal Dynamics in Pain(eds Vervoort, T. et al.) 23â41 (Springer, 2018).
Geniusas, S.The Phenomenology of Pain(Ohio University Press, 2020).
Miglio, N. & Stanier, J. Beyond pain scales: A critical phenomenology of the expression of pain.正面。Pain Res. 3, 895443 (2022).
Barrett, L. F., Quigley, K. S. & Hamilton, P. An active inference theory of allostasis and interoception in depression.哲学反式。R. Soc。B Biol.科学。 371, 20160011 (2016).
Kabir, M. K., Islam, M., Kabir, A. N. B., Haque, A. & Rhaman, M. K. Detection of depression severity using Bengali social media posts on mental health: Study using natural language processing techniques.JMIR Form Res. 6, e36118 (2022).
Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G. & Ravi, S. GoEmotions: a dataset of fine-grained emotions.Preprint athttps://arxiv.org/abs/2005.00547(2020)。
Heintzelman, N. H. et al.Longitudinal analysis of pain in patients with metastatic prostate cancer using natural language processing of medical record text.J. Am。医学通知。联合。 20, 898â905 (2013).
PubMed一个 Google Scholar一个
Bacco, L. et al.Natural language processing in low back pain and spine diseases: A systematic review.正面。外科。 9, 957085 (2022).
Cardoso, F. M. et al.Effect of network topology and node centrality on trading.科学。代表。 10, 11113 (2020).
广告一个 PubMed一个 PubMed Central一个 Google Scholar一个
Liu, M., Zou, X., Chen, J. & Ma, S. Comparative analysis of social support in online health communities using a word co-occurrence network analysis approach.熵 24, 174 (2022).
广告一个 PubMed一个 PubMed Central一个 Google Scholar一个
Newman, M. E. J.Networks: An Introduction(Oxford University Press, 2010).https://doi.org/10.1093/acprof:oso/9780199206650.001.0001。数学
一个 Google Scholar一个 Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks.J. Stat。
Mech.2008 , P10008 (2008).Google Scholar
一个 Cavallaro, L., De Meo, P., Fiumara, G. & Liotta, A. On the sensitivity of centrality metrics.
PLoS ONE19 , e0299255 (2024).PubMed
一个 PubMed Central一个 Google Scholar一个 Bunzli, S. et al.How do people communicate about knee osteoarthritis?
A discourse analysis.Pain Med. 22, 1127â1148 (2021).
PubMed一个 Google Scholar一个
Fernández-de-las-Peñas, C. et al.Understanding sensitization, cognitive and neuropathic associated mechanisms behind post-COVID pain: A network analysis.诊断 12, 1538 (2022).
Barabási, A.-L.Network Science(Cambridge University Press, 2016).
数学一个 Google Scholar一个
Oldham, S. et al.Consistency and differences between centrality measures across distinct classes of networks.PLoS ONE 14, e0220061 (2019).
Yadav, A. A comparative analysis of centrality measures in complex networks.Autom.遥控 85, 685â695 (2024).
Li, Z. et al.Temporal grading index of functional network topology predicts pain perception of patients with chronic back pain.正面。神经。 13, 899254 (2022).
Wu, S. et al.Deep learning in clinical natural language processing: A methodical review.J. Am。医学通知。联合。 27, 457â470 (2020).
PubMed一个 Google Scholar一个
Denecke, K. & Reichenpfader, D. Sentiment analysis of clinical narratives: A scoping review.J. Biomed.通知。 140, 104336 (2023).
PubMed一个 Google Scholar一个
Bouhassira, D. et al.Comparison of pain syndromes associated with nervous or somatic lesions and development of a new neuropathic pain diagnostic questionnaire (DN4).疼痛 114, 29â36 (2005).
PubMed一个 Google Scholar一个
Zhang, Y., Zhang, Y., Qi, P., Manning, C. D. & Langlotz, C. P. Biomedical and clinical English model packages for the Stanza Python NLP library.J. Am。医学通知。联合。 28, 1892â1899 (2021).
Cheung, T. et al.Network analysis of depressive symptoms in Hong Kong residents during the COVID-19 pandemic.翻译。Psychiatry. 11, 460 (2021).
Jones, P. J., Ma, R. & McNally, R. Bridge centrality: A network approach to understanding comorbidity.Multivar.行为。res。 56, 353â367 (2019).
Zamani Esfahlani, F. et al.Modularity maximization as a flexible and generic framework for brain network exploratory analysis.Neuroimage 244, 118607 (2021).
PubMed一个 Google Scholar一个
Xu, D. et al.Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)-based ranking for concept normalization.J. Am。医学通知。联合。 27, 1510â1519 (2020).
Neal, Z. P. How strong is strong?The challenge of interpreting network edge weights.PLoS ONE 19, e0311614 (2024).
Doiron, R. C., Nickel, J. C. & Siemens, D. R. Diagnosis and management of interstitial cystitis/bladder pain syndrome: CUA guideline.能。Urol.联合。J. 19, 92â102 (2025).
Yu, W.-R., Jiang, Y.-H., Jhang, J.-F.& Kuo, H.-C.Bladder pain syndrome associated with interstitial cystitis: Recent research and treatment options.Curr。Bladder Dysfunct.代表。 18, 389â400 (2023).
Clemens, J. Q., Erickson, D. R. & Varela, N. P. Diagnosis and treatment of interstitial cystitis/bladder pain syndrome: AUA guideline amendment.J. Urol。 208, 34â42 (2022).
PubMed一个 Google Scholar一个
Zheng, L. et al.A review of auditing techniques for the Unified Medical Language System.J. Am。医学通知。联合。 27, 1625â1638 (2020).
Altstidl, J. et al.Absence of chest discomfort in type 1 NSTEMI patients: Predictors and impact on outcome.Clin.res。Cardiol. 114, 1234â1245 (2025).
Kumar, A. et al.Chest pain symptoms during myocardial infarction in patients with and without diabetes: A systematic review and meta-analysis.心 109, 1516â1524 (2023).
PubMed一个 Google Scholar一个
Baba, M., Kuroha, M., Wasaki, Y. & Ohwada, S. Effects of mirogabalin on tingling or pins & needles in a phase 3 study of diabetic peripheral neuropathy.J. Jpn.Soc。Pain Clin. 27, 287â295 (2020).
Wu, S., Wahle, J. P. & Mohammad, S. The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions.arXiv:2505.16189(2025)。
Borbjerg, M. K. et al.Understanding the impact of diabetic peripheral neuropathy and neuropathic pain on quality of life and mental health in 6,960 people with diabetes.糖尿病护理 48, 588â595 (2025).
PubMed一个 Google Scholar一个
Tofthagen, C., Visovsky, C., Dominic, S. & McMillan, S. Neuropathic symptoms, physical and emotional well-being, and quality of life at the end of life.支持。Care Cancer 27, 3357â3364 (2019).
Hu, Y., Keloth, V. K., Raja, K., Chen, Y. & Xu, H. Towards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach.生物信息学 39, btad542 (2023).
Engels, G. et al.Clinical pain and functional network topology in Parkinsonâs disease: A resting-state fMRI study.J.神经传输。 125, 1449â1459 (2018).
PubMed一个 Google Scholar一个
Brusco, M. J., Steinley, D. & Watts, A. L. On maximization of the modularity index in network psychometrics.行为。res。方法。 55, 3549â3565 (2023).
PubMed一个 Google Scholar一个
Hoffman, M., Steinley, D., Gates, K. M., Prinstein, M. J. & Brusco, M. J. Detecting clusters/communities in social networks.Multivar.行为。res。 53, 57â73 (2018).
Filosi, M., Visintainer, R., Riccadonna, S., Jurman, G. & Furlanello, C. Stability indicators in network reconstruction.PLoS ONE 9, e89815 (2014).
广告一个 PubMed一个 PubMed Central一个 Google Scholar一个
Mokhtari, F., Akhlaghi, M. I., Simpson, S. L., Wu, G. & Laurienti, P. J. Sliding window correlation analysis: Modulating window shape for dynamic brain connectivity in resting state.Neuroimage 189, 655â666 (2019).
PubMed一个 Google Scholar一个
Shakil, S., Lee, C. H. & Keilholz, S. D. Evaluation of sliding window correlation performance for characterizing dynamic functional connectivity and brain states.Neuroimage 133, 111â128 (2016).
PubMed一个 Google Scholar一个
Daly, C. H. et al.Empirical evaluation of SUCRA-based treatment ranks in network meta-analysis: Quantifying robustness using Cohenâs kappa.BMJ Open 9, e024625 (2019).
Wongpakaran, N., Wongpakaran, T., Wedding, D. & Gwet, K. L. A comparison of Cohenâs Kappa and Gwetâs AC1 when calculating inter-rater reliability coefficients: A study conducted with personality disorder samples.BMC Med.res。methodol。 13, 61 (2013).
Kleinman, A. The Illness Narratives: Suffering, Healing, and the Human Condition.(Basic Books, 1988).
Good, B. J.Medicine, Rationality, and Experience: An Anthropological Perspective(Cambridge University Press, 1994).
Okui, N. Laser treatment for urinary incontinence in elite female athletes analyzed using a discrete mathematics approach.科学。代表。 15, 15450 (2025).
Okui, N. Innovative decision making tools using discrete mathematics for stress urinary incontinence treatment.科学。代表。 14, 9900 (2024).
广告一个 PubMed一个 PubMed Central一个 Google Scholar一个
Turner, R. J., Hagoort, K., Meijer, R. J., Coenen, F. & Scheepers, F. E. Bayesian network analysis of antidepressant treatment trajectories.科学。代表。 13, 8428 (2023).
广告一个 PubMed一个 PubMed Central一个 Google Scholar一个
Plisiecki, H. & Sobieszek, A. Emotion topology: extracting fundamental components of emotions from text using word embeddings.正面。Psychol。 15, 1401084 (2024).
We thank Karen Okui for English language editing and proofreading of the manuscript.We are also grateful to Dr. Machiko Okui for her support in the identification and refinement of pain-related lexical terms.This paper is dedicated to the memory of my late high school friend, mathematician Yasushi Kondoh.Our shared passion for mathematics during those formative years has remained a lasting source of inspiration, ultimately leading to the writing of this work more than four decades later.
这项研究未从公共,商业或非营利部门的资助机构那里获得任何具体赠款。
作者没有宣称没有竞争利益。
We used Python libraries with machine learning capabilities for statistical analysis and network visualization.However, no generative AI or AI-assisted technologies were used in the writing or editing of the manuscript text.
关于已发表的地图和机构隶属关系中的管辖权主张,Springer自然仍然是中立的。
Below is the link to the electronic supplementary material.
开放访问This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material.您没有根据本许可证的许可来共享本文或部分内容的改编材料。The images or other third party material in this article are included in the articleâs Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the articleâs Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visithttp://creativecommons.org/licenses/by-nc-nd/4.0/。重印和权限
Sci代表15 , 29219 (2025).https://doi.org/10.1038/s41598-025-14680-y
已收到:
公认:
出版:
doi:https://doi.org/10.1038/s41598-025-14680-y