本文已被:浏览 561次 下载 352次
投稿时间:2023-06-28
投稿时间:2023-06-28
中文摘要: 针对传统情感文本分类算法存在情感特征词的极性偏好区分度较低和稳定性较差等问题,提出了一种改进词频-逆文本频率(TF-IDF)模型与词典模型相融合的情感文本分类算法。首先,通过情感特征词在不同情感类型语料中的频率分布和离散系数,度量情感特征词极性偏好所包含的区分度和稳定性,生成情感特征词极性指标;然后,使用该指标改进TF-IDF模型的情感特征词权重;最后,基于改进的TF-IDF模型,使用带决策函数的有监督分类算法计算情感文本的极性得分,并与词典模型所得的极性得分进行调和平均,得到情感文本综合极性得分。
Abstract:Aiming at the problems of low polarity preference differentiation and poor stability of emotional feature words in traditional emotional text classification algorithms,an improved TF-IDF model integrated with dictionary model is proposed for emotional text classification.Firstly,based on the frequency distribution and dispersion coefficient of emotion feature words in different emotion types of corpus,the differentiation and stability of emotion feature words contained in polarity preference are measured to generate emotion feature word polarity index.Secondly,the index is used to improve the weight of emotion feature words in TF-IDF model.Thirdly,based on the improved TF-IDF model,the supervised classification algorithm with decision function is used to calculate the polarity score of emotional text,and the polarity score obtained from the dictionary model is harmonic averaged to obtain the comprehensive polarity score of emotional text.
keywords: term frequency-inverse document frequency affective polarity dispersion coefficient dictionary mode
文章编号:20241013 中图分类号:TP301.6 文献标志码:
基金项目:华东师范大学软硬件协同设计技术与应用教育部工程研究中心开放研究基金(OP202102)。
引用文本:
王康静,钱江海.一种融合改进TF-IDF与词典模型的情感分类算法[J].上海电力大学学报,2024,40(1):80-86.
WANG Kangjing,QIAN Jianghai.An Improved Emotion Classification Algorithm Based on Improved TF-IDF and Dictionary Model[J].Journal of Shanghai University of Electric Power,2024,40(1):80-86.
王康静,钱江海.一种融合改进TF-IDF与词典模型的情感分类算法[J].上海电力大学学报,2024,40(1):80-86.
WANG Kangjing,QIAN Jianghai.An Improved Emotion Classification Algorithm Based on Improved TF-IDF and Dictionary Model[J].Journal of Shanghai University of Electric Power,2024,40(1):80-86.