本文已被:浏览 1436次 下载 486次
投稿时间:2020-03-18
投稿时间:2020-03-18
中文摘要: 介绍了文本词向量及预训练语言模型的发展体系,系统整理并分析了其中重点方法的思想特点。首先,阐述了传统的文本词向量表征方法及基于语言模型的文本表征方法;然后,详述了预训练语言模型方法的研究进展,包括动态词向量的表征方法和基于Transformer架构的预训练模型;最后,指出了未来探究多模态间更有效的融合方式和迁移学习将成为该领域的发展趋势。
中文关键词: 文本信息处理 词向量 预训练语言模型 Transformer架构
Abstract:This paper mainly introduces the development system of text word vectors and pre-trained language models,systematically organizes and analyzes the ideological characteristics of key methods.Firstly,we describe the traditional text word vector representation method and the language model-based text representation method,then we elaborate the research progress of the pre-trained language model method,including the dynamic word vector representation method and the Transformer architecture-based pre-training model.Finally,it is pointed out that in the future,exploring more effective fusion methods and transfer learning between multi-modalities will become a development trend in this field.
keywords: text information processing word vector pre-trained language model Transformer architecture
文章编号:20204002 中图分类号:TP391.1 文献标志码:
基金项目:上海市自然科学基金(19ZR1420800)。
引用文本:
徐菲菲,冯东升.文本词向量与预训练语言模型研究[J].上海电力大学学报,2020,36(4):320-328.
XU Feifei,FENG Dongsheng.A Survey of Research on Word Vectors and Pretrained Language Models[J].Journal of Shanghai University of Electric Power,2020,36(4):320-328.
徐菲菲,冯东升.文本词向量与预训练语言模型研究[J].上海电力大学学报,2020,36(4):320-328.
XU Feifei,FENG Dongsheng.A Survey of Research on Word Vectors and Pretrained Language Models[J].Journal of Shanghai University of Electric Power,2020,36(4):320-328.