计算机科学

基于多核支持向量机的句子分类算法

  • 肖开研 ,
  • 廉洁
展开
  • 上海师范大学 信息与机电工程学院, 上海 201418

收稿日期: 2022-11-26

  网络出版日期: 2023-11-23

基金资助

上海市自然科学基金 (20ZR1440900)

Sentence classification algorithm based on multi-kernel support vector machine

  • Kaiyan XIAO ,
  • Jie LIAN
Expand
  • The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201418, China

Received date: 2022-11-26

  Online published: 2023-11-23

摘要

主流句子分类算法采用单一词向量表示模型获得文本表示, 导致了对文本的映射能力不足. 对此, 通过融合多种词向量的文本表示以提高分类的准确率. 针对多核学习在融合不同核函数时, 常规的核函数系数寻优方法存在的训练时间长、难以求得局部最优解等问题, 提出了一种新的核函数系数寻优方法, 该方法基于参数空间分割与广度优先搜索不断逼近核系数的最优值. 以支持向量机(support vector machine, SVM)为分类器, 在7个文本数据集上进行了分类实验. 实验结果表明, 多核学习分类效果明显优于单核学习, 并且所提出的寻优方法在训练次数少于常规方法时也能获得了好的分类效果.

本文引用格式

肖开研 , 廉洁 . 基于多核支持向量机的句子分类算法[J]. 华东师范大学学报(自然科学版), 2023 , 2023(6) : 85 -94 . DOI: 10.3969/j.issn.1000-5641.2023.00.008

Abstract

Mainstream sentence classification algorithms rely on a single word vector model to obtain the feature vector representation of text, which leads to insufficient text mapping ability. Therefore, a multi-kernel learning method is used to fuse multiple text representations based on different word vectors to improve the accuracy of sentence classification. In the process of fusing different kernel functions, traditional kernel function coefficient optimization methods often lead to long training time and difficulty in finding a local optimum. To address this problem, a new kernel function coefficient optimization method that continuously approximates the optimal kernel function coefficient value based on parameter space segmentation and breadth first search was developed. In this study, a support vector machine (SVM) was used as a classifier to perform classification experiments on seven text datasets, and the experimental results showed that the multi-kernel learning classification results were significantly better than those of single-kernel learning. Moreover, the proposed optimization method performed better than traditional methods with less training cost.

参考文献

1 张建, 严珂, 马祥.. 基于神经网络的复杂垃圾信息过滤算法分析. 计算机应用, 2022, 42 (3): 770.
2 王曙燕, 原柯.. 基于RoBERTa-WWM的大学生论坛情感分析模型. 计算机工程, 2022, 48 (8): 292- 298, 305.
3 KIM Y. Convolutional neural networks for sentence classification [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics(ACL), 2014: 1746-1751.
4 周燕.. 基于GloVe模型和注意力机制Bi-LSTM的文本分类方法. 电子测量技术, 2022, 45 (7): 42- 47.
5 DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2019-05-24)[2022-08-06]. https://doi.org/10.48550/arXiv.1810.04805.
6 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2017: 6000–6010.
7 刘欢, 张智雄, 王宇飞.. BERT模型的主要优化改进方法研究综述. 数据分析与知识发现, 2021, 5 (1): 3- 15.
8 邱宁佳, 贺金彪, 薛丽娇, 等.. 融合语义特征的加权朴素贝叶斯分类算法. 计算机工程与设计, 2020, 41 (9): 2523- 2529.
9 YU H, KIM S. SVM tutorial—Classification, regression and ranking[G]// B?CK G, KOK J N. Handbook of Natural Computing. Berlin: Springer, 2012: 479-506.
10 HACOHEN-KERNER Y, MILLER D, YIGAL Y.. The influence of preprocessing on text classification using a bag-of-words representation. Plos One, 2020, 15 (5): 0232525.
11 MISHRA R K, UROLAGIN S. A Sentiment analysis-based hotel recommendation using TF-IDF Approach [C]// 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE). IEEE, 2019: 811-815.
12 LILLEBERG J, ZHU Y, ZHANG Y. Support vector machines and word2vec for text classification with semantic features [C]// 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2015: 136-140.
13 MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. (2013-09-07)[2022-08-06]. https://doi.org/10.48550/arXiv.1301.3781.
14 PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (ACL), 2014: 1532-1543.
15 JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification [EB/OL]. (2016-08-09)[2022-08-06]. https://doi.org/10.48550/arXiv.1607.01759.
16 WANG T H, ZHANG L, HU W Y.. Bridging deep and multiple kernel learning: A review. Information Fusion, 2021, 67, 3- 13.
17 PINAR A J, RICE J, HU L Q, et al.. Efficient multiple kernel classification using feature and decision level fusion. IEEE Transactions on Fuzzy Systems, 2016, 25 (6): 1403- 1416.
18 XIAO P, YU X B, MINTZ A, et al. A generative-discriminative deep learning approach to classify radiology reports based on the presence of follow up recommendations [C]// Proceedings Volume 12469, Medical Imaging 2023: Imaging Informatics for Healthcare, Research, and Applications. Bellingham, WA, USA: SPIE, 2023: 155-162. DOI: 10.1117/12.2651950.
19 CHAUHAN V K, DAHIYA K, SHARMA A.. Problem formulations and solvers in linear SVM: A review. Artificial Intelligence Review, 2019, 52 (2): 803- 855.
20 SUN S L, SHAWE-TAYLOR J.. Sparse semi-supervised learning using conjugate functions. The Journal of Machine Learning Research, 2010, (11): 2423- 2455.
21 JI Y, SUN S L.. Multitask multiclass support vector machines: Model and experiments. Pattern Recognition, 2013, 46 (3): 914- 924.
22 SUN S L, XIE X J, DONG C.. Multiview learning with generalized eigenvalue proximal support vector machines. IEEE Transactions on Cybernetics, 2019, 49 (2): 688- 697.
23 SUN S L, XIE X J.. Semi-supervised support vector machines with tangent space intrinsic manifold regularization. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27 (9): 1827- 1839.
24 SUN S L, XIE X J, YANG M.. Multiview uncorrelated discriminant analysis. IEEE Transactions on Cybernetics, 2016, 46 (12): 3272- 3284.
25 ALI I M S, HARIPRASAD D. Hyper-heuristic salp swarm optimization of multi-kernel support vector machines for big data classification [J]. International Journal of Information Technology, 2023(15): 651-663.
26 BAO J, CHEN Y Y, YU L, et al.. A multi-scale kernel learning method and its application in image classification. Neurocomputing, 2017, 257, 16- 23.
27 PENG Z C, HU Q H, DANG J W.. Multi-kernel SVM based depression recognition using social media data. International Journal of Machine Learning and Cybernetics, 2019, (10): 43- 57.
28 TANG F, WU Y Q, ZHOU Y S.. Hybridizing grid search and support vector regression to predict the compressive strength of fly ash concrete. Advances in Civil Engineering, 2022, (Special Issue): 3601914.
29 BISCHL B, BINDER M, LANG M, et al. Hyperparameter optimization: Foundations, algorithms, best practices and open challenges[EB/OL]. (2021-11-24)[2022-08-06]. https://doi.org/10.48550/arXiv.2107.05847.
文章导航

/