J* E* C* N* U* N* S* ›› 2025, Vol. 2025 ›› Issue (5): 87-98.doi: 10.3969/j.issn.1000-5641.2025.05.009
• Innovative Practices of Open Source and AI in Education • Previous Articles Next Articles
Received:
2025-06-27
Accepted:
2025-08-06
Online:
2025-09-25
Published:
2025-09-25
Contact:
Xuesong LU
E-mail:xslu@dase.ecnu.edu.cn
CLC Number:
Linna XIE, Xuesong LU. Student employment prediction for digital jobs based on behavior in open-source communities[J]. J* E* C* N* U* N* S*, 2025, 2025(5): 87-98.
Table 1
Example of student-to-repository data structure"
Student | Repository | |||
Student ID | Education | Repo ID | API | |
0 | "school": "Arizona State University", "start_date": "2011", "end_date": "2015", "degrees": ["bachelors", "masters"], "majors": ["Science in Software Engineering"] | 0 | javax.servlet.http.HttpServlet, org.springframework.boot, org.springframework.web, | |
1 | javax.crypto.Cipher, javax.crypto.KeyGenerator, java.nio.file.Paths, java.util.Base64, org.bouncycastle.jce.provider.BouncyCastleProvide, | |||
2 | sklearn.model_selection.train_test_split, sklearn.linear_model.LogisticRegression, sklearn.metrics.accuracy_score, |
Table 2
Performance comparison of different models for the employment prediction task"
组别 | 方法 | Accuracy | Macro-F1 | Micro-F1 |
ML | SVM | 20.81 | 15.38 | 20.81 |
LR | 22.43 | 16.24 | 22.43 | |
RF | 24.46 | 17.65 | 24.46 | |
GNN | GAT | 28.37 | 19.87 | 23.69 |
GraphSAGE | 28.98 | 20.21 | 24.35 | |
GCN | 29.86 | 20.54 | 24.89 | |
LLM | Mistral-7B | 34.75 | 27.84 | 34.75 |
Qwen-7B | 35.62 | 28.52 | 35.62 | |
DeepSeek-R1-Distill-Qwen-7B | 36.41 | 29.04 | 36.41 | |
LLM + GNN | LLM-as-Encoder | 42.33 | 35.57 | 37.94 |
LLM-as-Explainer | 44.12 | 38.23 | 40.11 |
1 | 王秋霞. 2024年中国高校毕业人数统计及就业意愿、就业现状分析: 高校毕业人数再创新高, 国家加大针对毕业生职业技术培训力度 [EB/OL]. (2024-08-31)[2025-06-27]. https://www.chyxx.com/industry/1196296.html. |
2 | LIU R, RONG W G, OUYANG Y X, et al.. A hierarchical similarity based job recommendation service framework for university students. Frontiers of Computer Science, 2017, 11 (5): 912- 922. |
3 | KONG J, REN M, LU T, et al. Analysis of college students’ employment, unemployment and enrollment with self-organizing maps [M]// E-Learning and Games. Cham: Springer International Publishing, 2019: 318-321. |
4 | SOUMYA M D, SUGATHAN T, BIJLANI K. Improve student placement using job competency modeling and personalized feedback [C]// 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2017: 1751-1755. |
5 | GUO T, XIA F, ZHEN S H, et al.. Graduate employment prediction with bias. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34 (1): 670- 677. |
6 | YANG G Z, OUYANG Y, YE Z W, et al.. Social-path embedding-based transformer for graduation development prediction. Applied Intelligence, 2022, 52 (12): 14119- 14136. |
7 | UOSAKI N, MOURI K, YIN C J, et al. Seamless support for international students’ job hunting in Japan using learning log system and eBook [C]// 2018 7th International Congress on Advanced Applied Informatics (IIAI-AAI). IEEE, 2018: 374-377. |
8 | LIU L, SILVER D, BEMIS K.. Application-driven design: Help students understand employment and see the “big picture”. IEEE Computer Graphics and Applications, 2018, 38 (3): 90- 105. |
9 | Linux Foundation Research Team. The 10th annual open source jobs report [R/OL]. (2022-06-01)[2025-06-27]. https://www.linuxfoundation.org/research/the-10th-annual-open-source-jobs-report. |
10 | DEY T, KARNAUCH A, MOCKUS A. Replication package for representation of developer expertise in open source software [C]// 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 2021: 995-1007. |
11 | DESHPANDE S, DESHMUKH S, PHADATARE A, et al.. Prediction of suitable career for students using machine learning. International Research Journal of Engineering and Technology, 2021, 8 (2): 2043- 2046. |
12 | REN X B, TANG J B, YIN D W, et al. A survey of large language models for graphs [C]// Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2024: 6616-6626. |
13 | CHEN Z K, MAO H T, LI H, et al.. Exploring the potential of large language models (LLMs) in learning on graphs. ACM SIGKDD Explorations Newsletter, 2024, 25 (2): 42- 61. |
14 | ZHU Y, WANG Y K, SHI H Z, et al. Efficient tuning and inference for large language models on textual graphs [C]// Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2024: 5734-5742. |
15 | HE X, BRESSON X, LAURENT T, et al. Harnessing explanations: LLM-to-LM interpreter for enhanced text-attributed graph representation learning [C]// International Conference on Learning Representations. 2024. |
16 | ZHANG J X, LIU J Y, LUO D S, et al. LLMExplainer: Large language model based bayesian inference for graph explanation generation [EB/OL]. (2024-07-23)[2025-06-27]. https://arxiv.org/abs/2407.15351. |
17 | TANG J B, YANG Y H, WEI W, et al. GraphGPT: Graph instruction tuning for large language models [C]// Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2024: 491-500. |
18 | CHEN R J, ZHAO T, JAISWAL A K, et al. LLaGA: Large language and graph assistant [C]// Proceedings of the 41st International Conference on Machine Learning. PMLR, 2024: 7809-7823. |
19 | LI Y H, WANG P S, LI Z X, et al. ZeroG: Investigating cross-dataset zero-shot transferability in graphs [C]// Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. ACM, 2024: 1725-1735. |
20 | LIU H, FENG J R, KONG L C, et al. One for all: Towards training one graph model for all classification tasks [EB/OL]. (2024-07-12)[2025-07-01]. https://arxiv.org/abs/2310.00149. |
21 | LIU X, WANG Y, DONG Q W, et al. Job title prediction as a dual task of expertise prediction in open source software [M]// Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. Cham: Springer Nature Switzerland, 2024: 381-396. |
22 | FATEMI B, HALCROW J, PEROZZI B. Talk like a graph: Encoding graphs for large language models [EB/OL]. (2023-10-06)[2025-07-01]. https://doi.org/10.48550/arXiv.2310.04560. |
23 | LAVALLEY M P.. Logistic regression. Circulation, 2008, 117 (18): 2395- 2399. |
24 | CORTES C, VAPNIK V.. Support-vector networks. Machine Learning, 1995, 20 (3): 273- 297. |
25 | BREIMAN L.. Random forests. Machine Learning, 2001, 45 (1): 5- 32. |
26 | HAMILTON W L, YING Z, LESKOVEC J. Inductive representation learning on large graphs [C]// Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NeurIPS). NeurIPS, 2017: 1025–1035. |
27 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [EB/OL]. (2017-02-22)[2025-07-01]. https://arxiv.org/pdf/1609.02907. |
28 | VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks [EB/OL]. (2018-02-04)[2025-07-01]. https://doi.org/10.48550/arXiv.1710.10903. |
[1] | Lijun XU, Li YANG, Ziyi HUANG. Synergy between large language models and open source ecosystems in AI education [J]. J* E* C* N* U* N* S*, 2025, 2025(5): 66-75. |
[2] | Ruiyang PANG, Xuesong LU. Interactive data structure and algorithm visualization based on AI agents [J]. J* E* C* N* U* N* S*, 2025, 2025(5): 32-42. |
[3] | Dexin HE, Fanyu HAN, Wei WANG. Application and evaluation of large language models in open source project topic annotation [J]. J* E* C* N* U* N* S*, 2025, 2025(5): 14-24. |
[4] | Houlong FAN, Ailian FANG, Xin LIN. Knowledge graph completion by integrating textual information and graph structure information [J]. J* E* C* N* U* N* S*, 2025, 2025(1): 111-123. |
[5] | Zhirui CHEN, Xuesong LU. Prompting open-source code large language models for student program repair [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 93-103. |
[6] | Sijia KOU, Fengyun YAN, Jing MA. A case study on the application of the automatic labelling of the subject knowledge graph of Chinese large language models: Take morality and law and mathematics as examples [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 81-92. |
[7] | Jia LIU, Xin SUN, Yuqing ZHANG. Educational resource content review method based on knowledge graph and large language model collaboration [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 57-69. |
[8] | Chao KONG, Jiahui CHEN, Dan MENG, Huabin DIAO, Wei WANG, Liping ZHANG, Tao LIU. Personalized knowledge concept recommendation for massive open online courses [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 32-44. |
[9] | Jiarui ZHANG, Qiming ZHANG, Fenglin BI, Yanbin ZHANG, Wei WANG, Erjin REN, Haili ZHANG. Locally lightweight course teaching-assistant system based on IPEX-LLM [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 162-172. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||