J* E* C* N* U* N* S* ›› 2025, Vol. 2025 ›› Issue (5): 170-182.doi: 10.3969/j.issn.1000-5641.2025.05.016

• Open Source Ecosystem: Development and Governance • Previous Articles    

OSS Insight: A platform for open source ecosystem spatiotemporal data analysis and insights

Xiaowei CHEN1(), Wei WANG1,*(), Fanyu HAN1, Guanglei BAO2, Fei DONG2, Hao HUO2, Chen LIU2   

  1. 1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2. PingCAP, Beijing 100192, China
  • Received:2025-01-22 Online:2025-09-25 Published:2025-09-25
  • Contact: Wei WANG E-mail:wayne.chen@stu.ecnu.edu.cn;wwang@dase.ecnu.edu.cn

Abstract:

An open source ecosystem abounds with valuable data, yet extracting insights requires innovative data infrastructure and analytical methods. To address this, OSS Insight was developed that innovatively used the hybrid transactional analytical processing(HTAP) database for efficient storage and query of billions of GitHub event data and offered real-time exploration via a visual interface. It delved into spatiotemporal data analysis, modeling developer behaviors and ecosystem evolution, such as visualizing global contribution patterns. Integrated with large language models(LLMs), it enabled natural language to structured query language(SQL) conversion for intelligent querying. A case study of Kubernetes showcased its capabilities in analyzing developers, project evolution, and organizational collaboration. Experiments proved that OSS Insight efficiently analyzed large-scale open source data, and its LLM-driven interaction simplified data analysis and provided automated insights.

Key words: open source ecosystem, open source insight, spatiotemporal data analysis, HTAP, LLM

CLC Number: