华东师范大学学报(自然科学版) ›› 2018, Vol. 2018 ›› Issue (1): 103-116.doi: 10.3969/j.issn.1000-5641.2018.01.010

• 计算机科学 • 上一篇    下一篇

IM2:一种改进的MIN/MAX窗口函数优化技术

宋光旋, 赵大鹏, 王晓玲   

  1. 华东师范大学 计算机科学与软件工程学院 上海市高可信计算重点实验室, 上海 200062
  • 收稿日期:2016-10-21 出版日期:2018-01-25 发布日期:2018-01-11
  • 作者简介:宋光旋,男,硕士研究生,研究方向为数据库.E-mail:guangxuan_song@163.com.
  • 基金资助:
    国家自然科学基金(61170085,61472141);上海市重点学科建设项目(B412);上海市可信物联网软件协同创新中心项目(ZF1213)

IM2: Improved MIN/MAX window functions optimizer in relational database

SONG Guang-xuan, ZHAO Da-peng, WANG Xiao-ling   

  1. Shanghai Key Laboratory of Trustworthy Computing, School of Computer Science and Software Engineering, East China Normal University, Shanghai 200062, China
  • Received:2016-10-21 Online:2018-01-25 Published:2018-01-11

摘要: 窗口函数作为一种分析型的OLAP函数加入SQL (Structured Query Language)标准已有十多年,而且随着分析型应用需求的增长窗口函数有着越来越广泛的应用前景.窗口函数的语法非常简单,却可以表达诸如rank、moving average、cumulative sum等复杂的查询.尽管目前主流的商业数据库几乎都支持窗口函数,但是现有的执行策略效率低下,不能满足大批量数据的处理需求.本文主要针对窗口函数中MIN和MAX聚集函数,提出了一种改进的IM2优化策略,可以有效地提升窗口函数的执行效率.本文不仅从时空复杂性理论分析层面进行了证明,而且与已有算法进行了对比实验,证明了本文方法的高效性;另外在目前主流的开源数据库PostgreSQL中实现本文算法,与SQL Server对比有着显著的优化效果.

关键词: window函数, MIN/MAX, 执行优化, PostgreSQL

Abstract: Window functions, also known as analytic OLAP functions, is a part of the SQL standard, and has been extensively studied during the past decade. And the window function has more and more extensive application prospects withthegrowthofthedemandstheanalyticalapplications. Despite its simple syntax, window functions can express many complex queries, such as ranking, moving average, cumulative sum and so on. Although almost all the current mainstream commercial database support window function, the existing implementation strategy is inefficient, and is not suitable for processing big data. In this paper, we propose the IM2 algorithm, an improved algorithm for the MAX/MIN window functions, which effectively improves the efficiency. And we prove the effectiveness of the IM2 algorithm the theoretical complexity analysis. Additionally, we implement the algorithm in PostgreSQL and conduct extensive experiments on real world data to demonstrate the efficiency of the IM2 algorithm.

Key words: window function, MIN/MAX, performance optimization, PostgreSQL

中图分类号: