华东师范大学学报(自然科学版) ›› 2006, Vol. 2006 ›› Issue (1): 108-115.

• 计算机科学 • 上一篇    下一篇

网页数据库动态同步技术研究

许文韬   

  1. 华东师范大学 信息科学与技术学院,上海 200062
  • 收稿日期:2004-12-29 修回日期:2005-09-06 出版日期:2006-01-25 发布日期:2006-01-25
  • 通讯作者: 许文韬

Research on Dynamic Synchronization Technology of Webpage Database(Chinese)

XU Wen-tao   

  1. School of Information Science and Technology, East China Normal University,Shanghai 200062,China
  • Received:2004-12-29 Revised:2005-09-06 Online:2006-01-25 Published:2006-01-25
  • Contact: XU Wen-tao

摘要:

通常搜索引擎网站都有存储大量远程站点复制网页的数据库.为保持复制网页和源网页的同步,需要花费大量的时间和资源.本文提出了保持复制网页和源网页一致的多种同步新策略,并提出源端网页变化的泊松模型,给出了刷新率和刷新时长的规范性描述,对各种同步策略的性能进行了研究和比较分析,发现其较大地改善了网页数据库刷新率.

关键词: 同步技术, 网页, 数据库, 刷新, 搜索引擎, 同步技术, 网页, 数据库, 刷新, 搜索引擎

Abstract:

There are plenty of local copies of pages of remote web sites on local databases on most of web search engine sites. It is necessary to pull remote web pages periodically to refresh local copies of these pages on database in order to keep copies and source pages consistent,and which takes plenty of time and resources.The article proposes serveral policies to synchronize copy and source pages , proposes a Poission model of source page change, define freshness and fresh time, studies on these policies and compares their effectiveness.It is shown that the proposed policies improve the freshness of web pages significantly.

中图分类号: