用户名: 密码: 验证码:
数据挖掘在网格资源预测方面的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网格是一个动态的分布式巨型系统,网格的结构和资源随着时间的推移都处在不断的变化当中,为了更高效地利用网格上分散分布的计算资源,如何预测网格资源的CPU负载及网络性能成为当前各国机构研究的热点。
     本文发现单凭预测CPU负载及网络性能并不能够全面地评价资源未来的可用性,提出以往工作中被忽视的两个地方: 1.每个作业有其自身的作业描述,包括作业对资源要求的具体内容,作业只能在符合作业描述的主机上执行。因此对于单个集群,需要针对作业描述当中的某些资源属性对资源进行分类;2.CPU负载并不是影响集群作业执行能力的唯一因素,其他重要因素,如集群的局部作业调度策略也是不容忽视的。因此,如何将资源分类,如何整合与作业执行相关的重要因素并找出与时间相关的集群状态变化的规律成为本文工作的重点。
     本文设计并实现了网格资源的聚类、网格资源可用性的预测、以及利用MDS监控和发现系统进行信息发布与结果反馈。分析和预测过程引入数据挖掘技术,将资源按照属性间的相似程度划分为簇,找出局部调度策略以及与时间相关的资源信息的变化规律,利用该规律可以预测出将来某时刻网格上各集群的对于特定作业的可用性。
Grid is a new computing and applying technology build on Internet, its essence is making the best use of the existing hardware and software resources all around the web, supporting wide area sharing and cooperating with computation, data, storage, information and knowledge resources, eliminating information isolation, and improving the quality of services. The purpose of grid is to connecting geographically dispersed, and heterogeneous computing resources through high-speed network to solving large scale application problems by working together, sharing the wide area resource published information, providing unique programming and application interface, and shield the hardware boundaries, eventually aggregate all the resources on the Internet to form a super virtual computer.
     Grid is a dynamic system, the resources on grid are always changing. In order to make the best use of these resources, we need to predict the future status of these resources. The prediction of future status of resources is very important in grid computing field, grid resource prediction became a hot topic in many institutions all over the world.
     Related work refers that the contention that results from sharing resources causes the deliverable performance to vary over time. To make the best use of the resources that are at hand, an application scheduler must make a prediction of what performance will be available from each. This is the original reason for predicting grid resource performance. But the presupposition of this argumentation is to suppose job scheduling based on only load and performance. This paper find out some other important factors besides load and performance which have a lot to with evaluating a cluster’s availability, like the influence of local strategies and the requests towards resources for different applications.
     Resources in the grid are heterogeneous, the performance vary dynamically, and they are independently managed. In this situation, it’s very difficult to predict the availability of grid resources. 1. Heterogeneous. From the hardware side, resources differ from their system architecture and computing ability. From the software side, they differ from operating system, local management and scheduler. 2. Performance vary dynamically. The grid environment is not a static system, there are all kinds of unpredictable factors, the resources could be unavailable because of machine or network goes down, and it's very possible that new resources would join in the grid. 3. Independently managed. The grid resources have or under their own local managers, they have their own local scheduling strategies. The grid management system must follow the local strategies, not try to change of take the place of it.
     This paper is based on“Research on Resource Co-allocation and Meta-scheduling algorithm for Cross-domain Parallel Application”, the project of national natural science foundation, providing the meta-scheduling program with management support mainly by predicting the availability of grid resources. From the researches on related materials, most related work from both inside and outside the country focused on the prediction of system load and performance. After the researches and analysis on grid resource and scheduling system, this paper found that in heterogeneous environment, system load and performance couldn’t evaluate the whole availability of a cluster. Therefore, based on former work, this paper suggested to classify resources into clusters and take local strategies into consideration, and represented a new grid resource availability predicting system build on GDIA(A Scalable Grid Infrastructure For Data Intensive Application). The system consists of three main modules, the information collecting module, the resources clustering module and the resources availability predicting module, these modules all formed their own architectures. The system using MDS(Monitoring And Discovery System) to implement the information publishing and results uploading functions. Three modules and the MDS module consist the grid resource predicting system.
     This paper design and implement grid resource clustering, grid resource availability prediction as well as MDS information publication and result reporting. The analysis and prediction process brought in Data Mining technology, classified grid resources into clusters, and found out the local policies and the potential changing rules of grid resources. Using these rules could predict the future resource availability for certain kind of jobs.
     The grid resource predicting system represented by the paper brought in Data Mining technology. After researches on the conceptions and related algorithms of Data Mining, the paper found using Data Mining could properly solve the problems with related work. The system used clustering method to implement the resources clustering module, and used regression method to implement the resource availability predicting module.
     This paper studied related work in grid resource prediction, found 2 restrictions in real applications, 1, ignore the demands on resources from jobs for their execution; 2, ignore local policies. This paper studied lots of papers on Data Mining, did many researches on algorithms. This paper used cluster and regression algorithm in Data Mining, carried out and implement the grid resource clustering and availability predicting, solved the problems of related work. Besides, the paper used MDS to implement information publishing and result reporting. Information collecting module, resource clustering module and resource predicting module are relatively independent, including MDS module , they work as the whole grid resource prediction system.
     The system represented by this paper can be adaptable for large-scale grid environment, compared with related work, the innovation of this paper includes:
     Cluster the grid resources, could help the meta-scheduler filter the right clusters when a job came in.
     Set an index system, to estimate the availability of grid resource as a standard.
     Predict the availability of grid resources, to provide the meta-scheduler with management support.
     The grid resource availability predicting system represented by this paper could help meta-scheduler discover and organize the matching groups of resources for certain applications including to the resource information. Compared with related work, this system could reduce the time costs, rise up the success rate of applications, and timely predict the intensive change of resource availability.
     In the future work, this grid resource availability predicting system needs to run on much larger grid environment for experimentation, to test the stability and reliability of the system. The clustering and regression algorithms used in this system could be tested and changed depends on the application needs. And the CPU availability evaluation equation could be improved in the future work.
引文
[1] I. Foster and C. Kesselman. The Grid: Blueprint for a Future Computing Infrastructure[M]. Morgan Kaufmann Publishers, 1999.
    [2] Shailendra Kumar, Sajal K. Das, Rupak Biswas. Graph Partitioning for Parallel Applications in Heterogeneous Grid Environments[C]. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS 2002). Florida, U.S.A., 2002, 167.
    [3] S. Kumar, U. Maulik, S. Bandyopadhyay, S.K. Das. Efficient Task Mapping on Distributed Heterogeneous System for Mesh Applications[C]. In Proceedings of International workshop on Distributed Computing (IWDC 2001). Calcutta, India, 2001.
    [4] C. Liu, L. Yang, I. Foster, D. Angulo. Design and Evaluation of a Resource Selection Framework for Grid Applications[C]. In Proceedings of the 11th IEEE International Symposium on High-Performance Distributed Computing (HPDC 11). Edinburgh, Scotland, 2002, 63.
    [5] Liang Hu, Dong Guo, Xilong Che. A Fast Resource Selection Approach for Grid Applications Based on Fuzzy Clustering Technology[C]. In Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications. Dalian, China, 2008, 1019-1024.
    [6] R. Wolski. Dynamically Forecasting Network Performance Using the Network Weather Service[J]. Journal of Cluster Computing, 1998, 1(1): 119-132.
    [7] R. Wolski, N. Spring, J. Hayes. The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing[J]. Journal of Future Generation Computing Systems, 1999, 15(5-6): 757-768.
    [8] L. Yang, J.M. Schopf, I. Foster. Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments[C]. In Proceedings of the 2003 ACM/IEEE conference on Supercomputing. San Francisco, California, U.S.A., November 2003, 31.
    [9]邵峰晶,于忠清.数据挖掘原理与算法[M].北京:中国水利水电出版社, 2003.
    [10] Xiaohui Wei, Zhaohui Ding, Gaochao Xu, Ju Jiubin, Wilfred W. Li, Osamu Tatebe. GDIA: A Scalable Grid Infrastructure For Data Intensive Applications[C]. In Proceedings of the 2006 International Conference on Hybrid Information Technology. Jeju Island, Korea, 2006, Volume 01:347-354.
    [11] Globus Project[OL]. http://www.globus.org.
    [12] K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke. A Resource Management Architecture for Metacomputing Systems[C]. In Proceedings of IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing.Orlando, U.S.A., 1998, 62-82.
    [13] I. Foster. Globus Toolkit Version 4: Software for Service-Oriented Systems[C]. In Proceedings of IFIP International Conference on Network and Parallel Computing, Springer-Verlag LNCS 3779. Beijing, China, 2005, 2-13.
    [14] I. Foster, H. Kishimoto, A. Savva. The Open Grid Service Architecture, Version1.0[OL]. http://www.Gridforum.org/documents/GFD.30.pdf.
    [15] Marty Humphrey, Glenn Wasson, Jarek Gawor, Joe Bester, Sam Lang, Ian Foster, Stephen Pickles, Mark Mc Keown, Keith Jackson, Joshua Boverhof, Matt Rodriguez, Sam Meder. State and Events for Web Services: A Comparison of FiveWS-Resource Frame work and WS-Notification Implementations[C]. In Proceedings of 14th IEEE International Symposium on High Performance Distributed Computing (HPDC-14). Research Triangle Park, NC, July 2005, 24-27.
    [16] PRAGMA website[OL]. http://pragma.sdsc.edu.
    [17] CSF4 website[OL]. http://www.globus.org/toolkit/docs/4.0/contributions/csf/.
    [18] Songnian Zhou, Xiaohu Zheng, Jingwen Wang et al. Utopia: a Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems[J]. SOFTWARE—PRACTICE AND EXPERIENCE, Dec 1993: 23(12), 1305–1336.
    [19] P. James J. Portable Batch System: Exterernal Reference Specification Altair PBS Pro 5.3[M]. http://www.mta.ca/torch/pdf/pbspro54/pbsproers.pdf, March 2003.
    [20] Sun Microsystems, Inc. Sun Grid Engine 5.3 Administration and User’s Guide[OL]. http://Gridengine.sunsource.net/project/Gridengine-download/SGE53AdminUserDoc.pdf, April, 2002.
    [21] Jim Basney, Miron Livny. Managing Network Resources in Condor[C]. In Proceedings of the Ninth IEEE Symposium on High Performance Distributed Computing (HPDC9). Pittsburgh, Pennsylvania, August 2000, 298-299.
    [22]杜晓丽,蒋昌俊,徐国荣等.一种基于模糊聚类的网格DAG任务图调度算法[J].软件学报, 2006, 17(11):2277-2288.
    [23]桂小林,王庆江,龚文强等.面向网格计算的机器选择算法研究[J].计算机研究与发展, 2004, 41(12):2189-2193.
    [24] The Monitoring and Discovery System (MDS)[OL]. http://www.globus.org/toolkit/docs/4.2/4.2.1/info.
    [25] XML Path Language (XPath)[OL]. http://www.w3c.org/TR. 2006.
    [26] Osamu Tatebe, Youhei Morita, Satoshi Matsuoka et al. Grid Datafarm Architecture for Petascale Data Intensive Computing[C]. In Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid. Berlin, Germany, 2002, 102-110.
    [27] GT4 Figure[OL]. http://www.globus.org/toolkit/docs/4.0/GT4figure.jpg.
    [28] ZhaoHui Tang, Jamie Maclennan. Data Mining with SQL Server 2005[M]. Wiley Publishing, Inc. 2005.
    [29] Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, Angela Y. Wu. An Eficient SimpleK-Means Clustering Algorithm: Analysis and Implementation[J]. IEEE Transaction on Pattern and Machine Intelligence, 2002, 24(7): 881-892.
    [30] Guha S, Rastogi R, Shim K. CURE: An Efficient Clustering Algorithm for Large Databases[C]. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle, 1998: 73-84.
    [31] G. Karypis, E. H. Han, V. Kumar. CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling.[J] COMPUTER, 1999, 32(8):68-75.
    [32] Jiawei Han, Micheline Kamber. Data Mining Concepts and Techniques[M]. Morgan Kaufmann Publishers. 2001.
    [33] R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications[C]. In Proceedings of 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98). Seattle, U.S.A., 1998, 94-105.
    [34] R. Wolski, N. Spring, J. Hayes. Predicting the CPU Availability of Time-shared Unix Systems on the Computational Grid[C]. In Proceedings of the 8th High- Performance Distributed Computing Conference, August, 1999, 293-301.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700