用户名: 密码: 验证码:
作业管理系统增强特性研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
自1986年网络队列系统NQS面世以来,作业管理系统取得了长足的发展。国外一些大型的研究机构和公司都将作业管理系统作为提高生产效率和资源利用率的一项重要技术保证,并面向实际需求进行了大量的研究和开发工作。但国内在该领域的研究还没有取得重大的突破和进展。
     针对这种现状,作者对作业管理系统进行了较为详细的研究。通过分析和阐述以网络队列系统NQS为基础的作业管理系统的层次结构和功能特点,将作业网络、高可用性和安全性作为作业管理系统的增强特性加以重点研究。
     作业网络是对作业概念的扩展,通过跨平台作业网络描述语言将单独的作业根据依赖关系构成一个作业网络向系统进行投交和控制。根据作业网络,提出了一个基于作业网络DAG模型的静态调度算法,该算法是对动态负载平衡算法的扩展,用于作业网络的静态负载平衡。
     作业管理系统的可用性是一个十分重要的评价指标。作者在分析当前集群技术的基础上,提出了高可用性作业管理系统的实现模型。根据该模型,作业管理系统作为虚拟服务运行在高可用性集群上,一旦出现故障,可以透明地进行故障转移。当故障消失时,可以进行故障恢复,保证了作业运行的连续性和可用性。并且按照该模型,给出了在MSCS集群上具体的实现方案。
     针对传统作业管理系统在安全性上的问题,结合NT操作系统,给出了在NT平台上实现安全性的方案。并针对NT特有的域安全概念,给出了作业管理系统支持域安全模型的策略。
Since NQS (Network Queuing System) occurred in 1986, IMS (Job Management System) has been progressing rapidly recently. IMS has been adopted by many large research organizations and corporations for an important warranty of improving production efficiency and promoting ratio of resource usage. And lots of researches and developments have been carried out according to the actual need. However, the domestic research in this field has not achieved exciting success.
    In order to fill this research vacancy, a IMS product has been studied as a whole meticulously. Based on the NQS, the architecture and functions of this IMS product are analyzed and expounded in detail. JobNetwork, High Availability and Security are outlined as the enhanced features in JMS.
    JobNetwork extends the concept of job. The platform independent JobNetwork script can construct unit jobs to a job network according to the dependency of these jobs. And this job network can be submitted and controlled by JMS as a whole. A new static scheduling algorithm based on job network DAG module is present to balance network workload statically as the partner of the dynamic load balancing mechanism.
    The availability of JMS is a very important feature. A new architecture of HA (High Availability) JMS is applied based on current cluster technology. HA JMS is running as the virtual service on the High Availability Cluster. Once a failure occurs, HA JMS can be shifted transparently to another node in cluster (FailOver). After that failure is removed, HA JMS can go back to the original node to ensure the continuity and availability of jobs (FailBack). And moreover, HA JMS for MSCS (Microsoft Cluster Server) has been implemented.
    In order to improve the imperfection of security in JMS, a whole security solution powered by Windows NT Operating System has been realized. Furthermore, a new Domain Security policy applies to deal with Domain module in NT platform especially.
引文
[1] "LSF Administrator's Guide", Platform Computing Corporation, June 2001.
    [2] "Portable Batch System Administrator Guide", Veridian Systems, Inc., October 2000.
    [3] "Using and Administering LoadLeveler", IBM Corporation, October 1998.
    [4] http://www.cray.com/products/software/nqe, Cray Inc., 2001.
    [5] Rajkumar Buyya, "High Performance Cluster Computing Architectures and Systems,Volume 1", Prentice-Hall,Inc., 1999.
    [6] "Netshepherd & SystemScope/JobCenter User's Guide", NEC Corporation, 1998.
    [7] Stuart Herbert, "Features Of Generic NQS", June 1996.
    [8] David M. Carver, "Using the Network Queueing System(NQS)",1998.
    [9] Albeaus Bayucan, etc., "Portable Batch System Internal Design Specification", Numerical Aerospace Simulation Systems Division NASA Ames Research Center, October 1998.
    [10] Robert L. Henderson, Dave Tweten, "Portable Batch System Requirements Specification",NAS Scientific Computing Branch, NAS Systems Division, NASA Ames Research Center,August 1998.
    [11] Albeaus Bayucan, etc., "Portable Batch System External Reference Specification",Numerical Aerospace Simulation System Division, NASA Ames Research Center, August 1998.
    [12] Albeaus Bayucan, etc., "Portable Batch System Administrator Guide", Numerical Aerospace Simulation System Division, NASA Ames Research Center, August 1998.
    [13] Brent A. Kingsbury, "The Network Queueing System", Sterling Software, 1999.
    [14] victor Hazlewood, "Cluster Computing: A Survey and Tutorial", Miller Freeman, Inc., March 1997.
    [15] "Batch Queueing Systems", Scott PresnelI,August 1998.
    [16] Harsh Anand, "Batch Differences: NQE/NQS vs. LoadLeveler", http://hpcf.nersc.gov,August 2001.
    [17] "Network Queuing System", http://www.reading.ac.uk/ITS/Topic/UnixOS/UnSQnqs_01/,April 1998.
    [18] "SystemScope/JobCenter R9.1", NEC Corporation, December 2000.
    [19] http://www.genias.de.
    [20] http://www.platform.com.
    [21] IEEE P1003.2 Draft 11.2, Institute of Electrical and Electronics Engineers, Inc, 1991.9.
    [22] Michel Cosnard Emmanuel, "Compact DAG Representation and its Dynamic Scheduling",1999.
    [23] Kwok, Ahmad, "Static Scheduling Algorithms for Allocating Directed Task", 1998.
    [24] Ishfaq Ahmad Yu-Kwong, "Performance Comparison of Algorithms for Static Scheduling", 1995.
    [25] Abroad, Kwok, "On Exploiting Task Duplication in Parallel Program Scheduling", 1998.
    [26] Joseph A. Kaplan, Micheal L. Nelson, "A Comparison of Queuing, Cluster and Distributed Computing Systems", NASA Langley Research Center, June 1994.
    [27] Kwok, Ahmad, "Benchmarking and Comparison of the Task Graph Scheduling", 1999.
    [28] MSDN Library Visual Studio 6.0, Microsoft Corporation, 1998.
    [29] "Inside Microsoft Cluster Server", Windows &.NET magazine,February 1998.
    [30] "Deploying Microsoft Cluster Server", Windows &. NET magazine,August 2000.
    
    
    [31] Werner Vogels, Rod Gamache, Mike Massa, "The Design and Architecture of the Microsoft Cluster Service",IEEE FTCS'98,June 1998.
    [32] Stuart Herbert, "Changes To Generic NQS v3.50.0", June 1996.
    [33] Min-You Wu, Daniel D.Gaiski, "HYPERTOOL: A Programming Aid For Message-PassingSystems.", IEEE Transactions on Parallel and Distributed Systems, July 1990.
    [34] Dieter an Mey, "Are PC-Clusters ready for Supercomputing?", Computing Center Aachen University of Technology(RWTH), February 1999.
    [35] Don Kiely, "SSPI: Security Protocols", ITWorld.com,Inc,May 2001.
    [36] "Security Support Provider Interface(SSPI) White Paper", Microsoft Corporation, 1996.
    [37] "Microsoft NTLM", Microsoft Platform SDK, November 2001.
    [38] "Explore the Security Support Provider Interface Using the SSPI Workbench Utility", MSDN Magazine: security Briefs, August 2000.
    [39] "Kimberlite Cluster Whitepaper", Mission Critical Linux, Inc.,2000.
    [40] 汤小春,胡正国,卢维扬,基于集群的作业管理系统,西北工业大学学报,2001.2.
    [41] 铁玲等,具有可区分服务等级的可扩展并行服务器集群,计算机工程,2001.1.
    [42] Linux操作系统性能评价方法,微型电脑应用,2001第17卷第1期.
    [43] 吕健,杨社堂,Linux环境下的Web服务器负载均衡技术初探,科技情报开发与经济,2001年第11卷第2期.
    [44] 赵书钦,应吉康,Linux环境下的并行计算,微型电脑与应用,2001年第17卷第1期.
    [45] 赵书钦,应吉康,Linux环境集群技术,微型电脑与应用,2001年第17卷第1期.
    [46] 肖钧,庞丽萍,Linux虚拟服务器中WRR调度算法的优化,华中科技大学学报,2001.2.
    [47] 杨建等,PCCAVE:基于连网PC的廉价CAVE系统,计算机研究与发展,2001.5.
    [48] 李慧,林中强,李岩,Web服务器集群系统的实现方法及负载管理,计算机应用,2001.5.
    [49] 姚彤,陈星,Windows 2000集群技术应用,电工技术杂志,2001.5.
    [50] 丁原,刘玉树,朱天焕,多节点集群服务器系统共享磁盘私有网的研究,北京理工大学学报,2001.2.
    [51] 张小梅,服务器端中间件技术,计算技术与自动化,2001.3.
    [52] 张洪辉,基于并行容错网的对等服务器集群的设计与实现,计算机应用研究,2001.
    [53] 王雨晨,系统漏洞原理与常见攻击方法,计算机工程与应用,2001.3.
    [54] 赵军锁,周恩强,消息传递、PVM及MPI,电脑与信息技术,1998.2.
    [55] 计永旭等,一种实用的并行计算模型,计算机学报,2001.4.
    [56] 陈志刚,曾志文,中间应用服务器动态负载均衡的物理模型,计算机工程,2001.1.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700