用户名: 密码: 验证码:
基于多核平台的网络流量监测研究与优化
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网络流量监测是互联网发展到一定阶段的必然产物,一方面可以更好地了解互联网、监管互联网,另一方面也可以更好地服务互联网。伴随着链路带宽的不断增长、新兴业务的不断涌现,用于网络流量监测的软件系统面临着前所未有的压力。每一个流量监测系统的使用者,都希望监测结果既快又准,这便给系统性能提出了很高的要求。当监测系统的处理能力不能应对快速到达的网络流量时,便会出现协议状态丢失、关键信息遗漏、程序非正常终止等性能问题,导致流量监测系统不可用。于是人们开始寻找计算能力更为强大、成本开支易于接受的新型硬件平台,希望能够以此给软件系统提速。
     通用多核处理器的发展,集聚了许多渴望改善软件性能的研究者的目光,因为它不但能够提供强大的并行计算能力,而且具有价格上的优势。多核处理器通过在同一芯片上集成多个核心,同时保持或降低整体能耗,提供了真正意义上的高速并发运算引擎。从多核处理器角度而言,也希望能够找到一种计算密集型的应用,来充分发挥自身的性能优势,避免计算资源的浪费。可以说,多核处理器提供了计算能力方面的保证,让研究者看到了软件性能提升的发展方向,也让更多的人相信在这个逐渐普及的平台上,能够真正倍增软件性能。
     本课题来自于以上两方面技术发展的交汇与碰撞,将网络流量监测与多核处理器相结合,研究流量监测系统在多核平台上的性能优化问题。这是一个刚兴起不久的研究方向,也是一个涉及了流量监测技术、计算机体系结构、计算机操作系统等方面的学科内交叉领域。本文在此基点上,对一些典型的流量监测系统,进行了多核平台性能优化,并归纳总结了多核优化过程中的方法和特点。本文主要的研究内容如下。
     (1)研究了当前常用的多核平台软件并行优化技术。
     本文汲取了并行计算、处理器指令、多核平台性能优化等多方面知识,结合对大量文献的总结,归纳出了多核平台软件性能优化常用的八种方法。该八种方法分别从不同角度实现了软件性能多核优化,可以应用于不同类型的系统优化过程中。这八种优化方法的总结,对研究流量监测系统在多核平台上的性能优化,有很好的学习和参考价值。
     (2)研究了流量监测系统的通用架构,并提出一套用于流量监测系统多核平台性能优化的评价指标。
     本文总结了流量监测系统的通用架构,并通过该架构展示了流量监测系统的一般性功能组成。在此基础上,本文提出了一套流量监测系统的多核优化评价指标。该套指标分为核心指标和辅助指标,核心指标只有一个,即系统吞吐量。辅助指标细分为三类,分别用来评估系统开销分配、调度方式和优化效果。利用多核平台性能优化评价指标可以指导和帮助完成系统性能优化。
     (3)以具体系统为实例,研究了网络协议解析类系统多核优化的特点和方法。
     本文在给出流量监测系统通用架构和评价指标的基础上,选取了通用架构中应用较为广泛的一类——网络协议解析类系统,进行多核优化研究。在该类系统中,选用了自主研发的GTPAS (GPRS Tunnel Protocol Analysis System)作为具体实例进行优化,分析了GTPAS的基本性能,锁定了性能瓶颈,并结合通用多核处理器特点,提出多核性能优化策略。通过实验验证,在投入7个核心进行计算的情况下,优化后系统的吞吐量达到了优化前的391.73%,有效提高了系统处理能力。随后本文以此过程为依据,总结了网络协议解析类系统在多核平台上进行性能优化的独有特点。
     (4)以具体系统为实例,研究了网络内容监测类系统多核优化的特点和方法。
     本文在研究了网络协议解析类系统的多核优化过程后,选取了通用架构中又一类具有代表性的系统——网络内容监测类系统,进行多核优化研究。研究的实例系统,是自主研发的典型网络内容监测系统ITCMS (Internet Traffic Content Monitoring System)。经过对ITCMS进行基本性能评定、性能瓶颈分析等工作,提出了适合ITCMS的多核性能优化策略。实验结果表明,在投入7个核心进行计算的情况下,优化后系统的吞吐性能达到了优化前的436.10%。之后本文以此过程为依据,总结了网络内容监测类系统多核性能优化的特点,并将其与网络协议解析类系统的多核优化研究进行了对比。
     (5)完成了多核平台报文接收性能研究和报文重组性能优化。
     本文最后研究了该领域中两个较为常见的问题。对于多核平台报文接收性能的研究,主要分析了Linux下的两种报文接收方式——PF_PACKET Socket方式和Libpcap方式,在多核平台上的最大接收性能。文中分别对比了单、双接收进程的最大吞吐量和报文速率,之后又对单、双接收进程在最大接收能力下的处理器负载情况进行了分析。对于报文重组的多核平台性能优化,主要实现了对HTTP分片报文重组的吞吐量提升。研究借用开源程序Libnids中的报文重组部分作为待优化的目标,这是因为Libnids重组程序中所采用的重组机制是当前报文重组的主流,具有典型性。通过分析性能瓶颈,结合HTTP报文特点和多核平台特点,提出多核并行优化策略。最终经过实验验证,在投入2个计算核心的条件下,优化后的报文重组吞吐量达到了优化前的145.37%,满足了常见网络应用的基本需求。
Network traffic monitoring is a necessary requirement when the Internet development in a certain stage. One can better understand and monitoring the Internet, it can also provide a better service. With the growing of network bandwidth and the emergence of new applications, traffic monitoring system is facing unprecedented pressure. Everyone hopes that the traffic monitoring system can work fast and give accurate results. This makes a high performance demands for traffic monitoring system. When facing huge network traffic that beyond the processing capacity, traffic monitoring system will appear many performance problems such as losing protocol status, missing key information and abnormal program termination, which cause traffic monitoring system to be not available. People want to find new solutions to improve the processing performance of traffic monitoring systems.
     General-purpose multi-core processor has strong parallel computing capability. It provides high-speed concurrent computing engines by integrating multiple cores in one chip and maintaining or reducing overall energy consumption. For multi-core processor, it needs compute-intensive applications to avoid the waste of computing resources. Multi-core processor not only provides assurance of computing, but also makes researchers to see the direction of performance optimization.
     This thesis is the intersection of two aspects above, including some knowledge of network traffic monitoring, computer architecture, computer operating systems, etc. We do performance optimization of traffic monitoring system on the multi-core platform by combining traffic monitoring technology and multi-core technology. In this thesis, we mainly do the researches as follows:
     We study the parallel optimization technology of software on the multi-core platform. It includes parallel computing, processor instructions, multi-core performance optimization and so on. Furthermore, by reading a large number of references, we summarize eight methods of performance optimization on multi-core platform. These methods can be applied in different kinds of software to improve their performance. We think these methods are valuable reference for performance optimization of traffic monitoring system on multi-core platform.
     We proposed a general architecture of traffic monitoring system and a set of evaluation indicators which are used for optimizing system performance on multi-core platform. In this thesis, we summarized the general architecture of traffic monitoring system in order to show the system function. We also proposed a set of evaluation indicators for traffic monitoring system which used for improving performance on multi-core platform. This set of evaluation indicators includes key indicator and assistant indicators. The key indicator is the system throughput. The assistant indicators could be further classified into three categories which are used to evaluate system overhead, scheduling strategies and the optimization effect. These evaluation indicators can help us improving system performance on multi-core platform.
     We took a specific system which belongs to the category of Network Protocol Analysis as a study case to improve its performance. On this basis, we summarize the methods and features of performance improvement which is suit for this category of system on multi-core platform. The specific system is named'GTP Analysis System'. We analyze basic performance of the system, find performance bottleneck, and finally propose optimization strategies on multi-core platform. The experimental results show that when using seven cores to compute, system throughput is improved to 401.73% compared with before. Then we summarize the performance optimization features of Network Protocol Analysis System on multi-core platform.
     After studying the optimization features of Network Protocol Analysis System, we chose another category of system called Network Traffic Content Monitoring System. We took a specific system which belongs to this category as a research case to improve its performance. On this basis, we summarize the methods and features of performance improvement which is suit for this category of system on multi-core platform. The specific system is named ITCMS (Internet Traffic Content Monitoring System). Similarly, we evaluate the basic performance of ITCMS and analyze performance bottleneck. Then we proposed optimization strategies. Experimental results show that when using seven cores to do the optimization, system throughput is improved to 436.10% than before. Next, we summarize the performance improvement features of Network Traffic Content Monitoring System on multi-core platform. Furthermore, we do a comparative research on optimization features between these two categories of system.
     In the last part of this thesis, we analyze performance of packet capture on multi-core platform, and we also do a research on optimization of packet reassembly system based on multi-core platform. For packet capture performance, we chose two common approaches, namely PF_PACKET Socket capture and Libpcap capture. The analysis contains two aspects:One is the maximum packet capture rate and throughput on multi-core platform. The other is the CPU load under the maximum capture rate. In both of the two aspects, we explore the performance of single capture process and dual capture processes separately. For packet reassembly optimization, we chose Libnids as a study case. We analyze the system performance bottleneck and parallel the reassembly system according to the HTTP protocol features. When we use two cores to compute, this optimization improves the throughput of packets reassembly system to 145.37% than before. The performance result can meet the basic requirements of common network uses.
引文
[1]第27次中国互联网络发展状况统计报告(2011/1).中国互联网络信息中心China Internet Network Information Center (CNNIC),2011年1月.
    [2]中国互联网发展大事记.中国互联网络信息中心China Internet Network Information Center (CNNIC),2008年12月.
    [3]刘芳,窦伊男,陈陆颖,于华,雷振明.网络流量监测与控制.北京邮电大学出版社,2009.9.
    [4]http://sourceforge.net/proj ects/libpcap/.
    [5]http://www.tcpdump.org/.
    [6]http://www.wireshark.org/.
    [7]J.D.Case, M. Fedor, M.L. Schoffstall, C. Davin. RFC1157-1990, Simple Network Management Protocol (SNMP),1990.
    [8]王立新.网络流量监测技术综述.甘肃科技纵横,2007.
    [9]S. Waldbusser, R. Cole, C. Kalbfleisch, D. Romascanu. RFC3577-2003, Introduction to the Remote Monitoring (RMON) Family of MIB Modules,2003.
    [10]杨策,张永智,庞正.社网络流量监测技术及性能分析.空军工程大学学报(自然科学版),2003.
    [11]彭建,朱萍,傅明.网络流量监测技术.计算技术与自动化,2008.
    [12]P. Phaal, S. Panchen, N. McKee. RFC3176-2001. InMon Corporation's sFlow:A Method for Monitoring Traffic in Switched and Routed Networks.
    [13]汪继东,林南晖,林兆启.基于sFIow技术的网络流量监测系统研究.现代计算机(专业版),2008.
    [14]殷泰晖,龚正虎,卓莹,郭陈阳.分布式网络流量监测系统的关键技术研究.全国第18届计算机技术与应用学术会议(CACIS),2007.
    [15]Moulierac, J. Molnar, M. Active. Monitoring of Link Delays in Case of Asymmetric Routes. Networking, International Conference on Systems and International Conference on Mobile Communications and Learning Technologies, 2006.
    [16]Jackson, A.W. Sterbenz, J.P.G. Condell, M.N. Hain, R.R. Active network monitoring and control:the SENCOMM architecture and implementation. DARPA Active NEtworks Conference and Exposition,2002.
    [17]http://www.netperf.org/netperf/.
    [18]http://iperf.sourceforge.net/.
    [19]http://www.pcausa.com/Utilities/pcattcp.htm.
    [20]http://www.caida.org/tools/utilities/others/pathchar/.
    [21]http://www.freedownloadscenter.com/Network_and_Interniet/Online_Timers/Net Timer.html.
    [22]Liu Feng, Jie Yang, Wenli Zhou. Research on active monitoring based QQLive real-time information acquisition system. Network Infrastructure and Digital Content,2009. IC-NIDC,2009.
    [23]Sureswaran, R. Al Bazar, H. Abouabdalla, O. Manasrah, A.M. Active E-mail system protocols monitoring algorithm. TENCON,2009.
    [24]刘枫.基于用户和业务的互联网QoS监测与分析.北京邮电大学博士研究生学位论文,2009.
    [25]渠怀玉Internet网络流量的测量与分析.山西电子技术,2001.
    [26]北京宽广电信高技术发展有限公司http://www.kuanguang.com.cn.
    [27]北京派网软件有限公司http://www.panabit.com/.
    [28]Gordon, E. Moore. Cramming More Components onto Integrated Circuits. Electronics, Vol.38, No.8 (April 19,1965).
    [29]刘近光,梁满贵.多核多线程处理器的发展及其软件系统架构.微处理机,2007.
    [30]竹居智久.处理器向异构多核架构发展.电子设计应用,2008-02-01.
    [31]Ulrich Drepper. Programming for tomorrow's high speed processors. Today, May 9 2007.
    [32]信磊.对称多核处理器中Cache一致性的研究与实现.合肥工业大学硕士研究生学位论文,2007.
    [33]TILERA http://www.tilera.com/products/processors/TILE-Gx_Family.
    [34]Pam Frost Gorder. Multicore Processors for Science and Engineering[J]. IEEE Educational Activities Department,9(2):3-7, Mar.2007.
    [35]Tian Daxin and Xiang Yang. A Multi-core Supported Intrusion Detection System[C].2008 IFIP International Conference on Network and Parallel Computing. IEEE Computer Society,2008, Pages 50-55.
    [36]Vern Paxson, Robin Sommer, and Nicholas Weaver. An Architecture for Exploiting Multi-Core Processors to Parallelize Network Intrusion Prevention [J]. John Wiley and Sons Ltd,2009,21(10):1255-1279.
    [37]Barrelfish http://www.barrelfish.org/.
    [38]KEIR FRASER, TIM HARRIS. Concurrent Programming Without Locks[J]. ACM Transactions on Computer Systems. ACM Press, May 2007, Vol.25, No.2.
    [39]J. Giacomoni, T. Moseley, and M. Vachharajani. Fastforward for efficient pipeline parallelism:A Cache-optimized concurrent lock-free queue. In PPoPP'08,2008.
    [40]T.Ovatman, F.Buzluca. Investigating software design pattern behavior in multiprocessor systems A case study on observer[C].23rd International Symposium on Computer and Information Sciences,27-29 Oct.2008. Pages 1-4.
    [41]黄国睿,张平,魏广博.多核处理器的关键技术及其发展趋势.计算机工程与设计,2009.
    [42]周伟明.多核计算与程序设计,P13-P14.华中科技大学出版社,2009.
    [43]C.Xavier S.S.Iyengar (著),张云泉,陈英(译).并行算法导论,P6.机械工业出版社,中信出版社,2004.
    [44]Grama.M.(著),张武,毛国勇,程海英(译).并行计算导论,P3-P4.机械工业出版社,2005.
    [45]Timothy G.Mattson, Beverly A.Sanders, Berna L.Massingill(著),敖富江(译).并行编程模式,P21-P28.清华大学出版社,2005.
    [46]Timothy G.Mattson, Beverly A.Sanders, Berna L.Massingill(著),敖富江(译).并行编程模式,P9-P11.清华大学出版社,2005.
    [47]李学干.计算机系统的体系结构,P54.清华大学出版社,2006.
    [48]尹朝庆.计算机系统结构教程,P26-P27.清华大学出版社,2005.
    [49]C.Xavier S.S.Iyengar(著),张云泉/陈英(译).并行算法导论,P12-P15.机械工业出版社,中信出版社,2004.
    [50]张晨曦,王志英,沈立等.计算机系统结构教程,P53.清华大学出版社,2009.
    [51]Subramanian N, Shrisha Rao. Content-Split based Effective String-Matching for Multi-Core based Intrusion Detection Systems. International Conference on Computational Intelligence, Communication Systems and Networks,2009.
    [52]Haipeng Cheng, Zheng Chen, Bei Hua, Xinan Tang. Scalable Packet Classification Using Interpreting—A Cross-platform Multi-core Solution. In Proceedings of 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'08),2008.
    [53]Intel White Paper. Supra-linear Packet Processing Performance with Intel Multi-core Processors.
    [54]Microsoft Receive-Side Scaling Enhancements in Windows Server,2008.
    [55]Derek L.Schuff, Yung Ryn Choe, and Vijay S.Pai. Conservative vs. Optimistic Parallelization of Stateful Network Intrusion Detection. IEEE International Symposium on Performance Analysis of Systems and software,2008.
    [56]Junchang Wang, Haipeng Cheng, Bei Hua, Xinan Tang. Practice of Parallelizing Network Applicationson Multi-core Architectures. ICS'09 Proceedings of the 23rd international conference on Supercomputing,2009.
    [57]Juan C.Pichel, David E.Singh, and Jesus Carretero. Reordering Algorithms for Increasing Locality on Multicore Processors. The 10th IEEE International Conference on High Performance Computing and Communications,2008.
    [58]Zhuojun Zhuang, Yuan Luo, Minglu Li, et al. A Resource Scheduling Strategy for Intrusion Detection on Multi-Core Platform.2008 IFIP International Conference on Network and Parallel Computing,2008.
    [59]Rob Knauerhase, Paul Brett, Barbara Hohlt, et al. Using OS Observations to Improve Performance in Multicore Systems [J]. IEEE Computer Society Press, May.2008,28(3):54-66.
    [60]Li Zhao, Ravi Iyer, Ramesh Illikkal, et al. CacheScouts Fine-Grain Monitoring of Shared Caches in CMP Platforms.16th International Conference on Parallel Architecture and Compilation Techniques,2007.
    [61]Dimitris Nikolopoulos. Facing the challenges of multicore processor technologies using autonomic system software.20th International Parallel and Distributed Processing Symposium,2006.
    [62]Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta. Introduction to parallel computing. Benjamin/Cummings Publishing Company,1994.
    [63]James H.Anderson, John M.Calandrino. Parallel task scheduling on multicore platforms. ACM,2006.
    [64]James H.Anderson, John M.Calandrino, and UmaMaheswari C.Devi. Real-Time Scheduling on Multicore Platforms. IEEE Computer Society,2006.
    [65]Li Zhao, Ravi Iyer, Ramesh Illikkal, et al. CacheScouts Fine-Grain Monitoring of Shared Caches in CMP Platforms.16th International Conference on Parallel Architecture and Compilation Techniques,2007.
    [66]Danhua Guo, Guangdeng Liao, Laxmi N.Bhuyan, et al. A Scalable Multithreaded L7-filter Design for Multi-Core Servers. ANCS'08,2008.
    [67]张志斌,郭莉,方滨兴,陈小军.一种基于自动机分解的网络协议并行处理策略.计算机学报,2006.
    [68]Kai Zheng, Zhiyong Liang, and Yi Ge. Parallel Packet Classification via Policy Table Pre-Partitioning. IEEE Global Telecommunications Conference,2005.
    [69]Hyesook Lim, Hye-Ran Kim, and Yeo-Jung. Parallel Multiple Hashing for Packet Classification.2005 Workshop on High Performance Switching and Routing, 2005.
    [70]David A.Bader, Varun Kanade, and Kamesh Madduri. SWARM:A Parallel Programming Framework for Multicore Processors. IEEE International Parallel and Distributed Processing Symposium,2007.
    [71]http://sourceforge.net/.
    [72]刘热OpenMP多核技术研究及其在遗传算法中的应用.沈阳大学学报,2010.
    [73]Amith R.Mamidala, Rahul Kumar, Debraj De, et al. MPI Collectives on Modern Multicore Clusters Performance Optimizations and Communication Characteristics.8th IEEE International Symposium on Cluster Computing and the Grid,2008.
    [74]李苏平,刘羽,刘彦宇.基于MPI与OpenMP混合并行技术的研究.软件导刊,2010.
    [75]http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html.
    [76]Victor Pankratius, Christoph Schaefer, Ali Jannesari, et al. Software engineering for multicore systems-An experience report. IWMSE'08,2008.
    [77]Pam Frost Gorder. Multicore Processors for Science and Engineering. IEEE Educational Activities Department,2007.
    [78]邓宇,杨学军,戴华东,王勐.流编程模型下的存储一致性模型.国防科技大学学报,2008.
    [79]Manjunath Kudlur, Scott Mahlke. Orchestrating the execution of stream programs on multicore platforms. PLDI'08,2008.
    [80]Jayanth Gummaraju, Joel Coburn, Yoshio Turner, et al. Streamware: programming general-purpose multicore processors using streams. ASPLOS'08, 2008.
    [81]David Zhang, Qiuyuan J. Li, Rodric Rabbah, Saman Amarasinghe. A Lightweight Streaming Layer for Multicore Execution. ACM SIGARCH Computer Architecture News archive, Volume 36 Issue 2, May 2008.
    [82]Jayanth Gummaraju, Mendel Rosenblum. Stream Programming on General-Purpose Processors. IEEE Computer Society,2005.
    [83]Michael D.McCool. Scalable Programming Models for Massively Multicore Processors. Proceedings of the IEEE,2008.
    [84]尹朝庆.计算机系统结构教程,P130.清华大学出版社,2005.
    [85]Intel White Paper. Using Intel VTune Performance Analyzer to Optimize Software on Intel Core i7 Processors.
    [86]Intel White Paper. Core_i7_Processor_Family_VTune_SW_Opt_Guide_1.1.
    [87]Intel Write Paper. Using the VTune(TM) Performance Analyzer Sampling Collector for Mobile Internet Device (MID).
    [88]Timothy G.Mattson, Beverly A.Sanders, Berna L.Massingill,敖富江(译).并行编程模式,P14-P16.清华大学出版社,2005.
    [89]Hassan Shojania. Hardware-based performance monitoring with VTune Performance Analyzer under Linux, http://hassan.shojania.com/. Report,2003.
    [90]多核系列教材编写组(编著).多核程序设计,P253-P258.清华大学出版社,2008.
    [91]Intel White Paper. Inside Intel Core Microarchitecture and Smart Memory Access.
    [92]Intel White Paper. Intel 64 and IA-32 Architectures Optimization Reference Manual.
    [93]Intel White Paper. Intel 64 and IA-32 Architectures Software Developer's Manual.
    [94]Intel White Paper. Intel Xeon Processor 5500 Series An Intelligent Approach to IT Challenges.
    [95]WECohen. Tuning Programs with OProfile. Wide Open Magazine. Premiere Issue,53-62.2004.
    [96]“第三代移动通讯网(WCDMA)通用分组无线业务核心网技术规范跨Gn、Gp接口的GPRS GTP隧道协议Release 1999 (2000-06)"中华人民共和国信息产业部,2000.
    [97]3GPP TS 29.060. GPRS Tunneling Protocol (GTP) across the Gn and Gp interface (Release 6),2008.
    [98]http://www.snort.org/.
    [99]DANIEL P.BOVET & MARCO CESATI(著),陈莉君,张琼声,张宏伟(译).深入理解LINUX内核,P85-P87.中国电力出版社,2007.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700