用户名: 密码: 验证码:
基于系统时空行为特征的内存功耗优化研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着半导体工艺的进步,电路芯片集成度不断提高,功耗问题已经制约了整个系统的发展。功耗过高会导致芯片局部温度过热,对系统性能、成本、可靠性以及寿命都带来了挑战。内存作为整个系统的瓶颈,其大带宽、大容量的需求日益增加,内存功耗问题已成为近年来学术界与工业界的热点问题之一。
     本文以平衡系统性能、功耗、公平性等参数为目标,跨越计算机系统多个层次,对系统行为分析、内存管理系统、线程调度等关键问题进行研究,为内存功耗控制提供了切实有效的解决方案。本文主要研究内容如下:
     1.基于系统时空行为的内存管理系统
     内存系统已经支持功耗状态控制,而状态控制的前提是系统存在空闲内存模块以便进入低功耗状态。由于内存管理系统负责内存资源的分配与回收,因此功耗状态控制的关键将在于内存管理系统如何管理数据在物理内存上的分布,为此,我们首先分析了不同内存地址映射机制对数据分布以及内存功耗的影响,进而针对性地引入了功耗敏感的内存管理系统。其次,从任务的角度刻画系统访存行为,进而指导内存管理系统动态调整分配与释放策略以迎合任务生命周期中不同阶段的资源需求。
     2.行为指导的任务分组方法
     仅考虑任务自身的资源需求而忽略应用场景的限制,很难对系统行为进行充分挖掘。为此,我们针对Android操作系统,充分考虑应用场景对任务行为的影响,在保证系统响应速度及公平性的前提下,将具备类似行为特征的线程归并为组进行调度,优化了多核体系结构下高速缓存的竞争问题。3.系统访存行为的刻画
     内存动态电压频率缩放(DVFS)作为有效的功耗控制机制,与内存功耗状态管理的研究对象不同,数据分布对其缺乏指导意义,该机制主要依赖于任务运行时的资源需求行为。已有针对任务行为的研究往往借助于体系结构(性能计数器PMU)的底层事件参数进行刻画,忽略了高层系统的影响;而部分基于任务阶段性phase行为的研究,通常采用时间片或固定指令长度对任务行为进行划分,但是这对于任务执行指令序列的功能逻辑而言可能是不完整的,从而打破了任务本身的行为。为此,我们从整个计算机系统出发,纵跨多个系统层次(体系结构、操作系统、应用场景以及任务本身等),提出了“功能事件”(其具备功能完整性与时间片段性)作为任务行为刻画的工具,最终形成系统DVFS框架。
     任务行为特征的刻画是影响系统级功耗优化的关键。本研究与已有研究相比,其创新性在于:
     1.评估了内存体系结构对于内存功耗行为的影响,并据此提出了功耗敏感的内存管理系统。
     2.通过刻画任务对内存资源的需求特征定义了任务运行时的行为,并针对任务生命周期中不同时间段所表现的行为差异以及任务间的影响,引入了针对性能/功耗优化的内存管理系统框架。
     3.通过分析Android应用场景及其编程模型对任务行为的影响,为“线程组”划分提供了理论依据,在保证系统公平性和响应速度的前提下,缓解了多核体系结构的高速缓存竞争问题。
     4.提出功能事件作为任务行为刻画的工具,并引入系统调用与Android消息作为功能事件的载体,最终形成了操作系统级的DVFS框架。
With the rapid development of semiconductor technology and increasing integration of circuit chips, system's power has significantly restrained the improvement of system performance. High power consumption will lead to increasing temperature and directly affect system's performance, costs, reliability and lifetime. As the bottleneck of whole system, memory system's large-bandwidth and high-capacity demands are growing day by day. Therefore, memory power has become a hot topic in both academe and industry community.
     This dissertation focuses on balancing system's performance, power and fairness of sharing resource. By integrating with the operating system and system architecture, we intend to address the key theories and technologies of task behaviors, memory management system and thread-group scheduler, and so on, which facilitates us to propose the effective memory power management solution. The primary contents are as follows:
     1. Memory Management System Based on System Behaviors
     Memory modules currently provide many power states and system must have enough idle time to fully exploit these states. Since memory management system is in charge of allocating and releasing resources, the key to control these power states lies in how to distribute data on physical memory chips by the aid of memory manager. Therefore, we firstly analyze the impacts of memory address mapping scheme on data distribution and system power, and then a power-aware memory management is introduced to carry out the target optimization. Secondly, from the view of applications' behaviors, we propose a memory manager framework to improve power efficiency, which can adopt different polices to meet various resource demands during thread's lifetime.
     2. Thread-Behaviors Directed Group Scheduler
     It may not fully exploit task behaviors only based on its own demands while ignoring scenario's constrains. Therefore, in order to optimize the cache conflicts of multi-core platform, we firstly illustrate the impacts of Android scenario on thread behaviors and then propose to partition threads having similar behaviors as one group to schedule while keeping system response speed and fairness.
     3. Descriptive Methods of System Behaviors
     As an effective way to save memory power, the technology of dynamic voltage frequency scaling (DVFS) is different with former power states managements. It is an ineffective way to utilize DVFS while only considering data distributions, and the frequencies will be usually scaled based on thread's runtime resource demands. There are numerous works on analyzing characteristics of task behaviors by using PMU events at the hardware level, but their granularity is so fine to ignore high level information. Other works aimed at distinguishing thread phase behaviors are normally adopt the time slice or firmed instructions length as the basic unit. They are incomplete from the perspective of thread's functional logic and this distinction may be blindness and break thread's natural behaviors. Hence, we introduce the function event tool to illustrate task behaviors throughout whole levels, such as system architecture, operating system and applications. Finally, the DVFS framework is formed with the help of function event tool.
     Thread behaviors plays the key role in managing system power. Compared with other works, the contributions and innovations of this dissertation include:
     1. We firstly estimate the impacts of memory architecture on system's power behaviors and then propose a power-aware memory management system.
     2. Through analyzing threads' runtime behaviors on resource demands, we introduce a performance/power directed memory management framework to fully exploit intra-thread varied behaviors and inter-thread dependencies.
     3. Impacts of Android scenario and its programming model on thread behaviors provide thread-group scheme's theory basis. The concept of thread-behaviors group scheduler is then proposed to optimize multi-core cache conflicts issue while keeping system's response speed and thread fairness.
     4. This dissertation proposes the function event tool to portray task behaviors, and then we provide the system call and Android message to support this tool's implementation. Finally, operating system's DVFS scheme is conducted according to function event tool.
引文
[1]Venkatachalam V, Franz M. Power reduction techniques for microprocessor systems [J]. ACM Computing Surveys (CSUR),2005,37(3):195-237.
    [2]U.S. Environmental Protection Agency, ENERGY STAR Program. Report to Congress on Server and Data Center Energy Efficiency[R/OL],2007.
    [3]Kim N S, Flautner K, Blaauw D, et al. DRowsy instruction caches. Leakage power reduction using dynamic voltage scaling and cache sub-Bank prediction[C]. Microarchitecture, (MICRO-35). Proceedings.35th Annual IEEE/ACM International Symposium on. IEEE,2002: 219-230.
    [4]Mosse, D., Aydin, H., Childers, B., & Melhem, R. Compiler-assisted dynamic power-aware scheduling for real-time applications[C]. In In Workshop on Compilers and Operating Systems for Low Power,2000.
    [5]AbouGhazaleh N, Mosse D, Childers B, et al. Toward the placement of power management points in real-time applications[M], Compilers and operating systems for low power [S.l.]: Kluwer Academic Publishers,2003a:37-52.
    [6]AbouGhazaleh N, Childers B, Moss'e D, et al. Energy management for real-time embedded applications with compiler support[C], LCTES '03:Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems. New York, NY, USA:ACM,2003:284-293.
    [7]AbouGhazaleh N, Moss'e D, Childers B R, et al. Collaborative operating system and compiler power management for real-time applications [J]. Trans. on Embedded Computing Sys,2006,5(1):82-115.
    [8]陈娟,易会战,董勇等.能量受限的软件预取优化问题[J].软件学报,2006,(07).
    [9]刘啸滨,郭兵, 沈艳等.嵌入式软件算法级功耗BP网络模型研究[J].电子科技大学学报,2011年06期.
    [10]彭蔓蔓,徐立超,王颖.异构多核处理器的任务分配及功耗的研究[J]..计算机应用研究,2010年05期.
    [11]Tiefei Zhang, Ying-Jheng Chen, Che-Wei Chang, et.al. Power management strategies in data transmission. ASPDAC'l 1. January 2011.
    [12]Chen Tianzhou, Huang Jiangwei, Xiang Lingxiang, et.al. Balance the Battery life and real-time issuses for portable Real-time embedded system by applying DVS with battery model. The 34th Annual Conference of the IEEE Industrial Electronics Society (IECON). 2008.
    [13]Ge Zhang, Weiwu Hu. An Efficient Methodogy for Power Modeling and Simulation of Moder Cell-Based Microprocessors.52nd IEEE International Midwest Symposium on Circuits and Systems. MWSCAS'09.2009.
    [14]Hongbo Zeng, Jun Wang, Ge Zhang, et.al. An interconnect-aware power efficient cache coherence protocol for CMPs. IEEE International Symposium on Parallel and Distributed Processing. IPDPS.2008.
    [15]Sun Microsystems. Open Solaris CPU Power Management -Project Tesla. June 2009.
    [16]Jejurikar R, Gupta R. Dynamic slack reclamation with procrastination scheduling in real-time embedded systems [C] DAC'05:Proceedings of the 42nd annual Design Automation Conference. New York, NY, USA:ACM,2005:111-116.
    [17]Kwon W C, Kim T. Optimal voltage allocation techniques for dynamically variable voltage processors [J]. ACM Trans. Embed. Comput. Syst.,2005,4(1):211-230.
    [18]Hong S, Yoo S, Jin H, et al. Runtime distribution-aware dynamic voltage scaling[C], ICCAD '06:Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design. New York, NY, USA:ACM,2006:587-594.
    [19]Seo J, Kim T, Chung K S. Profile-based optimal intra-task voltage scheduling for hard real-time applications[C], DAC'04:Proceedings of the 41st annual Design Automation Conference. New York, NY, USA:ACM,2004:87-92.
    [20]Kong J, Choi J, Choi L, et al. Low-Cost Application-Aware DVFS for Multi-core Architecture[C], ICCIT'08,2008:106-111.
    [21]Weissel A, Bellosa F. Process cruise control:event-driven clock scaling for dynamic power management.[S.l.]:ACM,2008
    [22]Yang, J., Zhou, X., Chrobak, M., Zhang, et al. Dynamic Thermal Management through Task Scheduling. IEEE Int. Symp. Performance Analysis of Systems and Software, Austin, TX, USA, April 20-22. pp.191-201. IEEE Computer Society, Los Alamitos, CA, USA.548-553. IEEE Computer Society, Los Alamitos, CA, USA.
    [23]Stavrou, K, Pedro Trancoso. Thermal-Aware Scheduling:A solution for the Future Chip Multiprocessors Thermal Problems [J]. EURASIP Journal on Embedded Systems, Volume 2007 Issue 1
    [24]Andread Merkel, Frank Bellosa. Task Activity Vectors:A New Metric for Temperature-Aware Scheduling [c]. Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008.
    [25]J Choi, CY Cher, al et. Thermal-aware task scheduling at the system software level [c]. ISLPED 2007,Aug
    [26]A Merkel, F Bellosa. Balancing power consumption in multiprocessor systems [c]. Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
    [27]M Gomaa, MD Powell,al et. Heat-and-run:leveraging SMT and CMP to manage power density through the operating [c]. Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
    [28]Kumar A, Shang L, Peh L S, et al. HybDTM:a coordinated hardware-software approach for dynamic thermal management[C]. Proceedings of the 43rd annual Design Automation Conference. ACM,2006:548-553.
    [29]Kursun E, Ghiasi S, Sarrafzadeh M. Transistor Level Budgeting for Power Opti-mization [c]. Proceedings of the 5th International Symposium on Quality Electronic Design (ISQED'04). Washington, DC, USA. IEEE Computer Society.2004.
    [30]Fei Y, Ravi S, Raghunathan A, et al. Energy-optimizing source code transformations for operating system-driven embedded software [J]. ACM Trans. Embed. Comput. Syst.2007.
    [31]Shiue, Wen-Tsong, and Chaitali Chakrabarti. Memory design and exploration for low power, embedded systems [J]. The Journal of VLSI Signal Processing 29, no.3 (2001):167-178.
    [32]Jiang Lin, Hongzhong Zheng, Zhichun Zhu, et al. Software Thermal Management of DRAM Memory for Multicore Systems [c]. SIGMETRICS'08, June 2-6,2008
    [33]Barroso L A, Holzle U. The datacenter as a computer:An introduction to the design of warehouse-scale machines [J]. Synthesis lectures on computer architecture,2009,4(1):1-108.
    [34]Pedram M, Nazarian S. Thermal modeling, analysis, and management in VLSI circuits: Principles and methods [J]. Proceedings of the IEEE,2006,94(8):1487-1501.
    [35]Ajami A H, Banerjee K, Pedram M. Modeling and analysis of nonuniform substrate temperature effects on global ULSI interconnects [J]. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on,2005,24(6):849-861.
    [36]Severns R. Safe operating area and thermal design for mospower transistors[J]. Siliconix Application Note AN83,1983,10.
    [37]Raghunathan A, Jha N K, Dey S. High-level power analysis and optimization[M]. Kluwer Academic,1998.
    [38]Jacob B, Ng S, Wang D. Memory systems:cache, DRAM, disk [M]. Morgan Kaufmann, 2010.
    [39]Lee C J, Narasiman V, Mutlu O, et al. Improving memory bank-level parallelism in the presence of prefetching[C]. Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM,2009:327-336.
    [40]Liu L, Cui Z, Xing M, et al. A software memory partition approach for eliminating bank-level interference in multicore systems[C]. Proceedings of the 21st international conference on Parallel architectures and compilation techniques. ACM,2012:367-376.
    [41]Herrero E, Gonzalez J, Canal R, et al. Thread Row Buffers:Improving memory performance isolation and throughput in multiprogrammed environments [J]. Computers, IEEE Transactions on,2013,62(9):1879-1892.
    [42]Sudan K, Chatterjee N, Nellans D, et al. Micro-pages:increasing DRAM efficiency with locality-aware data placement [J]. ACM Sigplan Notices,2010,45(3):219-230.
    [43]Delaluz V, Kandemir M, Vijaykrishnan N, et al. Hardware and software techniques for controlling dram power modes[J]. Computers, IEEE Transactions on,2001,50(11): 1154-1173.
    [44]Diniz B, Guedes D, Meira Jr W, et al. Limiting the power consumption of main memory[J]. ACM SIGARCH Computer Architecture News,2007,35(2):290-301.
    [45]Eiblmaier M, Mao R, Wang X. Power management for main memory with access latency control [C]. International Workshop on Feedback Control Implementation and Design in Computing Systems and Networks.2009.
    [46]Hur I, Lin C. A comprehensive approach to DRAM power management[C]. High Performance Computer Architecture,2008. HPCA 2008. IEEE 14th International Symposium on. IEEE,2008:305-316.
    [47]Kim Y, Papamichael M, Mutlu O, et al. Thread cluster memory scheduling:Exploiting differences in memory access behavior[C]. Microarchitecture (MICRO),2010 43rd Annual IEEE/ACM International Symposium on. IEEE,2010:65-76.
    [48]Muralidhara S P, Subramanian L, Mutlu O, et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning[C]. Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM,2011: 374-385.
    [49]David H, Fallin C, Gorbatov E, et al. Memory power management via dynamic voltage/frequency scaling[C]. Proceedings of the 8th ACM international conference on Autonomic computing. ACM,2011:31-40.
    [50]Yoon H B, Meza J, Ausavarungnirun R, et al. Row buffer locality-aware data placement in hybrid memories[R]. SAFARI Technical Report No,2011.
    [51]Deng Q, Meisner D, Ramos L, et al. Memscale:active low-power modes for main memory[J]. ACM SIGPLAN Notices,2011,46(3):225-238.
    [52]Pandey V, Jiang W, Zhou Y, et al. DMA-aware memory energy management[C]. HPCA. 2006,6:133-144.
    [53]Brodal G S, Demaine E D, Munro J I. Fast allocation and deallocation with an improved buddy system[J]. Acta Informatica,2005,41(4-5):273-291.
    [54]Bromley A G. Memory fragmentation in buddy methods for dynamic storage allocation[J]. Acta Informatica,1980,14(2):107-117.
    [55]Wilson P R, Johnstone M S, Neely M, et al. Dynamic storage allocation:A survey and critical review[M]. Memory Management. Springer Berlin Heidelberg,1995:1-116.
    [56]Purdom Jr P W, Stigler S M. Statistical properties of the buddy system[J]. Journal of the ACM (JACM),1970,17(4):683-697.
    [57]Serewa S. The improvement of the buddy system[J]. Theoretical and Applied Informatics, 2006,18(2):133-140.
    [58]Lebeck A R, Fan X, Zeng H, et al. Power aware page allocation[J]. ACM SIGPLAN Notices,2000,35(11):105-116.
    [59]De La Luz V, Kandemir M, Kolcu I. Automatic data migration for reducing energy consumption in multi-bank memory systems[C]. Design Automation Conference,2002. Proceedings.39th. IEEE,2002:213-218.
    [60]Inoue H, Komatsu H, Nakatani T. A study of memory management for web-based applications on multicore processors[C]. ACM Sigplan Notices. ACM,2009,44(6):386-396.
    [61]Hanson D R. Fast allocation and deallocation of memory based on object lifetimes[J]. Software:Practice and Experience,1990,20(1):5-12.
    [62]Gay D, Aiken A. Memory management with explicit regions[M]. ACM,1998.
    [63]Tanenbaum A S, Kaashoek M F, Van Renesse R, et al. The Amoeba distributed operating system—a status report[J]. Computer communications,1991,14(6):324-335.
    [64]P. Dasgupta, R. Chen, S. Menon, et al. The design and implementation of the clouds distributed operating system [J]. USENIX Computing Systems Journal,3(1):11-C46,1990.
    [65]David Wentzla and Anant Agarwal. The case for a factored operating system(fos)[J]. Technical report, MIT CSAIL,2008
    [66]Kazempour V, Fedorova A, Alagheband P. Performance implications of cache affinity on multicore processors[M]. Euro-Par 2008-Parallel Processing. Springer Berlin Heidelberg, 2008:151-161.
    [67]Weissman B. Performance counters and state sharing annotations:a unified approach to thread locality[C]. ACM SIGPLAN Notices. ACM,1998,33(11):127-138.
    [68]Philbin J, Edler J, Anshus O J, et al. Thread scheduling for cache locality [J]. ACM SIGOPS Operating Systems Review,1996,30(5):60-71.
    [69]Harizopoulos S, Ailamaki A. STEPS towards cache-resident transaction processing[C]. Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment,2004:660-671.
    [70]Qureshi M K, Part Y N. Utility-based cache partitioning:A low-overhead, high-performance, runtime mechanism to partition shared caches[C]. Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2006:423-432.
    [71]El-Moursy A, Garg R, Albonesi D H, et al. Compatible phase co-scheduling on a CMP of multi-threaded processors[C]. Parallel and Distributed Processing Symposium,2006. IPDPS 2006.20th International. IEEE,2006:10 pp.
    [72]McGregor R L, Antonopoulos C D, Nikolopoulos D S. Scheduling algorithms for effective thread pairing on hybrid multiprocessors[C]. Parallel and Distributed Processing Symposium, 2005. Proceedings.19th IEEE International. IEEE,2005:28a-28a.
    [73]Zhuravlev S, Blagodurov S, Fedorova A. Addressing shared resource contention in multicore processors via scheduling[C]. ACM SIGARCH Computer Architecture News. ACM, 2010,38(1):129-142.
    [74]Tam D, Azimi R, Stumm M. Thread clustering:sharing-aware scheduling on SMP-CMP-SMT multiprocessors[C].ACM SIGOPS Operating Systems Review. ACM,2007, 41(3):47-58.
    [75]Parekh S, Eggers S, Levy H, et al. Thread-sensitive scheduling for SMT processors[J]. Technical report, Dept. of Computer Science & Engineering, Univ. of Washington,2000. 2000.
    [76]Nakajima J, Pallipadi V. Enhancements for Hyper-Threading Technology in the Operating System:Seeking the Optimal Scheduling[C].WIESS.2002:25-38.
    [77]Snavely A, Tullsen D M. Symbiotic jobscheduling for a simultaneous mutlithreading processor[J]. ACM SIGPLAN Notices,2000,35(11):234-244.
    [78]Fedorova A, Seltzer M, Smith M D. Cache-fair thread scheduling for multicore processors[J]. Division of Engineering and Applied Sciences, Harvard University, Tech. Rep. TR-17-06,2006.
    [79]赵鹏.多核环境下的DRAM内存分类调度算法[J].中国科技论文在线,2011,6(1):6-9.
    [80]金瑛棋,吴俊敏,赵小雨.公平性考虑的短作业优先内存调度策略[J].计算机工程,2012,38(20):243-246.
    [81]Jia G, Sheng W, Dai W, et al. Using FOM predicting method for scheduling on Chip Multi-Processor[C]. Communication Software and Networks (ICCSN),2011 IEEE 3rd International Conference on. IEEE,2011:579-584.
    [82]Zheng H, Lin J, Zhang Z, et al. Mini-rank:Adaptive DRAM architecture for improving memory power efficiency[C]. Microarchitecture,2008. MICRO-41.2008 41st IEEE/ACM International Symposium on. IEEE,2008:210-221.
    [83]Udipi A N, Muralimanohar N, Chatterjee N, et al. Rethinking DRAM design and organization for energy-constrained multi-cores [J]. ACM SIGARCH Computer Architecture News,2010,38(3):175-186.
    [84]Lee J, Park C, Ha S. Memory access pattern analysis and stream cache design for multimedia applications[C]. Design Automation Conference,2003. Proceedings of the ASP-DAC 2003. Asia and South Pacific. IEEE,2003:22-27.
    [85]Zhang C, McKee S A. Hardware-only stream prefetching and dynamic access ordering[C]. Proceedings of the 14th international conference on Supercomputing. ACM,2000:167-175.
    [86]Li X, Li Z, David F, et al. Performance directed energy management for main memory and disks[J]. ACM SIGARCH Computer Architecture News,2004,32(5):271-283.
    [87]Monsoon Solutions power measurement platform, http://www.msoon.com/LabEquipment/PowerMonitor/
    [88]Jeong M K, Yoon D H, Sunwoo D, et al. Balancing DRAM locality and parallelism in shared memory CMP systems[C]. High Performance Computer Architecture (HPCA),2012 IEEE 18th International Symposium on. IEEE,2012:1-12.
    [89]Patel A, Afram F, Chen S, et al. MARSS:a full system simulator for multicore x86 CPUs[C]. Proceedings of the 48th Design Automation Conference. ACM,2011:1050-1055.
    [90]Micron, Calculating Memory System Power for DDR3[R], July 2007.
    [91]Chandrasekar K, Akesson B, Goossens K. Improved Power Modeling of DDR SDRAMs[C]. Digital System Design (DSD),2011 14th Euromicro Conference on. IEEE, 2011:99-108.
    [92]Bienia C, Kumar S, Singh J P, et al. The PARSEC benchmark suite:Characterization and architectural implications[C]. Proceedings of the 17th international conference on Parallel architectures and compilation techniques. ACM,2008:72-81.
    [93]Wang D, Ganesh B, Tuaycharoen N, et al. DRAMsim:a memory system simulator[J]. ACM SIGARCH Computer Architecture News,2005,33(4):100-107.
    [94]Neal D M, Thurber S M. Coherency for DMA read cached data:U.S. Patent 6,636,947[P]. 2003-10-21.
    [95]Sherwood T, Sair S, Calder B. Phase tracking and prediction[C]. ACM SIGARCH Computer Architecture News. ACM,2003,31(2):336-349.
    [96]Khetan G. Comparison of Memory Management Systems of BSD, Windows, and Linux[J]. Retrieved May,2002,22:2010.
    [97]Chowdhury S K, Srimani P K. Worst case performance of weighted buddy systems[J]. Acta Informatica,1987,24(5):555-564.
    [98]Advanced Operating Systems and Kernel Applications:Techniques and Technologies [M]. Information Science Reference,2010.
    [99]Jonathan Corbet. Contiguous memory allocator, http://lwn.net/Articles/447405/.2011
    [100]Gupta R K, Franklin M A. Working Set and Page Fault Frequency Paging Algorithms: A Performance Comparison[J]. Computers, IEEE Transactions on,1978,100(8):706-712.
    [101]Zhao X, Massey D, Lad M, et al. On/off model:a new tool to understand bgp update burst [J]. USCCS D, Tech nical Report,2004,4819.
    [102]Brichet F, Roberts J, Simonian A, et al. Heavy traffic analysis of a storage model with long range dependent on/off sources[J]. Queueing systems,1996,23(1-4):197-215.
    [103]Liu S, Pattabiraman K, Moscibroda T, et al. Flikker:saving DRAM refresh-power through critical data partitioning[J]. ACM SIGPLAN Notices,2012,47(4):213-224.
    [104]Isen C, John L. Eskimo-energy savings using semantic knowledge of inconsequential memory occupancy for DRAM subsystem[C]. Microarchitecture,2009. MICRO-42.42nd Annual IEEE/ACM International Symposium on. IEEE,2009:337-346.
    [105]Roy Longbottom, 'Android multithreading benchmark apps', http://www.roylongbottom.org.uk/
    [106]Perf tool, "https://perf.wiki.kernel.org/index.php/Main Page'
    [107]Li T, John L K. Run-time modeling and estimation of operating system power consumption[J]. ACM SIGMETRICS Performance Evaluation Review,2003,31(1):160-171.
    [108]赵霞,郭耀,雷志勇,等.基于模拟器的嵌入式操作系统功耗估算与分析[J].2008.
    [109]Zongwei Zhu, XiLi, Yi Yuan. Powerlyser. https://github.com/zhuzongwei/ELPG.
    [110]Dietrich B, Chakraborty S. Power management using game state detection on android smartphones[C]. Proceeding of the 11th annual international conference on Mobile systems, applications, and services. ACM,2013:493-494.
    [111]Wagner D T, Rice A, Beresford A R. Device Analyzer:Large-scale mobile data collection[C]. Workshop on Big Data Analytics.2013.
    [112]Looper, UI. http://www.cnblogs.com/codingmyworld/archive/2011/09/12/ 2174255.html
    [113]Clauss P, Kenmei B, Beyler J C. The periodic-linear model of program behavior capture[M]. Euro-Par 2005 Parallel Processing. Springer Berlin Heidelberg,2005:325-335.
    [114]Hollingsworth J K, Miller B P, Cargille J. Dynamic program instrumentation for scalable performance tools[C]. Scalable High-Performance Computing Conference,1994., Proceedings of the. IEEE,1994:841-850.
    [115]Burguera I, Zurutuza U, Nadjm-Tehrani S. Crowdroid:behavior-based malware detection system for android[C]. Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices. ACM,2011:15-26.
    [116]Sherwood T, Perelman E, Hamerly G, et al. Discovering and exploiting program phases[J]. Micro, IEEE,2003,23(6):84-93.
    [117]Bircher W L, John L K. Power phase variation in a commercial server workload[C]. Low Power Electronics and Design,2006. ISLPED'06. Proceedings of the 2006 International Symposium on. IEEE,2006:350-353.
    [118]Ratanaworabhan P, Burtscher M. Program phase detection based on critical basic block transitions[C]. Performance Analysis of Systems and software,2008. ISPASS 2008. IEEE International Symposium on. IEEE,2008:11-21.
    [119]Ketterlin A, Clauss P. Recovering the Memory Behavior of Executable Programs[C].Source Code Analysis and Manipulation (SCAM),2010 10th IEEE Working Conference on. IEEE,2010:189-198.
    [120]Li T, John L K, Sivasubramaniam A, et al. Understanding and improving operating system effects in control flow prediction[C]. ACM Sigplan Notices. ACM,2002,37(10): 68-80.
    [121]Wolf F, Mohr B. Hardware-counter based automatic performance analysis of parallel programs[J]. Advances in Parallel Computing,2004,13:753-760.
    [122]Pathak A, Hu Y C, Zhang M, et al. Fine-grained power modeling for smartphones using system call tracing[C]. Proceedings of the sixth conference on Computer systems. ACM, 2011:153-168.
    [123]Kim J, Yoo S, Kyung C M. Program phase-aware dynamic voltage scaling under variable computational workload and memory stall environment[J]. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on,2011,30(1):110-123.
    [124]Nagpurkar P, Krintz C, Sherwood T. Phase-aware remote profiling[C]. Proceedings of the international symposium on Code generation and optimization. IEEE Computer Society, 2005:191-202.
    [125]Nagpurkar P, Hind M, Krintz C, et al. Online phase detection algorithms[C].Code Generation and Optimization,2006. CGO 2006. International Symposium on. IEEE,2006:13 pp.
    [126]Krintz C, Wolski R. Using phase behavior in scientific application to guide linux operating system customization[C]. Parallel and Distributed Processing Symposium,2005. Proceedings.19th IEEE International. IEEE,2005:8 pp.
    [127]Gordon-Ross A, Lau J, Calder B. Phase-based cache reconfiguration for a highly-configurable two-level cache hierarchy[C].Proceedings of the 18th ACM Great Lakes symposium on VLSI. ACM,2008:379-382.
    [128]Sawalha L, Tull M P, Barnes R D. Thread scheduling for heterogeneous multicore processors using phase identification[J]. ACM SIGMETRICS Performance Evaluation Review,2011,39(3):125-127.
    [129]Fang Z, Li J, Zhang W, et al. Improving dynamic prediction accuracy through multi-level phase analysis[C].ACM SIGPLAN Notices. ACM,2012,47(5):89-98.
    [130]AbouGhazaleh N, Childers B, Mosse D, et al. Energy management for real-time embedded applications with compiler support[J]. ACM SIGPLAN Notices,2003,38(7): 284-293.
    [131]Dhodapkar A S, Smith J E. Managing multi-configuration hardware via dynamic working set analysis[C].Computer Architecture,2002. Proceedings.29th Annual International Symposium on. IEEE,2002:233-244.
    [132]Dhodapkar A S, Smith J E. Comparing program phase detection techniques[C].Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society,2003:217.
    [133]Huang M C, Renau J, Torrellas J. Positional adaptation of processors:application to energy reduction[C].Computer Architecture,2003. Proceedings.30th Annual International Symposium on. IEEE,2003:157-168.
    [134]Isci C, Contreras G, Martonosi M. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management[C].Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2006:359-370.
    [135]Sondag T, Rajan H. Phase-guided thread-to-core assignment for improved utilization of performance-asymmetric multi-core processors[C].Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering. IEEE Computer Society,2009:73-80.
    [136]Balasubramonian R, Albonesi D, Buyuktosunoglu A, et al. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures[C].Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture. ACM,2000:245-257.
    [137]Smith J E, Dhodapkar A S. Dynamic microarchitecture adaptation via co-designed virtual machines[C].Solid-State Circuits Conference,2002. Digest of Technical Papers. ISSCC.2002 IEEE International. IEEE,2002,1:198-199.
    [138]Perelman E, Hamerly G, Calder B. Picking statistically valid and early simulation points[C]. Parallel Architectures and Compilation Techniques,2003. PACT 2003. Proceedings. 12th International Conference on. IEEE,2003:244-255.
    [139]Sherwood T, Perelman E, Hamerly G, et al. Automatically characterizing large scale program behavior[C]. ACM SIGARCH Computer Architecture News. ACM,2002,30(5): 45-57.
    [140]Lau J, Perelman E, Hamerly G, et al. Motivation for variable length intervals and hierarchical phase behavior[C].Performance Analysis of Systems and Software,2005. ISPASS 2005. IEEE International Symposium on. IEEE,2005:135-146.
    [141]Lau J, Perelman E, Calder B. Selecting software phase markers with code structure analysis[C].Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society,2006:135-146.
    [142]Duesterwald E, Torrellas J, Dwarkadas S. Characterizing and predicting program behavior and its variability[C]. Parallel Architectures and Compilation Techniques,2003. PACT 2003. Proceedings.12th International Conference on. IEEE,2003:220-231.
    [143]Isci C, Martonosi M. Identifying program power phase behavior using power vectors[C]. Workload Characterization,2003. WWC-6.2003 IEEE International Workshop on. IEEE,2003:108-118.
    [144]Sherwood T, Perelman E, Calder B. Basic block distribution analysis to find periodic behavior and simulation points in applications[C]. Parallel Architectures and Compilation Techniques,2001. Proceedings.2001 International Conference on. IEEE,2001:3-14.
    [145]Georges A, Buytaert D, Eeckhout L, et al. Method-level phase behavior in Java workloads[J]. ACM SIGPLAN Notices,2004,39(10):270-287.
    [146]Huffmire T, Sherwood T. Wavelet-based phase classification[C]. Proceedings of the 15th international conference on Parallel architectures and compilation techniques. ACM, 2006:95-104.
    [147]Shen X, Zhong Y, Ding C. Locality phase prediction[J]. ACM SIGPLAN Notices, 2004,39(11):165-176.
    [148]Eeckhout L, Sampson J, Calder B. Exploiting program microarchitecture independent characteristics and phase behavior for reduced benchmark suite simulation[C]. Workload Characterization Symposium,2005. Proceedings of the IEEE International. IEEE,2005:2-12.
    [149]Lau J, Schoemackers S, Calder B. Structures for phase classification[C]. Performance Analysis of Systems and Software,2004 IEEE International Symposium on-ISPASS. IEEE, 2004:57-67.
    [150]Cho C B, Li T. Complexity-based program phase analysis and classification[C]. Proceedings of the 15th international conference on Parallel architectures and compilation techniques. ACM,2006:105-113.
    [151]Cho C B, Li T. Using wavelet domain workload execution characteristics to improve accuracy, scalability and robustness in program phase analysis[C]. Performance Analysis of Systems & Software,2007. ISPASS 2007. IEEE International Symposium on. IEEE,2007: 136-145.
    [152]Lau J, Schoenmackers S, Calder B. Transition phase classification and prediction[C]. High-Performance Computer Architecture,2005. HPCA-11.11th International Symposium on. IEEE,2005:278-289.
    [153]Vandeputte F, Eeckhout L. Phase complexity surfaces:characterizing time-varying program behavior[M]. High Performance Embedded Architectures and Compilers. Springer Berlin Heidelberg,2008:320-334.
    [154]Tan T K, Raghunathan A, Jha N K. Embedded operating system energy analysis and macro-modeling[C]. Computer Design:VLSI in Computers and Processors,2002. Proceedings.2002 IEEE International Conference on. IEEE,2002:515-522.
    [155]Joy J, John A. Host based attack detection using system calls[C]. Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology. ACM,2012:7-11.
    [156]Mutz D, Valeur F, Vigna G, et al. Anomalous system call detection[J]. ACM Transactions on Information and System Security (TISSEC),2006,9(1):61-93.
    [157]陈云,贾刚勇,李曦,等.基于任务行为分析的DVFS机制[J].计算机系统应用,2013(10):1-7.
    [158]Choi K, Soma R, Pedram M. Dynamic voltage and frequency scaling based on workload decomposition[C]. Proceedings of the 2004 international symposium on Low power electronics and design. ACM,2004:174-179.
    [159]贾刚勇.系统级热敏感管理技术的研究[D].中国科学技术大学,2013.
    [160]Haggard E A, Isaacs K S. Micromomentary facial expressions as indicators of ego mechanisms in psychotherapy [M]. Methods of research in psychotherapy.1966:154-165.
    [161]Bovet D P, Cesati M. Understanding the Linux kernel[M]. O'Reilly Media, Inc.,2005.
    [162]Krzanowski R. Burst (of packets) and burstiness[C].66th IETF meeting.2006.
    [163]Shakkottai S, Brownlee N. A study of burstiness in tcp flows[M]//Passive and Active Network Measurement. Springer Berlin Heidelberg,2005:13-26.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700