Dynamic Page Migration on ccNUMA Platforms Guided by Hardware Tracing
dc.contributor.advisor | Frank Mueller, Committee Chair | en_US |
dc.contributor.advisor | Xiasong Ma, Committee Member | en_US |
dc.contributor.advisor | Vincent W. Freeh, Committee Member | en_US |
dc.contributor.author | Thakkar, Vivek | en_US |
dc.date.accessioned | 2010-04-02T17:57:45Z | |
dc.date.available | 2010-04-02T17:57:45Z | |
dc.date.issued | 2008-08-14 | en_US |
dc.degree.discipline | Computer Science | en_US |
dc.degree.level | thesis | en_US |
dc.degree.name | MS | en_US |
dc.description.abstract | Non-uniform memory architectures with cache coherence (ccNUMA) are becoming increasingly common, not just for large-scale high performance platforms but also in the context of multi-cores architectures. Under ccNUMA, data placement may influence overall application performance significantly as references resolved locally to a processor⁄core impose lower latencies than remote ones. This work develops a novel hardware-assisted dynamic page migration scheme based on automated tracing of the memory references made by application threads. The developed framework leverages the performance monitoring capabilities of contemporary x86 microprocessors to efficiently extract an approximate trace of memory accesses. This information along with multi-level hop latencies are used to decide page affinity, i.e., the node to which a page is bound. After determining affinities, page migration is initiated using Linux kernel mechanisms. All this automation is done in user space and transparent to the main application. Experiments show that this method, although based on lossy tracing and system configuration limitation on trace hardware, can efficiently and effectively improve local data availability at run time, leading to an average wall-clock execution time saving of over 14% on AMD Opterons with a 1.3x⁄1.6x access penalty to non-local memory with a very minimal page migration overhead due to the advances in modern memory interconnect technologies. To the best of our knowledge, this is a first experimental study on a popular platform, a combination of x86 processors and Linux operating system. | en_US |
dc.identifier.other | etd-08082008-131342 | en_US |
dc.identifier.uri | http://www.lib.ncsu.edu/resolver/1840.16/694 | |
dc.rights | I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. | en_US |
dc.subject | Dynamic Page Migration | en_US |
dc.subject | PMU | en_US |
dc.subject | PEBS | en_US |
dc.subject | ccNUMA | en_US |
dc.subject | ISA | en_US |
dc.subject | Perfmon2 | en_US |
dc.subject | Microarchitecture | en_US |
dc.title | Dynamic Page Migration on ccNUMA Platforms Guided by Hardware Tracing | en_US |
Files
Original bundle
1 - 1 of 1