Measuring and Improving TLB Performance for Linux GUI Applications
dc.contributor.advisor | Matthias Stallmann, Committee Member | en_US |
dc.contributor.advisor | William Cohen, Committee Co-Chair | en_US |
dc.contributor.advisor | Edward Gehringer, Committee Chair | en_US |
dc.contributor.author | Konireddygari, Sreekanth | en_US |
dc.date.accessioned | 2010-04-02T18:10:06Z | |
dc.date.available | 2010-04-02T18:10:06Z | |
dc.date.issued | 2009-04-07 | en_US |
dc.degree.discipline | Computer Science | en_US |
dc.degree.level | thesis | en_US |
dc.degree.name | MS | en_US |
dc.description.abstract | Modern GUI applications rely on a large number of dynamically linked shared libraries to reduce the applications' memory footprint and to avoid recompilation when newer versions of the shared libraries become available. However, each shared library needs separate memory pages, and the pages are laid out in virtual memory in essentially random fashion. This leads to increased conflict misses in the translation lookaside buffer (TLB) and diminishes the application's performance. Previous work has measured TLB performance using standard benchmarks, which consist of applications with statically-linked libraries. Statically-linked programs tend to use fewer libraries, and don't require separate pages for each library; hence they place much less stress on the TLB. We added a TLB simulator to Valgrind, a Linux binary instrumentation tool, and used Dogtail, a testing framework for GUI applications, to measure TLB behavior for several dynamically linked GUI applications: the document-viewing application Evince, the word processor Abiword, the image viewer Gthumb, and the calculator Gcalctool. Analysis of TLB miss data from the Valgrind instrumentation showed that some shared library pages repeatedly conflicted with pages from other shared libraries for TLB entries. Moving just a single highly-contended shared library to a different place in the virtual address space reduced TLB conflict misses up to 14.7%. Average reductions across applications were greater in larger TLBs, with the highest average reduction, 7.5%, in the 512-entry TLB, the largest studied. We then investigated a more comprehensive relocation strategy based on call-graph information. Results demonstrated greater improvement in miss ratios for instruction TLBs, the average reduction being 9.6%. | en_US |
dc.identifier.other | etd-03192008-173943 | en_US |
dc.identifier.uri | http://www.lib.ncsu.edu/resolver/1840.16/2072 | |
dc.rights | I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. | en_US |
dc.subject | shared library | en_US |
dc.subject | gui | en_US |
dc.subject | performance | en_US |
dc.subject | TLB | en_US |
dc.subject | linux | en_US |
dc.subject | processor | en_US |
dc.title | Measuring and Improving TLB Performance for Linux GUI Applications | en_US |
Files
Original bundle
1 - 1 of 1