A Scalable Architecture For Hardware Acceleration of Large Sparse Matrix Calculations
dc.contributor.advisor | Dr. Paul Franzon, Committee Chair | en_US |
dc.contributor.advisor | Dr. Gianluca Lazzi, Committee Member | en_US |
dc.contributor.advisor | Dr. Michael Steer, Committee Member | en_US |
dc.contributor.author | Hamlett, Matthew Issiah | en_US |
dc.date.accessioned | 2010-04-02T18:07:25Z | |
dc.date.available | 2010-04-02T18:07:25Z | |
dc.date.issued | 2007-08-01 | en_US |
dc.degree.discipline | Computer Engineering | en_US |
dc.degree.level | thesis | en_US |
dc.degree.name | MS | en_US |
dc.description.abstract | The task of implementing the Jacobi method has been looked at from several research works over the years. The Jacobi method is considered the most ideal Iterative method for implementation on FPGAs because of its inherent parallelism and lack of data dependencies. In this work, we look specifically at solving very large matrix equations in the form of Ax = b. Here A is a sparse matrix with dimensions of 1 million x 1 million with 6 entries per row. X is the vector we are solving for, and b is a known vector. All data is in 64-bit IEEE-754 floating point format. Previous work in this area has implemented the Jacobi method using only on chip memory accesses, greatly limiting the size of the matricies that can be solved. By using external memory, we present a design that is practical and can be used to accelerate various engineering and scientific problems today. In this design, we also implement the resources necessary for Multiple FPGAs to be used in a distributive manner so as to tackle larger problems. Our design gives a peak floating point performance of 1.8 GFLOPS and a sustained floating point performance of 1.18 GFLOPS. This is a speed up factor of around 2.95 when compared to the sustained performance that is typically seen on today's general purpose computers with this type of problem. To obtain this high peak floating point performance, we present in this paper a group of memory interfaces that are capable of supplying a total data rate of 20 Gb/sec sustained. | en_US |
dc.identifier.other | etd-11062006-023159 | en_US |
dc.identifier.uri | http://www.lib.ncsu.edu/resolver/1840.16/1781 | |
dc.rights | I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dis sertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. | en_US |
dc.subject | Jacobi | en_US |
dc.subject | Sparse | en_US |
dc.subject | Matricies | en_US |
dc.subject | FPGA | en_US |
dc.subject | 64-bit | en_US |
dc.subject | Floating-Point | en_US |
dc.title | A Scalable Architecture For Hardware Acceleration of Large Sparse Matrix Calculations | en_US |
Files
Original bundle
1 - 1 of 1