ULEDS-SVMs: Upper/Lower Limits and Error Data Supposted Support Vector Machines

Show simple item record

dc.contributor.advisor Jon Doyle, Committee Chair en_US
dc.contributor.advisor John Blondin, Committee Member en_US
dc.contributor.advisor Robert Funderlic, Committee Member en_US
dc.contributor.author Sun, Xuejun en_US
dc.date.accessioned 2010-04-02T18:12:36Z
dc.date.available 2010-04-02T18:12:36Z
dc.date.issued 2004-11-18 en_US
dc.identifier.other etd-10292003-085539 en_US
dc.identifier.uri http://www.lib.ncsu.edu/resolver/1840.16/2355
dc.description.abstract A Support Vector Machine, ULEDS-SVMs, was developed for classification in data domain which contains limits or errors. Data with upper or lower limits are different from missing data. They provide constraints at a certain level in data classification and modeling. Data with errors may be recognized as the special case of an upper and a lower limit existing at the two boundaries at an attribute. Such kind of data quality exists widely, from scientific data measurement, to databases resulted from integration and emerge with different quality. Including these data in training rather than dropping them or arbitrarily filling with some value is very desired to provide useful constraints in machine learning. A simple enhanced 1R algorithm is described which may be able to handle data in such a domain, and which principle may be extendable to other machine learning methods. But this is not favored because of its time complicity. Support Vector Machines (SVMs) treatment of the data in such a domain is, however, very promising. We provided the mathematical foundation to treat this kind of problem by recognizing the concepts of feasibilities for training, testing and predicting in SVMs. Algorithms were described by utilizing the theorems. For applying ULEDS-SVMs, we made an integration of a data set in astronomy (CHDF-N) based on Chandra Deep Field (CDF) and Hubble Deep Field (HDF) North observations. Classification of the astronomical objects is interesting for the study of formation and evolution of galaxies in the deep universe. This direction contains the deepest observations made with the largest astronomical facilities currently available. We used CHDF-N as a test bed for the ULEDS-SVMs algorithms application implemented via Matlab. The separation between stars and extragalactic objects gets a 100% accuracy, which would be otherwise more ambiguous in determining the separation plane if limit data in extragalactic class were not included. Training and testing using leave-one-out partition achieved 82% accuracy for separation of galaxies and active galactic nuclei (AGNs). This is better than 72.4% accuracy by using conventional R-log(F_x) plot separation method commonly used in the astronomical community. Prediction rate increased from 49.6% by using conventional SVMs to 75.5% by using ULEDS-SVMs. en_US
dc.rights I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to NC State University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. en_US
dc.subject upper limit en_US
dc.subject lower limit en_US
dc.subject Machine Learning en_US
dc.subject Data Mining en_US
dc.subject Application: Chandra and Hubble Field en_US
dc.subject SVMs en_US
dc.subject Support Vector Machines en_US
dc.subject Domain: data with error en_US
dc.title ULEDS-SVMs: Upper/Lower Limits and Error Data Supposted Support Vector Machines en_US
dc.degree.name MS en_US
dc.degree.level thesis en_US
dc.degree.discipline Computer Science en_US

Files in this item

Files Size Format View
etd.pdf 1.132Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record