Data Reduction and Model Selection with Wavelet Transforms

No Thumbnail Available

Date

2000-11-07

Journal Title

Series/Report No.

Journal ISSN

Volume Title

Publisher

Abstract

With modern technology massive quantities of data are being collected continuously. The purpose of our research has been to develop amethod for data reduction and model selection applicable to large data setsand replicated data. We propose a novel wavelet shrinkage method byintroducing a new model selection criterion. The proposed shrinkage rule hasat least two advantages over the current shrinkage methods. First, it isadaptive to the smoothness of the signal regardless of whether it has a sparsewavelet representation, since we consider both the deterministic and thestochastic cases. The wavelet decomposition not only catches the signalcomponents for a pure signal, but de-noises and extracts these signal components for a signal contaminated by external influences. Second, theproposed method allows for fine "tuning'' based on the particular data athand. Our simulation studyshows that the methods based on the model selection criterion have better meansquare error (MSE) over the methods currently known. Two aspects make wavelet analysis the analytical tool of choice.First, thelargest in magnitude wavelet coefficients in the discrete wavelet transform (DWT) ofthe data, extract the relevant information, while discarding the resteliminates the noise component. Second, the DWT allows for a fast algorithmcalculation of computational complexity O(n). For the deterministic case we derive a bound on the approximation error of thenonlinear wavelet estimate determined by the largest in magnitude discrete wavelet coefficients. Upper bounds for the approximation error and the rateof increase of the number of wavelet coefficients in the model areobtained for the new wavelet shrinkage estimate. When the signal comes from astochastic process,a bound for the MSE is found, and for the bias of its estimate. A corrected version of the model selection criterion is introduced and some of its properties are studied. The new wavelet shrinkage is employed in the case of replicated data. An algorithm for model selection is proposed,based on which a manufacturing process can be automatically supervised for quality and efficiency. Weapply it to two real life examples.

Description

Keywords

Citation

Degree

PhD

Discipline

Statistics

Collections