Peer-Reviewed Journal Details
Mandatory Fields
Lalor, GC;Zhang, CS
2001
December
Science Of The Total Environment
Multivariate outlier detection and remediation in geochemical databases
Published
WOS: 42 ()
Optional Fields
MULTIPLE OUTLIERS
281
99
109
In this study, outliers are classified into three types: (1) range outliers: (2) spatial outliers; and (3) relationship outliers, defined as observations that fall outside of the values expected from correlation within the dataset. The multivariate methods of principal component analysis (PCA), multiple regression analysis (MRA) and an autoassociation neural network (AutoNN) method are applied to a dataset comprising 203 samples of rare earth element (REE) concentrations in soils of Jamaica which shows the expected good correlations between the elements. PCA is shown to be effective in detection of high value range outliers, while AutoNN and MRA are effective in detection of relationship outliers. A backpropagation neural network was used to predict the 'expected values' of the outliers. Four obvious relationship outliers with unexpected low Sm concentrations were selected as an example for remediation. The predicted Sm values were confirmed on remeasurement. Neural network methods, with the advantages of being model-free and effective in solving non-linear relationship problems, appear to provide an automated and effective way for the quality control of environmental databases. (C) 2001 Elsevier Science B.V. All rights reserved.
0048-9697
Grant Details
Publication Themes