Lyell Collection

Geological Society, London, Special Publications

Lyell Centre  |   Lyell Collection  |   Subscriptions   |   Geological Society  |   Email alerts  |   Online bookshop  |   Help


Keywords:
Author:
Advanced search>>
This Article
Right arrow Full Text (PDF) FREE
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Pawlowsky-Glahn, V.
Right arrow Articles by Egozcue, J. J.
Right arrow Search for Related Content
GeoRef
Right arrow GeoRef Citation
Geological Society, London, Special Publications; 2006; v. 264; p. 1-10;
DOI: 10.1144/GSL.SP.2006.264.01.01
© 2006 Geological Society of London

Compositional data and their analysis: an introduction

V. Pawlowsky-Glahn1 & J. J. Egozcue2

1 Departament Informàtica i Matemàtica Aplicada, Universitat de Girona, Campus Montilivi, P4, E-17071 Girona, Spain vera.pawlowsky{at}udg.es
2 Departament Matemàtica Aplicada III, Universitat Politècnica de Catalunya, Jordi Girona Salgado 1-3, C2, E-08034 Barcelona, Spain

Compositional data are those which contain only relative information. They are parts of some whole. In most cases they are recorded as closed data, i.e. data summing to a constant, such as 100% — whole-rock geochemical data being classic examples. Compositional data have important and particular properties that preclude the application of standard statistical techniques on such data in raw form. Standard techniques are designed to be used with data that are free to range from – {infty} to + {infty}. Compositional data are always positive and range only from 0 to 100, or any other constant, when given in closed form. If one component increases, others must, perforce, decrease, whether or not there is a genetic link between these components. This means that the results of standard statistical analysis of the relationships between raw components or parts in a compositional dataset are clouded by spurious effects. Although such analyses may give apparently interpretable results, they are, at best, approximations and need to be treated with considerable circumspection. The methods outlined in this volume are based on the premise that it is the relative variation of components which is of interest, rather than absolute variation. Log-ratios of components provide the natural means of studying compositional data. In this contribution the basic terms and operations are introduced using simple numerical examples to illustrate their computation and to familiarize the reader with their use.