Questions tagged [compositional-data]

Refers to variables representing fractions of a total, i.e. all lying in $[0,1]$ interval and necessarily summing to one. Analysis of such data is often called compositional data analysis.

Compositional data pertain to the relative proportions of a whole. For example,

  • Each data point may correspond to a rock composed of three different minerals; a rock of which 10% is the first mineral, 30% is the second, and the remaining 60% is the third would correspond to the triple [0.1, 0.3, 0.6]; a data set would contain one such triple for each rock in a sample of rocks.
    (Wikipedia)

This is subtly different from binomial or multinomial data, which can also yield proportions that lie exclusively in $[0, 1]$ (or could be represented as counts out of a total), but where the proportions come from discrete events that could have been one category or another.

The analysis of such data requires special methods, mostly based on log ratios.

Additional resources can be found at http://www.compositionaldata.com/

169 questions
5
votes
1 answer

How do you interpret parameters from logratio analysis of compositional data?

In compositional data analysis as studied by John Aitchison, the analogue of simple linear regression is $$Y_i = \alpha\oplus\beta\odot X_i \oplus \epsilon_i$$ Here, $Y_i = [y_1,...,y_n]$ is the response, $X_i \in \mathbb{R}$ is a covariate,…
4
votes
0 answers

Calculating distance with compositional and non-compositional data

I have demographic data across different districts/neighbourhoods, and would like to find, for a given district, which is its most similar peer district across multiple variables such as size (total population), race, nationality etc. The idea would…
4
votes
1 answer

Compositional data analysis - what's the "method"?

Let $\textbf{X} = (X_1, \ldots, X_n)$ be a vector of responses, where $X_i = (p_1, \ldots, p_k)$ is itself a vector of probabilities. What method does one use to analyze such data? I want the logic/steps/ideas/concepts behind the method outlined in…
Margin
  • 41
3
votes
1 answer

Overview of compositional data analysis

I just need a very short summary of what the standard way to deal with compositional data is. I've skimmed pages in a 500-page long book on the topic, and I didn't really gather much. I would like to know the general idea before I delve into the…
Markis
  • 31
2
votes
0 answers

Is this a multivariate regression problem or separate univariate regression problem?

I have data on the percentage of their 24h day that animals spend doing certain activies. One response for one animal may look like 30 % spent hunting 30 % spent sleeping 30 % spent eating 10 % spent mating And then of course we have many…
Maikel
  • 21
  • 1
1
vote
1 answer

How would you analyse dependence of proportions?

We are given a compositional data set, where the response is $$Y = [y_1, ..., y_n], \sum y_i = 1, y_i \in [0,1]$$ I intend to do regression, however, prior to that, I would like to get a feel of the codependence structure of $Y$. What is the right…
doso
  • 11
1
vote
0 answers

Are there potential pitfals to compositional data with many components?

I currently have a data set where the response is compositional with many components. I am considering lumping some of the components together. I believe this will make it easier to spot the potential effect that some independent variables may have…
Nura
  • 11
1
vote
1 answer

Why can't I just analyse compositional data using regular multivariate analysis?

For a data set, I have compositional response variables: probabilities that sum to 1 Why can't I just analyse this using a linear model or alternatively, say, a generalized linear model, where I use the Dirichlet distribution instead? Currently I am…
Marggie
  • 21
0
votes
0 answers

centered log ratio transformation of overlapping compositions

I have a total of $N$ units that are assigned to $D$ classes based on certain combinations of characteristics. If each unit is assigned to only one class, then the problem reduces to the usual compositional data with $D$ compositions, which are…
Alemu
  • 125
0
votes
0 answers

Replacement of true zeros in compositional data

How does one go about replacing true zeros in compositional data sets? There is a lot of information on handling rounded zeros, such as multiplicative replacement (MultRepl in R), but not on true zeros.
Kyle
  • 21
0
votes
0 answers

Compositional data analysis

I have one independent categorical variable "Period" (day or night) and 4 dependent continuous variables "Behaviours" (walking, lying, standing, grazing) which are compositional in nature (add to 100% for a given period). Original data for…
Kyle
  • 21