2

Ok this is a very simple question but one which has been troubling me...

Lets say that I want to calculate the equilibrium constant K for a chemical reaction where A -> B

K = [A]/[B]

where [A] is the concentration of A and [B] is the concentration of B at equilibrium

I have measured the quantities [A] and [B] and thus have two sets of data.

To calculate K should I:

i) Take the ratio of [A] and [B] at each data point and then calculate a mean average of all of the K values?

OR ii) Plot a graph of [A] against [B] and use the gradient of the trendline as the value of K?

My instinct tells me it is the former as the straight line plot on excel has a non-zero intercept (which is not reasonable)

If it is option i) can you explain to me why it is so?

Chemist
  • 21
  • The best solution is probably neither: it should depend on the error structure of the measurement process. Presumably there is some measurement error in both [A] and [B]. The crux of the matter concerns the possibility of correlations among the errors in each data point. A clear analysis (from you) of the nature of the measurement errors would help identify appropriate ways to proceed. Perhaps you could share some thoughts about that? – whuber Feb 22 '16 at 18:44

1 Answers1

0

My instinct tells me it is the former as the straight line plot on excel has a non-zero intercept

This may be due to regression attenuation, see the following example where we generated $[A] = [B] \sim U(1,2)$ then add gamma distributed noise.

The image below shows that this results in a symmetrical situation, but the fitted line is not. (see also Effect of switching response and explanatory variable in simple linear regression)

example of attenuation

set.seed(1)
n = 100
u = runif(n,1,2)
x = rgamma(n,20,20/u)
y = rgamma(n,20,20/u)

plot(x,y, xlim = c(0,3),ylim=c(0,3)) mod = lm(y~x) lines(c(-1,4), mod$coefficients[1]+ mod$coefficients[2]*c(-1,4), lty = 2)

mean(y/x) ## result: 1.045009


The alternative, using values $y/x$, is not perfect either as this may have bias as well (there are some formula's to reduce this).

In the example above the mean of $y/x$ is approximately $1.045$

Below we see a histogram for a thousand repetitions of the above simulation and it shows that the mean is on average around 1.05 instead of the true value 1.00

histogram of mean(y/x)


A simple alternative can be the use of the average of y and the average of x such that you estimate the reaction ratio as $\hat{r} = \bar{x}/\bar{y}$.

histogram of mean(y)/mean(x)