There are really several issues here:
1) In a random-effects model with $k=2$, how well does the CI for $\mu$ work when using the standard Wald-type method and when using the Wald-type method with the Knapp and Hartung adjustment?
My suggestion would be to explore this using a simulation study. Here would be one way of approaching this:
library(metafor)
mu <- 0.8
tau2 <- 0.4
iters <- 10000
ci1 <- rep(NA, iters)
ci2 <- rep(NA, iters)
for (i in 1:iters) {
vi <- runif(2, .05, 1.5)
yi <- rnorm(2, mu, sqrt(tau2 + vi))
res <- rma(yi, vi)
ci1[i] <- res$ci.lb <= mu && res$ci.ub >= mu
res <- rma(yi, vi, knha=TRUE)
ci2[i] <- res$ci.lb <= mu && res$ci.ub >= mu
}
mean(ci1)
mean(ci2)
So, I am simulating two sampling variances from a uniform distribution (of course you could try some other distribution here), then two estimates from a normal distribution (with variance that is equal to the amount of heterogeneity plus sampling variance), then I fit the random-effects model without the Knapp and Hartung adjustment, then with the adjustment, and each time I check whether $\mu$ is captured by the CI or not. If I run this, I get:
> mean(ci1)
[1] 0.8955
> mean(ci2)
[1] 0.9485
The coverage of the CI with the adjustment is in essence nominal, while the CI without the adjustment undercovers. Of course, one should start examining this in a more systematic manner, but what you will find is that the adjustment works rather well in creating a CI that has more or less nominal coverage, while the CI without the adjustment does not.
2) Why is the CI with the adjustment so incredibly wide?
As the results for your particular example show, the estimated value of $\mu$ is the same regardless of whether you use the adjustment or not (the adjustment has no influence on that), but the corresponding standard errors are slightly different. Not by much though. But the CI for $\mu$ is much wider when using the adjustment. The reason for that is simple: Without the adjustment, the CI is computed with $$\hat{\mu} \pm 1.96 SE[\hat{\mu}],$$ and with the adjustment with $$\hat{\mu} \pm t_{.975; k-1} SE[\hat{\mu}],$$ where $t_{.975; k-1}$ is the 97.5th percentile of a t-distribution with $k-1$ degrees of freedom. So, with $k=2$, we actually have a t-distribution with 1 degree of freedom, which interestingly happens to be a Cauchy distribution. At any rate, the 97.5th percentile is then
> qt(.975, df=1)
[1] 12.7062
So, instead of $\pm1.96$, the CI is computed with $\pm12.7062$. No surprise it is so incredibly wide.
3) Does it even make sense to fit a random-effects model with $k=2$?
There is of course no right or wrong answer here and the issue is debatable. However, keep in mind that the goal of a random-effects model is to make an inference about the average true effect in a larger population of (hypothetical) studies. Doing so based on two realizations from that population is risky at best (some may say even foolish). As the simulation above shows, using the adjustment does give us a CI with more or less nominal coverage when $k=2$. But that comes at a heavy price: The CI is rather uninformative. I think that's a proper reflection of the degree of uncertainty in such an endeavor though.