1

Some references first:

  1. How is approximately unbiased bootstrap better than a regular bootstrap with regards to hierarchical clustering?
  2. Suzuki et al. 2004 https://www.researchgate.net/publication/228851295_An_Application_of_Multiscale_Bootstrap_Resampling_to_Hierarchical_Clustering_of_Microarray_Data_How_Accurate_are_these_Clusters
  3. https://cran.r-project.org/web/packages/pvclust/pvclust.pdf

For starters, i don't think this is a duplicate question : I'm not asking what does the term Approximately unbiased p-values stands for; as I understand it has the same meaning of a bootstrap value but is derived from multistep multiscale resampling. It still relates to the number of trees that have some that same 'type of cluster' (I'm not sure it's that same node as in standard bootstrap procedure). Basically even though my question might seem duplicate of the other one I referenced above, there still is no clear answer to the question.

My question arises from a practical problem I'm facing. I used PvClust to calculate Bootstrap Probabilities (BP) and Approximately unbiased P-values (UA). The data consists of few objects and many descriptors.

The striking difference between BP and UA made me question the reliability of this type of validation in my case: UA are usually over 90% while BP are very low (around30%). Is there any guideline to which approach is considered more reliable and in which case? I'm only using one 'type' of data that means my descriptors are not totally independent. Does that make UA values meaningless? Is the BP more accurate in this simple case? p

Mirko
  • 133
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Jun 15 '22 at 20:54

0 Answers0