Highest Voted Questions - Statistical Analysis Stack Exchange

35

votes

3 answers

What stop-criteria for agglomerative hierarchical clustering are used in practice?

I have found extensive literature proposing all sorts of criteria (e.g. Glenn et al. 1985(pdf) and Jung et al. 2002(pdf)). However, most of these are not that easy to implement (at least from my perspective). I am using scipy.cluster.hierarchy to…

clustering

asked Sep 12 '10 at 19:49

Björn Pollex

1,383

35

votes

1 answer

What are some useful guidelines for GBM parameters?

What are some useful guidelines for testing parameters (i.e. interaction depth, minchild, sample rate, etc.) using GBM? Let's say I have 70-100 features, a population of 200,000 and I intend to test interaction depth of 3 and 4. Clearly I need to do…

asked Apr 03 '12 at 03:27

Ram Ahluwalia

3,081

35

votes

3 answers

Things to consider about masters programs in statistics

It is admission season for graduate schools. I (and many students like me) am now trying to decide which statistics program to pick. What are some things those of you who work with statistics suggest we consider about masters programs in…

asked Apr 02 '12 at 17:12

AttemptedStudent

151

35

votes

11 answers

Why is generating 8 random bits uniform on (0, 255)?

I am generating 8 random bits (either a 0 or a 1) and concatenating them together to form an 8-bit number. A simple Python simulation yields a uniform distribution on the discrete set [0, 255]. I am trying to justify why this makes sense in my…

asked Jan 16 '17 at 19:34

glassy

481

35

votes

7 answers

Convolutional Layers: To pad or not to pad?

AlexNet architecture uses zero-paddings as shown in the pic. However, there is no explanation in the paper why this padding is introduced. Standford CS 231n course teaches we use padding to preserve the spatial size: I am curious if that is the…

asked Nov 17 '16 at 14:28

Jumabek Alihanov

403

35

votes

4 answers

How to measure smoothness of a time series in R?

Is there a good way to measure smoothness of a time series in R? For example, -1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1.0 is much smoother than -1, 0.8, -0.6, 0.4, -0.2, 0, 0.2, -0.4, 0.6, -0.8, 1.0 although they have same mean and…

asked Mar 14 '12 at 02:29

agmao

451

35

votes

2 answers

PCA in numpy and sklearn produces different results

Am i misunderstanding something. This is my code using sklearn import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from sklearn import decomposition from sklearn import datasets from sklearn.preprocessing…

asked Sep 20 '16 at 04:45

aceminer

1,043
1
10
23

35

votes

3 answers

Is whitening always good?

A common pre-processing step for machine learning algorithms is whitening of data. It seems like it is always good to do whitening since it de-correlates the data, making it simpler to model. When is whitening not recommended? Note: I'm referring to…

asked Feb 15 '12 at 07:48

Ran

1,626

35

votes

3 answers

How to build the final model and tune probability threshold after nested cross-validation?

Firstly, apologies for posting a question that has already been discussed at length here, here, here, here, here, and for reheating an old topic. I know @DikranMarsupial has written about this topic at length in posts and journal papers, but I'm…

asked Sep 01 '16 at 17:17

Dr. Andrew John Lowe

451
5
5

35

votes

3 answers

What is the most accurate way of determining an object's color?

I have written a computer program that can detect coins in a static image (.jpeg, .png, etc.) using some standard techniques for computer vision (Gaussian Blur, thresholding, Hough-Transform etc.). Using the ratios of the coins picked up from a…

image-processing

asked Feb 21 '12 at 13:40

MoonKnight

717

35

votes

5 answers

Think like a bayesian, check like a frequentist: What does that mean?

I am looking at some lecture slides on a data science course which can be found here: https://github.com/cs109/2015/blob/master/Lectures/01-Introduction.pdf I, unfortunately, cannot see the video for this lecture and at one point on the slide, the…

asked Aug 16 '16 at 13:33

Luca

4,650

35

votes

2 answers

Raw residuals versus standardised residuals versus studentised residuals - what to use when?

This looks like a similar question and didn't get many responses. Omitting tests such as Cook's D, and just looking at residuals as a group, I am interested in how others use residuals when assessing goodness-of-fit. I use the raw residuals: in a…

asked Feb 11 '12 at 21:15

Michelle

3,900

35

votes

3 answers

How and why does Batch Normalization use moving averages to track the accuracy of the model as it trains?

I was reading the batch normalization (BN) paper (1) and didn't understand the need to use moving averages to track the accuracy of the model and even if I accepted that it was the right thing to do, I don't understand what they are doing…

asked Jun 20 '16 at 15:22

Charlie Parker

6,866

35

votes

2 answers

If the Epanechnikov kernel is theoretically optimal when doing Kernel Density Estimation, why isn't it more commonly used?

I have read (for example, here) that the Epanechnikov kernel is optimal, at least in a theoretical sense, when doing kernel density estimation. If this is true, then why does the Gaussian show up so frequently as the default kernel, or in many…

asked Jun 01 '16 at 21:30

John Rauser

451

35

votes

2 answers

How to plot decision boundary of a k-nearest neighbor classifier from Elements of Statistical Learning?

I want to generate the plot described in the book ElemStatLearn "The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition" by Trevor Hastie & Robert Tibshirani& Jerome Friedman. The plot is: I am wondering how I…

asked Jan 23 '12 at 20:25

littleEinstein

533

Most Popular