Questions tagged [descriptive-statistics]

Descriptive statistics summarize features of a sample, such as mean and standard deviations, median and quartiles, the maximum and minimum. With multiple variables, may include correlations and crosstabs. Can include visual displays - boxplots, histograms, scatterplots and so on.

Descriptive statistics summarize features of a sample.

Common descriptive statistics include mean and standard deviations, particular quantiles like the median and quartiles, the maximum and minimum, range and interquartile range, five number summaries and so on, but with multiple variables, may include correlations and crosstabs.

Descriptive statistics may include visual displays such as boxplots, histograms and scatterplots.

1814 questions
72
votes
12 answers

What does orthogonal mean in the context of statistics?

In other contexts, orthogonal means "at right angles" or "perpendicular". What does orthogonal mean in a statistical context? Thanks for any clarifications.
pmgjones
  • 5,773
  • 8
  • 38
  • 36
13
votes
3 answers

Standard measure of clumpiness?

I have a lot of data and I want to do something which seems very simple. In this large set of data, I am interested in how much a specific element clumps together. Let's say my data is an ordered set like this: {A,C,B,D,A,Z,T,C...}. Let's say I want…
Alan H.
  • 5,169
11
votes
2 answers

Why is the coefficient of variation not valid when using data with positive and negative values?

I can't seem to find a definitive answer to my question. My data consists of several plots with measured means varying from 0.27 to 0.57. In my case, all data values are positive, but the measurement itself is based on a ratio of reflectance values…
10
votes
4 answers

What is the interpretation of interquartile range?

I have daily measurements of nitrogen dioxide for one year (365 days) and the interquartile (IQR) is 24 microgram per cubic meter. What does "24" mean in this context, apart from the definition of IQR which is the difference between the 25th and…
user2742
  • 161
9
votes
1 answer

Hamming distance for strings with different length

I am working on a project that involves computing similarity metrics between strings. I was would like to know whether it is possible to use hamming distance on strings with differnt length, and if possible, how to go about it. I step by step…
7
votes
2 answers

Conceptual understanding of standard deviation vs average distance from the mean

I always understood standard deviation to be the average distance of the observations from the mean. But when I generated a standard normal distribution N(0,1) with n = 1,000,000 in Excel, and took the average of all negative observations and the…
Jon
  • 71
6
votes
1 answer

Inequality involving interquartile range and standard deviation

Suppose I have a finite set of observations $x_{i}$, $i = 1, 2, \ldots, n$. Are there any inequalities relating the standard deviation and the interquartile range?
QTY
  • 63
6
votes
4 answers

Assessing the accuracy of a deterministic mathematical model

How can assess the accuracy of the output of a deterministic mathematical model? For example, a climate model can predict the mean annual temperature (MAT) for a specific location. I can use the model to predict thirty years of MAT in New York City,…
Steven
  • 143
5
votes
2 answers

What are statistical methods for comparing different brackets

I am not particularly familiar with statistics and am looking at methods for analysing numbers that have been broken into different "brackets" or "groups" for different entities. Consider three companies. A announces that they have 10 factories…
5
votes
2 answers

Interpreting table 1 in clinical research papers

Am new to clinical research and starting off by reading some clinical papers. In the paper, I came across a table like as shown below Am I right to understand that the value of HbA1c can range from 6.2 to 9.2 Is that the meaning of (1.5)? Similarly…
The Great
  • 3,272
5
votes
2 answers

How to create an index

I have 5 categories, each category is divided into the subcategories low, medium and high. An object can belong to one or more of these categories with a number between 1 and 100 in each subcategory but the sum for each category can no exceed 100.…
johannes
  • 163
5
votes
3 answers

How to identify a section of data with different characteristics

Consider this dataset: Where the x-axis is some sort of "time" measure, and the y-axis is some sort of "value" measure. E.g. "time taken for website to respond" vs. "time of day", hypothetically speaking, of course ;) It is very clear to the human…
Brondahl
  • 209
  • 1
  • 11
5
votes
1 answer

Is there a "smooth" version of a median of a data set?

So I'm thinking about how one would compare sizes of past civilizations in terms of population size and territory, and imagining that the distributions are basically continuous variables. I know there is a clear sense in which one can compute their…
Addem
  • 489
4
votes
1 answer

Five-number summary and mean

Why was not mean included in the five-number summary, when it was first conceived? What was the motivation of choosing sample minimum & maximum, lower & upper quartile and median?
hans-t
  • 569
  • 2
  • 10
  • 18
3
votes
0 answers

Who is the best writer among statisticians?

I'd like to learn their writing style. Could you recommend a good writer statistician?
user67275
  • 1,097
1
2 3 4 5 6 7