3

I am studying an introductory course in statistics, Essentials of Statistics. The author mentioned that histograms are used to represent the frequency distribution of a continuous data. Then directly, he explained how to detect if there is skewness in the data using histograms. Later he highlighted some other types of plots and graphs including the bar chart.

What is missing for me: What are the graphs that represent quantitative discrete data? If using a bar chart, then is it possible to use bar chart to detect the skewness in a quantitative discrete data?

Alexis
  • 29,850
Nizar
  • 867
  • It’s fine; histogram away with quantitative discrete data. Depending on the range of values, you might find it easy to do the binning better than the software default, however. – Dave Oct 10 '20 at 14:07
  • I think it's important to differentiate between 1) the method that is used to determine the representation of the property/feature/characteristic of the data you're analyzing and 2) the way that representation is displayed. For example, no one stops you to display a histogram as a - please don't do this - pie chart. To answer your question: The skewness of a random variable describes the degree of asymmetry about the mean value. As a consequence, the bar plot of the corresponding histogram will have a trend towards the left or right side. – applesoup Nov 18 '21 at 21:35
  • I wonder what the distinction between "continuous" and "discrete" data might be here. These terms refer correctly to random variables or distributions, but in this context are not relevant characteristics of data. A batch of data is a bunch of numbers. Whether you choose to model them using continuous or discrete random variables has no bearing on whether you can construct a histogram, bar chart, or any other graphic representation of them. – whuber Apr 23 '22 at 22:30

3 Answers3

1

You can represent univariate discrete data well using a bar plot, where the value of the variable is on the horizontal axis and the frequency/proportion of outcomes is on the vertical axis. This type of plot is essentially a type of histogram for discrete data.$^\dagger$ As for diagnosing skewness in the data, this should be reasonably evident from visual inspection of the bar plot in most cases, but it might be hard to diagnose in some difficult cases. You can supplement a visual assessment of skewness by computing the sample skewness for the data as one of your descriptive statistics.


$^\dagger$ Technically speaking, a bar plot for a univariate discrete variable (taking on integer vaules) is a histogram that using "bins" that each contain an individual discrete outcome (i.e., a single integer), with the axis for the histogram taken only over discrete outcomes rather than a continuum.

Ben
  • 124,856
0

Does the author specify which definition of skewness is being used? There are several and normally do not rely on looking at histograms.

I am guessing that the author is proposing a (vague) definition of skewness that you could apply to discrete quantitative variables. You could plot an histogram for this and proceed as if you were dealing with a continuous variable.

0

You also use a histogram to represent discrete data, including quantities like counts. A histogram is a visual representation of the frequency or probability by area over values of $x$ (for values of a discrete variable) or probability density—probability of $x$ for a given range of specific values of $x$ (i.e. for values of a continuous variable).

A bar chart, on the other hand, is a graph relating two different variables (e.g. $x$ is number of oranges eaten per capita per year, and $y$ is country). So you do not really use a bar chart to represent the distribution of a single variable.

Alexis
  • 29,850
  • In my opinion, it doesn't make sense to compare "histogram" and "bar graph". The histogram is a general way to find some representation of discrete data. While it has become common practice to display that representation in the form of line or bar graphs, plotting is not part of determining the "histogram". A bar graph as I see it is just a way to visualize some discrete, i.e. categorial, data by means of bars of different lengths. – applesoup Nov 18 '21 at 21:27
  • @applesoup I think I agree: one is a one-variable graph, the other a two (or more) variable graph. – Alexis Nov 18 '21 at 21:56
  • Your characterization of histograms is incorrect: they represent probability density, not probability per se, which means they depict probability by means of areas rather than heights of bars. In this fashion histograms differ from the bar charts used to show probability mass for discrete distributions. In these respects I think most of the points made in this answer confuse the issues rather than clarify them. – whuber Apr 23 '22 at 22:27
  • @whuber I was being sloppy. How do you find my edit? – Alexis Apr 24 '22 at 15:29
  • Someone who already has a clear idea of histograms and bar charts might follow this, but phrases like "frequency or probability by area over values of x" are difficult to parse and might have inconsistent or unintended interpretations. I continue to disagree with the second paragraph: a bar chart is a standard way to depict a probability mass function ("distribution of a single variable"). – whuber Apr 24 '22 at 16:54