8

I am trying to create a histogram data using following query:

SELECT FLOOR(Max_Irrad/10) AS bucket, COUNT(*) AS COUNT
FROM marctest.test_summarynimish
where Lcu_name='Allegro'
  and Lcu_Mode='Standard'
GROUP  BY bucket;

following is the result that i am getting:

bucket count
 0        3
 4        3
 5       12
 7        6
 8        3
10        3

now the bucket field is the range or bin used in the histogram. I want to create a bucket values with consistent range, for eg starting from 0,4,8,12.... and so on.. Is there any way to achieve this in mysql? This is how I am expecting to have as result:

 bucket count
 0        3
 4       21
 8        6
Mosty Mostacho
  • 41,294
  • 14
  • 93
  • 121
NIMISH DESHPANDE
  • 453
  • 3
  • 10
  • 30

2 Answers2

7

I think we can use the following general form to create a general histogram:

select (x div 4) * 4 as NewX, count(*) as NewY from histogram
group by NewX

Where x is the real x value of the x axis and count(*) is the real y value. The number 4 is the size amount of the x values we want to group. This means we will group all x values in groups of 4 (e.g.: group 1 is 0, 1, 2, 3; group 2 is 4, 5, 6, 7, and so on). The count of each item in the group will become the NewY value

You can play with this here

Applying this logic to your query this would be:

select (floor(Max_Irrad/10) div 4) * 4 as NewX, count(*) as NewY
from marctest.test_summarynimish
where Lcu_name='Allegro' and Lcu_Mode='Standard'
group by NewX

Let me know if you have any trouble or doubt about this.

Mosty Mostacho
  • 41,294
  • 14
  • 93
  • 121
  • 1
    The solution you provide is working thanks a lot.... :) that would be really very helpful – NIMISH DESHPANDE Feb 11 '12 at 14:47
  • 1
    Above method is correct except it will not report buckets having zero samples. – LionHeart Dec 06 '12 at 05:58
  • Also, if you're bin size is a non-integer it won't work. You'll need instead floor(value/binSize)*binSize for non-integer bin sizes. – juacala Feb 12 '14 at 15:57
  • @LionHeart That makes sense. If there are no rows that contain data then it is not possible to display those missing rows, right? :) If you want to know how to generate data in MySQL I've written a long explanation with a couple of alternatives in [this other question](http://stackoverflow.com/questions/10034668/how-to-generate-data-in-mysql). Anyway, the missing data can be easily and more efficiently zero-filled when formatting the results in the histogram. – Mosty Mostacho Feb 12 '14 at 18:43
0

Just make your buckets bigger by dividing Max_Irrad by 40 instead of 10.

David Grayson
  • 79,096
  • 23
  • 144
  • 182