2

How can we plot a histogram from a csv file containing all data into a single column. I need to plot those values vs the number of times they get repeated.

aseth
  • 23

1 Answers1

2

There is a wide used trick to build histogram in gnuplot. If your data is in the file mydata.csv, you can try something like

binwidth=1                          # here you can set the bin width 
bin(x,width)=width*floor(x/width)   # here the binning function
plot "mydata.csv" using (bin($1,binwidth)):(1.0) smooth freq with boxes

So you are building your histogram choosing the bin width.
In a more fine way you can try what below as suggested for example here

Min = 1.0  # where binning starts
Max = 12.0 # where binning ends
n = 11 # the number of bins
width = (Max-Min)/n # binwidth is evaluates to 1.0
bin(x,width) = width*(floor((x-Min)/width)+0.5) + Min
plot "mydata.csv" using (bin($1,width)):(1.0) smooth freq with boxes
Hastur
  • 18,942
  • Now, I am able to plot Histogram, but need to normalize the number of occurances.I took help from another forum to define a variable sum, But I am not able to plot normalized values. I am using these commands. set ylabel 'f/n_f' set xlabel 'k_{ij}^{n}/k_{0}^{n}' set xtics 0.2 set ytics 2 binwidth = 0.1 set boxwidth binwidth sum = 0 s(x) = ((sum=sum+1), 0) bin(x, width) = widthfloor(x/width) + binwidth/2.0 plot "mydata.csv" u (bin($1, binwidth)):(1.0/(binwidthsum)) smooth freq w boxes – aseth Jun 23 '15 at 11:05
  • As you can see it often results not so clean to post many lines of code in a comment. BTW a multiple question should be split in different posts as different questions. This will help other people with similar doubt. :) – Hastur Jun 23 '15 at 11:11
  • From what I understand you never call s(x) on your datafile with some other plot, and so you did not update the value of sum. Under Linux or OS you can set it with a system call to wc -l mydata.csv or with a call to awk... Do another question and it will be simpler for me to give you an answer. – Hastur Jun 23 '15 at 11:24