I am analysing CPU useage on a large network. In order to do that, I was provided with a large excel sheet. It contains batchID (means we dedicate a CPU to run that task) startTime, endTime (means we know the CPU is fully occupied during this time).
Based on this data, I need to understand how many batches are running at a particular instance. Therefore, I will use a chat with x-axis been the time and y-axis been the count of batches running at each time instance.
The whole file is over 15000 rows over two days's data. Here is a fraction of it.
BATCHID startTime endTime
560062 13/10/2011 11:59:23 13/10/2011 11:59:26
560061 13/10/2011 08:59:18 13/10/2011 08:59:21
560060 13/10/2011 05:59:21 13/10/2011 05:59:30
560059 13/10/2011 02:59:34 13/10/2011 02:59:43
560058 13/10/2011 01:57:24 13/10/2011 01:57:29
560057 13/10/2011 01:57:24 13/10/2011 01:57:28
560056 12/10/2011 23:59:19 12/10/2011 23:59:28
560055 12/10/2011 20:59:21 12/10/2011 20:59:30
560054 12/10/2011 18:02:13 12/10/2011 18:02:22
560053 12/10/2011 18:02:13 12/10/2011 18:02:21
560052 12/10/2011 18:02:12 12/10/2011 18:02:21
560051 12/10/2011 18:02:07 12/10/2011 18:02:16
560050 12/10/2011 18:02:03 12/10/2011 18:02:11
560049 12/10/2011 18:02:10 12/10/2011 18:02:19
560048 12/10/2011 18:02:11 12/10/2011 18:02:16
560047 12/10/2011 18:02:09 12/10/2011 18:02:13
560046 12/10/2011 18:02:04 12/10/2011 18:02:13
560045 12/10/2011 18:02:12 12/10/2011 18:02:21
Requirment:
- We need array to contain the time slice data. This could be every 1 minuts or 5 minuts. If we need to analyse two days for every 1 minuts interval, we could then need 2880 data points for the x-axis.
- Because at any instance there are could be many jobs running. We'll need to device a mechnism to count number of running batches at that time slice.
I suspect Excel 2003 can do a good job as the number of columns is limited to 256.
I'm welcoming any advise on how to permoer this task efficiently in Octave/MATLAB, ORACLE PL/SQL, R or Bash Script.
