2

I am working with time-series data and the sample of dataframe is as below.

Input data:

|      date        || Device_Id |   | value |
| ---------------- || --------- |   | ----- |
| 28-12-2018 00:00 ||     d1    |   | 0.014 |
| 28-12-2018 00:15 ||     d1    |   | 0.012 |
| 28-12-2018 00:30 ||     d1    |   | 0.012 |
| 28-12-2018 00:45 ||     d1    |   | 0.014 |
| 28-12-2018 01:00 ||     d1    |   | 0.012 |
| 28-12-2018 01:15 ||     d1    |   | 0.012 |
| 28-12-2018 01:30 ||     d1    |   | 0.012 |
| 28-12-2018 01:45 ||     d1    |   | 0.012 |
| 28-12-2018 02:00 ||     d1    |   | 0.014 |
| 28-12-2018 02:15 ||     d1    |   | 0.012 |
|      ....        ||     d1    |   |  ...  |
|        .         ||     .     |   |   .   |
|        .         ||     .     |   |   .   |
| 31-03-2019 23:45 ||     d2    |   |   .   |

Expected output:

|      date        || Device_Id |   | value |
| ---------------- || --------- |   | ----- |
| 28-12-2018 00:00 ||     d1    |   | 0.014 |
| 28-12-2018 00:15 ||     d1    |   | 0.012 |
| 28-12-2018 00:30 ||     d1    |   | 0.012 |
| 28-12-2018 00:45 ||     d1    |   | 0.014 |
| 28-12-2018 01:00 ||     d1    |   |   0   |
| 28-12-2018 01:15 ||     d1    |   |   0   |
| 28-12-2018 01:30 ||     d1    |   |   0   |
| 28-12-2018 01:45 ||     d1    |   |   0   |
| 28-12-2018 02:00 ||     d1    |   |   0   |
| 28-12-2018 02:15 ||     d1    |   |   0   |
|      ....        ||     d1    |   |  ...  |
|        .         ||     .     |   |   .   |
|        .         ||     .     |   |   .   |
| 31-03-2019 23:45 ||     d2    |   |   .   |

I want to replace zero value in the original dataframe based on Device_Id, date, and time between 1am to 6am. I have tried to solve the problem in different ways but unable to get the desired results. Below is my code that I have tried.

data1['value']=data1.loc[(data1['Device_Id'].str.contains('d1') & data1['date'].str.contains('28-12-2018')), 'value'].between_time('01:00:00', '06:00:00') = 0

The above code showing error "can't assign to function call". After that, I tried with below.

data1['value']=data1.loc[(data1['Device_Id'].str.contains('d1') & data1['date'].str.contains('28-12-2018')), 'value'].between_time('01:00:00', '06:00:00') * 0

This works but not updating the original dataframe.

1 Answers1

2

Create DatetimeIndex with DatetimeIndex.indexer_between_time for indices between times:

#if necessary
#data1['date'] = pd.to_datetime(data1['date'])
data1 = data1.set_index('date')

mask = data1['Device_Id'].str.contains('d1') & (data1.index.normalize() == '28-12-2018')
idx = data1[mask].index.indexer_between_time('01:00:00', '06:00:00')

data1.loc[data1[mask].index[idx], 'value'] = 0

print (data1)
                    Device_Id  value
date                                
2018-12-28 00:00:00        d1  0.014
2018-12-28 00:15:00        d1  0.012
2018-12-28 00:30:00        d1  0.012
2018-12-28 00:45:00        d1  0.014
2018-12-28 01:00:00        d1  0.000
2018-12-28 01:15:00        d1  0.000
2018-12-28 01:30:00        d1  0.000
2018-12-28 01:45:00        d1  0.000
2018-12-28 02:00:00        d1  0.000
2018-12-28 02:15:00        d1  0.000

Simplier is use Series.between with specify datetimes, so possible set values by mask:

#if necessary
#data1['date'] = pd.to_datetime(data1['date'])

mask = (data1['Device_Id'].str.contains('d1') & 
        data1['date'].between('28-12-2018 01:00:00', '28-12-2018 06:00:00'))

data1.loc[mask, 'value'] = 0
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
  • I got a problem with the above code. It is updating all the devices values within the specified time to zero. Like if I feed 'd1' then it should be updated the values associated to 'd1' but it is updating all the devices values. – Rajesh Ahir May 03 '21 at 05:00
  • @RajeshAhir - Can you change data sample for see problem? – jezrael May 03 '21 at 05:01
  • @RajeshAhir - Because in sample dta it working perfectly, so it means some data related problem. – jezrael May 03 '21 at 05:02
  • No, I am wroking on the same dataset which contains many devices such as d1, d2, d3, d4 ...... and its associated values. – Rajesh Ahir May 03 '21 at 05:02
  • @RajeshAhir - OK, so can you change data for see problem? Because for me working like need - only `d1` between `'28-12-2018 01:00:00', '28-12-2018 06:00:00'` – jezrael May 03 '21 at 05:04
  • @RajeshAhir - Is this `dates_list[i][j]` correct? Maybe need `dates_list[j]` I guess. – jezrael May 03 '21 at 05:09