0

I have a data made of 20 rows and 2500 columns. Each column is a unique product and rows are time series, results of measurements. Therefore each product is measured 20 times and there are 2500 products.

My data is defined as DataFrame and I want to write down the number of the row (index) where a specific condition (such as: x> 3) is met for the first time, for all columns(products, so that I will have an array in the end.

I tried using loops and iterrow but failed at executing.

P.S: I used idxmax() in order to get the row id of max value but this time I want to get the index of the cell where a condition is met for the first time and then break.

Cœur
  • 34,719
  • 24
  • 185
  • 251
meliksahturker
  • 471
  • 1
  • 4
  • 15

1 Answers1

1

Simply use .gt + .idxmax, which will give you the index of the first time your condition is met.

import pandas as pd
import numpy as np

np.random.seed(12)
df = pd.DataFrame(np.random.randint(1,5,(20,2500)))

df.gt(3).idxmax()
#0        0
#1        0
#2        4
#3        4
#4        1
#...
#2496     8
#2497     0
#2498     5
#2499     1
ALollz
  • 54,844
  • 7
  • 56
  • 77