0

In Python, I have a DataFrame that looks like the following, all the way down to about 5000 samples:

enter image description here

I was wondering, in pandas, how do I remove 3 out of every 4 data points in my DataFrame?

Gary
  • 2,017
  • 3
  • 18
  • 39

2 Answers2

4

To obtain a random sample of a quarter of your DataFrame, you could use

test4.sample(frac=0.25)

or, to specify the exact number of rows

test4.sample(n=1250))

If your purpose is to build training, validation, and testing data sets, then see this question.

unutbu
  • 777,569
  • 165
  • 1,697
  • 1,613
  • It's not related to machine learning, but rather, related to this question: https://stackoverflow.com/questions/45337886/valueerror-cannot-copy-sequence-to-array-axis-in-python-for-matplotlib-animatio/45342706#45342706 – Gary Jul 27 '17 at 20:49
1

If you want to select every 4th point, then you can do the following. This will select rows 0, 4, 8, ...:

test4.iloc[::4, :]['Accel']
nanojohn
  • 552
  • 3
  • 13