10

I've searched the pandas documentation and cookbook recipes and it's clear you can round to the nearest decimal place easily using dataframe.columnName.round(decimalplace).

How do you do this with larger numbers?

Example, I have a column of housing prices and I want them rounded to the nearest 10000 or 1000 or whatever.

df.SalesPrice.WhatDoIDo(1000)? 
AK47
  • 8,646
  • 6
  • 37
  • 61
Angelo
  • 865
  • 1
  • 8
  • 14
  • 2
    Possible duplicate of [How do I round to the nearest ten?](https://stackoverflow.com/questions/39824914/how-do-i-round-to-the-nearest-ten) – Andy Ray Dec 23 '17 at 01:37
  • 1
    Divide by 1000, round, multiply by 1000 – Barmar Dec 23 '17 at 01:39
  • Lots of languages have lots of iterative ways to solve this. While there are no shortage of "how to round" questions on SO, I was looking for a specific pandas way to leverage the efficiency of this framework. – Angelo Dec 23 '17 at 13:47

5 Answers5

13

By using the notation df.ColumnName.round(), you are actually calling pandas.Series.round, the documentation of which specifies:

decimals : int

Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.

So you can do:

df = pd.DataFrame({'val':[1,11,130,670]})
df.val.round(decimals=-2)

This produces the output:

0      0
1      0
2    100
3    700
Name: val, dtype: int64

decimals=-3 rounds to the 1000s, and so on. Notably, it also works using pandas.DataFrame.round(), though the documentation doesn't tell you:

df = pd.DataFrame({'val':[1,11,130,670], 'x':[1,11,150,900]})
df.round({'val':-2})

This will round the column val to the nearest 100, but leave x alone.

Community
  • 1
  • 1
PaSTE
  • 3,495
  • 19
  • 23
2

Function round does accept negative values for cases in which you want to specify precision to the left of the decimal point:

dataframe.columnName.round(-3)

Example:

>>> pd.Series([1, 500, 500.1, 999, 1500, 1501, 946546]).round(-3)
0         0.0
1         0.0
2      1000.0
3      1000.0
4      2000.0
5      2000.0
6    947000.0
dtype: float64
grovina
  • 2,869
  • 17
  • 25
1

You can try this

df = pd.DataFrame({'val':[1,11,130,670]})
10**df.val.astype(str).str.len()
Out[27]: 
0      10
1     100
2    1000
3    1000
Name: val, dtype: int64
BENY
  • 296,997
  • 19
  • 147
  • 204
1

Another interesting "hack" is this: Let's say you want to round off to the nearest 100s. You can add 50, then divide by 100, convert to integer, multiply back by 100.

df = pd.DataFrame({'val':[1005,1299,1301,4109]})
df.val.round(-2) # Proper way
((df.val+50)/100).astype(int)*100 # Hack

Gives you this, as desired:

[1000, 1300, 1300, 4100]
FatihAkici
  • 4,161
  • 1
  • 25
  • 47
0

My favorite, dynamic way to do this:

ds: pd.Series to "round"
x: int/float of the power to round

# Define rounding lambda function:
my_rounder = lambda ds, x: ((ds + 0.5*10**x) // 10**x) * 10**x

# Apply lambda function to "prices" values:
housing_df["rounded_prices"] = my_rounder(housing_df["prices"], 3)

# If you need to force/ensure no decimal:
housing_df["rounded_prices"] = housing_df["rounded_prices"].apply(int)

Alternative floor rounder:

my_floor_rounder = lambda ds, x: (ds // 10**x) * 10**x

Breakdown:

print(housing_df["prices"].head())

year
2010    372560.0
2011    374507.0
2012    376454.0
2013    378401.0
2014    380348.0
Name: prices, dtype: float64
    
# This step can be omitted if you're finding the floor:
step_up = housing_df["prices"] + 0.5*10**3
print(step_up.head())

year
2010    373060.0
2011    375007.0
2012    376954.0
2013    378901.0
2014    380848.0
Name: prices, dtype: float64

thsnd = step_up // 10**3
print(thsnd.head())

year
2010    373.0
2011    375.0
2012    376.0
2013    378.0
2014    380.0
Name: prices, dtype: float64

rounded = thsnd * 10**3
print(rounded.head())

year
2010    373000.0
2011    375000.0
2012    376000.0
2013    378000.0
2014    380000.0
Name: prices, dtype: float64

int_rounded = rounded.apply(int)
print(int_rounded.head())

year
2010    373000
2011    375000
2012    376000
2013    378000
2014    380000
Name: prices, dtype: int64
JGarcia
  • 37
  • 5