2

I am new to pandas and I would like to filter a dataframe in pandas which includes top5 values in the list. What is the best way to get the 5 values from the list with that code?

My Code:

cheese_top5 = cheese[cheese.year >= 2016]
AMC
  • 2,535
  • 7
  • 12
  • 34
Youkesen
  • 87
  • 1
  • 1
  • 3
  • What type of variable are the values? Are they years, integers etc....? – thefragileomen Nov 23 '17 at 20:15
  • is a dataset where I have to select top favorite names of the name I already tried many ways but I didn't find the solution. now Im trying this one: bnames_top5 = bnames.sort_values('year') bnames_top5[bnames_top5 >= 2011] I just want to filter the top5. – Youkesen Nov 23 '17 at 20:25
  • How Can I select just 10 rows and 3 columns. I have in the whole CSV file 1891894 rows × 4 columns. – Youkesen Nov 23 '17 at 20:27
  • 1
    please provide a small (3-7 rows) reproducible sample data set and your desired data set. Please read [how to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit your post correspondingly. – MaxU - stop genocide of UA Nov 23 '17 at 20:28
  • 2. Exploring Trends in Names One of the first things we want to do is to understand naming trends. Let us start by figuring out the top five most popular male and female names for this decade (born 2011 and later). Do you want to make any guesses? Go on, be a sport!! In [120]: # bnames_top5: A dataframe with top 5 popular male and female names for the decadeb bnames_top5 = bnames.sort_values('year') bnames_top5[bnames_top5 >= 2011] – Youkesen Nov 23 '17 at 20:30
  • 3
    Where is your data? What is your expected output? I want to see 5-10 rows of your data along with what your desired output is. Look at how to give a [mcve] and learn [ask]. Thanks. – cs95 Nov 23 '17 at 22:29

5 Answers5

12

I think what you are looking for is:

cheese.sort_values(by=['Name of column']).head(5)

to say anything more we need to see a sample of your data.

Mark
  • 816
  • 10
  • 25
5

You can use the pandas method nlargest:

df['column'].nlargest(n=5)

Reference: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.nlargest.html

lux7
  • 992
  • 2
  • 15
  • 26
1
import pandas as pd
df = pd.read_csv('911.csv')
df['zip'].value_counts().head(5)
StupidWolf
  • 41,371
  • 17
  • 31
  • 62
0
dataframe_name['field_name'].value_counts.head(5)
Tomerikoo
  • 15,737
  • 15
  • 35
  • 52
  • 1
    Generally, answers are much more helpful if they include an explanation of what the code is intended to do, and why that solves the problem without introducing others. – DCCoder Sep 19 '20 at 03:27
0

to get the top 5 most occuring values use df['column'].value_counts().head(n) and the solution provided by @lux7 df['column'].nlargest(n=5) would result in the top 5 values from a column(their values not how many times they have appeared).