pandas read_json: "If using all scalar values, you must pass an index"

Question

I have some difficulty in importing a JSON file with pandas.

import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

This is the error that I get:

ValueError: If using all scalar values, you must pass an index

The file structure is simplified like this:

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

It is from the machine learning course of University of Washington on Coursera. You can find the file here.

This is much more a pandas question than it it a JSON question -- you wouldn't have this specific error in any context that *didn't* involve pandas, but you **can** get this specific error without JSON being involved. — Charles Duffy, Jul 14 '16 at 17:44
See for instance, http://stackoverflow.com/questions/17839973/construct-pandas-dataframe-from-values-in-variables -- a question with the same error, but no JSON involved. — Charles Duffy, Jul 14 '16 at 17:46
It is expecting a list. So if you do like this will work. `pd.DataFrame([{"biennials": 522004, "lb915": 116290}])`. — Aung, Jun 30 '17 at 08:06

score 72 · Accepted Answer · answered Jul 14 '16 at 18:05

Try

ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')

That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').

You can also do something like this:

import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
    data = json.load(f)

Now data is a dictionary. You can pass it to a dataframe constructor like this:

df = pd.DataFrame({'count': data})

score 19 · Answer 2 · answered Apr 17 '19 at 05:42

You can do as @ayhan mention which will give you a column base format

Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.

df = pd.DataFrame([data])

score 6 · Answer 3 · answered Apr 14 '17 at 04:56

I think what is happening is that the data in

map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')

is being read as a string instead of a json

{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}

is actually

'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'

Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing

The best way is to do a json loads on the string to convert it to a dict and load it into pandas

myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)

thanks for the explaination, I have been looking for the reason for some time already. — Marine Galantin, Jun 02 '20 at 23:39

score 2 · Answer 4 · answered Oct 02 '21 at 13:06

{
"biennials": 522004,
"lb915": 116290
}

df = pd.read_json('values.json')

As pd.read_json expects a list

{
"biennials": [522004],
"lb915": [116290]
}

for a particular key, it returns an error saying

If using all scalar values, you must pass an index.

So you can resolve this by specifying 'typ' arg in pd.read_json

map_index_to_word = pd.read_json('Datasets/people_wiki_map_index_to_word.json', typ='dictionary')

score 0 · Answer 5 · answered May 10 '21 at 16:28

For example cat values.json

{
name: "Snow",
age: "31"
}

df = pd.read_json('values.json')

Chances are you might end up with this Error: if using all scalar values, you must pass an index

Pandas looks up for a list or dictionary in the value. Something like cat values.json

{
name: ["Snow"],
age: ["31"]
}

So try doing this. Later on to convert to html tohtml()

df = pd.DataFrame([pd.read_json(report_file,  typ='series')])
result = df.to_html()

score 0 · Answer 6 · answered Apr 01 '22 at 06:41

I solved this by converting it into an array like so

[{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}]

pandas read_json: "If using all scalar values, you must pass an index"

6 Answers6

Linked

Related