0

I have a dataframe df as shown below

   ID               Aisle            Residence        HomePhone        CellPhone
   ------------------------------------------------------------------------------
0  1245,3214        A1, A2, A3, A4   Home             NaN              888888888
1  5674             B2,B3            Cell             777777777        999999999

Expected outcome:

{
 "0":{
    {
      "column": "ID",
      "values": [
        "1245",
        "3214"
      ]
    },
    {
      "column": "Aisle",
      "values": [
        "A1",
        "A2",
        "A3",
        "A4"
      ]
    },
    {
      "column": "Residence",
      "values": [
        "Home"
      ]
    },
    {
      "column": "HomePhone",
      "values": []
    },
    {
      "column": "CellPhone",
      "values": [
        "888888888"
      ]
    }
   },
"1":{
    {
      "column": "ID",
      "values": [
        "5674"
      ]
    },
    {
      "column": "Aisle",
      "values": [
        "B2",
        "B3"
      ]
    },
    {
      "column": "Residence",
      "values": [
        "Cell"
      ]
    },
    {
      "column": "HomePhone",
      "values": [
        "777777777"
      ]
    },
    {
      "column": "CellPhone",
      "values": [
        "999999999"
      ]
    }
   },

I have 2 rows which are 0 and 1, and the json has the information under them for each row. So basically, I want to add attributes and assign them to values that are column names, such as "column":"Aisle","values":["A1","B1,...]"

Also, I have a restriction that the column names always change (ID, Aisle, Residence...etc) and the number of columns vary, so I cannot hard code columns when converting from DF to JSON.

pppery
  • 3,550
  • 19
  • 28
  • 41
Ash A
  • 127
  • 1
  • 1
  • 10

1 Answers1

1

I think this is the closest thing to what you are looking for.Use DataFrame.to_json:

df2=df.copy()
df2[['ID','Aisle']]=df2[['ID','Aisle']].apply(lambda x: x.replace(' ','').str.split(','))
print(df2)

             ID                Aisle Residence    HomePhone  CellPhone
0  [1245, 3214]  [A1,  A2,  A3,  A4]      Home          NaN  888888888
1        [5674]             [B2, B3]      Cell  777777777.0  999999999

df2.T.to_json()

Output:

'{"0":{"ID":["1245","3214"],"Aisle":["A1"," A2"," A3"," A4"],"Residence":"Home","HomePhone":null,"CellPhone":888888888},"1":{"ID":["5674"],"Aisle":["B2","B3"],"Residence":"Cell","HomePhone":777777777.0,"CellPhone":999999999}}'

You can also try this:

def split_func(x):
    try:
        return x.replace(' ','').str.split(',') 
    except:
        return x
df2=df2.apply(split_func)

but keep in mind that all str type cells would be converted to list

df2.T.to_json()

'{"0":{"ID":["1245","3214"],"Aisle":["A1","A2","A3","A4"],"Residence":["Home"],"HomePhone":null,"CellPhone":888888888},"1":{"ID":["5674"],"Aisle":["B2","B3"],"Residence":["Cell"],"HomePhone":777777777.0,"CellPhone":999999999}}'
ansev
  • 28,746
  • 5
  • 11
  • 29