0

I have a dataset with some customer information, with one column containing device codes (identifying the device used). I need to translate this codes into actual model names. I also have a second table with a column holding device codes (same as the first table) and another column holding the corresponding model names. I know it may seem trivial, I have managed to translate codes into models by using a for loop, .loc method and conditional substitution, but I'm looking for a more structured solution.

Here's an extract of the data.

df = pd.DataFrame(
    {
        'Device_code': ['SM-A520F','SM-A520F','iPhone9,3','LG-H860', 'WAS-LX1A', 'WAS-LX1A']
    }
)
transcription_table=pd.DataFrame(
    {
        'Device_code': ['SM-A520F','SM-A520X','iPhone9,3','LG-H860', 'WAS-LX1A', 'XT1662','iPhone11,2'],
        'models': ['Galaxy A5(2017)','Galaxy A5(2017)','iPhone 7','LG G5', 'P10 lite', 'Motorola Moto M','iPhone XS']
    }
)

Basically I need to obtain the explicit model of the device every time there's a match between the device_code column of the two tables, and overwrite the device_code of the first table (df) with the actual model name (or, it can be written on the same row into a newly created column, this is less of a problem).

Thank you for your help.

ALEX
  • 37
  • 4

2 Answers2

2

Turn your transcription_table into an actual mapping (aka a dictionary) and then use Series.map:

transcription_dict = dict(transcription_table.values)
df['models'] = df['Device_code'].map(transcription_dict)
print(df)

output:

  Device_code           models
0    SM-A520F  Galaxy A5(2017)
1    SM-A520F  Galaxy A5(2017)
2   iPhone9,3         iPhone 7
3     LG-H860            LG G5
4    WAS-LX1A         P10 lite
5    WAS-LX1A         P10 lite
Shubham Sharma
  • 52,812
  • 6
  • 20
  • 45
jfaccioni
  • 6,606
  • 1
  • 8
  • 23
0

This is just one solution:

# Dictionary that maps device codes to models
mapping = transcription_table.set_index('Device_code').to_dict()['models']

# Apply mapping to a new column in the dataframe
# If no match is found, None will be filled in
df['Model'] = df['Device_code'].apply(lambda x: mapping.get(x))
sdcbr
  • 6,671
  • 3
  • 26
  • 41