J'ai un fichier texte des statistiques de Kaggle of Clash Royale. C'est dans un format de dictionnaires Python. J'ai du mal à trouver comment lire cela dans un fichier de manière significative. Curieux de savoir quelle est la meilleure façon de le faire. C'est un Dict assez complexe avec des listes.
Jeu de données d'origine ici: https://www.kaggle.com/s1m0n38/clash-royale-matches- ensemble de données
{'players': {'right': {'deck': [['Mega Minion', '9'], ['Electro Wizard', '3'], ['Arrows', '11'], ['Lightning', '5'], ['Tombstone', '9'], ['The Log', '2'], ['Giant', '9'], ['Bowler', '5']], 'trophy': '4258', 'clan': 'TwoFiveOne', 'name': 'gpa raid'}, 'left': {'deck': [['Fireball', '9'], ['Archers', '12'], ['Goblins', '12'], ['Minions', '11'], ['Bomber', '12'], ['The Log', '2'], ['Barbarians', '12'], ['Royal Giant', '13']], 'trophy': '4325', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['2', '0'], 'time': '2017-07-12'} {'players': {'right': {'deck': [['Ice Spirit', '10'], ['Valkyrie', '9'], ['Hog Rider', '9'], ['Inferno Tower', '9'], ['Goblins', '12'], ['Musketeer', '9'], ['Zap', '12'], ['Fireball', '9']], 'trophy': '4237', 'clan': 'The Wolves', 'name': 'TITAN'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4296', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['1', '0'], 'time': '2017-07-12'} {'players': {'right': {'deck': [['Miner', '3'], ['Ice Golem', '9'], ['Spear Goblins', '12'], ['Minion Horde', '12'], ['Inferno Tower', '8'], ['The Log', '2'], ['Skeleton Army', '6'], ['Fireball', '10']], 'trophy': '4300', 'clan': '@LA PERLA NEGRA', 'name': 'Victor'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4267', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['0', '1'], 'time': '2017-07-12'}
3 Réponses :
Selon le synopsis sur kaggle de cet ensemble de données, chacun dictionnaire représente un match entre deux joueurs. J'ai pensé qu'il serait logique que chaque ligne du dataframe représente toutes les caractéristiques d'une seule correspondance.
Cela peut être accompli en quelques étapes courtes.
left_name left_clan left_trophy left_deck right_name right_clan right_trophy right_deck type time result 0 Supr4 battusai 4325 [[Fireball, 9], [Archers, 12], [Goblins, 12], ... gpa raid TwoFiveOne 4258 [[Mega Minion, 9], [Electro Wizard, 3], [Arrow... ladder 2017-07-12 [2, 0] 1 Supr4 battusai 4296 [[Royal Giant, 13], [Ice Wizard, 2], [Bomber, ... TITAN The Wolves 4237 [[Ice Spirit, 10], [Valkyrie, 9], [Hog Rider, ... ladder 2017-07-12 [1, 0] 2 Supr4 battusai 4267 [[Royal Giant, 13], [Ice Wizard, 2], [Bomber, ... Victor @LA PERLA NEGRA 4300 [[Miner, 3], [Ice Golem, 9], [Spear Goblins, 1... ladder 2017-07-12 [0, 1]
type
, time
et result
du correspondance: sides = ['right', 'left'] player_keys = ['deck', 'trophy', 'clan', 'name'] for side in sides: for key in player_keys: for i, row in df.iterrows(): df[side + '_' + key] = df['players'].apply(lambda x: x[side][key]) df = df.drop('players', axis=1) # no longer need this after populating the other columns df = df.iloc[:, ::-1] # made sense to display columns in order of player info from left to right, # followed by general match info at the far right of the dataframe
deck
, le trophy
, le clan
et le name code> des joueurs gauche et droit du match:
df = pd.DataFrame(matches)
Le dataframe résultant ressemble à ceci:
matches = [ {'players': {'right': {'deck': [['Mega Minion', '9'], ['Electro Wizard', '3'], ['Arrows', '11'], ['Lightning', '5'], ['Tombstone', '9'], ['The Log', '2'], ['Giant', '9'], ['Bowler', '5']], 'trophy': '4258', 'clan': 'TwoFiveOne', 'name': 'gpa raid'}, 'left': {'deck': [['Fireball', '9'], ['Archers', '12'], ['Goblins', '12'], ['Minions', '11'], ['Bomber', '12'], ['The Log', '2'], ['Barbarians', '12'], ['Royal Giant', '13']], 'trophy': '4325', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['2', '0'], 'time': '2017-07-12'}, {'players': {'right': {'deck': [['Ice Spirit', '10'], ['Valkyrie', '9'], ['Hog Rider', '9'], ['Inferno Tower', '9'], ['Goblins', '12'], ['Musketeer', '9'], ['Zap', '12'], ['Fireball', '9']], 'trophy': '4237', 'clan': 'The Wolves', 'name': 'TITAN'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4296', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['1', '0'], 'time': '2017-07-12'}, {'players': {'right': {'deck': [['Miner', '3'], ['Ice Golem', '9'], ['Spear Goblins', '12'], ['Minion Horde', '12'], ['Inferno Tower', '8'], ['The Log', '2'], ['Skeleton Army', '6'], ['Fireball', '10']], 'trophy': '4300', 'clan': '@LA PERLA NEGRA', 'name': 'Victor'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4267', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['0', '1'], 'time': '2017-07-12'} ]
test.txt
, qui sera des lignes de dictionnaires.
JSON
et n'ont pas besoin d'être converties dans ce format. str
str
en dict
avec ast.literal_eval
liste
de dictionnaires
en dataframe avec pandas.json_normalize
import pandas as pd from ast import literal_eval with open('test.txt', 'r', encoding='utf-8') as f: # read in the file list_of_rows = [literal_eval(row) for row in f.readlines()] # use a list comprehesion to convert each row from str to dict # convert to a dataframe df = pd.json_normalize(list_of_rows) # display(df) type result time players.right.deck players.right.trophy players.right.clan players.right.name players.left.deck players.left.trophy players.left.clan players.left.name 0 ladder [2, 0] 2017-07-12 [[Mega Minion, 9], [Electro Wizard, 3], [Arrows, 11], [Lightning, 5], [Tombstone, 9], [The Log, 2], [Giant, 9], [Bowler, 5]] 4258 TwoFiveOne gpa raid [[Fireball, 9], [Archers, 12], [Goblins, 12], [Minions, 11], [Bomber, 12], [The Log, 2], [Barbarians, 12], [Royal Giant, 13]] 4325 battusai Supr4 1 ladder [1, 0] 2017-07-12 [[Ice Spirit, 10], [Valkyrie, 9], [Hog Rider, 9], [Inferno Tower, 9], [Goblins, 12], [Musketeer, 9], [Zap, 12], [Fireball, 9]] 4237 The Wolves TITAN [[Royal Giant, 13], [Ice Wizard, 2], [Bomber, 12], [Knight, 12], [Fireball, 9], [Barbarians, 12], [The Log, 2], [Archers, 12]] 4296 battusai Supr4 2 ladder [0, 1] 2017-07-12 [[Miner, 3], [Ice Golem, 9], [Spear Goblins, 12], [Minion Horde, 12], [Inferno Tower, 8], [The Log, 2], [Skeleton Army, 6], [Fireball, 10]] 4300 @LA PERLA NEGRA Victor [[Royal Giant, 13], [Ice Wizard, 2], [Bomber, 12], [Knight, 12], [Fireball, 9], [Barbarians, 12], [The Log, 2], [Archers, 12]] 4267 battusai Supr4