training_data = [ ['Green',3,'Apple'], ['Yellow',3,'Apple'], ['Red',1,'Grape'], ['Red',1,'Grape'], ['Yellow',3,'Lemon'] ] def unique_values(df,col): return set([row[col] for row in df]) unique_values(training_data,1) output = {1,3} I want to be able to do this but with a pandas data frame instead of a list
3 Réponses :
Comme ça?
>>> import pandas as pd >>> training_data = [ ... ['Green',3,'Apple'], ... ['Yellow',3,'Apple'], ... ['Red',1,'Grape'], ... ['Red',1,'Grape'], ... ['Yellow',3,'Lemon'] ... ] >>> df = pd.DataFrame(training_data, columns = ['color', 'number', 'fruit']) >>> df.head() color number fruit 0 Green 3 Apple 1 Yellow 3 Apple 2 Red 1 Grape 3 Red 1 Grape 4 Yellow 3 Lemon >>> df.number.unique() array([3, 1])
Vous pouvez utiliser Series.unique
pour rechercher des valeurs uniques dans une colonne.
Créez un dataframe à partir de votre liste comme ceci:
In [1988]: unique_values(df, 'color') Out[1988]: ['Green', 'Yellow', 'Red'] In [1989]: unique_values(df, 'fruit') Out[1989]: ['Apple', 'Grape', 'Lemon'] In [1990]: unique_values(df, 'number') Out[1990]: [3, 1]
Alors ayez votre fonction comme ceci:
In [1983]: def unique_values(df,col): ...: return df[col].unique().tolist() ...:
Exécutez votre fonction comme ceci:
In [1974]: import pandas as pd In [1975]: df = pd.DataFrame(training_data, columns = ['color', 'number', 'fruit']) In [1986]: df Out[1986]: color number fruit 0 Green 3 Apple 1 Yellow 3 Apple 2 Red 1 Grape 3 Red 1 Grape 4 Yellow 3 Lemon
def unique_values(df,column): return set(df[column]) This also worked after turning the list into a data frame, thank you both!
Vous pouvez
df.agg(pd.Series.unique)