pandas中DataFrame字典互轉(zhuǎn)的實現(xiàn)
1. dict轉(zhuǎn)化為DataFrame
根據(jù)dict形式的不同,選擇不同的轉(zhuǎn)化方式,主要用的方法是 DataFrame.from_dict,其官方文檔如下:
- pandas.DataFrame.from_dict
- classmethod DataFrame.from_dict(data, orient=‘columns’, dtype=None, columns=None)
- Construct DataFrame from dict of array-like or dicts.
- Creates DataFrame object from dictionary by columns or by index allowing dtype specification.
- Parameters
- data [dict] Of the form {field : array-like} or {field : dict}.
- orient [{‘columns’, ‘index’}, default ‘columns’] The “orientation” of the data. If the
keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default). Otherwise if the keys should be rows, pass ‘index’. - dtype [dtype, default None] Data type to force, otherwise infer.
- columns [list, default None] Column labels to use when orient=‘index’. Raises
a ValueError if used with orient=‘columns’.
- Returns
- DataFrame
1.1 dict的value是不可迭代的對象
1. from_dict
如果用from_dict,必須設(shè)置orient=‘index’,要不然會報錯,也就是dict的key不能用于columns。
dic = {'name': 'abc', 'age': 18, 'job': 'teacher'} df = pd.DataFrame.from_dict(dic, orient='index') print(df)
Out:
0
name abc
age 18
job teacher
2. 土法轉(zhuǎn)換
dict先轉(zhuǎn)換成Series,再將Series轉(zhuǎn)換成Dataframe,再重設(shè)索引,重命名列名。
dic = {'name': 'abc', 'age': 18, 'job': 'teacher'} df = pd.DataFrame(pd.Series(dic), columns=['value']) df = df.reset_index().rename(columns={'index': 'key'}) print(df)
Out:
key value
0 name abc
1 age 18
2 job teacher
1.2 dict的value為list
1.2.1 當沒有指定orient時,默認將key值作為列名。(列排列)
dic = {'color': ['blue', 'green', 'orange', 'yellow'], 'size': [15, 20, 20, 25]} df = pd.DataFrame.from_dict(dic) print(df)
Out:
color size
0 blue 15
1 green 20
2 orange 20
3 yellow 25
1.2.2 當指定orient=‘index’時,將key值作為行名。(行排列)
dic = {'color': ['blue', 'green', 'orange', 'yellow'], 'size': [15, 20, 20, 25]} df = pd.DataFrame.from_dict(dic, orient='index', columns=list('ABCD')) print(df)
Out:
A B C D
color blue green orange yellow
size 15 20 20 25
總結(jié):
orient指定為什么, dict的key就作為什么。
如orient=‘index’,那么dict的key就作為行索引。
1.3 dict的value是dict
1.3.1 使用默認的orient屬性,key將當做columns使用
dic = {'Jack': {'hobby': 'football', 'age': 19}, 'Tom': {'hobby': 'basketball', 'age': 24}, 'Lucy': {'hobby': 'swimming', 'age': 20}, 'Lily': {'age': 21}} df = pd.DataFrame.from_dict(dic) print(df)
Out:
Jack Tom Lucy Lily
age 19 24 20 21.0
hobby football basketball swimming NaN
這是使用了dict嵌套dict的寫法,外層dict的key為columns,values內(nèi)的dict的keys為rows的名稱,缺省的值為NAN
1.3.2 當指定orient=‘index’時,內(nèi)部的key為columns,外部的key為index
當修改orient的默認值’columns’為’index’,內(nèi)部的key為DataFrame的columns,外部的key為DataFrame的index
dic = {'Jack': {'hobby': 'football', 'age': 19}, 'Tom': {'hobby': 'basketball', 'age': 24}, 'Lucy': {'hobby': 'swimming', 'age': 20}, 'Lily': {'age': 21}} df = pd.DataFrame.from_dict(dic, orient='index') print(df)
Out:
hobby age
Jack football 19
Lily NaN 21
Lucy swimming 20
Tom basketball 24
注意:
當時使用dict嵌套dict的時候,設(shè)置了orient='index’后,不能再為columns命名了,此時,如果設(shè)定columns,只會篩選出在原DataFrame中已經(jīng)存在的columns。
dic = {'Jack': {'hobby': 'football', 'age': 19}, 'Tom': {'hobby': 'basketball', 'age': 24}, 'Lucy': {'hobby': 'swimming', 'age': 20}, 'Lily': {'age': 21}} df = pd.DataFrame.from_dict(dic, orient='index', columns=['age', 'A']) print(df)
Out:
age A
Jack 19 NaN
Lily 21 NaN
Lucy 20 NaN
Tom 24 NaN
2.DataFrame轉(zhuǎn)換成 dict
DataFrame.to_dict官方文檔:
pandas.DataFrame.to_dict
- DataFrame.to_dict(orient=‘dict’, into=<class ‘dict’>)
- Convert the DataFrame to a dictionary.
- The type of the key-value pairs can be customized with the parameters (see below).
- Parameters
- orient [str {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}] Determines the type of the
values of the dictionary.
• ‘dict’ (default) : dict like {column -> {index -> value}}
• ‘list’ : dict like {column -> [values]}
• ‘series’ : dict like {column -> Series(values)}
• ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
• ‘records’ : list like [{column -> value}, . . . , {column -> value}]
• ‘index’ : dict like {index -> {column -> value}}
Abbreviations are allowed. s indicates series and sp indicates split. - into [class, default dict] The collections.abc.Mapping subclass used for all Mappings
in the return value. Can be the actual class or an empty instance of the mapping
type you want. If you want a collections.defaultdict, you must pass it initialized.
- orient [str {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}] Determines the type of the
- Returnsdict, list or collections.abc.Mapping Return a collections.abc.Mapping object representing the DataFrame. The resulting transformation depends on the orient parameter.
函數(shù)種只需要填寫一個參數(shù):orient 即可,但對于寫入orient的不同,字典的構(gòu)造方式也不同,官網(wǎng)一共給出了6種,并且其中一種是列表類型:
- orient =‘dict’,是函數(shù)默認的,轉(zhuǎn)化后的字典形式:{column(列名) : {index(行名) : value(值) )}};
- orient =‘list’ ,轉(zhuǎn)化后的字典形式:{column(列名) :{ values }};
- orient=‘series’ ,轉(zhuǎn)化后的字典形式:{column(列名) : Series (values) (值)};
- orient =‘split’ ,轉(zhuǎn)化后的字典形式:{‘index’ : [index],‘columns’ :[columns],’data‘ : [values]};
- orient =‘records’ ,轉(zhuǎn)化后是 list形式:[{column(列名) : value(值)}…{column:value}];
- orient =‘index’ ,轉(zhuǎn)化后的字典形式:{index(值) : {column(列名) : value(值)}};
說明:上面中 value 代表數(shù)據(jù)表中的值,column表示列名,index 表示行名
df = pd.DataFrame({'col_1': [5, 6, 7], 'col_2': [0.35, 0.96, 0.55]}, index=['row1', 'row2', 'row3']) print(df)
Out:
col_1 col_2
row1 5 0.35
row2 6 0.96
row3 7 0.55
2.1 orient =‘list’
{column(列名) : { values }};
生成dict中 key為各列名,value為各列對應值的list
df = df.to_dict(orient='list') print(df)
Out:
{'col_1': [5, 6, 7], 'col_2': [0.35, 0.96, 0.55]}
2.2 orient =‘dict’
{column(列名) : {index(行名) : value(值) )}}
df = df.to_dict(orient='dict') print(df)
Out:
{'col_1': {'row1': 5, 'row2': 6, 'row3': 7}, 'col_2': {'row1': 0.35, 'row2': 0.96, 'row3': 0.55}}
2.3 orient =‘series’
{column(列名) : Series (values) (值)};
orient =‘series’ 與 orient = ‘list’ 唯一區(qū)別就是,這里的 value 是 Series數(shù)據(jù)類型,而前者為列表類型.
df = df.to_dict(orient='series') print(df)
Out:
{'col_1': row1 5
row2 6
row3 7
Name: col_1, dtype: int64, 'col_2': row1 0.35
row2 0.96
row3 0.55
Name: col_2, dtype: float64}
2.4 orient =‘split’
{‘index’ : [index],‘columns’ :[columns],’data‘ : [values]};orient =‘split’ 得到三個鍵值對,列名、行名、值各一個,value統(tǒng)一都是列表形式;
df = df.to_dict(orient='split') print(df)
Out:
{'index': ['row1', 'row2', 'row3'], 'columns': ['col_1', 'col_2'], 'data': [[5, 0.35], [6, 0.96], [7, 0.55]]}
2.5 orient =‘records’
[{column:value(值)},{column:value}…{column:value}];注意的是,orient =‘records’ 返回的數(shù)據(jù)類型不是 dict ; 而是list 列表形式,由全部列名與每一行的值形成一一對應的映射關(guān)系:
df = df.to_dict(orient='records') print(df)
Out:
[{'col_1': 5, 'col_2': 0.35}, {'col_1': 6, 'col_2': 0.96}, {'col_1': 7, 'col_2': 0.55}]
這個構(gòu)造方式的好處就是,很容易得到 列名與某一行值形成得字典數(shù)據(jù);例如我想要第1行{column:value}得數(shù)據(jù):
print(df.to_dict('records')[1])
Out:
{'col_1': 6, 'col_2': 0.96}
2.6 orient =‘index’
{index:{culumn:value}};
orient ='index’與orient =‘dict’ 用法剛好相反,求某一行中列名與值之間一一對應關(guān)系(查詢效果與orient =‘records’ 相似):
print(df.to_dict('index'))
Out:
{'row1': {'col_1': 5, 'col_2': 0.35}, 'row2': {'col_1': 6, 'col_2': 0.96}, 'row3': {'col_1': 7, 'col_2': 0.55}}
查詢行名為 row1 列名與值一一對應字典數(shù)據(jù)類型
print(df.to_dict('index')['row1'])
Out:
{'col_1': 5, 'col_2': 0.35}
到此這篇關(guān)于pandas中DataFrame字典互轉(zhuǎn)的實現(xiàn)的文章就介紹到這了,更多相關(guān)pandas DataFrame字典互轉(zhuǎn)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
Python?ttkbootstrap?制作賬戶注冊信息界面的案例代碼
ttkbootstrap 是一個基于 tkinter 的界面美化庫,使用這個工具可以開發(fā)出類似前端 bootstrap 風格的 tkinter 桌面程序。本文重點給大家介紹Python?ttkbootstrap?制作賬戶注冊信息界面的案例代碼,感興趣的朋友一起看看吧2022-02-02