快捷導航

pandas中DataFrame字典互轉(zhuǎn)的實現(xiàn)

更新時間：2024年04月03日 11:44:04 作者：craftsman2020

在數(shù)據(jù)處理和分析中,Pandas是一個非常強大的Python庫,本文主要介紹了pandas中DataFrame字典互轉(zhuǎn)的實現(xiàn),文中通過示例代碼介紹的非常詳細,需要的朋友們下面隨著小編來一起學習學習吧

1. dict轉(zhuǎn)化為DataFrame

根據(jù)dict形式的不同，選擇不同的轉(zhuǎn)化方式，主要用的方法是 DataFrame.from_dict，其官方文檔如下：

pandas.DataFrame.from_dict
- classmethod DataFrame.from_dict(data, orient=‘columns’, dtype=None, columns=None)
- Construct DataFrame from dict of array-like or dicts.
- Creates DataFrame object from dictionary by columns or by index allowing dtype specification.
- Parameters
  - data [dict] Of the form {field : array-like} or {field : dict}.
  - orient [{‘columns’, ‘index’}, default ‘columns’] The “orientation” of the data. If the
    keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default). Otherwise if the keys should be rows, pass ‘index’.
  - dtype [dtype, default None] Data type to force, otherwise infer.
  - columns [list, default None] Column labels to use when orient=‘index’. Raises
    a ValueError if used with orient=‘columns’.
- Returns
  - DataFrame

1.1 dict的value是不可迭代的對象

1. from_dict

如果用from_dict，必須設(shè)置orient=‘index’,要不然會報錯，也就是dict的key不能用于columns。

dic = {'name': 'abc', 'age': 18, 'job': 'teacher'}
df = pd.DataFrame.from_dict(dic, orient='index')
print(df)

Out:

0
name abc
age 18
job teacher

2. 土法轉(zhuǎn)換

dict先轉(zhuǎn)換成Series，再將Series轉(zhuǎn)換成Dataframe，再重設(shè)索引，重命名列名。

dic = {'name': 'abc', 'age': 18, 'job': 'teacher'}
df = pd.DataFrame(pd.Series(dic), columns=['value'])
df = df.reset_index().rename(columns={'index': 'key'})
print(df)

Out:

key value
0 name abc
1 age 18
2 job teacher

1.2 dict的value為list

1.2.1 當沒有指定orient時，默認將key值作為列名。（列排列）

dic = {'color': ['blue', 'green', 'orange', 'yellow'], 'size': [15, 20, 20, 25]}
df = pd.DataFrame.from_dict(dic)
print(df)

Out:

color size
0 blue 15
1 green 20
2 orange 20
3 yellow 25

1.2.2 當指定orient=‘index’時，將key值作為行名。（行排列）

dic = {'color': ['blue', 'green', 'orange', 'yellow'], 'size': [15, 20, 20, 25]}
df = pd.DataFrame.from_dict(dic, orient='index', columns=list('ABCD'))
print(df)

Out:

A B C D
color blue green orange yellow
size 15 20 20 25

總結(jié)：
orient指定為什么, dict的key就作為什么。
如orient=‘index’，那么dict的key就作為行索引。

1.3 dict的value是dict

1.3.1 使用默認的orient屬性，key將當做columns使用

dic = {'Jack': {'hobby': 'football', 'age': 19},
       'Tom': {'hobby': 'basketball', 'age': 24},
       'Lucy': {'hobby': 'swimming', 'age': 20},
       'Lily': {'age': 21}}
df = pd.DataFrame.from_dict(dic)
print(df)

Out:

Jack Tom Lucy Lily
age 19 24 20 21.0
hobby football basketball swimming NaN

這是使用了dict嵌套dict的寫法，外層dict的key為columns，values內(nèi)的dict的keys為rows的名稱,缺省的值為NAN

1.3.2 當指定orient=‘index’時，內(nèi)部的key為columns，外部的key為index

當修改orient的默認值’columns’為’index’，內(nèi)部的key為DataFrame的columns，外部的key為DataFrame的index

dic = {'Jack': {'hobby': 'football', 'age': 19},
       'Tom': {'hobby': 'basketball', 'age': 24},
       'Lucy': {'hobby': 'swimming', 'age': 20},
       'Lily': {'age': 21}}
df = pd.DataFrame.from_dict(dic, orient='index')
print(df)

Out:

hobby age
Jack football 19
Lily NaN 21
Lucy swimming 20
Tom basketball 24

注意：
當時使用dict嵌套dict的時候，設(shè)置了orient='index’后，不能再為columns命名了，此時，如果設(shè)定columns，只會篩選出在原DataFrame中已經(jīng)存在的columns。

dic = {'Jack': {'hobby': 'football', 'age': 19},
       'Tom': {'hobby': 'basketball', 'age': 24},
       'Lucy': {'hobby': 'swimming', 'age': 20},
       'Lily': {'age': 21}}
df = pd.DataFrame.from_dict(dic, orient='index', columns=['age', 'A'])
print(df)

Out:

age A
Jack 19 NaN
Lily 21 NaN
Lucy 20 NaN
Tom 24 NaN

2.DataFrame轉(zhuǎn)換成 dict

DataFrame.to_dict官方文檔：

pandas.DataFrame.to_dict
- DataFrame.to_dict(orient=‘dict’, into=<class ‘dict’>)
- Convert the DataFrame to a dictionary.
- The type of the key-value pairs can be customized with the parameters (see below).
- Parameters
  - orient [str {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}] Determines the type of the
    values of the dictionary.
    • ‘dict’ (default) : dict like {column -> {index -> value}}
    • ‘list’ : dict like {column -> [values]}
    • ‘series’ : dict like {column -> Series(values)}
    • ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
    • ‘records’ : list like [{column -> value}, . . . , {column -> value}]
    • ‘index’ : dict like {index -> {column -> value}}
    Abbreviations are allowed. s indicates series and sp indicates split.
  - into [class, default dict] The collections.abc.Mapping subclass used for all Mappings
    in the return value. Can be the actual class or an empty instance of the mapping
    type you want. If you want a collections.defaultdict, you must pass it initialized.
- Returnsdict, list or collections.abc.Mapping Return a collections.abc.Mapping object representing the DataFrame. The resulting transformation depends on the orient parameter.
函數(shù)種只需要填寫一個參數(shù)：orient 即可，但對于寫入orient的不同，字典的構(gòu)造方式也不同，官網(wǎng)一共給出了6種，并且其中一種是列表類型：
- orient =‘dict’，是函數(shù)默認的，轉(zhuǎn)化后的字典形式：{column(列名) : {index(行名) : value(值) )}}；
- orient =‘list’ ，轉(zhuǎn)化后的字典形式：{column(列名) :{ values }};
- orient=‘series’ ，轉(zhuǎn)化后的字典形式：{column(列名) : Series (values) (值)};
- orient =‘split’ ，轉(zhuǎn)化后的字典形式：{‘index’ : [index]，‘columns’ :[columns]，’data‘ : [values]};
- orient =‘records’ ，轉(zhuǎn)化后是 list形式：[{column(列名) : value(值)}…{column:value}];
- orient =‘index’ ，轉(zhuǎn)化后的字典形式：{index(值) : {column(列名) : value(值)}};
說明：上面中 value 代表數(shù)據(jù)表中的值，column表示列名，index 表示行名

df = pd.DataFrame({'col_1': [5, 6, 7], 'col_2': [0.35, 0.96, 0.55]}, index=['row1', 'row2', 'row3'])
print(df)

Out:

col_1 col_2
row1 5 0.35
row2 6 0.96
row3 7 0.55

2.1 orient =‘list’

{column(列名) : { values }};
生成dict中 key為各列名，value為各列對應值的list

df = df.to_dict(orient='list')
print(df)

Out:

{'col_1': [5, 6, 7], 'col_2': [0.35, 0.96, 0.55]}

2.2 orient =‘dict’

{column(列名) : {index(行名) : value(值) )}}

df = df.to_dict(orient='dict')
print(df)

Out:

{'col_1': {'row1': 5, 'row2': 6, 'row3': 7}, 'col_2': {'row1': 0.35, 'row2': 0.96, 'row3': 0.55}}

2.3 orient =‘series’

{column(列名) : Series (values) (值)};
orient =‘series’ 與 orient = ‘list’ 唯一區(qū)別就是，這里的 value 是 Series數(shù)據(jù)類型，而前者為列表類型.

df = df.to_dict(orient='series')
print(df)

Out:

{'col_1': row1 5
row2 6
row3 7
Name: col_1, dtype: int64, 'col_2': row1 0.35
row2 0.96
row3 0.55
Name: col_2, dtype: float64}

2.4 orient =‘split’

{‘index’ : [index]，‘columns’ :[columns]，’data‘ : [values]};orient =‘split’ 得到三個鍵值對，列名、行名、值各一個，value統(tǒng)一都是列表形式；

df = df.to_dict(orient='split')
print(df)

Out:

{'index': ['row1', 'row2', 'row3'], 'columns': ['col_1', 'col_2'], 'data': [[5, 0.35], [6, 0.96], [7, 0.55]]}

2.5 orient =‘records’

[{column:value(值)},{column:value}…{column:value}];注意的是，orient =‘records’ 返回的數(shù)據(jù)類型不是 dict ; 而是list 列表形式，由全部列名與每一行的值形成一一對應的映射關(guān)系:

df = df.to_dict(orient='records')
print(df)

Out:

[{'col_1': 5, 'col_2': 0.35}, {'col_1': 6, 'col_2': 0.96}, {'col_1': 7, 'col_2': 0.55}]

這個構(gòu)造方式的好處就是，很容易得到列名與某一行值形成得字典數(shù)據(jù)；例如我想要第1行{column:value}得數(shù)據(jù)：

print(df.to_dict('records')[1])

Out:

{'col_1': 6, 'col_2': 0.96}

2.6 orient =‘index’

{index:{culumn:value}};

orient ='index’與orient =‘dict’ 用法剛好相反，求某一行中列名與值之間一一對應關(guān)系(查詢效果與orient =‘records’ 相似)：

print(df.to_dict('index'))

Out:

{'row1': {'col_1': 5, 'col_2': 0.35}, 'row2': {'col_1': 6, 'col_2': 0.96}, 'row3': {'col_1': 7, 'col_2': 0.55}}

查詢行名為 row1 列名與值一一對應字典數(shù)據(jù)類型

print(df.to_dict('index')['row1'])

Out:

{'col_1': 5, 'col_2': 0.35}

到此這篇關(guān)于pandas中DataFrame字典互轉(zhuǎn)的實現(xiàn)的文章就介紹到這了,更多相關(guān)pandas DataFrame字典互轉(zhuǎn)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: