pandas調(diào)整列的順序以及添加列的實現(xiàn)

更新時間：2021年03月27日 10:19:05 作者：python小工具

這篇文章主要介紹了pandas調(diào)整列的順序以及添加列的實現(xiàn)操作，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧

在對excel的操作中，調(diào)整列的順序以及添加一些列也是經(jīng)常用到的，下面我們用pandas實現(xiàn)這一功能。

1、調(diào)整列的順序

>>> df = pd.read_excel(r'D:/myExcel/1.xlsx')
>>> df
  A B C D
0  bob 12 78 87
1 millor 15 92 21
>>> df.columns
Index(['A', 'B', 'C', 'D'], dtype='object')
# 這是最簡單常用的一種方法，相當于指定列名讓pandas
# 從df中獲取
>>> df[['A', 'D', 'C', 'B']]
  A D C B
0  bob 87 78 12
1 millor 21 92 15
# 這也是可以的
>>> df[['A', 'A', 'A', 'A']]
  A  A  A  A
0  bob  bob  bob  bob
1 millor millor millor millor

2、添加某一列或者某幾列

（1）直接添加

>>> df['E']=[1, 2]
>>> df
  A B C D E
0  bob 12 78 87 1
1 millor 15 92 21 2

（2）調(diào)用assign方法。該方法善于根據(jù)已有的列添加新的列，通過基本運算，或者調(diào)用函數(shù)

>>> df
  A B C D
0  bob 12 78 87
1 millor 15 92 21
# 其中E是列名，根據(jù)B列-C列的值得到
>>> df.assign(E=df['B'] - df['C'])
  A B C D E
0  bob 12 78 87 -66
1 millor 15 92 21 -77
# 添加兩列也可以
>>> df.assign(E=df['B'] - df['C'], F=df['B'] * df['C'])
  A B C D E  F
0  bob 12 78 87 -66 936
1 millor 15 92 21 -77 1380

哈哈，以上就是pandas關于調(diào)整列的順序以及新增列的用法

補充：pandas修改DataFrame中的列名&調(diào)整列的順序

修改列名：

直接調(diào)用接口：

df.rename()

看一下接口中的定義：

 def rename(self, *args, **kwargs):
  """
  Alter axes labels.
  Function / dict values must be unique (1-to-1). Labels not contained in
  a dict / Series will be left as-is. Extra labels listed don't throw an
  error.
  See the :ref:`user guide <basics.rename>` for more.
  Parameters
  ----------
  mapper, index, columns : dict-like or function, optional
   dict-like or functions transformations to apply to
   that axis' values. Use either ``mapper`` and ``axis`` to
   specify the axis to target with ``mapper``, or ``index`` and
   ``columns``.
  axis : int or str, optional
   Axis to target with ``mapper``. Can be either the axis name
   ('index', 'columns') or number (0, 1). The default is 'index'.
  copy : boolean, default True
   Also copy underlying data
  inplace : boolean, default False
   Whether to return a new DataFrame. If True then value of copy is
   ignored.
  level : int or level name, default None
   In case of a MultiIndex, only rename labels in the specified
   level.
  Returns
  -------
  renamed : DataFrame
  See Also
  --------
  pandas.DataFrame.rename_axis
  Examples
  --------
  ``DataFrame.rename`` supports two calling conventions
  * ``(index=index_mapper, columns=columns_mapper, ...)``
  * ``(mapper, axis={'index', 'columns'}, ...)``
  We *highly* recommend using keyword arguments to clarify your
  intent.
  >>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
  >>> df.rename(index=str, columns={"A": "a", "B": "c"})
   a c
  0 1 4
  1 2 5
  2 3 6
 
  >>> df.rename(index=str, columns={"A": "a", "C": "c"})
   a B
  0 1 4
  1 2 5
  2 3 6
 
  Using axis-style parameters
 
  >>> df.rename(str.lower, axis='columns')
   a b
  0 1 4
  1 2 5
  2 3 6
 
  >>> df.rename({1: 2, 2: 4}, axis='index')
   A B
  0 1 4
  2 2 5
  4 3 6
  """
  axes = validate_axis_style_args(self, args, kwargs, 'mapper', 'rename')
  kwargs.update(axes)
  # Pop these, since the values are in `kwargs` under different names
  kwargs.pop('axis', None)
  kwargs.pop('mapper', None)
  return super(DataFrame, self).rename(**kwargs)

注意：

一個*，輸入可以是數(shù)組、元組，會把輸入的數(shù)組或元組拆分成一個個元素。

兩個*，輸入必須是字典格式

示例：

>>>import pandas as pd
>>>a = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9]})
>>> a 
 A B C
0 1 4 7
1 2 5 8
2 3 6 9 
 
#將列名A替換為列名a，B改為b，C改為c
>>>a.rename(columns={'A':'a', 'B':'b', 'C':'c'}, inplace = True)
>>>a
 a b c
0 1 4 7
1 2 5 8
2 3 6 9

調(diào)整列的順序：

如：

>>> import pandas
>>> dict_a = {'user_id':['webbang','webbang','webbang'],'book_id':['3713327','4074636','26873486'],'rating':['4','4','4'],
'mark_date':['2017-03-07','2017-03-07','2017-03-07']}
 
>>> df = pandas.DataFrame(dict_a) # 從字典創(chuàng)建DataFrame
>>> df # 創(chuàng)建好的df列名默認按首字母順序排序，和字典中的先后順序并不一樣，字典中'user_id','book_id','rating','mark_date'
 
 book_id mark_date rating user_id
0 3713327 2017-03-07 4 webbang
1 4074636 2017-03-07 4 webbang
2 26873486 2017-03-07 4 webbang

直接修改列名：

>>> df = df[['user_id','book_id','rating','mark_date']] # 調(diào)整列順序為'user_id','book_id','rating','mark_date'
>>> df
 
 user_id book_id rating mark_date
0 webbang 3713327 4 2017-03-07
1 webbang 4074636 4 2017-03-07
2 webbang 26873486 4 2017-03-07

就可以了。

以上為個人經(jīng)驗，希望能給大家一個參考，也希望大家多多支持腳本之家。如有錯誤或未考慮完全的地方，望不吝賜教。

您可能感興趣的文章: