R語(yǔ)言關(guān)于數(shù)據(jù)幀的知識(shí)點(diǎn)詳解
數(shù)據(jù)幀是表或二維陣列狀結(jié)構(gòu),其中每一列包含一個(gè)變量的值,并且每一行包含來(lái)自每一列的一組值。
以下是數(shù)據(jù)幀的特性。
- 列名稱應(yīng)為非空。
- 行名稱應(yīng)該是唯一的。
- 存儲(chǔ)在數(shù)據(jù)幀中的數(shù)據(jù)可以是數(shù)字,因子或字符類型。
- 每個(gè)列應(yīng)包含相同數(shù)量的數(shù)據(jù)項(xiàng)。
創(chuàng)建數(shù)據(jù)幀
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Print the data frame. print(emp.data)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_id emp_name salary start_date 1 1 Rick 623.30 2012-01-01 2 2 Dan 515.20 2013-09-23 3 3 Michelle 611.00 2014-11-15 4 4 Ryan 729.00 2014-05-11 5 5 Gary 843.25 2015-03-27
獲取數(shù)據(jù)幀的結(jié)構(gòu)
通過(guò)使用str()函數(shù)可以看到數(shù)據(jù)幀的結(jié)構(gòu)。
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Get the structure of the data frame. str(emp.data)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
'data.frame': 5 obs. of 4 variables: $ emp_id : int 1 2 3 4 5 $ emp_name : chr "Rick" "Dan" "Michelle" "Ryan" ... $ salary : num 623 515 611 729 843 $ start_date: Date, format: "2012-01-01" "2013-09-23" "2014-11-15" "2014-05-11" ...
數(shù)據(jù)框中的數(shù)據(jù)摘要
可以通過(guò)應(yīng)用summary()函數(shù)獲取數(shù)據(jù)的統(tǒng)計(jì)摘要和性質(zhì)。
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Print the summary. print(summary(emp.data))
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_id emp_name salary start_date Min. :1 Length:5 Min. :515.2 Min. :2012-01-01 1st Qu.:2 Class :character 1st Qu.:611.0 1st Qu.:2013-09-23 Median :3 Mode :character Median :623.3 Median :2014-05-11 Mean :3 Mean :664.4 Mean :2014-01-14 3rd Qu.:4 3rd Qu.:729.0 3rd Qu.:2014-11-15 Max. :5 Max. :843.2 Max. :2015-03-27
從數(shù)據(jù)幀提取數(shù)據(jù)
使用列名稱從數(shù)據(jù)框中提取特定列。
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01","2013-09-23","2014-11-15","2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Extract Specific columns. result <- data.frame(emp.data$emp_name,emp.data$salary) print(result)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp.data.emp_name emp.data.salary 1 Rick 623.30 2 Dan 515.20 3 Michelle 611.00 4 Ryan 729.00 5 Gary 843.25
先提取前兩行,然后提取所有列
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Extract first two rows. result <- emp.data[1:2,] print(result)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_id emp_name salary start_date 1 1 Rick 623.3 2012-01-01 2 2 Dan 515.2 2013-09-23
用第2和第4列提取第3和第5行
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Extract 3rd and 5th row with 2nd and 4th column. result <- emp.data[c(3,5),c(2,4)] print(result)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_name start_date 3 Michelle 2014-11-15 5 Gary 2015-03-27
擴(kuò)展數(shù)據(jù)幀
可以通過(guò)添加列和行來(lái)擴(kuò)展數(shù)據(jù)幀。
添加列
只需使用新的列名稱添加列向量。
# Create the data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), stringsAsFactors = FALSE ) # Add the "dept" coulmn. emp.data$dept <- c("IT","Operations","IT","HR","Finance") v <- emp.data print(v)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_id emp_name salary start_date dept 1 1 Rick 623.30 2012-01-01 IT 2 2 Dan 515.20 2013-09-23 Operations 3 3 Michelle 611.00 2014-11-15 IT 4 4 Ryan 729.00 2014-05-11 HR 5 5 Gary 843.25 2015-03-27 Finance
添加行
要將更多行永久添加到現(xiàn)有數(shù)據(jù)幀,我們需要引入與現(xiàn)有數(shù)據(jù)幀相同結(jié)構(gòu)的新行,并使用rbind()函數(shù)。
在下面的示例中,我們創(chuàng)建一個(gè)包含新行的數(shù)據(jù)幀,并將其與現(xiàn)有數(shù)據(jù)幀合并以創(chuàng)建最終數(shù)據(jù)幀。
# Create the first data frame. emp.data <- data.frame( emp_id = c (1:5), emp_name = c("Rick","Dan","Michelle","Ryan","Gary"), salary = c(623.3,515.2,611.0,729.0,843.25), start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")), dept = c("IT","Operations","IT","HR","Finance"), stringsAsFactors = FALSE ) # Create the second data frame emp.newdata <- data.frame( emp_id = c (6:8), emp_name = c("Rasmi","Pranab","Tusar"), salary = c(578.0,722.5,632.8), start_date = as.Date(c("2013-05-21","2013-07-30","2014-06-17")), dept = c("IT","Operations","Fianance"), stringsAsFactors = FALSE ) # Bind the two data frames. emp.finaldata <- rbind(emp.data,emp.newdata) print(emp.finaldata)
當(dāng)我們執(zhí)行上面的代碼,它產(chǎn)生以下結(jié)果 -
emp_id emp_name salary start_date dept 1 1 Rick 623.30 2012-01-01 IT 2 2 Dan 515.20 2013-09-23 Operations 3 3 Michelle 611.00 2014-11-15 IT 4 4 Ryan 729.00 2014-05-11 HR 5 5 Gary 843.25 2015-03-27 Finance 6 6 Rasmi 578.00 2013-05-21 IT 7 7 Pranab 722.50 2013-07-30 Operations 8 8 Tusar 632.80 2014-06-17 Fianance
到此這篇關(guān)于R語(yǔ)言關(guān)于數(shù)據(jù)幀的知識(shí)點(diǎn)詳解的文章就介紹到這了,更多相關(guān)R語(yǔ)言數(shù)據(jù)幀內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
R語(yǔ)言實(shí)現(xiàn)KMeans聚類算法實(shí)例教程
聚類是從數(shù)據(jù)集中對(duì)觀測(cè)值進(jìn)行聚類的機(jī)器學(xué)習(xí)方法,下面這篇文章主要給大家介紹了關(guān)于R語(yǔ)言實(shí)現(xiàn)KMeans聚類算法的相關(guān)資料,文中通過(guò)實(shí)例代碼介紹的非常詳細(xì),需要的朋友可以參考下2022-07-07

R語(yǔ)言中cbind、rbind和merge函數(shù)的使用與區(qū)別

R語(yǔ)言數(shù)據(jù)預(yù)處理操作——離散化(分箱)

R語(yǔ)言實(shí)現(xiàn)對(duì)數(shù)據(jù)框按某一列分組求組內(nèi)平均值

R語(yǔ)言實(shí)現(xiàn)二進(jìn)制文件讀寫操作

R語(yǔ)言 小數(shù)點(diǎn)位數(shù)的設(shè)置方式

R語(yǔ)言“循環(huán)”知識(shí)點(diǎn)詳解