亚洲乱码中文字幕综合,中国熟女仑乱hd,亚洲精品乱拍国产一区二区三区,一本大道卡一卡二卡三乱码全集资源,又粗又黄又硬又爽的免费视频

Python實(shí)現(xiàn)數(shù)據(jù)集劃分(訓(xùn)練集和測(cè)試集)

 更新時(shí)間:2023年05月19日 17:06:07   作者:Python小丸子  
這篇文章主要為大家詳細(xì)介紹了Python是如何實(shí)現(xiàn)數(shù)據(jù)集劃分的,分為訓(xùn)練集和測(cè)試集,文中的實(shí)現(xiàn)方法講解詳細(xì),感興趣的小伙伴可以了解一下

前面是分部講解,完整代碼在最后。

導(dǎo)入模塊 :

import os
from shutil import copy, rmtree
import random

創(chuàng)建文件夾 :

def make_file(file_path: str):
    if os.path.exists(file_path):
        rmtree(file_path)
    os.makedirs(file_path)

劃分?jǐn)?shù)據(jù)集的比例,本文是0.1:驗(yàn)證集的數(shù)量占總數(shù)據(jù)集的10%比如填0.1就是驗(yàn)證集的數(shù)量占總數(shù)據(jù)集的10%

random.seed(0)  
split_rate = 0.1 

數(shù)據(jù)集的存放:新建一個(gè)數(shù)據(jù)文件夾,將劃分的數(shù)據(jù)集存放進(jìn)去

data_path = r'D:\chengxu\data\caodi'  # 數(shù)據(jù)集存放的地方
data_root = r'D:\chengxu\data\cd'  # 這里是生成的訓(xùn)練集和驗(yàn)證集所處的位置,這里設(shè)置的是在當(dāng)前文件夾下。
data_class = [cla for cla in os.listdir(data_path)]
print("數(shù)據(jù)的種類分別為:")
print(data_class)  # 輸出數(shù)據(jù)種類

建立訓(xùn)練集文件夾:

train_data_root = os.path.join(data_root, "train")  # 訓(xùn)練集的文件夾名稱為 train
make_file(train_data_root)
for num_class in data_class:
    make_file(os.path.join(train_data_root, num_class))

建立測(cè)試集文件夾:

val_data_root = os.path.join(data_root, "val")  # 驗(yàn)證集的文件夾名稱為 val
make_file(val_data_root)
for num_class in data_class:
    make_file(os.path.join(val_data_root, num_class))

劃分?jǐn)?shù)據(jù):

for num_class in data_class:
    num_class_path = os.path.join(data_path, num_class)
    images = os.listdir(num_class_path)
    num = len(images)
    val_index = random.sample(images, k=int(num * split_rate))  # 隨機(jī)抽取圖片
    for index, image in enumerate(images):
        if image in val_index:
            # 將劃分到驗(yàn)證集中的文件復(fù)制到相應(yīng)目錄
            data_image_path = os.path.join(num_class_path, image)
            val_new_path = os.path.join(val_data_root, num_class)
            copy(data_image_path, val_new_path)
        else:
            # 將劃分到訓(xùn)練集中的文件復(fù)制到相應(yīng)目錄
            data_image_path = os.path.join(num_class_path, image)
            train_new_path = os.path.join(train_data_root, num_class)
            copy(data_image_path, train_new_path)
    print("\r[{}] split_rating [{}/{}]".format(num_class, index + 1, num), end="")  # processing bar
    print()
print("       ")
print("       ")
print("劃分成功")

完整代碼:

import os
from shutil import copy, rmtree
import random
 
 
def make_file(file_path: str):
    if os.path.exists(file_path):
 
        rmtree(file_path)
    os.makedirs(file_path)
 
random.seed(0) 
 
# 將數(shù)據(jù)集中10%的數(shù)據(jù)劃分到驗(yàn)證集中
split_rate = 0.1  
data_path = r'D:\chengxu\data\caodi'  # 數(shù)據(jù)集存放的地方,建議在程序所在的文件夾下新建一個(gè)data文件夾,將需要?jiǎng)澐值臄?shù)據(jù)集存放進(jìn)去
data_root = r'D:\chengxu\data\cd'  # 這里是生成的訓(xùn)練集和驗(yàn)證集所處的位置,這里設(shè)置的是在當(dāng)前文件夾下。
 
data_class = [cla for cla in os.listdir(data_path)]
print("數(shù)據(jù)的種類分別為:")
print(data_class)  
# 建立保存訓(xùn)練集的文件夾
train_data_root = os.path.join(data_root, "train")  # 訓(xùn)練集的文件夾名稱為 train
make_file(train_data_root)
for num_class in data_class:
    # 建立每個(gè)類別對(duì)應(yīng)的文件夾
    make_file(os.path.join(train_data_root, num_class))
 
# 建立保存驗(yàn)證集的文件夾
val_data_root = os.path.join(data_root, "val")  # 驗(yàn)證集的文件夾名稱為 val
make_file(val_data_root)
for num_class in data_class:
    # 建立每個(gè)類別對(duì)應(yīng)的文件夾
    make_file(os.path.join(val_data_root, num_class))
 
for num_class in data_class:
    num_class_path = os.path.join(data_path, num_class)
    images = os.listdir(num_class_path)
    num = len(images)
 
    val_index = random.sample(images, k=int(num * split_rate))  # 隨機(jī)抽取圖片
    for index, image in enumerate(images):
        if image in val_index:
          
            data_image_path = os.path.join(num_class_path, image)
            val_new_path = os.path.join(val_data_root, num_class)
            copy(data_image_path, val_new_path)
        else:
      
            data_image_path = os.path.join(num_class_path, image)
            train_new_path = os.path.join(train_data_root, num_class)
            copy(data_image_path, train_new_path)
    print("\r[{}] split_rating [{}/{}]".format(num_class, index + 1, num), end="")  # processing bar
    print()
 
print("       ")
print("       ")
print("劃分成功")

到此這篇關(guān)于Python實(shí)現(xiàn)數(shù)據(jù)集劃分(訓(xùn)練集和測(cè)試集)的文章就介紹到這了,更多相關(guān)Python數(shù)據(jù)集劃分內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!

相關(guān)文章

最新評(píng)論