PyTorch 中的 torch.utils.data 解析(推薦)
PyTorch 中的 torch.utils.data 解析
PyTorch 中的 torch.utils.data
解析
在 PyTorch 中,提供了一個(gè)處理數(shù)據(jù)集的工具包 torch.utils.data
。這里來簡(jiǎn)單介紹這個(gè)包的結(jié)構(gòu)。以下內(nèi)容翻譯和整理自 PyTorch 官方文檔。
概述
PyTorch 數(shù)據(jù)集處理包 torch.utils.data
的核心是 DataLoader
類。該類的構(gòu)造函數(shù)簽名為
DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None, num_workers=0, collate_fn=None, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None, *, prefetch_factor=2, persistent_workers=False)
它構(gòu)造一個(gè) 可迭代對(duì)象 loader
,代表經(jīng)過 “加工” 后的數(shù)據(jù)集。所謂的 “加工” 過程,是由構(gòu)造函數(shù)參數(shù)表指定的,它包括:
- 設(shè)置數(shù)據(jù)的加載順序(通過修改
shuffle
或sampler
參數(shù)) - 對(duì)數(shù)據(jù)進(jìn)行 batching 處理(通過修改
batch_size
,batch_sampler
,collate_fn
及drop_last
參數(shù)) - 實(shí)現(xiàn) multi-process loading,memory pinning 等(此處不涉及)
一旦構(gòu)造了 DataLoader
對(duì)象 loader
,就可以用
for data in loader: # data 是數(shù)據(jù)集中的一組數(shù)據(jù),且已轉(zhuǎn)換成 Tensor
來加載數(shù)據(jù)。缺省情況下,PyTorch 會(huì)對(duì)數(shù)據(jù)進(jìn)行 auto-batching,此時(shí) data
對(duì)應(yīng)一個(gè) batch 的數(shù)據(jù)。
可以將 loader
理解成一個(gè) 生成器,其定義按情況可分為:(出現(xiàn)一些概念之后都會(huì)解釋)
## 啟用 auto-batching # 對(duì) map-style 數(shù)據(jù)集 for indices in batch_sampler: yield collate_fn([dataset[i] for i in indices]) # 對(duì) iterable-style 數(shù)據(jù)集 dataset_iter = iter(dataset) for indices in batch_sampler: yield collate_fn([next(dataset_iter) for _ in indices]) ## 不啟用 auto-batching (設(shè)置 batch_size=None 和 batch_sampler=None) # 對(duì) map-style 數(shù)據(jù)集 for index in sampler: yield collate_fn(dataset[index]) # 對(duì) iterable-style 數(shù)據(jù)集 for data in iter(dataset): yield collate_fn(data)
數(shù)據(jù)集
DataLoader
構(gòu)造函數(shù)中的必需參數(shù) dataset
代表一個(gè)數(shù)據(jù)集。數(shù)據(jù)集主要分為兩種:
- map-style 數(shù)據(jù)集:它是
torch.utils.data.Dataset
的子類,重載了__getitem__
和__len__
運(yùn)算符,可以隨機(jī)訪問數(shù)據(jù)集中的數(shù)據(jù) - iterable-style 數(shù)據(jù)集:它是
torch.utils.data.IterableDataset
的子類,是可迭代對(duì)象
數(shù)據(jù)加載順序
手動(dòng)定義 sampler
可以通過指定 sampler
參數(shù)來手動(dòng)設(shè)置加載順序。一個(gè) sampler
是可迭代對(duì)象,其迭代的每一個(gè)值表示下一個(gè)待加載數(shù)據(jù)的 key/index。它應(yīng)當(dāng)實(shí)例化泛型類 torch.utils.data.Sampler[int]
的一個(gè)子類,并且重載 __iter__
和 __len__
函數(shù),具體地講:
構(gòu)造函數(shù) __init__(self, data_source, *args)
必須提供一個(gè)重載了 __len__
的數(shù)據(jù)集 data_source
作為參數(shù)__iter__
返回一個(gè)整型迭代器,其每迭代一次的返回值為下一個(gè)待加載數(shù)據(jù)的 key/index__len__
返回要加載的數(shù)據(jù)總數(shù)
但是要注意,只有 map-style 數(shù)據(jù)集才可定義 sampler,因?yàn)?iterable-style 不一定支持隨機(jī)訪問。
使用內(nèi)置 sampler
在模塊 torch.utils.data.sampler
中定義了一些內(nèi)置的 sampler,通常來說已經(jīng)夠用了。在缺省 sampler
參數(shù)的情況下,如果指定參數(shù) shuffle=False
將使用 SequentialSampler
,即按順序加載整個(gè)數(shù)據(jù)集;如果指定 shuffle=True
則使用 RandomSampler
,即隨機(jī)打亂數(shù)據(jù)后加載整個(gè)數(shù)據(jù)集。但是注意,不允許同時(shí)指定 sampler
參數(shù)和 shuffle
參數(shù)。
另外一些 sampler 可以參見模塊源代碼。
數(shù)據(jù)的 batching
在訓(xùn)練神經(jīng)網(wǎng)絡(luò)的時(shí)候經(jīng)常需要將數(shù)據(jù)分成 mini-batch。PyTorch 本身提供了 auto-batching 的功能,也可以通過修改參數(shù) batch_size
,batch_sampler
,drop_last
及 collate_fn
進(jìn)行自定義 batching。
使用 batch sampler
在定義了 batching 后,PyTorch 會(huì)一次性輸入多個(gè)(數(shù)量為 batch_size
)數(shù)據(jù)。這時(shí)候需使用 batch sampler 來取代普通的 sampler。
通過指定 batch_sampler
參數(shù),可以手動(dòng)實(shí)現(xiàn)想要的 batch sampler。一個(gè) batch_sampler
是 torch.utils.data.BatchSampler
的實(shí)例。在 PyTorch 源代碼中,該類繼承了 Sampler[List[int]]
,并且封裝了一個(gè) sampler
。
- 構(gòu)造函數(shù)簽名為
__init__(self, sampler, batch_size: int, drop_last: bool)
,其中sampler
是一個(gè)可迭代對(duì)象,代表被封裝的 samplerbatch_size
代表每個(gè) batch 的數(shù)據(jù)量drop_last
表示要不要把最后一個(gè)不足batch_size
的 batch 丟掉
__iter__
返回一個(gè)迭代器,它每迭代一次,返回一個(gè)List[int]
,表示下一個(gè) batch 的 key/index 列表__len__
返回 batch 總數(shù)
請(qǐng)注意,
- 如果自定義了
batch_sampler
,那么不能再指定sampler
,shuffle
,batch_size
和drop_last
參數(shù) - 如果沒有指定
batch_sampler
參數(shù),但batch_size
不為None
,則DataLoader
構(gòu)造函數(shù)自動(dòng)使用自定義的sampler
或由shuffle
指定的內(nèi)置 sampler,以及batch_size
和drop_last
參數(shù)封裝 batch sampler - 如果既沒有指定
batch_sampler
參數(shù),又設(shè)置batch_size
為None
,則禁用 auto-batching,每加載一次輸出的是單個(gè)數(shù)據(jù)。
修改 collate_fn
參數(shù) collate_fn
指定如何對(duì)每一 batch 的數(shù)據(jù)做預(yù)處理。在模塊 torch.utils.data._utils
中,定義了兩個(gè)默認(rèn)的 collate_fn
:
default_convert
:如果禁用 auto-batching,則用該函數(shù)將每個(gè)數(shù)據(jù)預(yù)處理為torch.Tensor
default_collate
:如果啟用 auto-batching,則用該函數(shù)將每個(gè) batch 預(yù)處理為torch.Tensor
擴(kuò)展:PyTorch torch.utils.data.Dataset 介紹與實(shí)戰(zhàn)案例
一、前言
訓(xùn)練模型一般都是先處理 數(shù)據(jù)的輸入問題 和 預(yù)處理問題 。Pytorch提供了幾個(gè)有用的工具:torch.utils.data.Dataset 類和 torch.utils.data.DataLoader 類 。
流程是先把原始數(shù)據(jù)轉(zhuǎn)變成 torch.utils.data.Dataset 類,隨后再把得到的 torch.utils.data.Dataset 類當(dāng)作一個(gè)參數(shù)傳遞給 torch.utils.data.DataLoader 類,得到一個(gè)數(shù)據(jù)加載器,這個(gè)數(shù)據(jù)加載器每次可以返回一個(gè) Batch 的數(shù)據(jù)供模型訓(xùn)練使用。
在 pytorch 中,提供了一種十分方便的數(shù)據(jù)讀取機(jī)制,即使用 torch.utils.data.Dataset
與 Dataloader
組合得到數(shù)據(jù)迭代器。在每次訓(xùn)練時(shí),利用這個(gè)迭代器輸出每一個(gè) batch 數(shù)據(jù),并能在輸出時(shí)對(duì)數(shù)據(jù)進(jìn)行相應(yīng)的預(yù)處理或數(shù)據(jù)增廣操作。
本文我們主要介紹對(duì) torch.utils.data.Dataset 的理解,對(duì) Dataloader 的介紹請(qǐng)參考我的另一篇文章:【PyTorch】torch.utils.data.DataLoader 簡(jiǎn)單介紹與使用
在本文的最后將給出 torch.utils.data.Dataset
與 Dataloader
結(jié)合使用處理數(shù)據(jù)的實(shí)戰(zhàn)代碼。
二、torch.utils.data.Dataset 是什么
1. 干什么用的?
- pytorch 提供了一個(gè)數(shù)據(jù)讀取的方法,其由兩個(gè)類構(gòu)成:torch.utils.data.Dataset 和 DataLoader。
- 如果我們要自定義自己讀取數(shù)據(jù)的方法,就需要繼承類 torch.utils.data.Dataset ,并將其封裝到DataLoader 中。
- torch.utils.data.Dataset 是一個(gè) 類 Dataset 。通過重寫定義在該類上的方法,我們可以實(shí)現(xiàn)多種數(shù)據(jù)讀取及數(shù)據(jù)預(yù)處理方式。
2. 長(zhǎng)什么樣子?
torch.utils.data.Dataset 的源碼:
class Dataset(object): """An abstract class representing a Dataset. All other datasets should subclass it. All subclasses should override ``__len__``, that provides the size of the dataset, and ``__getitem__``, supporting integer indexing in range from 0 to len(self) exclusive. """ def __getitem__(self, index): raise NotImplementedError def __len__(self): raise NotImplementedError def __add__(self, other): return ConcatDataset([self, other])
注釋翻譯:
表示一個(gè)數(shù)據(jù)集的抽象類。
所有其他數(shù)據(jù)集都應(yīng)該對(duì)其進(jìn)行子類化。 所有子類都應(yīng)該重寫提供數(shù)據(jù)集大小的 __len__
和 __getitem__
,支持從 0 到 len(self) 獨(dú)占的整數(shù)索引。
理解:
就是說,Dataset 是一個(gè) 數(shù)據(jù)集 抽象類,它是其他所有數(shù)據(jù)集類的父類(所有其他數(shù)據(jù)集類都應(yīng)該繼承它),繼承時(shí)需要重寫方法 __len__
和 __getitem__
, __len__
是提供數(shù)據(jù)集大小的方法, __getitem__
是可以通過索引號(hào)找到數(shù)據(jù)的方法。
三、通過繼承 torch.utils.data.Dataset 定義自己的數(shù)據(jù)集類
torch.utils.data.Dataset 是代表自定義數(shù)據(jù)集的抽象類,我們可以定義自己的數(shù)據(jù)類抽象這個(gè)類,只需要重寫__len__和__getitem__這兩個(gè)方法就可以。
要自定義自己的 Dataset 類,至少要重載兩個(gè)方法:__len__
, __getitem__
- __len__返回的是數(shù)據(jù)集的大小
- __getitem__實(shí)現(xiàn)索引數(shù)據(jù)集中的某一個(gè)數(shù)據(jù)
下面將簡(jiǎn)單實(shí)現(xiàn)一個(gè)返回 torch.Tensor 類型的數(shù)據(jù)集:
from torch.utils.data import Dataset import torch class TensorDataset(Dataset): # TensorDataset繼承Dataset, 重載了__init__, __getitem__, __len__ # 實(shí)現(xiàn)將一組Tensor數(shù)據(jù)對(duì)封裝成Tensor數(shù)據(jù)集 # 能夠通過index得到數(shù)據(jù)集的數(shù)據(jù),能夠通過len,得到數(shù)據(jù)集大小 def __init__(self, data_tensor, target_tensor): self.data_tensor = data_tensor self.target_tensor = target_tensor def __getitem__(self, index): return self.data_tensor[index], self.target_tensor[index] def __len__(self): return self.data_tensor.size(0) # size(0) 返回當(dāng)前張量維數(shù)的第一維 # 生成數(shù)據(jù) data_tensor = torch.randn(4, 3) # 4 行 3 列,服從正態(tài)分布的張量 print(data_tensor) target_tensor = torch.rand(4) # 4 個(gè)元素,服從均勻分布的張量 print(target_tensor) # 將數(shù)據(jù)封裝成 Dataset (用 TensorDataset 類) tensor_dataset = TensorDataset(data_tensor, target_tensor) # 可使用索引調(diào)用數(shù)據(jù) print('tensor_data[0]: ', tensor_dataset[0]) # 可返回?cái)?shù)據(jù)len print('len os tensor_dataset: ', len(tensor_dataset))
輸出結(jié)果:
tensor([[ 0.8618, 0.4644, -0.5929],
[ 0.9566, -0.9067, 1.5781],
[ 0.3943, -0.7775, 2.0366],
[-1.2570, -0.3859, -0.3542]])
tensor([0.1363, 0.6545, 0.4345, 0.9928])
tensor_data[0]: (tensor([ 0.8618, 0.4644, -0.5929]), tensor(0.1363))
len os tensor_dataset: 4
四、為什么要定義自己的數(shù)據(jù)集類?
因?yàn)槲覀兛梢酝ㄟ^定義自己的數(shù)據(jù)集類并重寫該類上的方法 實(shí)現(xiàn)多種多樣的(自定義的)數(shù)據(jù)讀取方式。
比如,我們重寫 __init__
實(shí)現(xiàn)用 pd.read_csv 讀取 csv 文件:
from torch.utils.data import Dataset import pandas as pd # 這個(gè)包用來讀取CSV數(shù)據(jù) # 繼承Dataset,定義自己的數(shù)據(jù)集類 mydataset class mydataset(Dataset): def __init__(self, csv_file): # self 參數(shù)必須,其他參數(shù)及其形式隨程序需要而不同,比如(self,*inputs) self.csv_data = pd.read_csv(csv_file) def __len__(self): return len(self.csv_data) def __getitem__(self, idx): data = self.csv_data.values[idx] return data data = mydataset('spambase.csv') print(data[3]) print(len(data))
輸出結(jié)果:
[0.000e+00 0.000e+00 0.000e+00 0.000e+00 6.300e-01 0.000e+00 3.100e-01
6.300e-01 3.100e-01 6.300e-01 3.100e-01 3.100e-01 3.100e-01 0.000e+00
0.000e+00 3.100e-01 0.000e+00 0.000e+00 3.180e+00 0.000e+00 3.100e-01
0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
1.370e-01 0.000e+00 1.370e-01 0.000e+00 0.000e+00 3.537e+00 4.000e+01
1.910e+02 1.000e+00]
4601
要點(diǎn):
- 自己定義的 dataset 類需要繼承 Dataset。
- 需要實(shí)現(xiàn)必要的魔法方法:
在 __init__
方法里面進(jìn)行 讀取數(shù)據(jù)文件 。
在 __getitem__
方法里支持通過下標(biāo)訪問數(shù)據(jù)。
在 __len__
方法里返回自定義數(shù)據(jù)集的大小,方便后期遍歷。
五、實(shí)戰(zhàn):torch.utils.data.Dataset + Dataloader 實(shí)現(xiàn)數(shù)據(jù)集讀取和迭代
實(shí)例 1
數(shù)據(jù)集 spambase.csv 用的是 UCI 機(jī)器學(xué)習(xí)存儲(chǔ)庫(kù)里的垃圾郵件數(shù)據(jù)集,它一條數(shù)據(jù)有57個(gè)特征和1個(gè)標(biāo)簽。
import torch.utils.data as Data import pandas as pd # 這個(gè)包用來讀取CSV數(shù)據(jù) import torch # 繼承Dataset,定義自己的數(shù)據(jù)集類 mydataset class mydataset(Data.Dataset): def __init__(self, csv_file): # self 參數(shù)必須,其他參數(shù)及其形式隨程序需要而不同,比如(self,*inputs) data_csv = pd.DataFrame(pd.read_csv(csv_file)) # 讀數(shù)據(jù) self.csv_data = data_csv.drop(axis=1, columns='58', inplace=False) # 刪除最后一列標(biāo)簽 def __len__(self): return len(self.csv_data) def __getitem__(self, idx): data = self.csv_data.values[idx] return data data = mydataset('spambase.csv') x = torch.tensor(data[:5]) # 前五個(gè)數(shù)據(jù) y = torch.tensor([1, 1, 1, 1, 1]) # 標(biāo)簽 torch_dataset = Data.TensorDataset(x, y) # 對(duì)給定的 tensor 數(shù)據(jù),將他們包裝成 dataset loader = Data.DataLoader( # 從數(shù)據(jù)庫(kù)中每次抽出batch size個(gè)樣本 dataset = torch_dataset, # torch TensorDataset format batch_size = 2, # mini batch size shuffle=True, # 要不要打亂數(shù)據(jù) (打亂比較好) num_workers=2, # 多線程來讀數(shù)據(jù) ) def show_batch(): for step, (batch_x, batch_y) in enumerate(loader): print("steop:{}, batch_x:{}, batch_y:{}".format(step, batch_x, batch_y)) show_batch()
輸出結(jié)果:
steop:0, batch_x:tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 6.3000e-01, 0.0000e+00,
3.1000e-01, 6.3000e-01, 3.1000e-01, 6.3000e-01, 3.1000e-01, 3.1000e-01,
3.1000e-01, 0.0000e+00, 0.0000e+00, 3.1000e-01, 0.0000e+00, 0.0000e+00,
3.1800e+00, 0.0000e+00, 3.1000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 1.3500e-01, 0.0000e+00, 1.3500e-01, 0.0000e+00, 0.0000e+00,
3.5370e+00, 4.0000e+01, 1.9100e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 6.3000e-01, 0.0000e+00,
3.1000e-01, 6.3000e-01, 3.1000e-01, 6.3000e-01, 3.1000e-01, 3.1000e-01,
3.1000e-01, 0.0000e+00, 0.0000e+00, 3.1000e-01, 0.0000e+00, 0.0000e+00,
3.1800e+00, 0.0000e+00, 3.1000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 1.3700e-01, 0.0000e+00, 1.3700e-01, 0.0000e+00, 0.0000e+00,
3.5370e+00, 4.0000e+01, 1.9100e+02]], dtype=torch.float64), batch_y:tensor([1, 1])
steop:1, batch_x:tensor([[2.1000e-01, 2.8000e-01, 5.0000e-01, 0.0000e+00, 1.4000e-01, 2.8000e-01,
2.1000e-01, 7.0000e-02, 0.0000e+00, 9.4000e-01, 2.1000e-01, 7.9000e-01,
6.5000e-01, 2.1000e-01, 1.4000e-01, 1.4000e-01, 7.0000e-02, 2.8000e-01,
3.4700e+00, 0.0000e+00, 1.5900e+00, 0.0000e+00, 4.3000e-01, 4.3000e-01,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
7.0000e-02, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 1.3200e-01, 0.0000e+00, 3.7200e-01, 1.8000e-01, 4.8000e-02,
5.1140e+00, 1.0100e+02, 1.0280e+03],
[6.0000e-02, 0.0000e+00, 7.1000e-01, 0.0000e+00, 1.2300e+00, 1.9000e-01,
1.9000e-01, 1.2000e-01, 6.4000e-01, 2.5000e-01, 3.8000e-01, 4.5000e-01,
1.2000e-01, 0.0000e+00, 1.7500e+00, 6.0000e-02, 6.0000e-02, 1.0300e+00,
1.3600e+00, 3.2000e-01, 5.1000e-01, 0.0000e+00, 1.1600e+00, 6.0000e-02,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 6.0000e-02, 0.0000e+00, 0.0000e+00,
1.2000e-01, 0.0000e+00, 6.0000e-02, 6.0000e-02, 0.0000e+00, 0.0000e+00,
1.0000e-02, 1.4300e-01, 0.0000e+00, 2.7600e-01, 1.8400e-01, 1.0000e-02,
9.8210e+00, 4.8500e+02, 2.2590e+03]], dtype=torch.float64), batch_y:tensor([1, 1])
steop:2, batch_x:tensor([[ 0.0000, 0.6400, 0.6400, 0.0000, 0.3200, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.6400, 0.0000, 0.0000,
0.0000, 0.3200, 0.0000, 1.2900, 1.9300, 0.0000, 0.9600,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.7780, 0.0000, 0.0000, 3.7560, 61.0000,
278.0000]], dtype=torch.float64), batch_y:tensor([1])
一共 5 條數(shù)據(jù),batch_size 設(shè)為 2 ,則數(shù)據(jù)被分為三組,每組的數(shù)據(jù)量為:2,2,1。
實(shí)例 2:進(jìn)階
import torch.utils.data as Data import pandas as pd # 這個(gè)包用來讀取CSV數(shù)據(jù) import numpy as np # 繼承Dataset,定義自己的數(shù)據(jù)集類 mydataset class mydataset(Data.Dataset): def __init__(self, csv_file): # self 參數(shù)必須,其他參數(shù)及其形式隨程序需要而不同,比如(self,*inputs) # 讀取數(shù)據(jù) frame = pd.DataFrame(pd.read_csv('spambase.csv')) spam = frame[frame['58'] == 1] ham = frame[frame['58'] == 0] SpamNew = spam.drop(axis=1, columns='58', inplace=False) # 刪除第58列,inplace=False不改變?cè)瓟?shù)據(jù),返回一個(gè)新dataframe HamNew = ham.drop(axis=1, columns='58', inplace=False) # 數(shù)據(jù) self.csv_data = np.vstack([np.array(SpamNew), np.array(HamNew)]) # 將兩個(gè)N維數(shù)組進(jìn)行連接,形成X # 標(biāo)簽 self.Label = np.array([1] * len(spam) + [0] * len(ham)) # 形成標(biāo)簽值列表y def __len__(self): return len(self.csv_data) def __getitem__(self, idx): data = self.csv_data[idx] label = self.Label[idx] return data, label data = mydataset('spambase.csv') print(len(data)) loader = Data.DataLoader( # 從數(shù)據(jù)庫(kù)中每次抽出batch size個(gè)樣本 dataset = data, # torch TensorDataset format batch_size = 460, # mini batch size shuffle=True, # 要不要打亂數(shù)據(jù) (打亂比較好) num_workers=2, # 多線程來讀數(shù)據(jù) ) def show_batch(): for step, (batch_x, batch_y) in enumerate(loader): print("steop:{}, batch_x:{}, batch_y:{}".format(step, batch_x, batch_y)) show_batch()
輸出結(jié)果:
4601
steop:0, batch_x:tensor([[0.0000e+00, 2.4600e+00, 0.0000e+00, ..., 2.1420e+00, 1.0000e+01,
7.5000e+01],
[0.0000e+00, 0.0000e+00, 1.6000e+00, ..., 2.0650e+00, 1.2000e+01,
9.5000e+01],
[0.0000e+00, 0.0000e+00, 3.6000e-01, ..., 3.7220e+00, 2.0000e+01,
2.6800e+02],
...,
[7.7000e-01, 3.8000e-01, 7.7000e-01, ..., 1.4619e+01, 5.2500e+02,
9.2100e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0000e+00, 1.0000e+00,
5.0000e+00],
[4.0000e-01, 1.8000e-01, 3.2000e-01, ..., 3.3050e+00, 1.8100e+02,
1.6130e+03]], dtype=torch.float64), batch_y:tensor([0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1,
0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0,
1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0,
0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0,
1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1,
1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0,
0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0,
0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1,
0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0,
0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1,
1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1,
0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1,
0, 1, 0, 1])
steop:1, batch_x:tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0000e+00, 1.0000e+00,
2.0000e+00],
[4.9000e-01, 0.0000e+00, 7.4000e-01, ..., 3.9750e+00, 4.7000e+01,
4.8500e+02],
[0.0000e+00, 0.0000e+00, 7.1000e-01, ..., 4.0220e+00, 9.7000e+01,
5.4300e+02],
...,
[0.0000e+00, 1.4000e-01, 1.4000e-01, ..., 5.3310e+00, 8.0000e+01,
1.0290e+03],
[0.0000e+00, 0.0000e+00, 3.6000e-01, ..., 3.1760e+00, 5.1000e+01,
2.7000e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.1660e+00, 2.0000e+00,
7.0000e+00]], dtype=torch.float64), batch_y:tensor([0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0,
0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0,
1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0,
0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0,
1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0,
0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0,
1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1,
0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0,
0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1,
1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1,
1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1,
1, 0, 0, 0])
steop:2, batch_x:tensor([[0.0000e+00, 0.0000e+00, 1.4700e+00, ..., 3.0000e+00, 3.3000e+01,
1.7700e+02],
[2.6000e-01, 4.6000e-01, 9.9000e-01, ..., 1.3235e+01, 2.7200e+02,
1.5750e+03],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.0450e+00, 6.0000e+00,
4.5000e+01],
...,
[4.0000e-01, 0.0000e+00, 0.0000e+00, ..., 1.1940e+00, 5.0000e+00,
1.2900e+02],
[2.6000e-01, 0.0000e+00, 0.0000e+00, ..., 1.8370e+00, 1.1000e+01,
1.5800e+02],
[5.0000e-02, 0.0000e+00, 1.0000e-01, ..., 3.7150e+00, 1.0700e+02,
1.3860e+03]], dtype=torch.float64), batch_y:tensor([1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0,
1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0,
0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0,
0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0,
0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1,
0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0,
0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0,
1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1,
1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0,
1, 1, 0, 0])
steop:3, batch_x:tensor([[2.6000e-01, 0.0000e+00, 5.3000e-01, ..., 2.6460e+00, 7.7000e+01,
1.7200e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.4280e+00, 5.0000e+00,
1.7000e+01],
[3.4000e-01, 0.0000e+00, 1.7000e+00, ..., 6.6700e+02, 1.3330e+03,
1.3340e+03],
...,
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0000e+00, 1.0000e+00,
7.0000e+00],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.7010e+00, 2.0000e+01,
1.8100e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 4.0000e+00, 1.1000e+01,
3.6000e+01]], dtype=torch.float64), batch_y:tensor([0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0,
1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1,
0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0,
1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0,
0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0,
1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0,
0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1,
0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1,
0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1,
1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0,
1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0,
1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0,
1, 0, 0, 1])
steop:4, batch_x:tensor([[ 0.0000, 0.0000, 0.3100, ..., 5.7080, 138.0000, 274.0000],
[ 0.0000, 0.0000, 0.3400, ..., 2.2570, 17.0000, 158.0000],
[ 1.0400, 0.0000, 0.0000, ..., 1.0000, 1.0000, 17.0000],
...,
[ 0.0000, 0.0000, 0.0000, ..., 4.0000, 12.0000, 28.0000],
[ 0.3300, 0.0000, 0.0000, ..., 1.7880, 6.0000, 93.0000],
[ 0.0000, 14.2800, 0.0000, ..., 1.8000, 5.0000, 9.0000]],
dtype=torch.float64), batch_y:tensor([1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1,
0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1,
0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0,
1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0,
0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0,
1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1,
1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0,
0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0,
1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1,
0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1,
1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0,
1, 1, 0, 0])
steop:5, batch_x:tensor([[7.0000e-01, 0.0000e+00, 1.0500e+00, ..., 1.1660e+00, 1.3000e+01,
1.8900e+02],
[0.0000e+00, 3.3600e+00, 1.9200e+00, ..., 6.1370e+00, 1.0700e+02,
1.7800e+02],
[5.4000e-01, 0.0000e+00, 1.0800e+00, ..., 5.4540e+00, 6.8000e+01,
1.8000e+02],
...,
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 3.8330e+00, 9.0000e+00,
2.3000e+01],
[6.0000e-02, 6.5000e-01, 7.1000e-01, ..., 4.7420e+00, 1.1700e+02,
1.3420e+03],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.6110e+00, 1.2000e+01,
4.7000e+01]], dtype=torch.float64), batch_y:tensor([1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1,
1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1,
0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0,
0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1,
1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1,
1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1,
0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0,
0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1,
0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0,
1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0,
0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 1, 1, 1])
steop:6, batch_x:tensor([[0.0000e+00, 1.4280e+01, 0.0000e+00, ..., 1.8000e+00, 5.0000e+00,
9.0000e+00],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.9280e+00, 1.5000e+01,
5.4000e+01],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0692e+01, 6.5000e+01,
1.3900e+02],
...,
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.5000e+00, 5.0000e+00,
2.4000e+01],
[7.6000e-01, 1.9000e-01, 3.8000e-01, ..., 3.7020e+00, 4.5000e+01,
1.0700e+03],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.0000e+00, 1.2000e+01,
8.8000e+01]], dtype=torch.float64), batch_y:tensor([0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1,
0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1,
0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1,
1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0,
0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1,
0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0,
0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0,
0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1,
0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0,
0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1,
1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 0, 1, 0])
steop:7, batch_x:tensor([[0.0000e+00, 2.7000e-01, 0.0000e+00, ..., 5.8020e+00, 4.3000e+01,
4.1200e+02],
[0.0000e+00, 3.5000e-01, 7.0000e-01, ..., 3.6390e+00, 6.1000e+01,
3.1300e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.5920e+00, 7.0000e+00,
1.2900e+02],
...,
[8.0000e-02, 1.6000e-01, 8.0000e-02, ..., 2.7470e+00, 8.6000e+01,
1.9950e+03],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.6130e+00, 1.1000e+01,
7.1000e+01],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.9110e+00, 1.5000e+01,
6.5000e+01]], dtype=torch.float64), batch_y:tensor([0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0,
0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0,
1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1,
0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0,
1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1,
1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0,
0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0,
1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1,
0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1,
0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1,
1, 0, 0, 0])
steop:8, batch_x:tensor([[1.7000e-01, 0.0000e+00, 1.7000e-01, ..., 1.7960e+00, 1.2000e+01,
4.5800e+02],
[3.7000e-01, 0.0000e+00, 6.3000e-01, ..., 1.1810e+00, 4.0000e+00,
1.0400e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0000e+00, 1.0000e+00,
7.0000e+00],
...,
[2.3000e-01, 0.0000e+00, 4.7000e-01, ..., 2.4200e+00, 1.2000e+01,
3.3400e+02],
[0.0000e+00, 0.0000e+00, 1.2900e+00, ..., 1.3500e+00, 4.0000e+00,
2.7000e+01],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.3730e+00, 1.1000e+01,
1.6900e+02]], dtype=torch.float64), batch_y:tensor([1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1,
0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0,
1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1,
1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0,
0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0,
0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1,
0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0,
1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0,
0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1,
1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0,
1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1,
0, 0, 0, 0])
steop:9, batch_x:tensor([[0.0000e+00, 6.3000e-01, 0.0000e+00, ..., 2.2150e+00, 2.2000e+01,
1.1300e+02],
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.0000e+00, 1.0000e+00,
5.0000e+00],
[0.0000e+00, 0.0000e+00, 2.0000e-01, ..., 1.1870e+00, 1.1000e+01,
1.1400e+02],
...,
[0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 2.3070e+00, 1.6000e+01,
3.0000e+01],
[5.1000e-01, 4.3000e-01, 2.9000e-01, ..., 6.5900e+00, 7.3900e+02,
2.3330e+03],
[6.8000e-01, 6.8000e-01, 6.8000e-01, ..., 2.4720e+00, 9.0000e+00,
8.9000e+01]], dtype=torch.float64), batch_y:tensor([0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0,
0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0,
0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0,
0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0,
0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1,
0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1,
0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0,
1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0,
1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0,
1, 1, 1, 1])
steop:10, batch_x:tensor([[0.0000e+00, 2.5000e-01, 7.5000e-01, 0.0000e+00, 1.0000e+00, 2.5000e-01,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 2.5000e-01,
1.2500e+00, 0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 1.2500e+00,
2.5100e+00, 0.0000e+00, 1.7500e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00,
0.0000e+00, 0.0000e+00, 2.5000e-01, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 4.2000e-02, 0.0000e+00, 0.0000e+00,
1.2040e+00, 7.0000e+00, 1.1800e+02]], dtype=torch.float64), batch_y:tensor([0])
一共 4601 條數(shù)據(jù),按 batch_size = 460 來分:能劃分為 11 組,前 10 組的數(shù)據(jù)量為 460,最后一組的數(shù)據(jù)量為 1 。
參考鏈接
- torch.Tensor.size()方法的使用舉例
- Pytorch筆記05-自定義數(shù)據(jù)讀取方式orch.utils.data.Dataset與Dataloader
- pytorch 可訓(xùn)練數(shù)據(jù)集創(chuàng)建(torch.utils.data)
- Pytorch的第一步:(1) Dataset類的使用
- pytorch中的torch.utils.data.Dataset和torch.utils.data.DataLoader
到此這篇關(guān)于PyTorch torch.utils.data.Dataset概述案例詳解的文章就介紹到這了,更多相關(guān)PyTorch torch.utils.data.Dataset內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
19個(gè)Python?Sklearn中超實(shí)用的隱藏功能分享
今天跟大家介紹?19?個(gè)?Sklearn?中超級(jí)實(shí)用的隱藏的功能,這些功能雖然不常見,但非常實(shí)用,它們可以直接優(yōu)雅地替代手動(dòng)執(zhí)行的常見操作2022-07-07python實(shí)現(xiàn)nao機(jī)器人身體軀干和腿部動(dòng)作操作
這篇文章主要為大家詳細(xì)介紹了python實(shí)現(xiàn)nao機(jī)器人身體軀干和腿部動(dòng)作操作,具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下2019-04-04詳解python腳本自動(dòng)生成需要文件實(shí)例代碼
這篇文章主要介紹了詳解python腳本自動(dòng)生成需要文件實(shí)例代碼的相關(guān)資料,需要的朋友可以參考下2017-02-02Python中列表遍歷使用range和enumerate的區(qū)別講解
這篇文章主要介紹了Python中列表遍歷使用range和enumerate的區(qū)別,在Python編程語言中,遍歷list有range和enumerate方法,本文結(jié)合示例代碼給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2022-12-12Python+Tkinter實(shí)現(xiàn)軟件自動(dòng)更新與提醒
這篇文章主要為大家詳細(xì)介紹了Python如何利用Tkinter編寫一個(gè)軟件自動(dòng)更新與提醒小程序,文中的示例代碼簡(jiǎn)潔易懂,感興趣的小伙伴可以動(dòng)手嘗試一下2023-07-07