深度學(xué)習(xí)的MNIST手寫數(shù)字數(shù)據(jù)集識別方式(準(zhǔn)確率99%,附代碼)

更新時間：2024年06月25日 16:52:47 作者：什么都不太會的研究生

這篇文章主要介紹了深度學(xué)習(xí)的MNIST手寫數(shù)字數(shù)據(jù)集識別方式(準(zhǔn)確率99%,附代碼),具有很好的參考價值,希望對大家有所幫助,如有錯誤或未考慮完全的地方,望不吝賜教

1.Mnist數(shù)據(jù)集介紹

1.1 基本介紹

Mnist數(shù)據(jù)集可以算是學(xué)習(xí)深度學(xué)習(xí)最常用到的了。

這個數(shù)據(jù)集包含70000張手寫數(shù)字圖片，分別是60000張訓(xùn)練圖片和10000張測試圖片，訓(xùn)練集由來自250個不同人手寫的數(shù)字構(gòu)成，一般來自高中生，一半來自工作人員，測試集（test set）也是同樣比例的手寫數(shù)字數(shù)據(jù)，并且保證了測試集和訓(xùn)練集的作者不同。

每個圖片都是2828個像素點，數(shù)據(jù)集/會把一張圖片的數(shù)據(jù)轉(zhuǎn)成一個2828=784的一維向量存儲起來。

里面的圖片數(shù)據(jù)如下所示，每張圖是0-9的手寫數(shù)字黑底白字的圖片，存儲時，黑色用0表示，白色用0-1的浮點數(shù)表示。

在這里插入圖片描述

1.2 數(shù)據(jù)集下載

1）官網(wǎng)下載

Mnist數(shù)據(jù)集的下載地址如下：http://yann.lecun.com/exdb/mnist/

打開后會有四個文件：

在這里插入圖片描述

訓(xùn)練數(shù)據(jù)集：train-images-idx3-ubyte.gz
訓(xùn)練數(shù)據(jù)集標(biāo)簽：train-labels-idx1-ubyte.gz
測試數(shù)據(jù)集：t10k-images-idx3-ubyte.gz
測試數(shù)據(jù)集標(biāo)簽：t10k-labels-idx1-ubyte.gz

將這四個文件下載后放置到需要用的文件夾下即可不要解壓！下載后是什么就怎么放！

2）代碼導(dǎo)入

文件夾下運行下面的代碼，即可自動檢測數(shù)據(jù)集是否存在，若沒有會自動進行下載，下載后在這一路徑：

在這里插入圖片描述

# 下載數(shù)據(jù)集
from torchvision import datasets, transforms

train_set = datasets.MNIST("data",train=True,download=True, transform=transforms.ToTensor(),)
test_set = datasets.MNIST("data",train=False,download=True, transform=transforms.ToTensor(),)

參數(shù)解釋：

datasets.MNIST：是Pytorch的內(nèi)置函數(shù)torchvision.datasets.MNIST，可以導(dǎo)入數(shù)據(jù)集
train=True ：讀入的數(shù)據(jù)作為訓(xùn)練集
transform：讀入我們自己定義的數(shù)據(jù)預(yù)處理操作
download=True：當(dāng)我們的根目錄（root）下沒有數(shù)據(jù)集時，便自動下載

如果這時候我們通過聯(lián)網(wǎng)自動下載方式download我們的數(shù)據(jù)后，它的文件路徑是以下形式：原文件夾/data/MNIST/raw

2.代碼部分

2.1文件夾目錄

在這里插入圖片描述

test：自己寫的測試圖片
main:主函數(shù)
model:訓(xùn)練的模型參數(shù)，會自動生成
data:數(shù)據(jù)集文件夾 2.2 運行結(jié)果

14輪左右，模型識別準(zhǔn)確率達到99%以上

在這里插入圖片描述

2.3代碼

1）導(dǎo)入必要的包及預(yù)處理

本人學(xué)習(xí)時做了較多注釋，且用的是下載好的文件，如果是自己的請更改對應(yīng)的文件目錄哦。

import os
import matplotlib.pyplot as plt
import torch
from PIL import Image
from torch import nn
from torch.nn import Conv2d, Linear, ReLU
from torch.nn import MaxPool2d
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader


# Dataset:創(chuàng)建數(shù)據(jù)集的函數(shù)；__init__:初始化數(shù)據(jù)內(nèi)容和標(biāo)簽
# __geyitem:獲取數(shù)據(jù)內(nèi)容和標(biāo)簽
# __len__:獲取數(shù)據(jù)集大小
# daataloader:數(shù)據(jù)加載類，接受來自dataset已經(jīng)加載好的數(shù)據(jù)集
# torchbision:圖形庫，包含預(yù)訓(xùn)練模型，加載數(shù)據(jù)的函數(shù)、圖片變換，裁剪、旋轉(zhuǎn)等
# torchtext:處理文本的工具包，將不同類型的額文件轉(zhuǎn)換為datasets

# 預(yù)處理：將兩個步驟整合在一起
transform = transforms.Compose({
    transforms.ToTensor(),  # 將灰度圖片像素值（0~255）轉(zhuǎn)為Tensor（0~1），方便后續(xù)處理
    # transforms.Normalize((0.1307,),(0.3081)),    # 歸一化，均值0，方差1;mean:各通道的均值std：各通道的標(biāo)準(zhǔn)差inplace：是否原地操作
})

2）加載數(shù)據(jù)集

# 加載數(shù)據(jù)集
# 訓(xùn)練數(shù)據(jù)集
train_data = MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_data, batch_size=64, shuffle=True)
# transform：指示加載的數(shù)據(jù)集應(yīng)用的數(shù)據(jù)預(yù)處理的規(guī)則，shuffle：洗牌，是否打亂輸入數(shù)據(jù)順序
# 測試數(shù)據(jù)集
test_data = MNIST(root="./data", train=False, transform=transform, download=True)
test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True)

train_data_size = len(train_data)
test_data_size = len(test_data)
print("訓(xùn)練數(shù)據(jù)集的長度：{}".format(train_data_size))
print("測試數(shù)據(jù)集的長度：{}".format(test_data_size))

3）構(gòu)建模型

成功運行的話請給個免費的贊吧?。ㄕ{(diào)試不易）

模型主要由兩個卷積層，兩個池化層，以及三個全連接層構(gòu)成，激活函數(shù)使用relu.

class MnistModel(nn.Module):
    def __init__(self):
        super(MnistModel, self).__init__()
        self.conv1 = Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1, padding=0)
        self.maxpool1 = MaxPool2d(2)
        self.conv2 = Conv2d(in_channels=10, out_channels=20, kernel_size=5, stride=1, padding=0)
        self.maxpool2 = MaxPool2d(2)
        self.linear1 = Linear(320, 128)
        self.linear2 = Linear(128, 64)
        self.linear3 = Linear(64, 10)
        self.relu = ReLU()

    def forward(self, x):
        x = self.relu(self.maxpool1(self.conv1(x)))
        x = self.relu(self.maxpool2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.linear3(x)

        return x

# 損失函數(shù)CrossentropyLoss
model = MnistModel()#實例化
criterion = nn.CrossEntropyLoss()   # 交叉熵損失，相當(dāng)于Softmax+Log+NllLoss
# 線性多分類模型Softmax,給出最終預(yù)測值對于10個類別出現(xiàn)的概率，Log:將乘法轉(zhuǎn)換為加法，減少計算量，保證函數(shù)的單調(diào)性
# NLLLoss:計算損失，此過程不需要手動one-hot編碼，NLLLoss會自動完成

# SGD，優(yōu)化器，梯度下降算法e
optimizer = torch.optim.SGD(model.parameters(), lr=0.14)#lr:學(xué)習(xí)率

4）模型訓(xùn)練

每次訓(xùn)練完成后會自動保存參數(shù)到pkl模型中，如果路徑中有Pkl文件，下次運行會自動加載上一次的模型參數(shù)，在這個基礎(chǔ)上繼續(xù)訓(xùn)練，第一次運行時沒有模型參數(shù)，結(jié)束后會自動生成。

# 模型訓(xùn)練
def train():
    # index = 0
    for index, data in enumerate(train_loader):#獲取訓(xùn)練數(shù)據(jù)以及對應(yīng)標(biāo)簽
        # for data in train_loader:
       input, target = data   # input為輸入數(shù)據(jù)，target為標(biāo)簽
       y_predict = model(input) #模型預(yù)測
       loss = criterion(y_predict, target)
       optimizer.zero_grad() #梯度清零
       loss.backward()#loss值反向傳播
       optimizer.step()#更新參數(shù)
       # index += 1
       if index % 100 == 0: # 每一百次保存一次模型，打印損失
           torch.save(model.state_dict(), "./model/model.pkl")   # 保存模型
           torch.save(optimizer.state_dict(), "./model/optimizer.pkl")
           print("訓(xùn)練次數(shù)為：{}，損失值為：{}".format(index, loss.item() ))

5）加載模型

第一次運行這里需要一個空的model文件夾

# 加載模型
if os.path.exists('./model/model.pkl'):
   model.load_state_dict(torch.load("./model/model.pkl"))#加載保存模型的參數(shù)

6）模型測試

# 模型測試
def test():
    correct = 0     # 正確預(yù)測的個數(shù)
    total = 0   # 總數(shù)
    with torch.no_grad():   # 測試不用計算梯度
        for data in test_loader:
            input, target = data
            output = model(input)   # output輸出10個預(yù)測取值，概率最大的為預(yù)測數(shù)
            probability, predict = torch.max(input=output.data, dim=1)    # 返回一個元祖，第一個為最大概率值，第二個為最大概率值的下標(biāo)
            # loss = criterion(output, target)
            total += target.size(0)  # target是形狀為（batch_size,1)的矩陣，使用size（0）取出該批的大小
            correct += (predict == target).sum().item()  # predict 和target均為（batch_size,1)的矩陣，sum求出相等的個數(shù)
        print("測試準(zhǔn)確率為：%.6f" %(correct / total))

7）自己手寫數(shù)字圖片識別函數(shù)（可選用）

這部分主要是加載訓(xùn)練好的pkl模型測試自己的數(shù)據(jù)，因此在進行自己手寫圖的測試時，需要有訓(xùn)練好的pkl文件，并且就不要調(diào)用train()函數(shù)和test()函數(shù)啦

注意：這個圖片像素也要說黑底白字，28*28像素，否則無法識別

def test_mydata():
    image = Image.open('./test/test_two.png')   #讀取自定義手寫圖片
    image = image.resize((28, 28))   # 裁剪尺寸為28*28
    image = image.convert('L')  # 轉(zhuǎn)換為灰度圖像
    transform = transforms.ToTensor()
    image = transform(image)
    image = image.resize(1, 1, 28, 28)
    output = model(image)
    probability, predict = torch.max(output.data, dim=1)
    print("此手寫圖片值為：%d,其最大概率為：%.2f " % (predict[0], probability))
    plt.title("此手寫圖片值為：{}".format((int(predict))), fontname='SimHei')
    plt.imshow(image.squeeze())
    plt.show()

8）MNIST中的數(shù)據(jù)識別測試數(shù)據(jù)

訓(xùn)練過程中的打印信息我進行了修改，這里設(shè)置的訓(xùn)練輪數(shù)是15輪，每次訓(xùn)練生成的pkl模型參數(shù)也是會更新的，想要更多訓(xùn)練信息可以查看對應(yīng)的教程哦~

#測試識別函數(shù)
if __name__ == '__main__':
    #訓(xùn)練與測試
    for i in range(15):#訓(xùn)練和測試進行15輪
        print({"————————第{}輪測試開始——————".format (i + 1)})
        train()
        test(）

9）測試自己的手寫數(shù)字圖片（可選）

這部分主要是與tset_mydata()函數(shù)結(jié)合，加載訓(xùn)練好的pkl模型測試自己的數(shù)據(jù)，因此在進行自己手寫圖的測試時，需要有訓(xùn)練好的pkl文件，并且就不要調(diào)用train()函數(shù)和test()函數(shù)啦。

注意：這個圖片像素也要說黑底白字，28*28像素，否則無法識別

# 測試主函數(shù)
if __name__ == '__main__':
    test_mydata()

將所有代碼按順序放到編輯器中，安裝好對應(yīng)的包，就可以順利運行啦。

完整代碼放下面：

import os
import matplotlib.pyplot as plt
import torch
from PIL import Image
from torch import nn
from torch.nn import Conv2d, Linear, ReLU
from torch.nn import MaxPool2d
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader


# Dataset:創(chuàng)建數(shù)據(jù)集的函數(shù)；__init__:初始化數(shù)據(jù)內(nèi)容和標(biāo)簽
# __geyitem:獲取數(shù)據(jù)內(nèi)容和標(biāo)簽
# __len__:獲取數(shù)據(jù)集大小
# daataloader:數(shù)據(jù)加載類，接受來自dataset已經(jīng)加載好的數(shù)據(jù)集
# torchbision:圖形庫，包含預(yù)訓(xùn)練模型，加載數(shù)據(jù)的函數(shù)、圖片變換，裁剪、旋轉(zhuǎn)等
# torchtext:處理文本的工具包，將不同類型的額文件轉(zhuǎn)換為datasets

# 預(yù)處理：將兩個步驟整合在一起
transform = transforms.Compose({
    transforms.ToTensor(),  # 將灰度圖片像素值（0~255）轉(zhuǎn)為Tensor（0~1），方便后續(xù)處理
    # transforms.Normalize((0.1307,),(0.3081)),    # 歸一化，均值0，方差1;mean:各通道的均值std：各通道的標(biāo)準(zhǔn)差inplace：是否原地操作
})

# normalize執(zhí)行以下操作：image=(image-mean)/std?????
# input[channel] = (input[channel] - mean[channel]) / std[channel]

# 加載數(shù)據(jù)集
# 訓(xùn)練數(shù)據(jù)集
train_data = MNIST(root='./data', train=True, transform=transform, download=True)
train_loader = DataLoader(dataset=train_data, batch_size=64, shuffle=True)
# transform：指示加載的數(shù)據(jù)集應(yīng)用的數(shù)據(jù)預(yù)處理的規(guī)則，shuffle：洗牌，是否打亂輸入數(shù)據(jù)順序
# 測試數(shù)據(jù)集
test_data = MNIST(root="./data", train=False, transform=transform, download=True)
test_loader = DataLoader(dataset=test_data, batch_size=64, shuffle=True)

train_data_size = len(train_data)
test_data_size = len(test_data)
print("訓(xùn)練數(shù)據(jù)集的長度：{}".format(train_data_size))
print("測試數(shù)據(jù)集的長度：{}".format(test_data_size))
# print(test_data)
# print(train_data)


class MnistModel(nn.Module):
    def __init__(self):
        super(MnistModel, self).__init__()
        self.conv1 = Conv2d(in_channels=1, out_channels=10, kernel_size=5, stride=1, padding=0)
        self.maxpool1 = MaxPool2d(2)
        self.conv2 = Conv2d(in_channels=10, out_channels=20, kernel_size=5, stride=1, padding=0)
        self.maxpool2 = MaxPool2d(2)
        self.linear1 = Linear(320, 128)
        self.linear2 = Linear(128, 64)
        self.linear3 = Linear(64, 10)
        self.relu = ReLU()

    def forward(self, x):
        x = self.relu(self.maxpool1(self.conv1(x)))
        x = self.relu(self.maxpool2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.linear1(x)
        x = self.linear2(x)
        x = self.linear3(x)

        return x


# 損失函數(shù)CrossentropyLoss
model = MnistModel()#實例化
criterion = nn.CrossEntropyLoss()   # 交叉熵損失，相當(dāng)于Softmax+Log+NllLoss
# 線性多分類模型Softmax,給出最終預(yù)測值對于10個類別出現(xiàn)的概率，Log:將乘法轉(zhuǎn)換為加法，減少計算量，保證函數(shù)的單調(diào)性
# NLLLoss:計算損失，此過程不需要手動one-hot編碼，NLLLoss會自動完成

# SGD，優(yōu)化器，梯度下降算法e
optimizer = torch.optim.SGD(model.parameters(), lr=0.14)#lr:學(xué)習(xí)率


# 模型訓(xùn)練
def train():
    # index = 0
    for index, data in enumerate(train_loader):#獲取訓(xùn)練數(shù)據(jù)以及對應(yīng)標(biāo)簽
        # for data in train_loader:
       input, target = data   # input為輸入數(shù)據(jù)，target為標(biāo)簽
       y_predict = model(input) #模型預(yù)測
       loss = criterion(y_predict, target)
       optimizer.zero_grad() #梯度清零
       loss.backward()#loss值反向傳播
       optimizer.step()#更新參數(shù)
       # index += 1
       if index % 100 == 0: # 每一百次保存一次模型，打印損失
           torch.save(model.state_dict(), "./model/model.pkl")   # 保存模型
           torch.save(optimizer.state_dict(), "./model/optimizer.pkl")
           print("訓(xùn)練次數(shù)為：{}，損失值為：{}".format(index, loss.item() ))

# 加載模型
if os.path.exists('./model/model.pkl'):
   model.load_state_dict(torch.load("./model/model.pkl"))#加載保存模型的參數(shù)


# 模型測試
def test():
    correct = 0     # 正確預(yù)測的個數(shù)
    total = 0   # 總數(shù)
    with torch.no_grad():   # 測試不用計算梯度
        for data in test_loader:
            input, target = data
            output = model(input)   # output輸出10個預(yù)測取值，概率最大的為預(yù)測數(shù)
            probability, predict = torch.max(input=output.data, dim=1)    # 返回一個元祖，第一個為最大概率值，第二個為最大概率值的下標(biāo)
            # loss = criterion(output, target)
            total += target.size(0)  # target是形狀為（batch_size,1)的矩陣，使用size（0）取出該批的大小
            correct += (predict == target).sum().item()  # predict 和target均為（batch_size,1)的矩陣，sum求出相等的個數(shù)
        print("測試準(zhǔn)確率為：%.6f" %(correct / total))


#測試識別函數(shù)
if __name__ == '__main__':
    #訓(xùn)練與測試
    for i in range(15):#訓(xùn)練和測試進行5輪
        print({"————————第{}輪測試開始——————".format (i + 1)})
        train()
        test()


def test_mydata():
    image = Image.open('./test/test_two.png')   #讀取自定義手寫圖片
    image = image.resize((28, 28))   # 裁剪尺寸為28*28
    image = image.convert('L')  # 轉(zhuǎn)換為灰度圖像
    transform = transforms.ToTensor()
    image = transform(image)
    image = image.resize(1, 1, 28, 28)
    output = model(image)
    probability, predict = torch.max(output.data, dim=1)
    print("此手寫圖片值為：%d,其最大概率為：%.2f " % (predict[0], probability))
    plt.title("此手寫圖片值為：{}".format((int(predict))), fontname='SimHei')
    plt.imshow(image.squeeze())
    plt.show()

# 測試主函數(shù)
# if __name__ == '__main__':
#     test_mydata()