Python實現(xiàn)一鍵整理百度云盤中重復(fù)無用文件

更新時間：2022年08月09日 10:07:42 作者：Mr數(shù)據(jù)楊

有沒有頭疼過百度云盤都要塞滿了，可是又沒有工具能剔除大量重復(fù)無用的文件？這里教你一個用Python實現(xiàn)的簡單方法，通過整理目錄的方式來處理我們云盤中無用的文件吧

獲取云盤緩存目錄

使用 Everything 找到云盤緩存 db 文件，復(fù)制到腳本的目錄下。

云盤數(shù)據(jù)整理

我們發(fā)現(xiàn)這個是一個 sqlite3 的文件，用 Navicat 打開先看看。

我們所有云盤的文件以及對應(yīng)的路徑保存在 cache_file 中，直接導(dǎo)出可能會有些問題，所以我們用 pandas 來處理數(shù)據(jù)就可以了。

云盤數(shù)據(jù)導(dǎo)出

我的云盤導(dǎo)出來了 40MB 的目錄數(shù)據(jù)，看著都頭疼。

數(shù)據(jù)整理

把云盤的目錄數(shù)據(jù)導(dǎo)出到 excel，后去該怎么處理就怎么處理吧。代碼非常少，如果喜歡用 python 處理就用 pandas 處理，如果感覺有困難直接在 excel 中處理就可以了。

import sqlite3
import pandas as pd

file_dict = {}  
con = sqlite3.connect('BaiduYunCacheFileV0.db')
cursor = con.cursor()  
cursor.execute("select * from cache_file") 
values = cursor.fetchall()

df = pd.DataFrame(values,columns=["id","fid","parent_path","server_filename","file_size","md5","isdir","category","server_mtime","local_mtime","reserved1","reserved2","reserved3","reserved4","reserved5","reserved6","reserved7","reserved8","reserved9"])
df.to_excel("data.xlsx")

重復(fù)文件提取

這個由于百度云盤沒有對應(yīng)的API接口可以使用爬蟲的方式進行網(wǎng)頁的操作對重復(fù)數(shù)據(jù)進行刪除，但是容易誤操作，所以還是手動把要處理的數(shù)據(jù)整理出來然后進行操作把。

通過文件名稱判斷重復(fù)，有了結(jié)果后續(xù)自己處理就好了。

df["server_filename"].duplicated()

0         False
1         False
2         False
3         False
4         False
          ...  
379563    False
379564    False
379565     True
379566     True
379567    False
Name: server_filename, Length: 379568, dtype: bool


df[df["server_filename"].duplicated()]["server_filename"]
188             WE_rk_nos06.txt
252                   django.po
254                   django.po
255                   django.po
256                   django.po
                  ...          
378517                video.mp4
378518            top_level.txt
378543    Blog_articleinfo.xlsx
379565                     apps
379566              職業(yè)培訓(xùn)規(guī)劃.mmap
Name: server_filename, Length: 152409, dtype: object

到此這篇關(guān)于Python實現(xiàn)一鍵整理百度云盤中重復(fù)無用文件的文章就介紹到這了,更多相關(guān)Python整理重復(fù)文件內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: