使用 Python 和 LabelMe 實現(xiàn)圖片驗證碼的自動標注功能

更新時間：2024年12月31日 15:30:10 作者：XMYX-0

文章介紹了如何使用Python和LabelMe自動標注圖片驗證碼,主要步驟包括圖像預處理、OCR識別和生成標注文件,通過結合PaddleOCR,可以快速實現(xiàn)驗證碼字符的自動標注,大幅提升工作效率,感興趣的朋友一起看看吧

使用 Python 和 LabelMe 實現(xiàn)圖片驗證碼的自動標注

在處理圖片驗證碼時，手動標注是一項耗時且枯燥的工作。本文將介紹如何使用 Python 和 LabelMe 實現(xiàn)圖片驗證碼的自動標注。通過結合 PaddleOCR 實現(xiàn)自動識別，再生成 LabelMe 格式的標注文件，大幅提升工作效率。

環(huán)境準備

必備工具

Python 3.7+
PaddleOCR（支持文字識別）
OpenCV（圖像處理）
LabelMe（標注工具）

安裝依賴

使用以下命令安裝所需庫：

pip install paddleocr labelme opencv-python

實現(xiàn)自動標注

自動標注分為以下幾個步驟：

加載圖片：讀取圖片文件，確保格式正確。
圖像預處理：對驗證碼圖片進行灰度化和二值化處理，優(yōu)化識別效果。
OCR 識別：使用 PaddleOCR 獲取驗證碼中的文字和位置。
生成標注文件：根據 OCR 結果創(chuàng)建符合 LabelMe 格式的 JSON 文件。

核心代碼實現(xiàn)

以下是完整的自動標注腳本：

import os
import cv2
from paddleocr import PaddleOCR
def auto_label_image(image_path, output_path):
    # 檢查文件是否存在
    if not os.path.exists(image_path):
        print(f"Error: File not found: {image_path}")
        return
    # 加載圖像
    image = cv2.imread(image_path)
    if image is None:
        print(f"Error: Failed to load image. Check the file path or format: {image_path}")
        return
    # 圖像預處理
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, binary_image = cv2.threshold(gray_image, 128, 255, cv2.THRESH_BINARY)
    # 保存預處理后的圖片（可選，用于調試）
    preprocessed_path = os.path.join(output_path, "processed_image.jpg")
    cv2.imwrite(preprocessed_path, binary_image)
    # 初始化 OCR
    ocr = PaddleOCR(use_angle_cls=True, lang='en')
    # OCR 識別
    results = ocr.ocr(preprocessed_path)
    if not results or not results[0]:
        print(f"No text detected in the image: {image_path}")
        return
    # 獲取圖像尺寸
    image_height, image_width, _ = image.shape
    # 構建標注 JSON
    label_data = {
        "version": "4.5.7",
        "flags": {},
        "shapes": [],
        "imagePath": os.path.basename(image_path),
        "imageData": None,
        "imageHeight": image_height,
        "imageWidth": image_width,
    }
    # 遍歷 OCR 結果
    for line in results[0]:
        points = line[0]  # 字符位置 [左上, 右上, 右下, 左下]
        text = line[1][0]  # 識別的文本
        shape = {
            "label": text,
            "points": [points[0], points[2]],  # 左上角和右下角
            "group_id": None,
            "shape_type": "rectangle",
            "flags": {}
        }
        label_data["shapes"].append(shape)
    # 保存標注 JSON
    json_path = os.path.join(output_path, os.path.basename(image_path).replace('.jpg', '.json'))
    with open(json_path, 'w') as f:
        import json
        json.dump(label_data, f, indent=4)
    print(f"Saved LabelMe annotation: {json_path}")
# 示例
image_path = r"C:\Users\wangzq\Desktop\images\captcha.jpg"
output_path = "./annotations"
os.makedirs(output_path, exist_ok=True)
auto_label_image(image_path, output_path)

核心邏輯解析

圖像預處理

為了提高 OCR 的識別精度，對驗證碼圖片進行灰度化和二值化處理：

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, binary_image = cv2.threshold(gray_image, 128, 255, cv2.THRESH_BINARY)

二值化處理可以去除背景噪聲，使字符更加清晰。

OCR 識別

使用 PaddleOCR 對圖片進行文字檢測和識別，返回檢測框和文字內容：

ocr = PaddleOCR(use_angle_cls=True, lang='en')
results = ocr.ocr(preprocessed_path)

如果 results 為空，說明 OCR 未檢測到任何文本。

生成標注文件

根據 OCR 結果，生成 LabelMe 格式的標注文件，關鍵字段包括：

shapes：標注框信息，包括位置和對應文字。
imageHeight 和 imageWidth：圖像的尺寸。

運行結果

輸出預處理圖片：在指定路徑下保存經過預處理的圖片（processed_image.jpg）。
生成標注文件：在 output_path 目錄下生成與圖片同名的 .json 文件。
無文本檢測提示：如果未檢測到任何文本，提示 No text detected in the image。

擴展與優(yōu)化

模型適配

如果驗證碼中的字符種類較復雜，可以考慮訓練一個專用模型，替代通用的 PaddleOCR。

批量處理

針對多張圖片驗證碼，可以將腳本擴展為批量處理模式：

for image_file in os.listdir(input_folder):
    image_path = os.path.join(input_folder, image_file)
    auto_label_image(image_path, output_path)

標注類型擴展

目前代碼僅支持矩形框標注。如果需要支持多邊形標注，可以調整 shape_type 為 polygon 并提供相應點坐標。

總結

本文介紹了如何使用 Python 和 LabelMe 自動標注圖片驗證碼，從圖像預處理到生成標注文件的完整流程。通過 PaddleOCR 的結合，可以快速實現(xiàn)驗證碼字符的自動標注，節(jié)省大量時間和精力。

測試

運行完腳本，出來json

{
    "version": "4.5.7",
    "flags": {},
    "shapes": [
        {
            "label": "OZLQ",
            "points": [
                [
                    6.0,
                    1.0
                ],
                [
                    68.0,
                    21.0
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        }
    ],
    "imagePath": "captcha.png",
    "imageData": null,
    "imageHeight": 22,
    "imageWidth": 76
}

{
    "version": "4.5.7",
    "flags": {},
    "shapes": [
        {
            "label": "3081",
            "points": [
                [
                    6.0,
                    1.0
                ],
                [
                    63.0,
                    21.0
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
        }
    ],
    "imagePath": "captcha.png",
    "imageData": null,
    "imageHeight": 22,
    "imageWidth": 76
}

目前較為復雜還需要深度研究

到此這篇關于使用 Python 和 LabelMe 實現(xiàn)圖片驗證碼的自動標注的文章就介紹到這了,更多相關Python圖片驗證碼自動標注內容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: