Python | OpenCV 與深度學習框架整合 — 實作範例

📑 目錄

📚 前言

在上一篇 OpenCV 與深度學習框架整合 中，我們把電腦視覺常見的術語與工具定位整理清楚——PyTorch、TensorFlow、Keras、YOLO、ResNet、MobileNet 各自是什麼、做什麼。
這一篇進入實作，直接用 PyTorch (ResNet18) 和 TensorFlow/Keras (MobileNetV2) 對影像進行分類，並整合 OpenCV 顯示結果。

🎨 範例圖片

來源：Pexels - Car Image，屬於無版權圖片，可自由下載與使用。
內容：圖片呈現沙漠場景，畫面主要為車輛，非常適合用來測試物件分類模型。
下載後將檔名改為 car.jpg，放到專案的 assets/ 目錄下。

🔎 原理說明

OpenCV：負責影像擷取、前處理與顯示。
深度學習框架：負責模型載入、推論與微調。
整合方式：框架輸出推論結果，OpenCV 負責繪製邊界框或文字。

📂 模型下載與使用說明

PyTorch

使用 torchvision.models 載入預訓練模型 (ResNet、MobileNet 等)。
可進行微調或自訂網路結構。

TensorFlow/Keras

使用 tf.keras.applications 載入預訓練模型 (MobileNetV2、ResNet50 等)。
API 友善，適合快速應用與部署。

🧠 函式與參數說明

📌 `torchvision.models.resnet18()`

載入 ResNet18 預訓練模型：

1	model = resnet18(weights=ResNet18_Weights.DEFAULT)

weights：預訓練權重，建議使用 ResNet18_Weights.DEFAULT 取代舊版的 pretrained=True，避免警告。

📌 `weights.transforms()`

取得官方推薦的前處理流程：‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌

1	preprocess = weights.transforms()

會依模型訓練時的設定自動產生對應的 Resize、Normalize 等前處理步驟，不需手動指定參數。

📌 `torch.topk()`

取出機率最高的前 N 個類別：

1	top5 = torch.topk(probs, k)

probs：模型輸出經 softmax 後的機率向量。
k：取前幾個結果，例如 5。

📌 `MobileNetV2()`

載入 MobileNetV2 預訓練模型：

1	model = MobileNetV2(weights="imagenet")

weights：設為 "imagenet" 載入 ImageNet 預訓練權重。

📌 `preprocess_input()`

對輸入影像做正規化：‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌

1	x = preprocess_input(x)

將像素值縮放至模型預期的範圍，需在 model.predict() 前呼叫。

📌 `decode_predictions()`

解碼分類結果：

1	results = decode_predictions(preds, top=5)[0]

preds：model.predict() 的輸出。
top：顯示前幾個預測結果。

💻 範例程式 — PyTorch 整合 (ResNet18)

# pytorch_resnet18.py
import torch
from torchvision.models import resnet18, ResNet18_Weights
import cv2
from PIL import Image

weights = ResNet18_Weights.DEFAULT
model = resnet18(weights=weights)
model.eval()

preprocess = weights.transforms()

img = cv2.imread("assets/car.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
pil_img = Image.fromarray(img_rgb)

input_tensor = preprocess(pil_img).unsqueeze(0)

with torch.no_grad():
    outputs = model(input_tensor)
    probs = torch.nn.functional.softmax(outputs[0], dim=0)

# 取前 5 個類別
top5 = torch.topk(probs, 5)

for idx, score in zip(top5.indices, top5.values):
    label = weights.meta["categories"][idx]
    print(f"{label}: {score:.4f}")

圖：使用 PyTorch ResNet18 對影像進行分類，Top-5 預測結果顯示模型的信心分布

📊 ResNet18 Top-5 預測結果

排名	類別 (Label)	機率 (Probability)
1	minivan	0.4953
2	jeep	0.2708
3	beach wagon	0.1988
4	minibus	0.0080
5	limousine	0.0033

💻 範例程式 — TensorFlow/Keras 整合 (MobileNetV2)

# tensorflow_mobilenetv2.py
import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input, decode_predictions
import cv2
import numpy as np

model = MobileNetV2(weights="imagenet")

img = cv2.imread("assets/car.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_resized = cv2.resize(img_rgb, (224, 224))

x = np.expand_dims(img_resized, axis=0)
x = preprocess_input(x)

preds = model.predict(x)

# 顯示前五個預測結果
results = decode_predictions(preds, top=5)[0]
for (imagenet_id, label, prob) in results:
    print(f"{label}: {prob:.4f}")

圖：使用 TensorFlow/Keras MobileNetV2 對影像進行分類，顯示前 5 個預測結果‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌

📊 MobileNetV2 Top-5 預測結果

排名	類別 (Label)	機率 (Probability)
1	minivan	0.8437
2	beach_wagon	0.1082
3	minibus	0.0051
4	convertible	0.0050
5	car_wheel	0.0044

🛠️ 與 OpenCV 結合的應用

即時影像分類：用 OpenCV 擷取攝影機畫面，每幀交給框架推論，再將預測類別標註在畫面上。
批次分類：用 OpenCV 批次讀取圖片資料夾，框架推論後輸出分類結果至 CSV。
分類結果視覺化：將 Top-5 預測結果用 OpenCV 繪製文字或長條圖疊加在原圖上顯示。

⚠️ 注意事項

版本相容性：建議使用 Python 3.8–3.11，避免 3.12+ 的套件相容性問題。
GPU 支援：若要使用 CUDA，需安裝對應版本的 PyTorch/TensorFlow，CPU 也可執行，速度較慢。
分類模型限制：ResNet、MobileNet 等分類模型一次只輸出一個類別，適合單一主體影像；若需偵測多個物件，應改用 YOLO 等偵測模型。

📊 應用場景

品質檢測：工廠產線用分類模型判斷產品是否合格，OpenCV 擷取即時影像。
動植物辨識：上傳單張照片，用 ResNet/MobileNet 辨識物種類別。
零售商品辨識：拍攝單一商品，框架模型輸出商品類別，OpenCV 顯示結果。

🎯 結語

本篇我們實際用 PyTorch 與 TensorFlow/Keras 載入預訓練模型，對車輛影像進行分類並取得 Top-5 預測結果。
下一篇進入 模型訓練與微調，學習如何訓練與微調深度學習模型，進一步針對特定場景最佳化。

📖 如在學習過程中遇到疑問，或是想了解更多相關主題，建議回顧一下 Python | OpenCV 系列導讀，掌握完整的章節目錄，方便快速找到你需要的內容。

註：以上參考了
OpenCV 官方文件 — Tutorials
OpenCV 官方文件 — Python Tutorials
PyTorch 官方安裝指南
 TensorFlow/Keras 官方文件
 Pexels — 免費圖片與影片素材‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌

Python | OpenCV 與深度學習框架整合

Python | OpenCV 模型訓練與微調

若本文對您有幫助，歡迎打賞支持，謝謝您的鼓勵 💛
If you enjoyed this article, feel free to donate. Thanks!

💛 感謝你的支持

任選一種方式掃描 QR Code 即可

💰 USDT (TRC20)

用加密貨幣交易所 APP 掃描

🏦 Line Bank (NTD)

用任意台灣銀行 APP 掃描

⚠ 此為 Line Bank 帳戶，跨行可能有手續費

查看完整訂閱方案與其他付款方式 →

📄 公開段落採 CC BY 3.0 TW 授權，可自由轉載引用，需署名出處。

J.J.'s Blogs

J.J. Huang 2026-03-03 Python OpenCV 07.物件偵測與辨識篇瀏覽次數：次 {{moment(1772499600000).fromNow()}}

Python | OpenCV 與深度學習框架整合 — 實作範例

📚 前言

🎨 範例圖片

🔎 原理說明