YOLO 추론과 배포 - 배움 에이아이

학습된 YOLO 모델을 활용하여 이미지와 영상에서 객체를 탐지하고, 배포를 위해 ONNX 형식으로 변환하는 과정을 실습합니다.

이미지 추론

from ultralytics import YOLO

# 학습된 모델 로드
model = YOLO('runs/detect/safety_helmet/weights/best.pt')

# 단일 이미지 추론
results = model('test.jpg', conf=0.5, iou=0.45)

# 결과 접근
for result in results:
    for box in result.boxes:
        cls_id = int(box.cls)
        cls_name = result.names[cls_id]
        conf = float(box.conf)
        x1, y1, x2, y2 = box.xyxy[0].tolist()
        print(f"{cls_name}: {conf:.2f} [{x1:.0f}, {y1:.0f}, {x2:.0f}, {y2:.0f}]")

# 결과 이미지 저장
result.save('result.jpg')

영상 추론

# 영상 파일 추론
results = model('video.mp4', stream=True)  # stream=True로 메모리 절약

for frame_idx, result in enumerate(results):
    print(f"프레임 {frame_idx}: {len(result.boxes)}개 탐지")

# 결과 영상 저장 (자동)
results = model('video.mp4', save=True)

배치 추론

from pathlib import Path

# 폴더 내 모든 이미지 추론
results = model(
    source='test_images/',
    conf=0.5,
    save=True,
    save_txt=True,   # 레이블 파일 저장
    project='results',
    name='predictions',
)

# 결과를 DataFrame으로 변환
for result in results:
    df = result.to_df()
    print(df[['name', 'confidence', 'xmin', 'ymin', 'xmax', 'ymax']])

추론 결과 시각화 (supervision)

import cv2
import supervision as sv

model = YOLO('best.pt')
image = cv2.imread('test.jpg')
results = model(image)[0]

# Ultralytics 결과를 supervision Detections로 변환
detections = sv.Detections.from_ultralytics(results)

# 바운딩 박스 그리기
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()

labels = [
    f"{results.names[int(cls)]} {conf:.2f}"
    for cls, conf in zip(detections.class_id, detections.confidence)
]

annotated = box_annotator.annotate(scene=image.copy(), detections=detections)
annotated = label_annotator.annotate(scene=annotated, detections=detections, labels=labels)

cv2.imwrite('annotated.jpg', annotated)

ONNX 변환

# PyTorch → ONNX 변환
model = YOLO('best.pt')
model.export(
    format='onnx',
    imgsz=640,
    simplify=True,    # ONNX 그래프 최적화
    dynamic=False,     # 고정 배치 크기
)
# → best.onnx 생성

# ONNX 모델로 추론
onnx_model = YOLO('best.onnx')
results = onnx_model('test.jpg')

다양한 내보내기 형식

# TensorRT (NVIDIA GPU 최적화)
model.export(format='engine', imgsz=640, half=True)

# CoreML (Apple 기기)
model.export(format='coreml', imgsz=640)

# TFLite (모바일/엣지)
model.export(format='tflite', imgsz=640)

# OpenVINO (Intel 하드웨어)
model.export(format='openvino', imgsz=640)

추론 파라미터

파라미터	기본값	설명
`conf`	0.25	최소 신뢰도 임계값
`iou`	0.7	NMS IoU 임계값
`imgsz`	640	입력 이미지 크기
`half`	False	FP16 추론
`device`	auto	GPU/CPU 선택
`max_det`	300	이미지당 최대 탐지 수
`classes`	None	특정 클래스만 탐지

# 특정 클래스만 탐지 (예: person=0만)
results = model('image.jpg', classes=[0])

# FP16 추론 (GPU에서 속도 향상)
results = model('image.jpg', half=True)

내보내기 형식 비교

형식	대상 하드웨어	속도 향상	파일 크기
PyTorch (.pt)	GPU/CPU	기준	기준
ONNX	범용	1.5~2x	비슷
TensorRT	NVIDIA GPU	3~5x	작음
CoreML	Apple 기기	2~3x	작음
TFLite	모바일/엣지	2~3x	작음
OpenVINO	Intel CPU	2~3x	작음

서버 배포는 ONNX 또는 TensorRT, 모바일은 TFLite 또는 CoreML을 선택합니다. 자세한 최적화 방법은 ONNX/TensorRT 변환 문서를 참고합니다.

트러블슈팅

추론 속도가 너무 느립니다

half=True로 FP16 추론을 활성화합니다. 2) 더 작은 모델(yolo11n)을 사용합니다. 3) 이미지 크기를 줄이세요(640→480). 4) TensorRT로 변환하면 GPU 추론 속도가 3~5배 향상됩니다.

ONNX 변환 시 오류가 발생합니다

pip install onnx onnxruntime으로 의존성을 설치합니다. simplify=True에서 오류가 나면 simplify=False로 시도합니다.

​추론 파라미터

​내보내기 형식 비교

​트러블슈팅

추론 파라미터

내보내기 형식 비교

트러블슈팅