출력 파싱 (Output Parsing)

LLM의 응답은 기본적으로 자유 형식의 텍스트입니다. 애플리케이션에서 이를 활용하려면 구조화된 데이터(JSON, 객체 등)로 변환해야 합니다. 이 문서에서는 안정적으로 구조화된 출력을 얻는 다양한 방법을 실습합니다.

사전 준비

pip install openai anthropic pydantic instructor

실습

JSON 모드 활용

OpenAI의 response_format 파라미터를 사용하면 모델이 반드시 유효한 JSON을 출력하도록 강제할 수 있습니다.

from openai import OpenAI
import json

client = OpenAI()

# JSON 모드 활성화
response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type": "json_object"},  # JSON 모드
    messages=[
        {
            "role": "system",
            "content": (
                "당신은 NLP 분석 도구입니다. "
                "반드시 JSON 형식으로 응답하세요."
            )
        },
        {
            "role": "user",
            "content": (
                "다음 문장의 감성을 분석하세요: "
                "'이 영화는 정말 감동적이었고, 배우들의 연기도 훌륭했습니다.'\n\n"
                "다음 필드를 포함하세요: sentiment, confidence, keywords"
            )
        }
    ],
)

# JSON 파싱
result = json.loads(response.choices[0].message.content)
print(json.dumps(result, ensure_ascii=False, indent=2))

출력 예시:

{
  "sentiment": "positive",
  "confidence": 0.95,
  "keywords": ["감동적", "훌륭"]
}

JSON 모드를 사용할 때 시스템 프롬프트에 “JSON으로 응답하세요”라는 지시를 반드시 포함해야 합니다. 지시 없이 response_format만 설정하면 에러가 발생할 수 있습니다.

Structured Outputs (스키마 지정)

OpenAI의 Structured Outputs는 JSON Schema를 사전 정의하여 모델 출력의 구조를 보장합니다. JSON 모드보다 엄격한 제어가 가능합니다.

from openai import OpenAI

client = OpenAI()

# JSON Schema 정의
response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "sentiment_analysis",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "sentiment": {
                        "type": "string",
                        "enum": ["positive", "negative", "neutral"],
                        "description": "감성 분류 결과"
                    },
                    "confidence": {
                        "type": "number",
                        "description": "0.0~1.0 범위의 확신도"
                    },
                    "reasoning": {
                        "type": "string",
                        "description": "판단 근거 설명"
                    },
                    "keywords": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "핵심 감성 키워드"
                    }
                },
                "required": ["sentiment", "confidence", "reasoning", "keywords"],
                "additionalProperties": False,
            }
        }
    },
    messages=[
        {"role": "system", "content": "한국어 텍스트의 감성을 분석합니다."},
        {"role": "user", "content": "분석할 텍스트: '배송이 너무 느려서 실망했지만, 제품 품질은 괜찮았습니다.'"}
    ],
)

import json
result = json.loads(response.choices[0].message.content)
print(json.dumps(result, ensure_ascii=False, indent=2))

출력 예시:

{
  "sentiment": "neutral",
  "confidence": 0.72,
  "reasoning": "배송 불만(부정)과 품질 만족(긍정)이 혼재되어 있어 중립으로 판단합니다.",
  "keywords": ["느려서", "실망", "괜찮았습니다"]
}

strict: true 모드에서는 모델 출력이 정의된 스키마를 반드시 따릅니다. enum, required, additionalProperties: false 등을 활용하여 출력 형식을 엄밀하게 제어할 수 있습니다.

Pydantic 모델로 검증

Pydantic을 사용하면 타입 검증, 값 범위 체크, 커스텀 유효성 검사를 자동화할 수 있습니다.

from pydantic import BaseModel, Field, field_validator
from typing import Literal
import json
from openai import OpenAI

# Pydantic 모델 정의
class SentimentResult(BaseModel):
    """감성 분석 결과 모델"""
    sentiment: Literal["positive", "negative", "neutral"] = Field(
        description="감성 분류 결과"
    )
    confidence: float = Field(
        ge=0.0, le=1.0,
        description="0.0~1.0 범위의 확신도"
    )
    reasoning: str = Field(
        min_length=10,
        description="최소 10자 이상의 판단 근거"
    )
    keywords: list[str] = Field(
        min_length=1,
        description="핵심 감성 키워드 (최소 1개)"
    )

    @field_validator("keywords")
    @classmethod
    def keywords_not_empty(cls, v):
        """빈 문자열이 아닌 키워드만 허용합니다."""
        return [k for k in v if k.strip()]

# LLM 호출 + Pydantic 검증
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "system",
            "content": (
                "한국어 텍스트의 감성을 분석합니다. JSON으로 응답하세요.\n"
                "필드: sentiment (positive/negative/neutral), "
                "confidence (0.0~1.0), reasoning (판단 근거), "
                "keywords (핵심 키워드 배열)"
            )
        },
        {
            "role": "user",
            "content": "분석: '이 강의는 정말 이해하기 쉽고 실습도 잘 되어있어요!'"
        }
    ],
)

# Pydantic으로 검증 및 파싱
raw_data = json.loads(response.choices[0].message.content)
try:
    result = SentimentResult(**raw_data)
    print(f"감성: {result.sentiment}")
    print(f"확신도: {result.confidence}")
    print(f"근거: {result.reasoning}")
    print(f"키워드: {result.keywords}")
except Exception as e:
    print(f"검증 실패: {e}")

instructor 라이브러리 활용

instructor는 Pydantic과 LLM API를 통합하여 구조화된 출력을 가장 간편하게 얻는 라이브러리입니다.

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Literal

# instructor로 클라이언트 패치
client = instructor.from_openai(OpenAI())

# Pydantic 모델 정의
class MovieReview(BaseModel):
    """영화 리뷰 분석 결과"""
    title: str = Field(description="영화 제목")
    sentiment: Literal["positive", "negative", "mixed"] = Field(
        description="전체 감성"
    )
    score: float = Field(ge=1.0, le=10.0, description="1~10점 평점")
    pros: list[str] = Field(description="장점 목록")
    cons: list[str] = Field(description="단점 목록")
    summary: str = Field(description="한 줄 요약")

# 자동으로 Pydantic 모델에 맞는 출력을 받습니다
review = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=MovieReview,  # instructor 전용 파라미터
    messages=[
        {
            "role": "user",
            "content": (
                "다음 리뷰를 분석하세요:\n"
                "'인터스텔라는 시각적으로 압도적이고 과학적 고증도 훌륭했습니다. "
                "다만 후반부 전개가 다소 난해할 수 있습니다. "
                "한스 짐머의 음악은 역대 최고 수준이었습니다.'"
            )
        }
    ],
)

# 이미 Pydantic 모델 인스턴스
print(f"영화: {review.title}")
print(f"감성: {review.sentiment}")
print(f"평점: {review.score}")
print(f"장점: {review.pros}")
print(f"단점: {review.cons}")
print(f"요약: {review.summary}")

instructor가 제공하는 핵심 기능:

기능	설명
자동 재시도	검증 실패 시 자동으로 재시도 (최대 3회 기본)
검증 피드백	에러 메시지를 LLM에 전달하여 수정 유도
스트리밍	부분 객체를 점진적으로 반환
다중 제공자	OpenAI, Anthropic, Cohere 등 통합 지원

Anthropic과 함께 사용하기:

import instructor
import anthropic
from pydantic import BaseModel

# Anthropic 클라이언트 패치
client = instructor.from_anthropic(anthropic.Anthropic())

class Summary(BaseModel):
    main_topic: str
    key_points: list[str]
    word_count: int

result = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    response_model=Summary,
    messages=[
        {"role": "user", "content": "Transformer 아키텍처의 핵심을 요약해 주세요."}
    ],
)
print(result)

파싱 실패 시 재시도 전략

LLM 출력이 예상 형식에 맞지 않을 때 체계적으로 재시도하는 전략입니다.

import json
from openai import OpenAI
from pydantic import BaseModel, ValidationError, Field
from typing import Literal

client = OpenAI()

class ExtractedEntity(BaseModel):
    """텍스트에서 추출한 엔티티"""
    name: str = Field(description="엔티티 이름")
    type: Literal["PERSON", "ORG", "LOCATION", "DATE"] = Field(
        description="엔티티 유형"
    )
    context: str = Field(description="텍스트에서의 맥락")

class ExtractionResult(BaseModel):
    """엔티티 추출 결과"""
    entities: list[ExtractedEntity]
    text_length: int = Field(description="원문 길이")

def extract_with_retry(
    text: str,
    max_retries: int = 3,
) -> ExtractionResult:
    """구조화된 출력을 재시도 로직과 함께 추출합니다."""
    messages = [
        {
            "role": "system",
            "content": (
                "텍스트에서 엔티티를 추출하여 JSON으로 반환하세요.\n"
                "형식: {\"entities\": [{\"name\": \"...\", \"type\": "
                "\"PERSON|ORG|LOCATION|DATE\", \"context\": \"...\"}], "
                "\"text_length\": N}"
            )
        },
        {"role": "user", "content": f"추출 대상: '{text}'"},
    ]

    last_error = None
    for attempt in range(max_retries):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            response_format={"type": "json_object"},
            messages=messages,
        )

        raw_content = response.choices[0].message.content
        try:
            raw_data = json.loads(raw_content)
            result = ExtractionResult(**raw_data)
            return result

        except (json.JSONDecodeError, ValidationError) as e:
            last_error = e
            print(f"[시도 {attempt + 1}] 파싱 실패: {e}")

            # 에러 정보를 LLM에 피드백하여 수정 유도
            messages.append({"role": "assistant", "content": raw_content})
            messages.append({
                "role": "user",
                "content": (
                    f"출력이 올바르지 않습니다. 에러: {str(e)}\n"
                    "올바른 JSON 형식으로 다시 시도해 주세요."
                )
            })

    raise ValueError(f"최대 재시도 횟수 초과. 마지막 에러: {last_error}")

# 사용 예시
result = extract_with_retry(
    "삼성전자의 이재용 회장이 2024년 3월 15일 서울에서 기자회견을 열었습니다."
)
for entity in result.entities:
    print(f"  {entity.type}: {entity.name} ({entity.context})")

instructor 라이브러리를 사용하면 이 재시도 로직이 내장되어 있습니다. 검증 에러가 자동으로 LLM에 피드백되어 수정된 응답을 받습니다. 대부분의 경우 instructor를 사용하는 것이 가장 효율적입니다.

구조화된 데이터 추출 실습

복잡한 비즈니스 문서에서 구조화된 정보를 추출하는 종합 실습입니다.

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional
from enum import Enum

client = instructor.from_openai(OpenAI())

class Priority(str, Enum):
    HIGH = "high"
    MEDIUM = "medium"
    LOW = "low"

class ActionItem(BaseModel):
    """회의록에서 추출한 액션 아이템"""
    task: str = Field(description="수행할 작업 내용")
    assignee: str = Field(description="담당자 이름")
    deadline: Optional[str] = Field(None, description="마감일 (YYYY-MM-DD)")
    priority: Priority = Field(description="우선순위")

class MeetingMinutes(BaseModel):
    """회의록 구조화 결과"""
    title: str = Field(description="회의 제목")
    date: str = Field(description="회의 일시")
    participants: list[str] = Field(description="참석자 목록")
    summary: str = Field(description="회의 핵심 내용 요약 (3문장 이내)")
    decisions: list[str] = Field(description="주요 결정 사항")
    action_items: list[ActionItem] = Field(description="후속 조치 항목")

# 비정형 회의록 텍스트
raw_minutes = """
2024년 3월 20일 NLP 프로젝트 주간 회의

참석: 김팀장, 박대리, 이연구원, 최인턴

김팀장: 지난주 감성 분석 모델 성능이 어떻게 되었나요?
박대리: KoBERT 기반으로 F1 0.87 달성했습니다. 목표였던 0.85를 넘었습니다.
김팀장: 좋습니다. 다음 주까지 프로덕션 배포 준비해 주세요.
이연구원: RAG 파이프라인은 청크 사이즈 최적화 중입니다. 이번 주 금요일까지 완료 예정입니다.
최인턴: 데이터 라벨링은 80% 완료했습니다.

결정사항:
- 감성 분석 모델은 다음 주 수요일 배포
- RAG 파이프라인은 금요일까지 최적화 완료 후 다음 주 테스트
- 라벨링 완료 후 모델 재학습 진행
"""

# instructor로 구조화된 추출
minutes = client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=MeetingMinutes,
    messages=[
        {
            "role": "system",
            "content": "회의록에서 구조화된 정보를 추출합니다."
        },
        {"role": "user", "content": f"회의록:\n{raw_minutes}"},
    ],
)

# 결과 출력
print(f"회의: {minutes.title}")
print(f"일시: {minutes.date}")
print(f"참석자: {', '.join(minutes.participants)}")
print(f"\n요약: {minutes.summary}")
print(f"\n결정사항:")
for d in minutes.decisions:
    print(f"  - {d}")
print(f"\n액션 아이템:")
for item in minutes.action_items:
    print(f"  [{item.priority.value}] {item.task} → {item.assignee} (마감: {item.deadline})")

트러블슈팅

json.loads()에서 JSONDecodeError 발생

LLM이 JSON 앞뒤에 마크다운 코드 블록(json ... )을 추가하는 경우가 있습니다.

response_format={"type": "json_object"}를 사용하면 순수 JSON만 출력됩니다
그래도 실패하면 정규식으로 코드 블록을 제거하세요: re.sub(r'```json?\n?|```', '', text).strip()

Pydantic ValidationError가 반복적으로 발생합니다

모델이 스키마를 정확히 따르지 못하는 경우입니다.

시스템 프롬프트에 정확한 JSON 예시를 포함하세요
gpt-4o 이상의 모델을 사용하면 스키마 준수율이 높아집니다
instructor 라이브러리의 자동 재시도 기능을 활용하세요

instructor 설치 후 import 에러

instructor는 openai >= 1.0과 pydantic >= 2.0이 필요합니다.

pip install --upgrade openai pydantic instructor

Enum 필드가 잘못된 값으로 채워집니다

시스템 프롬프트에 허용 가능한 값을 명시적으로 나열하세요. Structured Outputs의 strict: true 모드를 사용하면 enum 제약이 강제됩니다.

다음 단계

안전장치

파싱된 출력의 안전성을 검증하고 유해 콘텐츠를 필터링하는 방법을 학습합니다

비용 최적화

출력 토큰을 줄이고 구조화된 응답으로 비용을 절약하는 전략을 다룹니다

00. 시작하기

01. 텍스트 전처리

02. Attention과 Transformer

03. 사전학습 언어 모델

04. NLP 핵심 태스크

05. 프롬프트 엔지니어링

06. LLM 활용 실무

07. 실무 프로젝트

출력 파싱

출력 파싱 (Output Parsing)

사전 준비

실습

트러블슈팅

다음 단계

안전장치

비용 최적화

00. 시작하기

01. 텍스트 전처리

02. Attention과 Transformer

03. 사전학습 언어 모델

04. NLP 핵심 태스크

05. 프롬프트 엔지니어링

06. LLM 활용 실무

07. 실무 프로젝트

​출력 파싱 (Output Parsing)

​사전 준비

​실습

​트러블슈팅

​다음 단계

안전장치

비용 최적화

출력 파싱 (Output Parsing)

사전 준비

실습

트러블슈팅

다음 단계