YouTube 비디오에서 음성 텍스트를 추출해서 요약하는 방법

YouTube 비디오에서 음성을 추출하고 이를 텍스트로 변환한 후, 요약하는 방법을 소개합니다. 이번 과정에서는 yt_dlp, whisper, 그리고 langchain 라이브러리를 사용하여 자동으로 텍스트를 요약하는 기능을 구현합니다.

ref. https://www.youtube.com/watch?v=oyotmQ-vQLs&t=729s

1. 필요한 라이브러리 설치

아래 명령어를 실행하여 필요한 라이브러리를 설치합니다.

pip install yt-dlp openai-whisper langchain

2. YouTube 비디오에서 음성 추출하기

먼저, yt_dlp를 이용해 YouTube 비디오에서 오디오를 추출합니다.

import yt_dlp

youtube_link = "https://youtu.be/mCyY8pQDpJM"

ydl_opts = {
    'format': 'bestaudio/best',
    'outtmpl': 'downloaded_audio.%(ext)s',
    'postprocessors': [{
        "key": "FFmpegExtractAudio",
        "preferredcodec": "mp3",
        "preferredquality": "320",
    }],
}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    ydl.download([youtube_link])

3. 음성을 텍스트로 변환하기

OpenAI의 whisper를 이용하여 오디오 파일을 텍스트로 변환합니다.

import whisper

model = whisper.load_model("small")  # 작은 모델을 사용하여 변환 속도를 향상
result = model.transcribe("downloaded_audio.mp3")
text = result["text"]
print("추출된 텍스트:", text[:500])  # 첫 500자 출력

4. 텍스트를 작은 단위로 분할하기

긴 텍스트를 효과적으로 처리하기 위해 RecursiveCharacterTextSplitter를 사용하여 나눕니다.

from langchain.schema.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = [Document(page_content=x) for x in text_splitter.split_text(text)]
split_docs = text_splitter.split_documents(docs)

5. 텍스트 요약하기

LangChain의 MapReduceDocumentsChain을 사용하여 텍스트를 요약합니다.

from langchain.chains import MapReduceDocumentsChain
from langchain.chains.llm import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama

llm = ChatOllama(model="exaone3.5", temperature=0)

# Map 단계 (부분 요약)
map_template = """다음 문서를 기반으로 주요 내용을 요약하세요:
{docs}
"""
map_prompt = ChatPromptTemplate([("human", map_template)])
map_chain = LLMChain(llm=llm, prompt=map_prompt)

# Reduce 단계 (최종 요약)
reduce_template = """다음 요약된 내용을 종합하여 최종 요약을 작성하세요:
{doc_summaries}
"""
reduce_prompt = ChatPromptTemplate([("human", reduce_template)])
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

combine_documents_chain = StuffDocumentsChain(llm_chain=reduce_chain, document_variable_name="doc_summaries")

map_reduce_chain = MapReduceDocumentsChain(
    llm_chain=map_chain,
    reduce_documents_chain=combine_documents_chain,
    document_variable_name="docs",
    return_intermediate_steps=False,
)

summary = map_reduce_chain.run(split_docs)
print("요약된 내용:", summary)

6. 결과 확인

위 과정을 실행하면 YouTube 비디오에서 추출된 음성이 텍스트로 변환된 후, 주요 내용만 요약되어 출력됩니다.

이 방법을 활용하면 긴 동영상 콘텐츠를 빠르게 요약하여 중요한 내용을 쉽게 파악할 수 있습니다.

'''

ref:

1. YouTube 비디오에서 음성 추출: https://theonly1.tistory.com/2479

2. 음성에서 텍스트 추출: https://m.blog.naver.com/baemsu/223239058462?recommendTrackingCode=2

3. 텍스트를 요약: https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain/

YouTube 비디오 = "https://youtu.be/mCyY8pQDpJM?si=sd324t9fTg9EDZBM" 1'40"

112 출동' 경찰관 흉기 피습..범인은 실탄에 맞아 사망 (2025.02.26/12MBC뉴스)

'''

from langchain_ollama import ChatOllama

import yt_dlp

import whisper

from langchain.schema.document import Document

from langchain.text_splitter import RecursiveCharacterTextSplitter

# from langchain.prompts import PromptTemplate

from langchain.chains.llm import LLMChain

llm = ChatOllama(model="exaone3.5", temperature=0)

# YouTube 비디오에서 음성 추출

''''''

youtube_link = "https://youtu.be/mCyY8pQDpJM?si=sd324t9fTg9EDZBM"

ydl_opts = {

'format': 'bestaudio/best',

'outtmpl': 'downloaded_audio.%(ext)s',

'postprocessors': [{

"key": "FFmpegExtractAudio",

"preferredcodec": "mp3",

"preferredquality": "320",

}],

}

with yt_dlp.YoutubeDL(ydl_opts) as ydl:

ydl.download([youtube_link])

# 음성에서 텍스트 추출

# 추출에 시간이 너무 많이 걸리는데 진행 상황을 트래킹할 수 있나?

model = whisper.load_model("tiny") # "tiny", "base", "small", "medium", "large"

result = model.transcribe("downloaded_audio.mp3")

# # 추출된 텍스트의 길이

# print("추출된 텍스트의 길이: ", len(result["text"]))

# # 추출된 텍스트의 첫 1000자

# print("추출된 텍스트의 첫 1000자: ", result["text"][:1000])

# # 세그먼트 정보 확인

# print("세그먼트 정보: ", result["segments"][:1])

# 텍스트를 요약

text_splitter = RecursiveCharacterTextSplitter( # RecursiveCharacterTextSplitter 초기화

chunk_size=1000, # 원하는 청크 크기

chunk_overlap=200 # 청크 간의 중첩 크기

)

docs = [Document(page_content=x) for x in text_splitter.split_text(result["text"])]

split_docs = text_splitter.split_documents(docs)

print("split_docs 정보: ", split_docs)

# MapReduceChain 구성

from langchain.chains import MapReduceDocumentsChain, ReduceDocumentsChain

from langchain.chains.combine_documents.stuff import StuffDocumentsChain

from langchain_core.prompts import ChatPromptTemplate

from langchain_core.output_parsers import StrOutputParser

# Map

# Map 프롬프트 , "Write a concise summary of the following: {docs}."

map_template = """다음은 여러 개의 문서입니다.

{docs}

이 문서 목록을 기반으로 주요 테마를 식별해 주세요."""

map_prompt = ChatPromptTemplate([("human", map_template)])

# Map 체인

map_chain = LLMChain(llm=llm, prompt=map_prompt)

# map_chain = map_prompt | llm | StrOutputParser()

# reduce_chain = reduce_prompt | llm | StrOutputParser()

# Reduce

# Reduce 프롬프트

# reduce_template = """

# The following is a set of summaries:

# {docs}

# Take these and distill it into a final, consolidated summary

# of the main themes.

# """

# 최종 답변은 한국어로 약 100단어 정도의 단락이어야 합니다.

reduce_template = """당신은 리포터입니다. 물론입니다 등의 불필요한 말은 사용하지 마세요. 다음은 여러 개의 요약입니다:

{doc_summaries}

이 요약들을 바탕으로 주요 테마를 대략 다섯 개의 단락으로 친절하게 이야기 해주세요."""

reduce_prompt = ChatPromptTemplate([("human", reduce_template)])

# Reduce 체인

reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

# Takes a list of documents, combines them into a single string, and passes this to an LLMChain

# 서 목록을 받아 이를 하나의 문자열로 결합한 후, LLMChain에 전달한다.

combine_documents_chain = StuffDocumentsChain(

llm_chain=reduce_chain, document_variable_name="doc_summaries"

)

# Combines and iteratively reduces the mapped documents

reduce_documents_chain = ReduceDocumentsChain(

# This is final chain that is called.

combine_documents_chain=combine_documents_chain,

# If documents exceed context for `StuffDocumentsChain`

collapse_documents_chain=combine_documents_chain,

# The maximum number of tokens to group documents into.

token_max=2000,

)

# Combining documents by mapping a chain over them, then combining results

map_reduce_chain = MapReduceDocumentsChain(

# Map chain

llm_chain=map_chain,

# Reduce chain

reduce_documents_chain=reduce_documents_chain,

# The variable name in the llm_chain to put the documents in

document_variable_name="docs",

# Return the results of the map steps in the output

return_intermediate_steps=False,

)

sum_result = map_reduce_chain.run(split_docs)

print(sum_result)

# response = llm .invoke("<내용>" + assistant_content + "</내용> <내용>을 한국어로 적어줘.")

# print(response)

신고하기

프로필

YouTube 비디오에서 음성 텍스트를 추출해서 요약하는 방법

YouTube 비디오에서 음성 텍스트를 추출해서 요약하는 방법

1. 필요한 라이브러리 설치

2. YouTube 비디오에서 음성 추출하기

3. 음성을 텍스트로 변환하기

4. 텍스트를 작은 단위로 분할하기

5. 텍스트 요약하기

6. 결과 확인

작성자: 김영국

댓글 쓰기

0 댓글

Most Popular

Flask 서버(Python)에서 HTTPS로 통신하는 방법

MCP 서버 개발 및 디버깅 방법(Python)

Python으로 MCP 서버 구축 및 Claude 연동 가이드

Tags

Categories

팔로어

Featured post

빌 게이츠가 유퀴즈에서 추천한 도서 목록

Popular Posts

Flask 서버(Python)에서 HTTPS로 통신하는 방법

MCP 서버 개발 및 디버깅 방법(Python)

Python으로 MCP 서버 구축 및 Claude 연동 가이드

Footer Menu Widget

Contact form

프로필

YouTube 비디오에서 음성 텍스트를 추출해서 요약하는 방법

YouTube 비디오에서 음성 텍스트를 추출해서 요약하는 방법

1. 필요한 라이브러리 설치

2. YouTube 비디오에서 음성 추출하기

3. 음성을 텍스트로 변환하기

4. 텍스트를 작은 단위로 분할하기

5. 텍스트 요약하기

6. 결과 확인

작성자: 김영국

관심 있을 만한 글

댓글 쓰기

0 댓글

Most Popular

Social Plugin

Tags

Categories

팔로어

Featured post

Popular Posts

Footer Menu Widget

Contact form