๐ŸŽ™ Whisper๋กœ STT(์Œ์„ฑ ํ…์ŠคํŠธ ๋ณ€ํ™˜) ํ•˜๊ธฐ โ€“ Windows FFmpeg & GPU ์„ค์ • ๊ฐ€์ด๋“œ

๐ŸŽ™ Whisper๋กœ STT(์Œ์„ฑ ํ…์ŠคํŠธ ๋ณ€ํ™˜) ํ•˜๊ธฐ โ€“ Windows FFmpeg & GPU ์„ค์ • ๊ฐ€์ด๋“œ

Python + Whisper + FFmpeg + Scoop + GPU (์„ ํƒ)๊นŒ์ง€ ์™„๋ฒฝ ์ •๋ฆฌ!

โœ… ์ด ๊ธ€์—์„œ ๋‹ค๋ฃจ๋Š” ๊ฒƒ

  • Whisper ์„ค์น˜ ๋ฐ ์‹คํ–‰ ๋ฐฉ๋ฒ•
  • FFmpeg ์„ค์น˜ (PowerShell + Scoop ํ™œ์šฉ)
  • GPU / CPU ํ™˜๊ฒฝ ๋ชจ๋‘ ๋Œ€์‘
  • Python ๊ฐ€์ƒํ™˜๊ฒฝ ๊ตฌ์„ฑ
  • ๋ณ€ํ™˜๋œ ๊ฒฐ๊ณผ ์—‘์…€๋กœ ์ €์žฅํ•˜๊ธฐ

๐Ÿ“‹ ์ค€๋น„๋ฌผ

ํ•ญ๋ชฉ์„ค๋ช…
Python 3.10 ๋˜๋Š” 3.113.12์€ ํ˜ธํ™˜์„ฑ ์ด์Šˆ ์žˆ์œผ๋ฏ€๋กœ ๊ถŒ์žฅํ•˜์ง€ ์•Š์Œ
NVIDIA GPU (์„ ํƒ)GPU ๊ฐ€์†์„ ์œ„ํ•œ ์žฅ์น˜ (์˜ˆ: RTX 4070)
FFmpeg์˜ค๋””์˜ค ์ „์ฒ˜๋ฆฌ ํ•„์ˆ˜
PowerShellScoop ์„ค์น˜์— ํ•„์š”
pipPython ํŒจํ‚ค์ง€ ๊ด€๋ฆฌ์ž (๊ธฐ๋ณธ ํฌํ•จ๋จ)

๐Ÿงฑ STEP 1. Python ์„ค์น˜ (3.11 ๊ถŒ์žฅ)

Python 3.11 ๋‹ค์šด๋กœ๋“œ ๋งํฌ

Python
python3 --version
py --version

๐ŸŒฑ STEP 2. ๊ฐ€์ƒํ™˜๊ฒฝ ๋งŒ๋“ค๊ธฐ

์—๋Ÿฌ ๋ฌธ๊ตฌ : activate ๋ช…๋ น์ด ํ˜„์žฌ ์œ„์น˜์— ์žˆ์ง€๋งŒ ์ด ๋ช…๋ น์„ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. Windows PowerShell์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ํ˜„์žฌ ์œ„์น˜์—์„œ ๋ช…๋ น์„ ๋กœ๋“œํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

powershell ๊ด€๋ฆฌ์ž๋ชจ๋“œ ์‹คํ–‰ > Set-ExecutionPolicy Unrestricted ์ž…๋ ฅ

Python
## ๊ฐ€์ƒ ํ™˜๊ฒฝ ๋งŒ๋“ค๊ธฐ 
python -m venv myenv
- whisper-env ๋Œ€์‹ ์— ์•„๋ฌด๊ฑฐ๋‚˜ ์จ๋„ ๋œ๋‹ค. 

## ๊ฐ€์ƒํ™˜๊ฒฝ ๋“ค์–ด๊ฐ€๋Š” ๋ฐฉ๋ฒ• 
1. ๊ฐ€์ƒ ํ™˜๊ฒฝ์„ ์ƒ์„ฑํ•œ Scripts ํด๋”๋กœ ์ด๋™ํ•œ๋‹ค. 
-> cd myenvScripts
2. activate ๋ช…๋ น์–ด๋ฅผ ์ž…๋ ฅํ•œ๋‹ค. 
-> .activate

์ž…๋ ฅ ํ›„ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์ด๋ ‡๊ฒŒ ๋ฐ”๋€œ:
(myenv) C:…>
โ†’ ๊ทธ๋Ÿฌ๋ฉด ์„ฑ๊ณต์ ์œผ๋กœ ๊ฐ€์ƒํ™˜๊ฒฝ์— ๋“ค์–ด๊ฐ„ ๊ฑฐ์•ผ!

๐Ÿ”Œ STEP 3. FFmpeg ์„ค์น˜ (Scoop ํ™œ์šฉ)

Python
# PowerShell์—์„œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ—ˆ์šฉํ•ด์ฃผ๋Š” ๋ช…๋ น
Set-ExecutionPolicy RemoteSigned -scope CurrentUser 

# ์ด๊ฑด Scoop์ด๋ผ๋Š” ์œˆ๋„์šฐ์šฉ ํŒจํ‚ค์ง€ ๋งค๋‹ˆ์ €๋ฅผ ์„ค์น˜ํ•˜๋Š” ๋ช…๋ น์–ด์•ผ.
irm get.scoop.sh | iex 
scoop install ffmpeg
ffmpeg -version

๐Ÿง  STEP 4. PyTorch ์„ค์น˜

GPU:

Python
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

CPU:

Python
pip install torch torchvision torchaudio

ํ™•์ธ:

Python
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))

๐Ÿ“ฆ STEP 5. Whisper ์„ค์น˜

Python
pip install openai-whisper

๐Ÿงช STEP 6. Whisper CLI ์‚ฌ์šฉ ์˜ˆ์‹œ

Python
whisper "C:UsersYOUDesktopaudio.wav" --model medium --language Korean

๐Ÿ“Š STEP 7. Python ์ฝ”๋“œ๋กœ STT + Excel ์ €์žฅ

Python
import whisper
import pandas as pd
import datetime
import os

def format_time(seconds):
    return str(datetime.timedelta(seconds=int(seconds)))

# Whisper ๋ชจ๋ธ์„ CPU์—์„œ ๋กœ๋“œ
print("Loading Whisper model on CPU...")
model = whisper.load_model("medium", device="cpu")

# ์˜ค๋””์˜ค ํŒŒ์ผ ํด๋” ๊ฒฝ๋กœ (๋ฌธ์ž์—ด ์•ž์— r ๋ถ™์—ฌ์„œ ๊ฒฝ๋กœ ์˜ค๋ฅ˜ ๋ฐฉ์ง€)
folder = r"audio"

results = []

# ํด๋” ๋‚ด ์˜ค๋””์˜ค ํŒŒ์ผ ํƒ์ƒ‰ ๋ฐ ์ „์‚ฌ
for file in os.listdir(folder):
    if file.endswith(('.wav', '.mp3', '.m4a')):
        path = os.path.join(folder, file)
        print(f"Transcribing: {file}")
        result = model.transcribe(path, language="ko")

        # ์‹œ๊ฐ„ ๊ตฌ๊ฐ„๋ณ„ ํ…์ŠคํŠธ ์ •๋ฆฌ
        segments = []
        for seg in result['segments']:
            segments.append(f"[{format_time(seg['start'])} - {format_time(seg['end'])}] "{seg['text'].strip()}"")

        results.append({
            "filename": file,
            "transcript": "n".join(segments)
        })

# ํ˜„์žฌ ๋‚ ์งœ๋ฅผ YYYYMMDD ํ˜•์‹์œผ๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ
current_date = datetime.datetime.now().strftime("%Y%m%d")

# ํŒŒ์ผ ์ด๋ฆ„ ์ƒ์„ฑ
file_name = f"{current_date}_results.csv"
output_path = os.path.join('.', file_name)  # ์ƒ์œ„ ํด๋”๋กœ ์ด๋™

# DataFrame์„ CSV๋กœ ์ €์žฅ
df = pd.DataFrame(results)
df.to_csv(output_path, index=False, encoding='cp949')

print(f"nโœ… ์ „์‚ฌ ์™„๋ฃŒ! ๊ฒฐ๊ณผ ์ €์žฅ ์œ„์น˜: {output_path}")

โ— ์ž์ฃผ ๋ฐœ์ƒํ•˜๋Š” ์˜ค๋ฅ˜ & ํ•ด๊ฒฐ๋ฒ•

์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•
‘ffmpeg’์€(๋Š”) ๋‚ด๋ถ€ ๋ช…๋ น์–ด๊ฐ€ ์•„๋‹™๋‹ˆ๋‹คScoop ๋˜๋Š” ์ˆ˜๋™ ์„ค์น˜ + ํ™˜๊ฒฝ๋ณ€์ˆ˜ ๋“ฑ๋ก
torch.cuda.is_available() = FalseGPU์šฉ PyTorch ๋ฏธ์„ค์น˜. cu121 ๋ฒ„์ „์œผ๋กœ ์žฌ์„ค์น˜
%1์€ ์˜ฌ๋ฐ”๋ฅธ Win32 ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์ด ์•„๋‹™๋‹ˆ๋‹คPython 3.12์™€ PyTorch GPU ์ถฉ๋Œ. 3.10~3.11 ๊ถŒ์žฅ
FileNotFoundError์ „์ฒด ๊ฒฝ๋กœ ์ง€์ • ํ•„์š”

๐Ÿ”— ์ฐธ๊ณ  ๋งํฌ

  • Whisper GitHub
  • Scoop ๊ณต์‹ ์‚ฌ์ดํŠธ

๐Ÿ”Š ํ•œ๊ตญ์–ด ์Œ์„ฑ ์ƒ˜ํ”Œ

  • ๐Ÿ“šAI-Hub โ€“ ํ•œ๊ตญ์–ด ๊ฐ•์˜ ์Œ์„ฑ ๋ฐ์ดํ„ฐEBS ๊ต์œก ์˜์ƒ ๊ธฐ๋ฐ˜์˜ ๊ณ ํ’ˆ์งˆ ๊ฐ•์˜ ์Œ์„ฑ ๋ฐ์ดํ„ฐ. AI ํ•™์Šต์ด๋‚˜ ๊ต์œก ์ฝ˜ํ…์ธ  ์ œ์ž‘์— ์ ํ•ฉ.