๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ’ปTech/๐Ÿpython

hwp to pdf ๋ณ€ํ™˜

by _viper_ 2025. 4. 13.
๋ฐ˜์‘ํ˜•

๊ฐœ์š”

  • ๋ฆฌ๋ˆ…์Šค ํ™˜๊ฒฝ์˜ ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ์กด์žฌํ•˜์ง€ ์•Š์Œ (hwp to pdf)
  • ํ•œ๊ธ€๊ณผ ์ปดํ“จํ„ฐ ์œ ๋ฃŒ API๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ€๋Šฅํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์„œ๋ฒ„์— ํ•œ์ปดํ†ตํ•ฉ๋ทฐ์–ด๊ฐ€ ์„ค์น˜ ๋˜์–ด์•ผ ํ•จ.
  • libreoffice ๋ผ๋Š” ์šฐ๋ถ„ํˆฌ ํŒจํ‚ค์ง€์™€ extention์„ ํ†ตํ•ด ๋ณ€ํ™˜ ๊ฐ€๋Šฅํ•˜์—ฌ ๋ฆฌ๋ˆ…์Šค ํ™˜๊ฒฝ์— ๊ด€๋ จ ํŒจํ‚ค์ง€ ์„ค์น˜ ํ›„ CLI ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ง„ํ–‰

 

์„ค์น˜ ๋ฐ ํ…Œ์ŠคํŠธ

1. libreoffice ์„ค์น˜

sudo apt update && sudo apt install libreoffice -y

2. ํ•œ๊ธ€ ํฐํŠธ ์„ค์น˜

sudo apt install fonts-nanum fonts-noto-cjk fonts-unfonts-core

3. ํ•œ๊ธ€ ์ง€์› libreoffice extention ์„ค์น˜

libreoffice --headless --norestore --nofirststartwizard --accept="socket,host=localhost,port=2002;urp;" --nodefault --nologo &
sleep 10
sudo unopkg add --shared /tmp/H2Orestart-0.7.0.oxt
pkill -f soffice

4. ์‹คํ–‰ ๋ช…๋ น์–ด

libreoffice --headless --infilter=Hwp2002_File --convert-to pdf:writer_pdf_Export --outdir ./output sample.hwp
 

5. ์œ„ ์„ค์ • ํ›„ ํŒŒ์ด์ฌ ์‹คํ–‰ ์ฝ”๋“œ ์˜ˆ์ œ

import os

def convert_hwp_to_pdf(hwp_file, output_dir="../data/output"):
    """
    HWP ํŒŒ์ผ์„ PDF๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜ (LibreOffice ์‚ฌ์šฉ)

    :param hwp_file: ๋ณ€ํ™˜ํ•  HWP ํŒŒ์ผ ๊ฒฝ๋กœ
    :param output_dir: ๋ณ€ํ™˜๋œ PDF ์ €์žฅ ๋””๋ ‰ํ† ๋ฆฌ (๊ธฐ๋ณธ๊ฐ’: ./output)
    """
    # ์ถœ๋ ฅ ๋””๋ ‰ํ† ๋ฆฌ๊ฐ€ ์—†์œผ๋ฉด ์ƒ์„ฑ
    os.makedirs(output_dir, exist_ok=True)

    # LibreOffice ๋ช…๋ น์–ด ์‹คํ–‰
    os.system(f'libreoffice --headless --infilter=Hwp2002_File --convert-to pdf:writer_pdf_Export --outdir {output_dir} "{hwp_file}"')

# ์‚ฌ์šฉ ์˜ˆ์‹œ
hwp_file = "../data/sample.hwp"
convert_hwp_to_pdf(hwp_file)
 

๐Ÿ”— reference url