pip3 install pdfminer2
import io

from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage


def convert_pdf_to_txt(path):
    rsrcmgr = PDFResourceManager()
    retstr = io.StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
    fp = open(path, 'rb')
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    password = ""
    maxpages = 0
    caching = True
    pagenos = set()

    for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages,
                                  password=password,
                                  caching=caching,
                                  check_extractable=True):
        interpreter.process_page(page)



    fp.close()
    device.close()
    text = retstr.getvalue()
    retstr.close()
    return text

path = "beautiful soup.pdf"
print(convert_pdf_to_txt(path))
Este artigo foi útil ?
SimNão

6 Replies to “Extraindo texto do PDF com pdfminer”

  1. Finally a straightforward login for n777slotjili! No more fumbling around trying to remember passwords. Fast and easy, just how I like it. Get logged in here: n777slotjililogin

  2. 120betlogin disse:

    Dead easy login page for 120bet. Bookmarked 120betlogin so I can jump straight in whenever I fancy a bet. No messing about: 120betlogin

  3. 666casinologin made it easy to get in and play quick. Appreciate that. The login process was slick, no hassles. Worth checking out for that alone! 666casinologin

  4. n188game disse:

    n188game, time to get my game on! Hoping for some awesome gameplay and even better wins. Let’s see what this site has in store. Game on!n188game

  5. gà đòn c1 disse:

    For the gà đòn enthusiasts, I’m seeing people talk about gà đòn c1. Might be something worth checking out, right?

  6. winwinslot777 disse:

    Hey y’all! If you’re a slot fan, Winwinslot777 is calling your name! Amazing payouts and a really cool theme. Check them out here: winwinslot777

Deixe um comentário para gà đòn c1 Cancelar resposta

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *

Close Search Window