[Israel.pm] How to get TEXT from PDF ?

Roey Almog (Infoneto Ltd) almog at infoneto.co.il
Sun Jun 28 06:56:32 PDT 2009


Offer

Thanks, but I have already tried it - you get the file structure with
all the PDF metadata etc.
the text itself is encoded somehow, so it does not help.

I thought that I could get away with module that do what I need, but
it seems it only works for simple PDF's.

Roey

On Sun, Jun 28, 2009 at 2:37 PM, Roey Almog (Infoneto Ltd)
<almog at infoneto.co.il> wrote:
>
> Offer
> Thanks, but I have already tried it - you get the file structure with all the PDF metadata etc.
> the text itself is encoded somehow, so it does not help.
> I thought that I could get away with module that do what I need, but it seems it only works for simple PDF's.
> Roey
> On Sun, Jun 28, 2009 at 2:14 PM, Offer Kaye<offer.kaye at gmail.com> wrote:
> > On Sun, Jun 28, 2009 at 12:36 PM, Roey Almog (Infoneto Ltd) wrote:
> >> Hi,
> >>
> >> I tried using CAM::PDF to get text out of PDF's in the following way:
> >>
> >
> > I haven't used it myself but http://search.cpan.org/dist/PDF-API2
> > seems to be fairly updated.
> > Try out the "stringify" method, I would guess something like:
> >
> > use PDF::API2;
> > my $pdf = PDF::API2->open('demo.pdf');
> > my $page = $pdf->openpage(1);
> > print $pdf->stringify;
> >
> > Cheers,
> > --
> > Offer Kaye
> > _______________________________________________
> > Perl mailing list
> > Perl at perl.org.il
> > http://mail.perl.org.il/mailman/listinfo/perl
> >
>


More information about the Perl mailing list