See remarks on PDF::API2 in this book review of Perl Graphics Programming.
Etymon's PJ class library coded in Java includes a command-line utility, pjscript. Jens Vonderheide was also enthusiastic about it. In early 2012, Etymon seems to be off-line.
PyPDF2 is Phaseit's fork of pyPdf. Both pyPdf and PyPDF2 are open-source pure-Python libraries which concentrate on manipulation of existing PDF instances.
ReportLab is an ambitious, industrial-strength library largely focused on precise creation of PDF. Understand clearly that it has an open-source, no-charge base, but also a for-fee "ReportLab PLUS" extension of that base. ReportLab PLUS involves a relatively large cost and a relatively large extension of capabilities and services.
from pageCatcher import copyPages from reportlab.pdfgen import canvas def makeAppendedResult(result, first_source, second_source): c = canvas.Canvas(result) copyPages(first_source, c); copyPages(first_source, c); c.showOutline() c.save()
The import_HTML I use is this:
# In response to a correspondent's comment, I replied: # "Bleah; ignore the Python. I'll comment it to make this # clear: the point is just that HTML->PDF is achieved as # HTML->PS->PDF, the second step is canonical, and the # first is done with a specific command-line tool." # Copyright Kyler Laird 2001. # Freely redistributable. # # Import from HTML. def import_HTML(self, html, color=0, style=None, landscape=0, number=0): infile = self._write_string_to_tmpfile(html, ext='HTML') self.outfile = self._mktemp('ps') options = [] if number: options.append('--number') options.append('--startno %d' % number) if landscape: options.append('--landscape') if color: options.append('--colour') if style: stylefile = self._write_string_to_tmpfile(style, ext='style') # options.append('--style "%s"' % (style)) options.append('-f "%s"' % (stylefile)) command_string = "html2ps %s -o %s %s" % (string.join(options, ' '), self.outfile, infile) self._run(command_string) returnThere are several ways to render HTML as PS.
I also (episodically) maintain pages on PDF in general, PDF "converters", PDF generation, ...
In December 2001, I published a breezy introduction to no-cost PDF resources for my "Open Sources" column. I also wrote "Yes You Can" (August 2002), "Low-cost PDF" (April 2003), "PDF for C and C++ Developers" (October 2003), and ... For more information on the products described there, start with the home pages of PDFlib, PJ, and ReportLab. I'll probably write more on ReportLab programming and business strategy throughout 2002, perhaps beginning with a piece on PDF security; write me if there's a particular aspect you want me to cover. Note that the Ohio Department of Transportation's open-source JavaPDF is another product worth considering along with PDFlib, PJ, ReportLab, and all of CPAN's PDF directory.
I recommend reading "Kyler Laird's PDF utilities" both for the usefulness of the tools and hyperlinks available there, and also for the correct engineering commentary. Dave Toureztky maintains a "Gallery of Adobe Remedies" with more comprehensive information on PDF security, including a pointer to a Perl script which decrypts PDF.