No description
|
|
||
|---|---|---|
| .gitea/workflows | ||
| .github/workflows | ||
| .vscode | ||
| jars | ||
| samples.pdf | ||
| scripts | ||
| src | ||
| .gitignore | ||
| .xqdoca | ||
| changelog.md | ||
| LICENSE | ||
| package.json | ||
| readme.md | ||
Pdfbox
A BaseX interface for Pdfbox version 3. It is packaged using the Expath format, and is tested against BaseX 10.7 and 11.7. Note: currently (v0.1.5) also works on V9.7
- The Pdfbox 3 FAQ may be useful.
Features
The features focus on extracting information from PDFs rather than creation or editing.
- read PDF page count.
- read any PDF outline and return as map(s) or XML.
- read pagelabels.
- read page text.
- save pdf page range to a new pdf.
- save image of rendered pdf page.
Install
Pre-built pdfbox-x.y.z.zar files are available on the releases page. They can be installed using the standard respository functions or using the GUI.
Usage
import module namespace pdfbox="org.expkg_zone58.Pdfbox3";
pdfbox:with-pdf("...path/to/pdf.pdf",
function($pdf){
(1 to pdfbox:page-count($pdf))!pdfbox:page-text($pdf,.)
}
)
Build
scripts/make-xar.xqpackages the requiredjars andxqmfiles to axarfile in thedistfolder.
Action support
The workflow ci-basex.yaml builds and tests the package. This can be used as an action on github, or on a local gitea installation.