No description
.gitea/workflows | ||
.github/workflows | ||
.vscode | ||
jars | ||
samples.pdf | ||
scripts | ||
src | ||
.gitignore | ||
.xqdoca | ||
changelog.md | ||
LICENSE | ||
package.json | ||
readme.md |
Pdfbox
A BaseX interface for Pdfbox version 3. It is packaged using the Expath format, and is tested against BaseX 10.7 and 11.7. Note: currently (v0.1.5) also works on V9.7
- The Pdfbox 3 FAQ may be useful.
Features
The features focus on extracting information from PDFs rather than creation or editing.
- read PDF page count.
- read any PDF outline and return as map(s) or XML.
- read pagelabels.
- read page text.
- save pdf page range to a new pdf.
- save image of rendered pdf page.
Install
Pre-built pdfbox-x.y.z.zar
files are available on the releases page. They can be installed using the standard respository functions or using the GUI.
Usage
import module namespace pdfbox="org.expkg_zone58.Pdfbox3";
pdfbox:with-pdf("...path/to/pdf.pdf",
function($pdf){
(1 to pdfbox:page-count($pdf))!pdfbox:page-text($pdf,.)
}
)
Build
scripts/make-xar.xq
packages the requiredjar
s andxqm
files to axar
file in thedist
folder.
Action support
The workflow ci-basex.yaml
builds and tests the package. This can be used as an action on github, or on a local gitea installation.