From 6dca7f38877c90f67d02d85d31ed78fbf88686df Mon Sep 17 00:00:00 2001 From: Andy Bunce Date: Wed, 4 Jun 2025 16:18:22 +0100 Subject: [PATCH] [mod] ver 0.4.0 --- changelog.md | 4 ++++ docs/xqdoc/annotations.html | 2 +- docs/xqdoc/imports.html | 2 +- docs/xqdoc/index.html | 4 ++-- docs/xqdoc/modules/F000001/index.html | 33 +++++++++++++++----------- docs/xqdoc/modules/F000001/xqdoc.xml | 23 +++++++++++------- docs/xqdoc/modules/F000001/xqparse.xml | 11 +++++---- docs/xqdoc/restxq.html | 2 +- docs/xqdoc/validation-report.xml | 2 +- docs/xqdoc/xqdoca.xml | 2 +- package.json | 2 +- samples.pdf/readme.md | 12 +++++----- 12 files changed, 58 insertions(+), 41 deletions(-) diff --git a/changelog.md b/changelog.md index e410a5f..217bb34 100644 --- a/changelog.md +++ b/changelog.md @@ -1,3 +1,7 @@ +# 0.4.0 2025-06-04 +* ADD Label access +* various renames +* Doc updates # 0.3.6 2025-05-31 * Add metadata function * rename page-size->page-media-box diff --git a/docs/xqdoc/annotations.html b/docs/xqdoc/annotations.html index f45c54b..f1c7141 100644 --- a/docs/xqdoc/annotations.html +++ b/docs/xqdoc/annotations.html @@ -8,4 +8,4 @@ Contents
  1. Summary
  2. Annotations
    1. 2.1 http://www.w3.org/2012/xquery

Summary

This project uses 1 annotation namespaces.

Related documents
ViewDescriptionFormat
reportIndex of sourcesxhtml
restxqSummary of REST interfacexhtml
importsSummary of import usagexhtml
imports-diagProject wide module imports as html mermaid class diagramhtml5
imports-diag.mmdProject wide module imports as a mermaid class diagramtext
xqdoca.xmlxqDocA run configuration report (XML)xml
xqdoc-validatevalidate generated xqdoc filesxml

Annotations

2.1 http://www.w3.org/2012/xquery

private
\ No newline at end of file +   on Wednesday, 4th June 2025

\ No newline at end of file diff --git a/docs/xqdoc/imports.html b/docs/xqdoc/imports.html index f602dd3..06a69c6 100644 --- a/docs/xqdoc/imports.html +++ b/docs/xqdoc/imports.html @@ -6,4 +6,4 @@ Contents
  1. Summary
  2. Imports

    Summary

    Lists all modules imported.

    Related documents
    ViewDescriptionFormat
    reportIndex of sourcesxhtml
    restxqSummary of REST interfacexhtml
    imports-diagProject wide module imports as html mermaid class diagramhtml5
    imports-diag.mmdProject wide module imports as a mermaid class diagramtext
    annotationsSummary of XQuery annotation usexhtml
    xqdoca.xmlxqDocA run configuration report (XML)xml
    xqdoc-validatevalidate generated xqdoc filesxml

    Imports (0)

    \ No newline at end of file +   on Wednesday, 4th June 2025

    \ No newline at end of file diff --git a/docs/xqdoc/index.html b/docs/xqdoc/index.html index 09f2ec5..83d6779 100644 --- a/docs/xqdoc/index.html +++ b/docs/xqdoc/index.html @@ -6,9 +6,9 @@ 1 XQuery source files, and uses 1 annotation namespaces.

    This document was built from source folder C:/Users/mrwhe/git/expkg-zone58/pdfbox/src/ on - Wednesday, 4th June 2025.

    Related documents
    ViewDescriptionFormat
    reportIndex of sourcesxhtml
    restxqSummary of REST interfacexhtml
    importsSummary of import usagexhtml
    imports-diagProject wide module imports as html mermaid class diagramhtml5
    imports-diag.mmdProject wide module imports as a mermaid class diagramtext
    annotationsSummary of XQuery annotation usexhtml
    xqdoca.xmlxqDocA run configuration report (XML)xml
    xqdoc-validatevalidate generated xqdoc filesxml

    XQuery Main (0)

    None

    XQuery Library (1)

    UriPrefixDescriptionUseAMetrics
    org.expkg_zone58.Pdfbox3pdfbox + Wednesday, 4th June 2025.

    Related documents
    ViewDescriptionFormat
    reportIndex of sourcesxhtml
    restxqSummary of REST interfacexhtml
    importsSummary of import usagexhtml
    imports-diagProject wide module imports as html mermaid class diagramhtml5
    imports-diag.mmdProject wide module imports as a mermaid class diagramtext
    annotationsSummary of XQuery annotation usexhtml
    xqdoca.xmlxqDocA run configuration report (XML)xml
    xqdoc-validatevalidate generated xqdoc filesxml

    XQuery Main (0)

    None

    XQuery Library (1)

    UriPrefixDescriptionUseAMetrics
    org.expkg_zone58.Pdfbox3pdfbox A BaseX 10.7+ interface to pdfbox3 https://...
    0
    Library
    ↖0
    P
    V#1
    F#37

    File view (1)

    Annotation namespaces (1)

    A total of 8 annotations are defined.

    http://www.w3.org/2012/xquery

    \ No newline at end of file +   on Wednesday, 4th June 2025

    \ No newline at end of file diff --git a/docs/xqdoc/modules/F000001/index.html b/docs/xqdoc/modules/F000001/index.html index 2be907e..d8bbd3c 100644 --- a/docs/xqdoc/modules/F000001/index.html +++ b/docs/xqdoc/modules/F000001/index.html @@ -1,7 +1,7 @@ src - xqDocA - xqDocA

    org.expkg_zone58.Pdfbox3  library module
    P

    Summary

    +

    org.expkg_zone58.Pdfbox3

    1. 1 Summary
    2. 2 Imports
    3. 3 Variables
      1. 3.1$pdfbox:property-map
        P
    4. 4 Functions
      1. 4.1binary
      2. 4.2bookmark
        P
      3. 4.3bookmark-xml
        P
      4. 4.4close
      5. 4.5do-until
        P
      6. 4.6extract-range
      7. 4.7find-page
      8. 4.8gregToISO
        P
      9. 4.9label-as-map
      10. 4.10label-as-string
      11. 4.11labels-as-map
      12. 4.12labels-as-string
      13. 4.13labels-by-page
      14. 4.14metadata
      15. 4.15number-of-bookmarks
      16. 4.16number-of-labels
      17. 4.17number-of-pages
      18. 4.18open
      19. 4.19outline
        P
      20. 4.20outline-xml
      21. 4.21outline_
        P
      22. 4.22page-labels
      23. 4.23page-media-box
      24. 4.24page-render
      25. 4.25page-text
      26. 4.26pdf-save
      27. 4.27property
      28. 4.28property-names
      29. 4.29read-stream
        P
      30. 4.30report
      31. 4.31report-save
      32. 4.32specification
      33. 4.33version
      34. 4.34with-pdf
    5. 5 Namespaces
    6. 6 RestXQ
    7. 7 Source

    Summary

    A BaseX 10.7+ interface to pdfbox3 https://pdfbox.apache.org/ , requires pdfbox jars on classpath, in lib/custom or xar @@ -16,7 +16,7 @@ refer to the same concept. Also label and (page)range are used interchangably&#x Defines a map from property names to evaluation method. Keys are property names, values are sequences of functions to get property value starting from a $pdf object. -
    Type
    References 15 functions from 3 modules
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getAuthor#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getCreationDate#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getCreator#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getKeywords#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getModificationDate#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getProducer#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getSubject#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getTitle#1
    • {java:org.apache.pdfbox.pdmodel.PDDocument}getDocumentInformation#1
    • pdfbox:gregToISO#1
    • pdfbox:labels-as-strings#1
    • pdfbox:number-of-bookmarks#1
    • pdfbox:number-of-labels#1
    • pdfbox:number-of-pages#1
    • pdfbox:specification#1
    Annotations (1)
    %private()
    Source ( 36 lines)
    variable $pdfbox:property-map:=map{
    +
    Type
    References 15 functions from 3 modules
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getAuthor#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getCreationDate#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getCreator#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getKeywords#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getModificationDate#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getProducer#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getSubject#1
    • {java:org.apache.pdfbox.pdmodel.PDDocumentInformation}getTitle#1
    • {java:org.apache.pdfbox.pdmodel.PDDocument}getDocumentInformation#1
    • pdfbox:gregToISO#1
    • pdfbox:labels-as-string#1
    • pdfbox:number-of-bookmarks#1
    • pdfbox:number-of-labels#1
    • pdfbox:number-of-pages#1
    • pdfbox:specification#1
    Annotations (1)
    %private()
    Source ( 37 lines)
    variable $pdfbox:property-map:=map{
       "#pages": pdfbox:number-of-pages#1,
     
       "#bookmarks": pdfbox:number-of-bookmarks#1,
    @@ -50,7 +50,8 @@ values are sequences of functions to get property value starting from a $pdf obj
       "modificationDate":  (PDDocument:getDocumentInformation#1,
                             PDDocumentInformation:getModificationDate#1,
                             pdfbox:gregToISO#1),
    -   "labels":      pdfbox:labels-as-strings#1                     
    +
    +   "labels":      pdfbox:labels-as-string#1                     
     }

    Functions

    4.1 pdfbox:binary

    Arities: #1

    Summary
    Create binary representation of $pdf object as xs:base64Binary
    Signatures
    pdfbox:binary ( @@ -158,7 +159,7 @@ as map(*) }

    4.10 pdfbox:label-as-string

    Arities: #2

    Summary
    label for $page formated as string, empty if none
    Signatures
    pdfbox:label-as-string ( - $pagelabels, $page as xs:integer ) as xs:string?
    Parameters
    • pagelabels as 
    • page as xs:integer
    Return
    • xs:string ?
    Referenced by 1 functions from 1 modules
    References 7 functions from 3 modules
    • {http://www.w3.org/2005/xpath-functions}empty#1
    • {http://www.w3.org/2005/xpath-functions}exists#1
    • {http://www.w3.org/2005/xpath-functions}string-join#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getPrefix#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getStart#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getStyle#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabels}getPageLabelRange#2
    Source ( 15 lines)
    function pdfbox:label-as-string($pagelabels,$page as  xs:integer)
    +			$pagelabels, $page as xs:integer ) as xs:string?
    Parameters
    • pagelabels as 
    • page as xs:integer
    Return
    • xs:string ?
    Referenced by 1 functions from 1 modules
    References 7 functions from 3 modules
    • {http://www.w3.org/2005/xpath-functions}empty#1
    • {http://www.w3.org/2005/xpath-functions}exists#1
    • {http://www.w3.org/2005/xpath-functions}string-join#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getPrefix#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getStart#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabelRange}getStyle#1
    • {java:org.apache.pdfbox.pdmodel.common.PDPageLabels}getPageLabelRange#2
    Source ( 15 lines)
    function pdfbox:label-as-string($pagelabels,$page as  xs:integer)
     as xs:string?{
       let $label:=PDPageLabels:getPageLabelRange($pagelabels,$page)
       return  if(empty($label))
    @@ -182,16 +183,17 @@ as map(*)*{
       return  $pagelabels
               !(0 to pdfbox:number-of-pages($pdf)-1)
               !pdfbox:label-as-map($pagelabels,.)
    -}

    4.12 pdfbox:labels-as-strings

    Arities: #1

    Summary
    -sequence of label ranges defined in PDF as formatted strings
    Signatures
    pdfbox:labels-as-strings +}

    4.12 pdfbox:labels-as-string

    Arities: #1

    Summary
    +sequence of label ranges defined in PDF as formatted strings +
    Signatures
    pdfbox:labels-as-string ( - $pdf as item() ) as xs:string
    Parameters
    • pdf as item()
    Return
    • xs:string
    Referenced by 0 functions from 0 modules
      References 3 functions from 2 modules
      Source ( 9 lines)
      function pdfbox:labels-as-strings($pdf as item())
      +			$pdf as item() ) as xs:string
      Parameters
      • pdf as item()
      Return
      • xs:string a custom representation of the labels e.g "0-*Cover,1r,11D"
      Referenced by 0 functions from 0 modules
        References 3 functions from 2 modules
        Source ( 9 lines)
        function pdfbox:labels-as-string($pdf as item())
         as xs:string{
           let $pagelabels:=PDDocument:getDocumentCatalog($pdf)
                            =>PDDocumentCatalog:getPageLabels()
           return $pagelabels
                  !(0 to pdfbox:number-of-pages($pdf)-1)
        -         !pdfbox:label-as-string($pagelabels,.)=>string-join(",")
        +         !pdfbox:label-as-string($pagelabels,.)=>string-join("
")
                     
         }

        4.13 pdfbox:labels-by-page

        Arities: #1

        Summary
        pageLabel for every page from derived from page-ranges @@ -246,7 +248,7 @@ as xs:integer }

        4.17 pdfbox:number-of-pages

        Arities: #1

        Summary
        Number of pages in PDF
        Signatures
        pdfbox:number-of-pages ( - $pdf as item() ) as xs:integer
        Parameters
        • pdf as item()
        Return
        • xs:integer
        Referenced by 2 functions from 1 modules
        References 1 functions from 1 modules
        • {java:org.apache.pdfbox.pdmodel.PDDocument}getNumberOfPages#1
        Source ( 4 lines)
        function pdfbox:number-of-pages($pdf as item())
        +			$pdf as item() ) as xs:integer
        Parameters
        • pdf as item()
        Return
        • xs:integer
        Referenced by 2 functions from 1 modules
        References 1 functions from 1 modules
        • {java:org.apache.pdfbox.pdmodel.PDDocument}getNumberOfPages#1
        Source ( 4 lines)
        function pdfbox:number-of-pages($pdf as item())
         as xs:integer{
           PDDocument:getNumberOfPages($pdf)
         }

        4.18 pdfbox:open

        Arities: #1#2

        Summary
        @@ -646,7 +648,8 @@ declare %private variable $pdfbox:property-map:=map{ "modificationDate": (PDDocument:getDocumentInformation#1, PDDocumentInformation:getModificationDate#1, pdfbox:gregToISO#1), - "labels": pdfbox:labels-as-strings#1 + + "labels": pdfbox:labels-as-string#1 }; (:~ Defined property names, sorted :) @@ -870,14 +873,16 @@ as xs:string* =>PDPageLabels:getLabelsByPageIndices() }; -(:~ sequence of label ranges defined in PDF as formatted strings :) -declare function pdfbox:labels-as-strings($pdf as item()) +(:~ sequence of label ranges defined in PDF as formatted strings +@return a custom representation of the labels e.g "0-*Cover,1r,11D" +:) +declare function pdfbox:labels-as-string($pdf as item()) as xs:string{ let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) - !pdfbox:label-as-string($pagelabels,.)=>string-join(",") + !pdfbox:label-as-string($pagelabels,.)=>string-join("
") }; @@ -988,4 +993,4 @@ declare %private function pdfbox:do-until( };
        \ No newline at end of file +   on Wednesday, 4th June 2025

        \ No newline at end of file diff --git a/docs/xqdoc/modules/F000001/xqdoc.xml b/docs/xqdoc/modules/F000001/xqdoc.xml index ecff44c..1e807fb 100644 --- a/docs/xqdoc/modules/F000001/xqdoc.xml +++ b/docs/xqdoc/modules/F000001/xqdoc.xml @@ -1,4 +1,4 @@ -2025-06-04T10:09:22.636+01:001.1org.expkg_zone58.Pdfbox3pdfbox +2025-06-04T16:17:13.527+01:001.1org.expkg_zone58.Pdfbox3pdfbox A BaseX 10.7+ interface to pdfbox3 https://pdfbox.apache.org/ , requires pdfbox jars on classpath, in lib/custom or xar @@ -182,7 +182,8 @@ declare %private variable $pdfbox:property-map:=map{ "modificationDate": (PDDocument:getDocumentInformation#1, PDDocumentInformation:getModificationDate#1, pdfbox:gregToISO#1), - "labels": pdfbox:labels-as-strings#1 + + "labels": pdfbox:labels-as-string#1 }; (:~ Defined property names, sorted :) @@ -406,14 +407,16 @@ as xs:string* =>PDPageLabels:getLabelsByPageIndices() }; -(:~ sequence of label ranges defined in PDF as formatted strings :) -declare function pdfbox:labels-as-strings($pdf as item()) +(:~ sequence of label ranges defined in PDF as formatted strings +@return a custom representation of the labels e.g "0-*Cover,1r,11D" +:) +declare function pdfbox:labels-as-string($pdf as item()) as xs:string{ let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) - !pdfbox:label-as-string($pagelabels,.)=>string-join(",") + !pdfbox:label-as-string($pagelabels,.)=>string-join("
") }; @@ -526,7 +529,7 @@ declare %private function pdfbox:do-until( Defines a map from property names to evaluation method. Keys are property names, values are sequences of functions to get property value starting from a $pdf object. -org.expkg_zone58.Pdfbox3number-of-pagesorg.expkg_zone58.Pdfbox3number-of-bookmarksorg.expkg_zone58.Pdfbox3number-of-labelsorg.expkg_zone58.Pdfbox3specificationjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetTitlejava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetAuthorjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetCreatorjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetProducerjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetSubjectjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetKeywordsjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetCreationDateorg.expkg_zone58.Pdfbox3gregToISOjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetModificationDateorg.expkg_zone58.Pdfbox3gregToISOorg.expkg_zone58.Pdfbox3labels-as-stringsvariable $pdfbox:property-map:=map{ +org.expkg_zone58.Pdfbox3number-of-pagesorg.expkg_zone58.Pdfbox3number-of-bookmarksorg.expkg_zone58.Pdfbox3number-of-labelsorg.expkg_zone58.Pdfbox3specificationjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetTitlejava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetAuthorjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetCreatorjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetProducerjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetSubjectjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetKeywordsjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetCreationDateorg.expkg_zone58.Pdfbox3gregToISOjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentInformationjava:org.apache.pdfbox.pdmodel.PDDocumentInformationgetModificationDateorg.expkg_zone58.Pdfbox3gregToISOorg.expkg_zone58.Pdfbox3labels-as-stringvariable $pdfbox:property-map:=map{ "#pages": pdfbox:number-of-pages#1, "#bookmarks": pdfbox:number-of-bookmarks#1, @@ -560,7 +563,8 @@ values are sequences of functions to get property value starting from a $pdf obj "modificationDate": (PDDocument:getDocumentInformation#1, PDDocumentInformation:getModificationDate#1, pdfbox:gregToISO#1), - "labels": pdfbox:labels-as-strings#1 + + "labels": pdfbox:labels-as-string#1 } "With-document" pattern: open pdf,apply $fn function, close pdf creates a local pdfobject and ensures it is closed after use @@ -814,13 +818,14 @@ as xs:string* =>PDDocumentCatalog:getPageLabels() =>PDPageLabels:getLabelsByPageIndices() } -sequence of label ranges defined in PDF as formatted stringspdfbox:labels-as-stringsfunction pdfbox:labels-as-strings ( $pdf as item() ) as xs:string { let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) !pdfbox:label-as-string($pagelabels,.)=>string-join(",") }pdfitem()xs:stringjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentCatalogorg.expkg_zone58.Pdfbox3number-of-pagesorg.expkg_zone58.Pdfbox3label-as-stringfunction pdfbox:labels-as-strings($pdf as item()) +sequence of label ranges defined in PDF as formatted strings +a custom representation of the labels e.g "0-*Cover,1r,11D"pdfbox:labels-as-stringfunction pdfbox:labels-as-string ( $pdf as item() ) as xs:string { let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) !pdfbox:label-as-string($pagelabels,.)=>string-join("
") }pdfitem()xs:stringjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentCatalogorg.expkg_zone58.Pdfbox3number-of-pagesorg.expkg_zone58.Pdfbox3label-as-stringfunction pdfbox:labels-as-string($pdf as item()) as xs:string{ let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) - !pdfbox:label-as-string($pagelabels,.)=>string-join(",") + !pdfbox:label-as-string($pagelabels,.)=>string-join("
") } get pagelabels existpdfbox:page-labelsfunction pdfbox:page-labels ( $pdf ) { PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() }pdfjava:org.apache.pdfbox.pdmodel.PDDocumentgetDocumentCatalogfunction pdfbox:page-labels($pdf) diff --git a/docs/xqdoc/modules/F000001/xqparse.xml b/docs/xqdoc/modules/F000001/xqparse.xml index 890a7c5..c1564e9 100644 --- a/docs/xqdoc/modules/F000001/xqparse.xml +++ b/docs/xqdoc/modules/F000001/xqparse.xml @@ -177,7 +177,8 @@ options.format="bmp jpg png gif" etc, options.scale= 1 is 72 dpi?? :) "modificationDate": (PDDocument:getDocumentInformation#1, PDDocumentInformation:getModificationDate#1, pdfbox:gregToISO#1), - "labels": pdfbox:labels-as-strings#1 + + "labels": pdfbox:labels-as-string#1 }; (:~ Defined property names, sorted :) @@ -401,14 +402,16 @@ The returned sequence will contain at MOST as much entries as the document has p =>PDPageLabels:getLabelsByPageIndices() }; -(:~ sequence of label ranges defined in PDF as formatted strings :) -declare function pdfbox:labels-as-strings($pdf as item()) +(:~ sequence of label ranges defined in PDF as formatted strings +@return a custom representation of the labels e.g "0-*Cover,1r,11D" +:) +declare function pdfbox:labels-as-string($pdf as item()) as xs:string{ let $pagelabels:=PDDocument:getDocumentCatalog($pdf) =>PDDocumentCatalog:getPageLabels() return $pagelabels !(0 to pdfbox:number-of-pages($pdf)-1) - !pdfbox:label-as-string($pagelabels,.)=>string-join(",") + !pdfbox:label-as-string($pagelabels,.)=>string-join("
") }; diff --git a/docs/xqdoc/restxq.html b/docs/xqdoc/restxq.html index 813eace..0584079 100644 --- a/docs/xqdoc/restxq.html +++ b/docs/xqdoc/restxq.html @@ -7,4 +7,4 @@ Contents
        1. 1 Summary
        2. 2 Rest Paths

        Summary

        No RESTXQ usage

        Related documents
        ViewDescriptionFormat
        reportIndex of sourcesxhtml
        importsSummary of import usagexhtml
        imports-diagProject wide module imports as html mermaid class diagramhtml5
        imports-diag.mmdProject wide module imports as a mermaid class diagramtext
        annotationsSummary of XQuery annotation usexhtml
        xqdoca.xmlxqDocA run configuration report (XML)xml
        xqdoc-validatevalidate generated xqdoc filesxml

        Rest interface paths

        \ No newline at end of file +   on Wednesday, 4th June 2025

        \ No newline at end of file diff --git a/docs/xqdoc/validation-report.xml b/docs/xqdoc/validation-report.xml index 5e59a2d..4ee2c65 100644 --- a/docs/xqdoc/validation-report.xml +++ b/docs/xqdoc/validation-report.xml @@ -1 +1 @@ -valid \ No newline at end of file +valid \ No newline at end of file diff --git a/docs/xqdoc/xqdoca.xml b/docs/xqdoc/xqdoca.xml index 71c6a98..1bacc45 100644 --- a/docs/xqdoc/xqdoca.xml +++ b/docs/xqdoc/xqdoca.xml @@ -1,4 +1,4 @@ -0.9.1docs/xqdoc/ +0.9.1docs/xqdoc/ report restxq imports diff --git a/package.json b/package.json index 586c591..ccaffd3 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "pdfbox", - "version": "0.3.8", + "version": "0.4.0", "description": "A BaseX interface to Apache Pdfbox version 3", "main": "src/Pdfbox3.xqm", "homepage": "https://github.com/expkg-zone58/pdfbox#readme", diff --git a/samples.pdf/readme.md b/samples.pdf/readme.md index 74c37a2..1cc64e0 100644 --- a/samples.pdf/readme.md +++ b/samples.pdf/readme.md @@ -3,11 +3,11 @@ ## Sources | Name | bookmarks | labels | password |source | |------|-----------|--------|----------|---| -|[BaseX100.pdf](BaseX100.pdf)||☑||https://files.basex.org/releases/10.0/BaseX100.pdf| -|[icelandic-dictionary.pdf](icelandic-dictionary.pdf)|☑|| |http://css4.pub/2015/icelandic/dictionary.pdf| -|[page-numbers.pdf](page-numbers.pdf)||☑||https://www.w3.org/WAI/WCAG22/working-examples/pdf-page-numbers/page-numbers| -|[page-numbers-password.pdf](page-numbers-password.pdf)||☑|☑(password)|https://www.w3.org/WAI/WCAG22/working-examples/pdf-page-numbers/page-numbers| -|[Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans](Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans-Final-Report-November-2021.pdf)|☑|||https://www.lse.ac.uk/News/News-Assets/PDFs/2021/Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans-Final-Report-November-2021.pdf| -|[Legal RAG Hallucinations](Legal_RAG_Hallucinations.pdf)|☑|||https://law.stanford.edu/wp-content/uploads/2024/05/Legal_RAG_Hallucinations.pdf| +|[BaseX100.pdf](BaseX100.pdf)||✅||https://files.basex.org/releases/10.0/BaseX100.pdf| +|[icelandic-dictionary.pdf](icelandic-dictionary.pdf)|✅|| |http://css4.pub/2015/icelandic/dictionary.pdf| +|[page-numbers.pdf](page-numbers.pdf)||✅||https://www.w3.org/WAI/WCAG22/working-examples/pdf-page-numbers/page-numbers| +|[page-numbers-password.pdf](page-numbers-password.pdf)||✅|✅(password)|https://www.w3.org/WAI/WCAG22/working-examples/pdf-page-numbers/page-numbers| +|[Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans](Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans-Final-Report-November-2021.pdf)|✅|||https://www.lse.ac.uk/News/News-Assets/PDFs/2021/Sentience-in-Cephalopod-Molluscs-and-Decapod-Crustaceans-Final-Report-November-2021.pdf| +|[Legal RAG Hallucinations](Legal_RAG_Hallucinations.pdf)|✅|||https://law.stanford.edu/wp-content/uploads/2024/05/Legal_RAG_Hallucinations.pdf|