Path to this page:
./
textproc/py-pdf,
Pure-python PDF library
Branch: CURRENT,
Version: 5.6.0,
Package name: py312-pdf-5.6.0,
Maintainer: pkgsrc-userspypdf is a free and open-source pure-python PDF library capable of
splitting, merging, cropping, and transforming the pages of PDF
files. It can also add custom data, viewing options, and passwords
to PDF files. pypdf can retrieve text and metadata from PDFs as
well.
Master sites:
Filesize: 4906.005 KB
Version history: (Expand)
- (2025-06-10) Updated to version: py312-pdf-5.6.0
- (2025-05-18) Updated to version: py312-pdf-5.5.0
- (2025-04-23) Updated to version: py312-pdf-5.4.0
- (2025-03-03) Updated to version: py312-pdf-5.3.1
- (2025-02-23) Updated to version: py312-pdf-5.3.0nb1
- (2025-02-12) Updated to version: py312-pdf-5.3.0
CVS history: (Expand)
2025-06-10 07:40:40 by Thomas Klausner | Files touched by this commit (2) |  |
Log message:
py-pdf: update to 5.6.0.
New Features (ENH)
Add basic support for JBIG2 by using jbig2dec (#3163) by @stefan6419846
Bug Fixes (BUG)
Fix crashes by removing unnecessary line (#3293) by @larsga
Add delimiters to NameObject.renumber_table (#3286) by @ztravis
Robustness (ROB)
Handle DecodeParms being a NullObject (#3285) by @stefan6419846
Code Style (STY)
Update to mypy 1.16.0 (#3300) by @stefan6419846
|
2025-05-18 12:34:25 by Thomas Klausner | Files touched by this commit (2) |  |
Log message:
py-pdf: update to 5.5.0.
New Features (ENH)
Add support for IndirectObject.iter (#3228) by @bryan-brancotte
Allow filtering by font when removing text (#3216) by @samuelbradshaw
Bug Fixes (BUG)
Add missing named destinations being ByteStringObjects (#3282) by @stefan6419846
Get font information more reliably when removing text (#3252) by @samuelbradshaw
T* 2D Translation consistent with PDF 1.7 Spec (#3250) by @hackowitz-af
Add font stack to q/Q operations in layout mode (#3225) by @hackowitz-af
Avoid completely hiding image loading issues like exceeding image size \
limits (#3221) by @stefan6419846
Using compress_identical_objects on transformed content duplicates differing \
content (#3197) by @danio
Consider BlackIs1 parameter for CCITTFaxDecode filter (#3196) by @stefan6419846
Robustness (ROB)
Deal with insufficient cm matrix during text extraction (#3283) by @stefan6419846
Allow merging when annotations miss D entry (#3281) by @stefan6419846
Fix merging documents if there are no Dests (#3280) by @stefan6419846
Fix crash on malformed action in outline (#3278) by @larsga
Fix compression issues for removed images which might be None (#3246) by \
@stefan6419846
Attempt to deal with non-rectangular FlateDecode streams (#3245) by \
@stefan6419846
Handle some None values for broken PDF files (#3230) by @stefan6419846
Developer Experience (DEV)
Multiple style improvements by @j-t-1
Update ruff to 0.11.0 by @stefan6419846
Maintenance (MAINT)
Conform ASCIIHexDecode implementation to specification (#3274) by @j-t-1
Modify comments of filters that do not use decode_parms (#3260) by @j-t-1
Code Style (STY)
Simplify warnings & debugging in layout mode text extraction (#3271) by \
@hackowitz-af
Standardize mypy assert statements (#3276) by @j-t-1
|
2025-05-04 03:06:27 by Nia Alarie | Files touched by this commit (1) |
Log message:
py-pdf: Support non-Rust version of py-cryptography
Discovered by a failure on OpenBSD in drecklypkg ci, this package
should be buildable on platforms for which there is no rust
bootstrap.
|
2025-04-20 23:11:53 by Thomas Klausner | Files touched by this commit (2) |  |
Log message:
py-pdf: update to 5.4.0.
New Features (ENH)
Add support for IndirectObject.__contains__ (#3155) by @noamkush
Bug Fixes (BUG)
Fix detection of inline images followed by names or numbers (#3173) by \
@stefan6419846
Robustness (ROB)
Consider root objects without catalog type as fallback (#3175) by @stefan6419846
Raise proper error on infinite loop when reading objects (#3169) by \
@stefan6419846
Documentation (DOC)
Mention memory consumption of text extraction (#3168) by @stefan6419846
Developer Experience (DEV)
Upgrade to ruff 0.10.0 (#3191) by @stefan6419846
|
2025-03-03 14:06:43 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-pdf: updated to 5.3.1
5.3.1
Bug Fixes (BUG)
Use the correct name StandardEncoding for the predefined cmap
Handle inline images containing EI sequences
Fix check box value which should be name object
Fix stream position on inline image fallback extraction
Fix object count for incremental writer
Robustness (ROB)
Avoid index errors on empty lines in xref table
Improve handling of LZW decoder table overflow
Ignore non-numbers for width when building font width map
Avoid negative seek values when reading partially broken files
Documentation (DOC)
Fixed PageObject.images example usage for replacing image
|
2025-02-23 21:44:52 by Thomas Klausner | Files touched by this commit (2) |
Log message:
py-pdf: adapt for flit_core 3.11.
Bump PKGREVISION.
|
2025-02-12 13:12:49 by Adam Ciarcinski | Files touched by this commit (3) |  |
Log message:
py-pdf: updated to 5.3.0
Version 5.3.0, 2025-02-09
New Features (ENH)
- Handle attachments in /Kids and provide object-oriented API
Bug Fixes (BUG)
- Handle annotations being None on merging
Robustness (ROB)
- Prevent excessive layout mode text output from Type3 fonts
Documentation (DOC)
- stefan6419846 becomes BDFL of pypdf
- Tidy the visitor function description
Developer Experience (DEV)
- Remove ignoring multiple Ruff rules
- Remove unused mutmut configuration
Testing (TST)
- Fix warning assertions to use `pytest.warns()`
|
2025-01-27 15:00:11 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-pdf: updated to 5.2.0
Version 5.2.0, 2025-01-26
Deprecations (DEP)
- Deprecate with replacement CCITParameters
- Correct deprecation of interiour_color
New Features (ENH)
- Support alternative (U)F names for embedded file retrieval
- Adding support for reading .metadata.keywords
Bug Fixes (BUG)
- Handle further Tf operators in text extraction layout mode
- Ensure `add_metadata` can deal with `_info = None`
- Handle IndirectObject in CCITTFaxDecode filter
- Handle chained colorspace for inline images when no filter is set
- Avoid extracting inline images twice and dropping other operators
- Fixed reference of value with `str.__new__` in TextStringObject
- Handle indirect objects in font width calculations
- Title sometimes is bytes and not str
- Fix undefined variable for text extraction (regression)
- Don't close stream passed to PdfWriter.write()
Robustness (ROB)
- Handle zero height fonts when extracting text
- Deal with content streams not containing streams
- Gracefully handle some text operators when the operands are missing
- Fall back to non-Adobe Ascii85 format for missing end markers
- Ignore odd-length strings when processing cmap lines
- Skip annotation destination being NullObject in PdfWriter
- Skip destination page being None in PdfWriter
- Fix infinite loop case when reading null objects within an Array
- Fixing infinite loop in ArrayObject read_from_stream
Documentation (DOC)
- Add note about default line colors
Developer Experience (DEV)
- Remove ignoring Ruff rule PGH004
- Tidy ignore array in tool.ruff.lint
- Move Windows CI to Python 3.13
- Move to Ubuntu 22.04
Maintenance (MAINT)
- Fix formatting of warning message and include exception message
- Narrow return type for `ContentStream.operations`
Testing (TST)
- Fix image similarity for upcoming Ubuntu 24.04
- Replace broken Apache Tika Corpora urls
Code Style (STY)
- Add form feed to WHITESPACES
- Lots of small internal changes
|