2021-10-07 17:09:00 by Nia Alarie | Files touched by this commit (1033) |
Log message:
www: Remove SHA1 hashes for distfiles
|
2020-10-03 20:11:59 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: updated to 4.9.3
4.9.3:
* Implemented a significant performance optimization to the process of
searching the parse tree.
|
2020-09-29 20:47:30 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: updated to 4.9.2
4.9.2
* Fixed a bug that caused too many tags to be popped from the tag
stack during tree building, when encountering a closing tag that had
no matching opening tag.
* Fixed a bug that inconsistently moved elements over when passing
a Tag, rather than a list, into Tag.extend().
* Specify the soupsieve dependency in a way that complies with
PEP 508. Patch by Mike Nerone.
* Change the signatures for BeautifulSoup.insert_before and insert_after
(which are not implemented) to match PageElement.insert_before and
insert_after, quieting warnings in some IDEs.
|
2020-05-27 15:00:40 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: updated to 4.9.1
4.9.1:
* Added a keyword argument 'on_duplicate_attribute' to the
BeautifulSoupHTMLParser constructor (used by the html.parser tree
builder) which lets you customize the handling of markup that
contains the same attribute more than once, as in:
<a href="url1" href="url2">
* Added a distinct subclass, GuessedAtParserWarning, for the warning
issued when BeautifulSoup is instantiated without a parser being
specified.
* Added a distinct subclass, MarkupResemblesLocatorWarning, for the
warning issued when BeautifulSoup is instantiated with 'markup' that
actually seems to be a URL or the path to a file on
disk.
* The new NavigableString subclasses (Stylesheet, Script, and
TemplateString) can now be imported directly from the bs4 package.
* If you encode a document with a Python-specific encoding like
'unicode_escape', that encoding is no longer mentioned in the final
XML or HTML document. Instead, encoding information is omitted or
left blank.
* Fixed test failures when run against soupselect 2.0.
|
2020-04-28 23:16:14 by David H. Gutteridge | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: update to 4.9.0
4.9.0 (20200405)
* Added PageElement.decomposed, a new property which lets you
check whether you've already called decompose() on a Tag or
NavigableString.
* Embedded CSS and Javascript is now stored in distinct Stylesheet and
Script tags, which are ignored by methods like get_text(). This
feature is not supported by the html5lib treebuilder. [bug=1868861]
* Added a Russian translation by 'authoress' to the repository.
* Fixed an unhandled exception when formatting a Tag that had been
decomposed.[bug=1857767]
* Fixed a bug that happened when passing a Unicode filename containing
non-ASCII characters as markup into Beautiful Soup, on a system that
allows Unicode filenames. [bug=1866717]
* Added a performance optimization to PageElement.extract(). Patch by
Arthur Darcet.
|
2020-01-08 22:08:26 by Adam Ciarcinski | Files touched by this commit (3) | |
Log message:
py-beautifulsoup4: updated to 4.8.2
4.8.2:
* Added Python docstrings to all public methods of the most commonly
used classes.
* Added a Chinese translation by Deron Wang and a Brazilian Portuguese
translation by Cezar Peixeiro to the repository.
* Fixed two deprecation warnings.
* The html.parser tree builder now correctly handles DOCTYPEs that are
not uppercase.
* PageElement.select() now returns a ResultSet rather than a regular
list, making it consistent with methods like find_all().
|
2019-10-15 19:21:35 by Adam Ciarcinski | Files touched by this commit (3) | |
Log message:
py-beautifulsoup4: updated to 4.8.1
4.8.1:
* When the html.parser or html5lib parsers are in use, Beautiful Soup
will, by default, record the position in the original document where
each tag was encountered. This includes line number (Tag.sourceline)
and position within a line (Tag.sourcepos). Based on code by Chris
Mayo.
* When instantiating a BeautifulSoup object, it's now possible to
provide a dictionary ('element_classes') of the classes you'd like to be
instantiated instead of Tag, NavigableString, etc.
* Fixed the definition of the default XML namespace when using
lxml 4.4. Patch by Isaac Muse.
* Fixed a crash when pretty-printing tags that were not created
during initial parsing.
* Copying a Tag preserves information that was originally obtained from
the TreeBuilder used to build the original Tag.
* Raise an explanatory exception when the underlying parser
completely rejects the incoming markup.
* Avoid a crash when trying to detect the declared encoding of a
Unicode document.
* Avoid a crash when unpickling certain parse trees generated
using html5lib on Python 3.
|
2019-07-21 10:05:32 by Adam Ciarcinski | Files touched by this commit (3) | |
Log message:
py-beautifulsoup4: updated to 4.8.0
4.8.0:
This release focuses on making it easier to customize Beautiful Soup's
input mechanism (the TreeBuilder) and output mechanism (the Formatter).
* You can customize the TreeBuilder object by passing keyword
arguments into the BeautifulSoup constructor. Those keyword
arguments will be passed along into the TreeBuilder constructor.
The main reason to do this right now is to change how which
attributes are treated as multi-valued attributes (the way 'class'
is treated by default). You can do this with the
'multi_valued_attributes' argument.
* The role of Formatter objects has been greatly expanded. The Formatter
class now controls the following:
- The function to call to perform entity substitution. (This was
previously Formatter's only job.)
- Which tags should be treated as containing CDATA and have their
contents exempt from entity substitution.
- The order in which a tag's attributes are output.
- Whether or not to put a '/' inside a void element, e.g. '<br/>' vs \
'<br>'
All preexisting code should work as before.
* Added a new method to the API, Tag.smooth(), which consolidates
multiple adjacent NavigableString elements.
* ' (which is valid in XML, XHTML, and HTML 5, but not HTML 4) is always
recognized as a named entity and converted to a single quote.
|
2019-01-08 10:30:44 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: updated to 4.7.1
4.7.1:
* Fixed a significant performance problem introduced in 4.7.0.
* Fixed an incorrectly raised exception when inserting a tag before or
after an identical tag.
* Beautiful Soup will no longer try to keep track of namespaces that
are not defined with a prefix; this can confuse soupselect.
* Tried even harder to avoid the deprecation warning originally fixed in
4.6.1.
|
2019-01-02 11:36:08 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-beautifulsoup4: updated to 4.7.0
4.7.0:
* Beautiful Soup's CSS Selector implementation has been replaced by a
dependency on Isaac Muse's SoupSieve project (the soupsieve package
on PyPI). The good news is that SoupSieve has a much more robust and
complete implementation of CSS selectors, resolving a large number
of longstanding issues. The bad news is that from this point onward,
SoupSieve must be installed if you want to use the select() method.
You don't have to change anything lf you installed Beautiful Soup
through pip (SoupSieve will be automatically installed when you
upgrade Beautiful Soup) or if you don't use CSS selectors from
within Beautiful Soup.
SoupSieve documentation: https://facelessuser.github.io/soupsieve/
* Fix a number of problems with the tree builder that caused
trees that were superficially okay, but which fell apart when bits
were extracted.
* Fixed a problem with the tree builder in which elements that
contained no content (such as empty comments and all-whitespace
elements) were not being treated as part of the tree.
* Fixed a problem with multi-valued attributes where the value
contained whitespace.
* Clarified ambiguous license statements in the source code. Beautiful
Soup is released under the MIT license, and has been since 4.4.0.
* This file has been renamed from NEWS.txt to CHANGELOG.
|