2019-07-21 10:05:32 by Adam Ciarcinski | Files touched by this commit (3) |  |
Log message:
py-beautifulsoup4: updated to 4.8.0
4.8.0:
This release focuses on making it easier to customize Beautiful Soup's
input mechanism (the TreeBuilder) and output mechanism (the Formatter).
* You can customize the TreeBuilder object by passing keyword
arguments into the BeautifulSoup constructor. Those keyword
arguments will be passed along into the TreeBuilder constructor.
The main reason to do this right now is to change how which
attributes are treated as multi-valued attributes (the way 'class'
is treated by default). You can do this with the
'multi_valued_attributes' argument.
* The role of Formatter objects has been greatly expanded. The Formatter
class now controls the following:
- The function to call to perform entity substitution. (This was
previously Formatter's only job.)
- Which tags should be treated as containing CDATA and have their
contents exempt from entity substitution.
- The order in which a tag's attributes are output.
- Whether or not to put a '/' inside a void element, e.g. '<br/>' vs \
'<br>'
All preexisting code should work as before.
* Added a new method to the API, Tag.smooth(), which consolidates
multiple adjacent NavigableString elements.
* ' (which is valid in XML, XHTML, and HTML 5, but not HTML 4) is always
recognized as a named entity and converted to a single quote.
|
2019-01-08 10:30:44 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-beautifulsoup4: updated to 4.7.1
4.7.1:
* Fixed a significant performance problem introduced in 4.7.0.
* Fixed an incorrectly raised exception when inserting a tag before or
after an identical tag.
* Beautiful Soup will no longer try to keep track of namespaces that
are not defined with a prefix; this can confuse soupselect.
* Tried even harder to avoid the deprecation warning originally fixed in
4.6.1.
|
2019-01-02 11:36:08 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-beautifulsoup4: updated to 4.7.0
4.7.0:
* Beautiful Soup's CSS Selector implementation has been replaced by a
dependency on Isaac Muse's SoupSieve project (the soupsieve package
on PyPI). The good news is that SoupSieve has a much more robust and
complete implementation of CSS selectors, resolving a large number
of longstanding issues. The bad news is that from this point onward,
SoupSieve must be installed if you want to use the select() method.
You don't have to change anything lf you installed Beautiful Soup
through pip (SoupSieve will be automatically installed when you
upgrade Beautiful Soup) or if you don't use CSS selectors from
within Beautiful Soup.
SoupSieve documentation: https://facelessuser.github.io/soupsieve/
* Fix a number of problems with the tree builder that caused
trees that were superficially okay, but which fell apart when bits
were extracted.
* Fixed a problem with the tree builder in which elements that
contained no content (such as empty comments and all-whitespace
elements) were not being treated as part of the tree.
* Fixed a problem with multi-valued attributes where the value
contained whitespace.
* Clarified ambiguous license statements in the source code. Beautiful
Soup is released under the MIT license, and has been since 4.4.0.
* This file has been renamed from NEWS.txt to CHANGELOG.
|
2018-08-14 09:26:20 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-beautifulsoup4: updated to 4.6.3
4.6.3:
* Exactly the same as 4.6.2. Re-released to make the README file
render properly on PyPI.
4.6.2:
* Fix an exception when a custom formatter was asked to format a void
element
|
2018-08-02 17:31:03 by Adam Ciarcinski | Files touched by this commit (2) |  |
Log message:
py-beautifulsoup4: updated to 4.6.1
4.6.1:
* Stop data loss when encountering an empty numeric entity, and
possibly in other cases.
* Preserve XML namespaces introduced inside an XML document, not just
the ones introduced at the top level.
* Added a new formatter, "html5", which represents void elements
as "<element>" rather than "<element/>".
* Fixed a problem where the html.parser tree builder interpreted
a string like "&foo " as the character entity "&foo;"
* Correctly handle invalid HTML numeric character entities
which reference code points that are not Unicode code points. Note
that this is only fixed when Beautiful Soup is used with the
html.parser parser -- html5lib already worked and I couldn't fix it
with lxml.
* Improved the warning given when no parser is specified.
* When markup contains duplicate elements, a select() call that
includes multiple match clauses will match all relevant
elements.
* Fixed code that was causing deprecation warnings in recent Python 3
versions.
* Fixed a Windows crash in diagnose() when checking whether a long
markup string is a filename.
* Stopped HTMLParser from raising an exception in very rare cases of
bad markup.
* Fixed a bug where find_all() was not working when asked to find a
tag with a namespaced name in an XML document that was parsed as
HTML.
* You can get finer control over formatting by subclassing
bs4.element.Formatter and passing a Formatter instance into (e.g.)
encode().
* You can pass a dictionary of `attrs` into
BeautifulSoup.new_tag. This makes it possible to create a tag with
an attribute like 'name' that would otherwise be masked by another
argument of new_tag.
* Clarified the deprecation warning when accessing tag.fooTag, to cover
the possibility that you might really have been looking for a tag
called 'fooTag'.
|
2017-09-03 10:53:18 by Thomas Klausner | Files touched by this commit (165) |
Log message:
Follow some redirects.
|
2017-05-09 22:05:18 by Adam Ciarcinski | Files touched by this commit (3) |
Log message:
= 4.6.0 (20170507) =
* Added the `Tag.get_attribute_list` method, which acts like `Tag.get` for
getting the value of an attribute, but which always returns a list,
whether or not the attribute is a multi-value attribute.
* It's now possible to use a tag's namespace prefix when searching,
e.g. soup.find('namespace:tag')
* Improved the handling of empty-element tags like <br> when using the
html.parser parser.
* HTML parsers treat all HTML4 and HTML5 empty element tags (aka void
element tags) correctly.
* Namespace prefix is preserved when an XML tag is copied. Thanks
to Vikas for a patch and test.
|
2017-02-12 05:01:39 by Wen Heping | Files touched by this commit (3) |
Log message:
Update to 4.5.3
Upstream changes:
= 4.5.3 (20170102) =
* Fixed foster parenting when html5lib is the tree builder. Thanks to
Geoffrey Sneddon for a patch and test.
* Fixed yet another problem that caused the html5lib tree builder to
create a disconnected parse tree. [bug=1629825]
= 4.5.2 (20170102) =
* Apart from the version number, this release is identical to
4.5.3. Due to user error, it could not be completely uploaded to
PyPI. Use 4.5.3 instead.
|
2016-08-09 16:40:54 by Leonardo Taccari | Files touched by this commit (3) |
Log message:
Update www/py-beautifulsoup4 to 4.5.1
Changes:
= 4.5.1 (20160802) =
* Fixed a crash when passing Unicode markup that contained a
processing instruction into the lxml HTML parser on Python
3. [bug=1608048]
= 4.5.0 (20160719) =
* Beautiful Soup is no longer compatible with Python 2.6. This
actually happened a few releases ago, but it's now official.
* Beautiful Soup will now work with versions of html5lib greater than
0.99999999. [bug=1603299]
* If a search against each individual value of a multi-valued
attribute fails, the search will be run one final time against the
complete attribute value considered as a single string. That is, if
a tag has class="foo bar" and neither "foo" nor \
"bar" matches, but
"foo bar" does, the tag is now considered a match.
This happened in previous versions, but only when the value being
searched for was a string. Now it also works when that value is
a regular expression, a list of strings, etc. [bug=1476868]
* Fixed a bug that deranged the tree when a whitespace element was
reparented into a tag that contained an identical whitespace
element. [bug=1505351]
* Added support for CSS selector values that contain quoted spaces,
such as tag[style="display: foo"]. [bug=1540588]
* Corrected handling of XML processing instructions. [bug=1504393]
* Corrected an encoding error that happened when a BeautifulSoup
object was copied. [bug=1554439]
* The contents of <textarea> tags will no longer be modified when the
tree is prettified. [bug=1555829]
* When a BeautifulSoup object is pickled but its tree builder cannot
be pickled, its .builder attribute is set to None instead of being
destroyed. This avoids a performance problem once the object is
unpickled. [bug=1523629]
* Specify the file and line number when warning about a
BeautifulSoup object being instantiated without a parser being
specified. [bug=1574647]
* The `limit` argument to `select()` now works correctly, though it's
not implemented very efficiently. [bug=1520530]
* Fixed a Python 3 ByteWarning when a URL was passed in as though it
were markup. Thanks to James Salter for a patch and
test. [bug=1533762]
* We don't run the check for a filename passed in as markup if the
'filename' contains a less-than character; the less-than character
indicates it's most likely a very small document. [bug=1577864]
= 4.4.1 (20150928) =
* Fixed a bug that deranged the tree when part of it was
removed. Thanks to Eric Weiser for the patch and John Wiseman for a
test. [bug=1481520]
* Fixed a parse bug with the html5lib tree-builder. Thanks to Roel
Kramer for the patch. [bug=1483781]
* Improved the implementation of CSS selector grouping. Thanks to
Orangain for the patch. [bug=1484543]
* Fixed the test_detect_utf8 test so that it works when chardet is
installed. [bug=1471359]
* Corrected the output of Declaration objects. [bug=1477847]
= 4.4.0 (20150703) =
Especially important changes:
* Added a warning when you instantiate a BeautifulSoup object without
explicitly naming a parser. [bug=1398866]
* __repr__ now returns an ASCII bytestring in Python 2, and a Unicode
string in Python 3, instead of a UTF8-encoded bytestring in both
versions. In Python 3, __str__ now returns a Unicode string instead
of a bytestring. [bug=1420131]
* The `text` argument to the find_* methods is now called `string`,
which is more accurate. `text` still works, but `string` is the
argument described in the documentation. `text` may eventually
change its meaning, but not for a very long time. [bug=1366856]
* Changed the way soup objects work under copy.copy(). Copying a
NavigableString or a Tag will give you a new NavigableString that's
equal to the old one but not connected to the parse tree. Patch by
Martijn Peters. [bug=1307490]
* Started using a standard MIT license. [bug=1294662]
* Added a Chinese translation of the documentation by Delong .w.
New features:
* Introduced the select_one() method, which uses a CSS selector but
only returns the first match, instead of a list of
matches. [bug=1349367]
* You can now create a Tag object without specifying a
TreeBuilder. Patch by Martijn Pieters. [bug=1307471]
* You can now create a NavigableString or a subclass just by invoking
the constructor. [bug=1294315]
* Added an `exclude_encodings` argument to UnicodeDammit and to the
Beautiful Soup constructor, which lets you prohibit the detection of
an encoding that you know is wrong. [bug=1469408]
* The select() method now supports selector grouping. Patch by
Francisco Canas [bug=1191917]
Bug fixes:
* Fixed yet another problem that caused the html5lib tree builder to
create a disconnected parse tree. [bug=1237763]
* Force object_was_parsed() to keep the tree intact even when an element
from later in the document is moved into place. [bug=1430633]
* Fixed yet another bug that caused a disconnected tree when html5lib
copied an element from one part of the tree to another. [bug=1270611]
* Fixed a bug where Element.extract() could create an infinite loop in
the remaining tree.
* The select() method can now find tags whose names contain
dashes. Patch by Francisco Canas. [bug=1276211]
* The select() method can now find tags with attributes whose names
contain dashes. Patch by Marek Kapolka. [bug=1304007]
* Improved the lxml tree builder's handling of processing
instructions. [bug=1294645]
* Restored the helpful syntax error that happens when you try to
import the Python 2 edition of Beautiful Soup under Python
3. [bug=1213387]
* In Python 3.4 and above, set the new convert_charrefs argument to
the html.parser constructor to avoid a warning and future
failures. Patch by Stefano Revera. [bug=1375721]
* The warning when you pass in a filename or URL as markup will now be
displayed correctly even if the filename or URL is a Unicode
string. [bug=1268888]
* If the initial <html> tag contains a CDATA list attribute such as
'class', the html5lib tree builder will now turn its value into a
list, as it would with any other tag. [bug=1296481]
* Fixed an import error in Python 3.5 caused by the removal of the
HTMLParseError class. [bug=1420063]
* Improved docstring for encode_contents() and
decode_contents(). [bug=1441543]
* Fixed a crash in Unicode, Dammit's encoding detector when the name
of the encoding itself contained invalid bytes. [bug=1360913]
* Improved the exception raised when you call .unwrap() or
.replace_with() on an element that's not attached to a tree.
* Raise a NotImplementedError whenever an unsupported CSS pseudoclass
is used in select(). Previously some cases did not result in a
NotImplementedError.
* It's now possible to pickle a BeautifulSoup object no matter which
tree builder was used to create it. However, the only tree builder
that survives the pickling process is the HTMLParserTreeBuilder
('html.parser'). If you unpickle a BeautifulSoup object created with
some other tree builder, soup.builder will be None. [bug=1231545]
|
2015-11-04 03:47:43 by Alistair G. Crooks | Files touched by this commit (758) |
Log message:
Add SHA512 digests for distfiles for www category
Problems found locating distfiles:
Package haskell-cgi: missing distfile haskell-cgi-20001206.tar.gz
Package nginx: missing distfile array-var-nginx-module-0.04.tar.gz
Package nginx: missing distfile encrypted-session-nginx-module-0.04.tar.gz
Package nginx: missing distfile headers-more-nginx-module-0.261.tar.gz
Package nginx: missing distfile nginx_http_push_module-0.692.tar.gz
Package nginx: missing distfile set-misc-nginx-module-0.29.tar.gz
Package nginx-devel: missing distfile echo-nginx-module-0.58.tar.gz
Package nginx-devel: missing distfile form-input-nginx-module-0.11.tar.gz
Package nginx-devel: missing distfile lua-nginx-module-0.9.16.tar.gz
Package nginx-devel: missing distfile nginx_http_push_module-0.692.tar.gz
Package nginx-devel: missing distfile set-misc-nginx-module-0.29.tar.gz
Package php-owncloud: missing distfile owncloud-8.2.0.tar.bz2
Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden). All existing
SHA1 digests retained for now as an audit trail.
|