Path to this page:
Subject: CVS commit: pkgsrc/www/py-beautifulsoup
From: D'Arcy J.M. Cain
Date: 2008-09-05 17:46:51
Message id: 20080905154651.355B3175D0@cvs.netbsd.org
Log Message:
Add BeautifullSoup package.
Beautiful Soup is a Python HTML/XML parser designed for quick turnaround
projects like screen-scraping. Three features make it powerful:
1. Beautiful Soup won't choke if you give it bad markup. It yields a parse
tree that makes approximately as much sense as your original document. This
is usually good enough to collect the data you need and run away.
2. Beautiful Soup provides a few simple methods and Pythonic idioms for
navigating, searching, and modifying a parse tree: a toolkit for dissecting
a document and extracting what you need. You don't have to create a custom
parser for each application.
3. Beautiful Soup automatically converts incoming documents to Unicode and
outgoing documents to UTF-8. You don't have to think about encodings, unless
the document doesn't specify an encoding and Beautiful Soup can't autodetect
one. Then you just have to specify the original encoding.
Files: