Path to this page:
NOTICE: This package has been removed from pkgsrc./
wip/hs-text,
Efficient packed Unicode text type
Branch: CURRENT,
Version: 1.1.1.2,
Package name: hs-text-1.1.1.2,
Maintainer: phoAn efficient packed, immutable Unicode text type (both strict and
lazy), with a powerful loop fusion optimization framework.
The Text type represents Unicode character strings, in a time and
space-efficient manner. This package provides text processing
capabilities that are optimized for performance critical use, both in
terms of large data quantities and high speed.
The Text type provides character-encoding, type-safe case conversion
via whole-string case conversion functions. It also provides a range
of functions for converting Text values to and from ByteStrings, using
several standard encodings.
Efficient locale-sensitive support for text IO is also supported.
These modules are intended to be imported qualified, to avoid name
clashes with Prelude functions, e.g.
import qualified Data.Text as T
To use an extended and very rich family of functions for working with
Unicode text (including normalization, regular expressions,
non-standard encodings, text breaking, and locales), see the text-icu
package: http://hackage.haskell.org/package/text-icu
Required to run:[
lang/ghc7]
Master sites:
SHA1: 6f506f729b7171c37c97d5a12d784bdbd628a935
RMD160: dad79f8f057942e5942a1e23f9ec34104e2b0cba
Filesize: 134.796 KB
Version history: (Expand)
- (2014-05-20) Package deleted from pkgsrc
- (2014-05-09) Updated to version: hs-text-1.1.1.2
- (2014-05-08) Updated to version: hs-text-1.1.1.1
- (2013-12-05) Updated to version: hs-text-1.0.0.0
- (2013-11-18) Package has been reborn
- (2013-11-18) Package deleted from pkgsrc
CVS history: (Expand)
2014-05-18 23:33:25 by Ryosuke Moro | Files touched by this commit (119) | |
Log message:
Remove hs-data-default-class, hs-dlist, hs-text, hs-utf8-string,
imported to pkgsrc/devel.
|
2014-05-08 22:18:06 by Ryosuke Moro | Files touched by this commit (2) | |
Log message:
Update to 1.1.1.2
Changes from https://github.com/bos/text:
1.1.1.2
- updated upperBound to return upper instead of lower bound from size hint
|
2014-05-08 02:38:31 by Ryosuke Moro | Files touched by this commit (4) |
Log message:
Update to 1.1.1.1
pkgsrc changes: static -> dynamic
changelog:
1.1.1.1
- changelog -> changelog.md
1.1.1.0
* The Data.Data instance now allows gunfold to work, via a virtual
pack constructor
* dropEnd, takeEnd: new functions
* Comparing the length of a Text against a number can now
short-circuit in more cases
1.1.0.1
* streamDecodeUtf8: fixed gh-70, did not return all unconsumed bytes
in single-byte chunks
1.1.0.0
* encodeUtf8: Performance is improved by up to 4x.
* encodeUtf8Builder, encodeUtf8BuilderEscaped: new functions,
available only if bytestring >= 0.10.4.0 is installed, that allow
very fast and flexible encoding of a Text value to a bytestring
Builder.
As an example of the performance gain to be had, the
encodeUtf8BuilderEscaped function helps to double the speed of JSON
encoding in the latest version of aeson! (Note: if all you need is a
plain ByteString, encodeUtf8 is still the faster way to go.)
* All of the internal module hierarchy is now publicly exposed. If a
module is in the .Internal hierarchy, or is documented as internal,
use at your own risk - there are no API stability guarantees for
internal modules!
1.0.0.1
* decodeUtf8: Fixed a regression that caused us to incorrectly
identify truncated UTF-8 as valid (gh-61)
|
2014-04-15 12:59:40 by Ryosuke Moro | Files touched by this commit (53) |
Log message:
- ready for HASKELL_ENABLE_HADDOCK_DOCUMENTATION= yes
|
2013-12-05 04:05:29 by PHO / phonohawk | Files touched by this commit (3) | |
Log message:
Upstream update to text-1.0.0.0
|
2013-09-14 02:27:21 by Ryosuke Moro | Files touched by this commit (4) |
Log message:
Update to 0.11.3.1
changes:
0.11.3.1
- Make Data.Text.Unsafe public, bump version
0.11.3.0
- Drop last vestige of restreamUtf8
- Add a copy function
This fixes https://github.com/bos/text/issues/48
- Drop restreamUtf8 function that is no longer used \
(https://github.com/bos/text/issues/44)
- Fix printing of hex Integers (https://github.com/bos/text/issues/47)
- Replace the few last uses of div with quot
- Undo an overflow bug I introduced with quotRem
- Handle Int8 overflow
- Compare Show instance performance
- Shave off another 6ns for negative integers with quotRem
- Replace uses of quot and rem with quotRem
Astonishingly (at least to me), this improves performance by almost 30% for \
large integers.
- Backport integer builder benchmarks
- Switch to a faster decimal algorithm
This is about 25% faster than its predecessor for large numbers.
- Benchmark some bigger numbers
- Backed out changeset bb9a0e19421e, since it was slow
- A more straightforward (and slower) countDigits
This is a few percent slower than the tail-recursive version for numbers of \
more than one digit.
- Replace countDigits with a faster, more complex version
This is taken from Andrei's "Three Optimization Tips for C++" post:
\
https://www.facebook.com/notes/facebook-engineering/three-optimization-tips-for-c/10151361643253920
It improves performance by up to 15%.
- Replace a use of div with quot
- Add LLVM support for benchmarks
- Update some comments and whitespace
- Cast to widest fixed integer to avoid truncation trouble
- Write straight into the dest buffer
- Float ensureFree way out
- Add a countDigits function
- Refactor Builder into Builder and Builder.Internal modules
rename : Data/Text/Lazy/Builder.hs => Data/Text/Lazy/Builder/Internal.hs
- Use unsafeDupablePerformIO where possible
unsafeDupablePerformIO is much faster than unsafePerformIO and can be
used safely as long as the underlying operation is pure and we're fine
risking duplicating it in a multi-core scenario. unsafeDupablePerformIO
helps performance a lot on short string where the overhead of
unsafePerformIO dominates.
- Add benchmarks for decodeUtf8'
Also make it possible to run the Pure benchmark with a very short input
string. This lets us test the constant overheads in functions, such as
the one added by unsafePerformIO in decodeUtf8.
- Document internal units and representation
- Try to sort out benchmark build with GHC 7.6
- Fix benchmarks with older bytestring'
- Fix test build with older bytestring
- Ensure that an encoding error handler's result is safe
- Get in-place tests working "properly"
- Merge pull request #18 from hvr/pull-req-16
Add new `Data.Text.Encoding.decodeLatin1` ISO-8859-1 decoding function
- Merge pull request #36 from deian/master
Mark top-level modules Trustworthy
- Turn one error into a CAF
- Make streaming cons strict in its first argument
- Drop some more overhead from unstreamChunks
- First of many CAFs to be NOINLINEd :-(
- When unstreaming, we know the first chunk is not empty
- Lazy Text: reduce memory allocation during unstreaming
- A few simple bang patterns help performance a little
- Merge
- Optimize latin1-to-UTF16 C-implementation by using 32-bit loads
- Add `Data.Text.Lazy.Encoding.decodeLatin1` ISO-8859-1 decoding function
See \
https://github.com/bos/text/commit/7c06306bd5b7382cb101f8632b5a1fc50697fe94 for \
more information
- Add new `Data.Text.Encoding.decodeLatin1` ISO-8859-1 decoding function
This has about an order of magnitude lower runtime and/or call-overhead as
compared to the more generic `text-icu` approach, e.g. according to criterion
with GHC 7.4.1 on Linux/x86_64:
* 12 times faster for empty input strings,
* 6 times faster for 16-byte strings, and
* 3 times faster for 1024-byte strings.
`decodeLatin1` is also faster compared to using `decodeUtf8` for plain ASCII:
* 2 times faster for 16-byte input strings,
* ~40% faster for 1024-byte strings.
- nits
- kill PatternSignatures warning
- Top-level interfaces are safe, marked trustworthy
- Fix documentation for hGetChunk
- Hoist out duplicated catchError definitions :-(
- Merge
- Redefine pack to fuse better
|
2013-01-15 13:03:31 by PHO / phonohawk | Files touched by this commit (3) | |
Log message:
Upstream update to text-0.11.2.3
|
2012-03-04 06:01:50 by PHO / phonohawk | Files touched by this commit (2) | |
Log message:
Upstream update to text-0.11.1.13
|