./devel/R-bit, Class for vectors of 1-bit booleans

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 4.0.4, Package name: R-bit-4.0.4, Maintainer: minskim

bitmapped vectors of booleans (no NAs), coercion from and to logicals,
integers and integer subscripts; fast boolean operators and fast
summary statistics. With 'bit' vectors you can store true binary
booleans {FALSE,TRUE} at the expense of 1 bit only, on a 32 bit
architecture this means factor 32 less RAM and ~ factor 32 more speed
on boolean operations. Due to overhead of R calls, actual speed gain
depends on the size of the vector: expect gains for vectors of size >
10000 elements. Even for one-time boolean operations it can pay-off to
convert to bit, the pay-off is obvious, when such components are used
more than once. Reading from and writing to bit is approximately as
fast as accessing standard logicals - mostly due to R's time for
memory allocation. The package allows to work with pre-allocated
memory for return values by calling .Call() directly: when evaluating
the speed of C-access with pre-allocated vector memory, coping from
bit to logical requires only 70% of the time for copying from logical
to logical; and copying from logical to bit comes at a performance
penalty of 150%. the package now contains further classes for
representing logical selections: 'bitwhich' for very skewed selections
and 'ri' for selecting ranges of values for chunked processing. All
three index classes can be used for subsetting 'ff' objects (ff-2.1-0
and higher).


Required to run:
[math/R]

Required to build:
[pkgtools/cwrappers]

Master sites: (Expand)


Version history: (Expand)


CVS history: (Expand)


   2021-10-26 12:20:11 by Nia Alarie | Files touched by this commit (3016)
Log message:
archivers: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Could not be committed due to merge conflict:
devel/py-traitlets/distinfo

The following distfiles were unfetchable (note: some may be only fetched
conditionally):

./devel/pvs/distinfo pvs-3.2-solaris.tgz
./devel/eclipse/distinfo eclipse-sourceBuild-srcIncluded-3.0.1.zip
   2021-10-07 15:44:44 by Nia Alarie | Files touched by this commit (3017)
Log message:
devel: Remove SHA1 hashes for distfiles
   2021-06-07 01:44:37 by Makoto Fujiwara | Files touched by this commit (2)
Log message:
(devel/R-bit) Update 1.1.14 to 4.0.4

        CHANGES IN bit VERSION 4.0.4
USER VISIBLE CHANGES

    o copy() and reverse() have been renamed to
    copy_vector() and reverse_vector() to avoid
    naming conflict with data.table

        CHANGES IN bit VERSION 4.0.3
BUG FIXES
    o temporarily removed link to clone.ff
      to satisfy CRAN checks

        CHANGES IN bit VERSION 4.0.2
USER VISIBLE CHANGES

    o Vignettes nolonger execute ff code
      for ff-version prior 4.0.0

BUG FIXES
    o NA could crash bit_extract_unsorted
    o now DESCRIPTION URL points to github

        CHANGES IN bit VERSION 4.0.1
USER VISIBLE CHANGES

    o bbatch now checks input N >= 0, B > 0
      and returns batchsize b in 1..N

BUG FIXES
    o NA could crash bit_extract_unsorted

        CHANGES IN bit VERSION 4.0.0

NEW FEATURES

    o new superclass ?booltype now allows proper method
      dispatch even for two user defined booleans, e.g. (bit | bitwhich)
    o new ordinal 'booltypes' nobool < logical < bit < bitwhich < \ 
which < ri
      and diagnostic functions booltype() and is.booltype()
    o bitwhich now has methods for [[ [ [[<- and [<-
    o new functions 'c', '==', '!=', '|', '&', 'xor' for .booltype
    o new function bitwhich_representation() to inspect the bitwhich
      representation without the cost of unclass()
    o new method 'is' for .which, .ri, .hi (and .booltype)
    o new coercion generic as.booltype with .default method
    o new coercion method as.logical.which
    o new generic as.ri with methods for .ri and .default (lossy)
    o new methods rep, rev, as.character and str for .bit and .bitwhich
    o new methods all, any, min, max, range, sum, summary for .booltype, .which
    o new method anyNA for all booltypes
    o new dummy method 'is.na' for .bit, .bitwhich
    o new function in.bitwhich much faster than %in%
    o new integer sorting function bitsort() using bit_sort() or bit_sort_unique()
      which can be by an order of magnitude faster than radix sorts
      or falling back to one of countsort(), quicksort2(), quicksort3()
    o new symmetric set function symdiff
    o new functions copy(), reverse() for copying and reversing integer vectors
    o new helper functions range_na(), range_nanozero(), range_sortna()
      join multiple tasks in one go
    o new fast unary functions for integers: bit_unique, bit_duplicated,
      bit_anyDuplicated, bit_sumDuplicated
    o new fast binary functions for integers: bit_in, bit_intersect, bit_union,
      bit_setequal, bit_symdiff, bit_setdiff, bit_rangediff
    o new fast unary functions for sorted integers: merge_rev,
      merge_unique, merge_duplicated, merge_anyDuplicated, merge_sumDuplicated,
      merge_first, merge_last,
    o new fast binary functions for sorted integers:
      merge_firstin, merge_firstnotin, merge_lastin, merge_lastnotin,
      merge_match, merge_in, merge_notin,
      merge_union, merge_intersect, merge_setdiff, merge_symdiff,
      merge_setequal
    o new even faster binary functions when the first argument is a range of \ 
integers:
      merge_rangein, merge_rangenotin, merge_rangesect, merge_rangediff
    o new function firstNA substantially faster than which.max(is.na(x))
    o new function getsetattr() does setattr() but returns the old attr()
    o new function get_length() directly returns LENGTH(SEXP)
      circumventing all method dispatch for length()
    o new methods rlepack.integer, rleunpack.rlepack anyDuplicated.rlepack

USER VISIBLE CHANGES

    o license has been extendend from GPL-2 to GPL-2 | GPL-3
    o S3methods are no longer exported in NAMESPACE
      (except for .booltype)
    o class bitwhich
    - now is a fully functional alternative to bit vectors
    - has argument order changed to (maxindex, x, poslength)
    - its internal representation of bitwhich(0) has been changed
      from FALSE to logical() and from unsorted to sorted integers
    o class 'which' now carries an attribute 'maxindex' if available
    o as.which() and bitwhich() now filter zeroes and store data unique(sort(x))
    o as.which() now has methods for .which, .logical, .integer and .numeric
      instead of .default.
    o bit() and bitwhich() now behave more like logical(), without
      arguments they return objects of length zero
    o as.bit, as.bitwhich and as.which now have methods for class NULL
      such that for example as.bit(c()) will return bit(0)
      (wish of Martijn Schuemle)
    o binary operators now allow for different lengths
      and recycle instead of throwing an error
    o xor.default now keeps the original definition of xor() and uses
      a new method xor.logical to speed-up logicals
    o the generics poslength and maxindex have been moved from package ff
      with methods now for .default, .logical, .bit, .bitwhich, .which, .ri
    o old method chunk.default has been renamed to chunks and now returns with names
      (for backward compatibility chunk() with named arguments behaves as before)
    o new method chunk.default calls chunks() along the length(x)
      using typeof(x) or vmode(x), this replaces chunk.bit from package ff
    o clone.default now uses R's C-function duplicate()
      and clone.list has been removed
    o intisasc() and intisdesc() have a new argument
      na.method=c("none","break","skip") to \ 
specify tie handling

TESTING and DOCUMENTATION

    o there are much more regression tests now
    o testing uses package testthat
    o documentation uses package roxygen2 now
    o new vignettes bit-demo, bit-usage and bit-performance

BUG FIXES

    o assignment functions '[<-.bit' now behave like '[<-.logical' when it
      comes to NAs or ZEROs in subscripts
    o length<-.bit no longer tries to access memory before it is allocated
    o as.bit.bitwhich now handles non-positive bitwhich correctly
    o declare as static many functions/variables in bit.c. (Thanks to Brian Ripley)
   2019-08-08 21:53:58 by Brook Milligan | Files touched by this commit (189) | Package updated
Log message:
Update all R packages to canonical form.

The canonical form [1] of an R package Makefile includes the
following:

- The first stanza includes R_PKGNAME, R_PKGVER, PKGREVISION (as
  needed), and CATEGORIES.

- HOMEPAGE is not present but defined in math/R/Makefile.extension to
  refer to the CRAN web page describing the package.  Other relevant
  web pages are often linked from there via the URL field.

This updates all current R packages to this form, which will make
regular updates _much_ easier, especially using pkgtools/R2pkg.

[1] http://mail-index.netbsd.org/tech-pkg/2019/08/02/msg021711.html
   2019-07-31 17:05:50 by Brook Milligan | Files touched by this commit (1) | Package updated
Log message:
R-bit: update to canonical form of an R package.
   2018-07-28 16:40:53 by Brook Milligan | Files touched by this commit (126)
Log message:
Remove MASTER_SITES= from individual R package Makefiles.

Each R package should include ../../math/R/Makefile.extension, which also
defines MASTER_SITES.  Consequently, it is redundant for the individual
packages to do the same.  Package-specific definitions also prevent
redefining MASTER_SITES in a single common place.
   2018-07-04 10:14:00 by Wen Heping | Files touched by this commit (2)
Log message:
Update to 1.1.14

Upstream changes:
CHANGES IN bit VERSION 1.1-14

BUG FIXES

    o bit[i] and bit[i]<-v now check for non-positive integers
      which prevents a segfault when bit[NA] or bit[NA]<-v

        CHANGES IN bit VERSION 1.1-13

USER VISIBLE CHANGES

    o logical NA is now mapped to bit FALSE as in ff booleans
    o extractor function '[.bit' with positive numeric subscripts
	  (integer, double, bitwhich) now behaves like '[.logical' and returns
	  NA for out-of-bound requests and no element for 0
    o extractor function '[[.bit' with positive numeric (integer, double,
      bitwhich) subscripts now behaves like '[[.logical' and throws an error
      for out-of-bound requests
    o extractor function '[.bit' with range index subscripts (ri)
      subscripts now behaves like '[[.bit' and throws an error
      for out-of-bound requests
    o assignment functions '[<-.bit' and '[[<-.bit' with positive numeric
      (integer, double, bitwhich) subscripts now behave like '[<-.logical' and
      '[[<-.logical' and silently increase vector length if necessary
    o assignment function '[<-.bit' with range index subscripts (ri) now
      behaves like '[[<-.bit' and silently increases vector length if necessary
    o rlepack() is now a generic with a method for class 'integer'
    o rleunpack() is now a generic with a method for class 'rlepack'
    o unique.rlepack() now gives correct results for unordered sequences
    o anyDuplicated.rlepack() now returns the position of the first
	  duplicate and gives correct results for unordered sequences

TUNING

    o The package can now compiled with 64bit words instead of 32bit words,
      since we only measured a minor speedup, we left 32bit as the default.

BUG FIXES

    o extractor and assignment functions now check for legal (positive)
      subscript bounds, hence illegally large subscripts or zero no longer
      cause memory violations
   2018-03-06 18:06:28 by Min Sik Kim | Files touched by this commit (3)
Log message:
devel/R-bit: Import version 1.1.12

bitmapped vectors of booleans (no NAs), coercion from and to logicals,
integers and integer subscripts; fast boolean operators and fast
summary statistics. With 'bit' vectors you can store true binary
booleans {FALSE,TRUE} at the expense of 1 bit only, on a 32 bit
architecture this means factor 32 less RAM and ~ factor 32 more speed
on boolean operations. Due to overhead of R calls, actual speed gain
depends on the size of the vector: expect gains for vectors of size >
10000 elements. Even for one-time boolean operations it can pay-off to
convert to bit, the pay-off is obvious, when such components are used
more than once. Reading from and writing to bit is approximately as
fast as accessing standard logicals - mostly due to R's time for
memory allocation. The package allows to work with pre-allocated
memory for return values by calling .Call() directly: when evaluating
the speed of C-access with pre-allocated vector memory, coping from
bit to logical requires only 70% of the time for copying from logical
to logical; and copying from logical to bit comes at a performance
penalty of 150%. the package now contains further classes for
representing logical selections: 'bitwhich' for very skewed selections
and 'ri' for selecting ranges of values for chunked processing. All
three index classes can be used for subsetting 'ff' objects (ff-2.1-0
and higher).