./sysutils/agedu, Utility for tracking down wasted disk space

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 20160920.853cea9, Package name: agedu-20160920.853cea9, Maintainer: pkgsrc-users

Suppose you're running low on disk space. You need to free some up, by finding
something that's a waste of space and deleting it (or moving it to an archive
medium). How do you find the right stuff to delete, that saves you the maximum
space at the cost of minimum inconvenience?

Unix provides the standard du utility, which scans your disk and tells you which
directories contain the largest amounts of data. That can help you narrow your
search to the things most worth deleting.

However, that only tells you what's big. What you really want to know is what's
too big. By itself, du won't let you distinguish between data that's big because
you're doing something that needs it to be big, and data that's big because you
unpacked it once and forgot about it.

Most Unix file systems, in their default mode, helpfully record when a file was
last accessed. Not just when it was written or modified, but when it was even
read. So if you generated a large amount of data years ago, forgot to clean it
up, and have never used it since, then it ought in principle to be possible to
use those last-access time stamps to tell the difference between that and a
large amount of data you're still using regularly.

agedu does same disk scan as du, but also records the last-access times of
everything. Then it builds an index that lets it efficiently generate reports
giving a summary of the results for each subdirectory.


Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: 29577771d36f8cc4537f7564ac521f5dfc082b72
RMD160: 476d76503022b02578b6b152a033b3372a330187
Filesize: 172.62 KB

Version history: (Expand)


CVS history: (Expand)


   2016-12-15 11:11:39 by Amitai Schleier | Files touched by this commit (2) | Package updated
Log message:
Update to 20160920.853cea9. From the changelog:

- Revise versioning system to be date-based.
- Pedantic changes to capitalisation of byte/kilobyte/megabyte etc.
- Clarify the --cgi usage a bit, and give an example.
- Rearrange documentation of -S, -L and -D.
   2015-11-04 02:32:42 by Alistair G. Crooks | Files touched by this commit (499)
Log message:
Add SHA512 digests for distfiles for sysutils category

Problems found with existing digests:
	Package memconf distfile memconf-2.16/memconf.gz
	b6f4b736cac388dddc5070670351cf7262aba048 [recorded]
	95748686a5ad8144232f4d4abc9bf052721a196f [calculated]

Problems found locating distfiles:
	Package dc-tools: missing distfile dc-tools/abs0-dc-burn-netbsd-1.5-0-gae55ec9
	Package ipw-firmware: missing distfile ipw2100-fw-1.2.tgz
	Package iwi-firmware: missing distfile ipw2200-fw-2.3.tgz
	Package nvnet: missing distfile nvnet-netbsd-src-20050620.tgz
	Package syslog-ng: missing distfile syslog-ng-3.7.2.tar.gz

Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden).  All existing
SHA1 digests retained for now as an audit trail.
   2014-02-14 14:48:40 by Amitai Schlair | Files touched by this commit (2)
Log message:
Distfile changed in place, but seems stable now.
   2014-01-31 16:32:03 by Amitai Schlair | Files touched by this commit (2) | Package updated
Log message:
Update to r10126. From the svn log:

* Fix handling of IPv6 address literals.

With luck, this distfile won't change in place.
   2014-01-16 19:11:37 by Amitai Schlair | Files touched by this commit (2) | Package updated
Log message:
Update to r9723 (no changelog). Some highlights from the svn log:

* Add the --files option, to list individual files in the various
  reporting modes.
* Flexibly report sizes in Kb, Mb, Gb etc as appropriate. The
  previous fixed Mb was inconvenient at both ends. Original patch
  from James Beal, though I've polished it pretty much into
  unrecognisability.
* Make the existing -d (depth) option apply to the -H (static HTML
  report) mode, transforming its output from a single HTML file
  giving a report for one directory with no crosslinks to a collection
  of HTML files with crosslinks between them.
* Introduce a --cgi mode, to make it easy to plumb agedu's web
  reporting into an existing web server as an alternative to running
  a dedicated one of its own.
* Switch all the HTML-based reporting modes (the internal httpd, the CGI
  mode and the dump of static HTML files) to using URIs and filenames
  based on the text of the pathname being reported on, rather than
  its numeric index in the data file. The aim is that sub-URIs
  should remain valid when the data is updated - if, for instance,
  you're running the agedu CGI script permanently and changing the
  data file under it every so often.
* Suggestion from James Beal: support a '--title' option to override the
  'agedu:' prefix at the start of the title of output web pages.
   2013-04-07 22:49:45 by Blue Rats | Files touched by this commit (91)
Log message:
Edited DESCR in the case of:
 File too long (should be no more than 24 lines).
 Line too long (should be no more than 80 characters).
 Trailing empty lines.
 Trailing white-space.
Trucated the long files as best as possible while preserving the most info
contained in them.
   2012-10-23 21:51:39 by Aleksej Saushev | Files touched by this commit (447)
Log message:
Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.
   2009-06-07 15:48:21 by Thomas Klausner | Files touched by this commit (4) | Imported package
Log message:
Initial import of agedu-8590:

Suppose you're running low on disk space. You need to free some
up, by finding something that's a waste of space and deleting it
(or moving it to an archive medium). How do you find the right
stuff to delete, that saves you the maximum space at the cost of
minimum inconvenience?

Unix provides the standard du utility, which scans your disk and
tells you which directories contain the largest amounts of data.
That can help you narrow your search to the things most worth
deleting.

However, that only tells you what's big. What you really want to
know is what's too big. By itself, du won't let you distinguish
between data that's big because you're doing something that needs
it to be big, and data that's big because you unpacked it once and
forgot about it.

Most Unix file systems, in their default mode, helpfully record
when a file was last accessed. Not just when it was written or
modified, but when it was even read. So if you generated a large
amount of data years ago, forgot to clean it up, and have never
used it since, then it ought in principle to be possible to use
those last-access time stamps to tell the difference between that
and a large amount of data you're still using regularly.

agedu is a program which does this. It does basically the same sort
of disk scan as du, but it also records the last-access times of
everything it scans. Then it builds an index that lets it efficiently
generate reports giving a summary of the results for each subdirectory,
and then it produces those reports on demand.