pkgsrc.se | The NetBSD package collection

./www/seaward, Crawler which searches for links or a specified word in a website

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 1.0.3, Package name: seaward-1.0.3, Maintainer: pin

Seaward is a crawler used to discover every link on a web page and its linked
pages without duplicates or to search for a word starting from the given URL.

If you want to save the links inside a file, you can run
'seaward <URL> --silent > file.txt', and if you experience many timeout errors
try using a higher timeout with '-t'.
With the '-d 0' option you crawl only the web page passed in the URL parameter,
with '-d 1' also the pages linked to it (always within the same web site) and
so on.

Master sites:

https://github.com/M3nny/ (Download)

Filesize: 429.992 KB

Version history: (Expand)

(2024-02-23) Updated to version: seaward-1.0.3
(2024-02-09) Updated to version: seaward-1.0.2
(2023-10-25) Updated to version: seaward-1.0.1nb1
(2023-07-13) Package added to pkgsrc.se, version seaward-1.0.1 (created)

CVS history: (Expand)

2024-02-23 18:25:27 by pin | Files touched by this commit (3) | Package updated

Log message:
www/seaward: update to 1.0.3

 - It is now possible to "strictly" crawl a url, run --help to get more
   information
 - The code now handles more anomalies (less panics! yay)
 - Link fragments (#) are now removed to avoid revisiting the same page with
   different tags
 - Printouts used for logging information will now be in the following format
   [logging level] ..."
 - The crate "ctrlc" has been replaced by tokio::signal

2024-02-09 14:58:23 by pin | Files touched by this commit (3) | Package updated

Log message:
www/seaward: update to 1.0.2

 - The heuristic used to find the optimum timeout has been modified:
   instead of using an average, it will be used the maximum time registered
   plus a margin (in the request samples).

 - Request samples timings are shown during execution (disable them with --silent).

2023-10-25 00:11:51 by Thomas Klausner | Files touched by this commit (2298)

Log message:
*: bump for openssl 3

2023-07-13 22:42:35 by pin | Files touched by this commit (5)

Log message:
www/seaward: import package

Seaward is a crawler used to discover every link on a web page and its linked
pages without duplicates or to search for a word starting from the given URL.

If you want to save the links inside a file, you can run
'seaward <URL> --silent > file.txt', and if you experience many timeout \ 
errors
try using a higher timeout with '-t'.
With the '-d 0' option you crawl only the web page passed in the URL parameter,
with '-d 1' also the pages linked to it (always within the same web site) and
so on.