# aquatic-crawler ![Build](https://github.com/YGGverse/aquatic-crawler/actions/workflows/build.yml/badge.svg) [![Dependencies](https://deps.rs/repo/github/YGGverse/aquatic-crawler/status.svg)](https://deps.rs/repo/github/YGGverse/aquatic-crawler) [![crates.io](https://img.shields.io/crates/v/aquatic-crawler.svg)](https://crates.io/crates/aquatic-crawler) Crawler for [Aquatic](https://github.com/greatest-ape/aquatic) BitTorrent tracker based on [librqbit](https://github.com/ikatson/rqbit/tree/main/crates/librqbit) API > [!NOTE] > Project in development! ## Roadmap > [!TIP] > For details on all implemented features, see the [Options](#options) section * Info-hash versions * [x] 1 * [ ] 2 * Import sources * [x] IPv4 / IPv6 info-hash JSON/API (requires [PR#233](https://github.com/greatest-ape/aquatic/pull/233)) * [x] local file path (`--infohash-file`) * [ ] remote URL * Export options * [x] File system (`--storage`) * [x] resolve infohash to the `.torrent` file (`--export-torrents`) * [x] download content files match the regex pattern (`--preload-regex`) * [x] RSS feed (includes resolved torrent meta and magnet links to download) * [ ] [Manticore](https://github.com/manticoresoftware/manticoresearch-rust) full text search * [ ] SQLite ## Install 1. `git clone https://github.com/YGGverse/aquatic-crawler.git && cd aquatic-crawler` 2. `cargo build --release` 3. `sudo install target/release/aquatic-crawler /usr/local/bin/aquatic-crawler` ## Usage ``` bash aquatic-crawler --infohash /path/to/info-hash-ipv4.json\ --infohash /path/to/info-hash-ipv6.json\ --infohash /path/to/another-source.json\ --tracker udp://host1:port\ --tracker udp://host2:port\ --preload /path/to/directory\ --enable-tcp ``` ### Options ``` bash -d, --debug Debug level * `e` - error * `i` - info * `t` - trace (run with `RUST_LOG=librqbit=trace`) [default: ei] --infohash Absolute path(s) or URL(s) to import infohashes from the Aquatic tracker JSON/API * PR#233 feature --tracker Define custom tracker(s) to preload the `.torrent` files info --initial-peer Define initial peer(s) to preload the `.torrent` files info --export-torrents Save resolved torrent files to given directory --export-rss File path to export RSS feed --export-rss-title Custom title for RSS feed (channel) [default: aquatic-crawler] --export-rss-link Custom link for RSS feed (channel) --export-rss-description Custom description for RSS feed (channel) --enable-dht Enable DHT resolver --enable-tcp Enable TCP connection --enable-upnp-port-forwarding Enable UPnP --enable-upload Enable upload (share received bytes with BitTorrent network) --preload Directory path to store preloaded data (e.g. `.torrent` files) --preload-clear Clear previous data collected on crawl session start --preload-regex Preload only files match regex pattern (list only without preload by default) * see also `preload_max_filesize`, `preload_max_filecount` options ## Example: Filter by image ext ``` --preload-regex '(png|gif|jpeg|jpg|webp)$' ``` * requires `storage` argument defined --preload-total-size Stop crawler on total preload files size reached --preload-max-filesize Max size sum of preloaded files per torrent (match `preload_regex`) --preload-max-filecount Max count of preloaded files per torrent (match `preload_regex`) --proxy-url Use `socks5://[username:password@]host:port` --peer-connect-timeout --peer-read-write-timeout --peer-keep-alive-interval --index-capacity Estimated info-hash index capacity [default: 1000] --add-torrent-timeout Max time to handle each torrent [default: 10] --sleep Crawl loop delay in seconds [default: 300] --upload-limit Limit upload speed (b/s) --download-limit Limit download speed (b/s) -h, --help Print help (see a summary with '-h') -V, --version Print version ```