yggverse
|
a20d466bd0
|
fix resolved host request
|
2024-04-07 20:14:45 +03:00 |
|
yggverse
|
bc9fc470e6
|
make custom resolver optionally required to continue the crawl #15
|
2024-04-07 04:57:02 +03:00 |
|
yggverse
|
1b8bcb084a
|
implement DNS resolver with memory cache feature #15
|
2024-04-07 03:54:55 +03:00 |
|
yggverse
|
0559ae3a58
|
fix undefined variable, minor optimization
|
2024-04-06 02:48:25 +03:00 |
|
yggverse
|
f12c897d34
|
update yggverse/net version api
|
2024-04-06 02:18:52 +03:00 |
|
yggverse
|
f39868350c
|
skip gemini tags in body index
|
2024-04-05 07:04:50 +03:00 |
|
yggverse
|
deeb186c27
|
fix namespace
|
2024-04-05 05:55:03 +03:00 |
|
yggverse
|
512e859033
|
remove chunk attribute, fix getCode object
|
2024-04-05 05:23:07 +03:00 |
|
yggverse
|
dc60f0376f
|
use yo-tools-php library
|
2024-04-03 18:12:32 +03:00 |
|
yggverse
|
1f96ca8a2c
|
init gemini protocol implementation
|
2024-04-03 17:07:28 +03:00 |
|
yggverse
|
488e090f97
|
compare compressed snap file size instead of document download size
|
2024-04-01 15:55:00 +03:00 |
|
yggverse
|
f8c8aacf95
|
fix size data type conversion
|
2024-04-01 15:48:39 +03:00 |
|
yggverse
|
231b7be50d
|
remove debug
|
2024-03-28 22:01:34 +02:00 |
|
yggverse
|
3fbde71313
|
fix variable name
|
2024-03-28 21:59:01 +02:00 |
|
yggverse
|
22156d230d
|
fix snaps filename selection
|
2024-03-28 21:42:37 +02:00 |
|
yggverse
|
c618714cd2
|
fix regex rule
|
2024-03-27 05:10:01 +02:00 |
|
yggverse
|
21c6eb18dc
|
fix variable name
|
2024-03-27 05:02:14 +02:00 |
|
yggverse
|
d07025e5ee
|
fix str_starts_with attribute
|
2024-03-27 05:00:47 +02:00 |
|
yggverse
|
a3f2ab0aa2
|
use document ID as the snap location
|
2024-03-27 04:57:35 +02:00 |
|
yggverse
|
27564c4fbc
|
add collision events debug
|
2024-03-27 04:27:49 +02:00 |
|
yggverse
|
5705f452cc
|
update config folding
|
2024-03-24 18:26:59 +02:00 |
|
yggverse
|
d44bf90fe3
|
stop crawler on network connection lost #11
|
2024-03-24 18:15:26 +02:00 |
|
yggverse
|
b475b4e61b
|
remove mime update on progress function execute #10
|
2024-03-23 16:21:57 +02:00 |
|
yggverse
|
686479e7f1
|
disable ranked pages index first
|
2024-03-23 15:55:09 +02:00 |
|
yggverse
|
0872e66e15
|
remove global constant declaration
|
2024-03-23 15:47:59 +02:00 |
|
yggverse
|
7cf10079c6
|
update mime on progress function event
|
2024-03-23 03:31:27 +02:00 |
|
yggverse
|
3a28bf5967
|
reset index time
|
2024-03-23 03:26:25 +02:00 |
|
yggverse
|
722de9175a
|
reset index time
|
2024-03-23 03:25:36 +02:00 |
|
yggverse
|
62149220b9
|
update http code even progress function fails
|
2024-03-23 03:16:01 +02:00 |
|
yggverse
|
34fe26fcf9
|
disable document autodelete
|
2024-03-23 03:15:01 +02:00 |
|
yggverse
|
c4df3f3237
|
improve notice level debug
|
2024-03-23 01:00:49 +02:00 |
|
yggverse
|
3a9efeabc5
|
add snaps update by timeout feature
|
2024-03-23 00:47:08 +02:00 |
|
yggverse
|
ebeef559ba
|
rename index action dependencies
|
2024-03-22 23:50:39 +02:00 |
|
yggverse
|
fef2b1abec
|
implement reindex by request feature
|
2024-03-22 22:50:52 +02:00 |
|
yggverse
|
fae43d54e5
|
enable xhtml parser
|
2024-03-22 19:11:27 +02:00 |
|
yggverse
|
f2dbd1599c
|
fix tags replacement condition
|
2024-03-22 03:02:57 +02:00 |
|
yggverse
|
5e4494c9e8
|
use PHP 8 str_starts_with function
|
2024-03-21 18:47:11 +02:00 |
|
yggverse
|
900e3a453f
|
Disable keywords collection from headers as body index enabled
|
2024-03-21 03:46:58 +02:00 |
|
yggverse
|
1f3ee435e9
|
fix custom encoding conversion
|
2024-03-21 03:38:46 +02:00 |
|
yggverse
|
e09440b44a
|
strip code content
|
2024-03-21 00:38:24 +02:00 |
|
yggverse
|
b5cd219f47
|
strip css content from index
|
2024-03-21 00:34:25 +02:00 |
|
yggverse
|
3884f375d4
|
save document body text to index
|
2024-03-20 19:31:56 +02:00 |
|
ghost
|
1c2e8dafb2
|
collect keywords from document headers
|
2024-01-23 02:49:52 +02:00 |
|
ghost
|
cfbc84cbaf
|
sort queue by rank asc
|
2024-01-23 02:19:35 +02:00 |
|
ghost
|
db9dc8d4ba
|
force results to string
|
2024-01-23 01:55:28 +02:00 |
|
ghost
|
50dc9d315a
|
add rank field
|
2024-01-22 22:56:36 +02:00 |
|
ghost
|
6f4abe4729
|
set crc32url as document id
|
2024-01-22 22:52:37 +02:00 |
|
ghost
|
93baed4b90
|
delete deprecated documents with HTTP code not 200 on second scan
|
2023-12-20 08:44:35 +02:00 |
|
ghost
|
33cc778999
|
crawl newest pages by rand in queue
|
2023-12-10 00:29:18 +02:00 |
|
ghost
|
35ad144a9e
|
add stripos url rules for crawl snaps
|
2023-12-02 22:15:44 +02:00 |
|