Reference¶
Configuration
¶
KToolBox Configuration
Attributes:
| Name | Type | Description | Default |
|---|---|---|---|
api |
APIConfiguration
|
Kemono API Configuration |
APIConfiguration()
|
downloader |
DownloaderConfiguration
|
File Downloader Configuration |
DownloaderConfiguration()
|
job |
JobConfiguration
|
Download jobs Configuration |
JobConfiguration()
|
logger |
LoggerConfiguration
|
Logger configuration |
LoggerConfiguration()
|
ssl_verify |
bool
|
Enable SSL certificate verification for Kemono API server and download server |
True
|
json_dump_indent |
int
|
Indent of JSON file dump |
4
|
use_uvloop |
bool
|
Use uvloop/winloop for asyncio performance optimization Uses winloop on Windows and uvloop on Unix-like systems for better concurrent performance. Install winloop on Windows with |
True
|
APIConfiguration
¶
Kemono API Configuration
Attributes:
| Name | Type | Description | Default |
|---|---|---|---|
scheme |
Literal['http', 'https']
|
Kemono API URL scheme |
'https'
|
netloc |
str
|
Kemono API URL netloc |
'kemono.cr'
|
statics_netloc |
str
|
URL netloc of Kemono server for static files (e.g. images) |
'img.kemono.cr'
|
files_netloc |
str
|
URL netloc of Kemono server for post files |
'kemono.cr'
|
path |
str
|
Kemono API URL root path |
'/api/v1'
|
timeout |
float
|
API request timeout |
5.0
|
retry_times |
int
|
API request retry times (when request failed) |
3
|
retry_interval |
float
|
Seconds of API request retry interval |
2.0
|
session_key |
str
|
Session key that can be found in cookies after a successful login |
''
|
DownloaderConfiguration
¶
File Downloader Configuration
Attributes:
| Name | Type | Description | Default |
|---|---|---|---|
scheme |
Literal['http', 'https']
|
Downloader URL scheme |
'https'
|
timeout |
float
|
Downloader request timeout |
30.0
|
encoding |
str
|
Charset for filename parsing and post |
'utf-8'
|
buffer_size |
int
|
Number of bytes of file I/O buffer for each downloading file |
20480
|
chunk_size |
int
|
Number of bytes of chunk of downloader stream |
1024
|
temp_suffix |
str
|
Temp filename suffix of downloading files |
'tmp'
|
retry_times |
int
|
Downloader retry times (when download failed) |
10
|
retry_stop_never |
bool
|
Never stop downloader from retrying (when download failed) ( |
False
|
retry_interval |
float
|
Seconds of downloader retry interval |
3.0
|
tps_limit |
float
|
Maximum connections established per second |
5.0
|
use_bucket |
bool
|
Enable local storage bucket mode |
False
|
bucket_path |
Path
|
Path of local storage bucket |
Path('./.ktoolbox/bucket_storage')
|
reverse_proxy |
str
|
Reverse proxy format for download URL. Customize the filename format by inserting an empty |
'{}'
|
keep_metadata |
bool
|
Keep the file metadata when downloading files (e.g. last modified time, etc.) |
True
|
PostStructureConfiguration
¶
Post path structure model
-
Default:
.. ├─ content.txt ├─ external_links.txt ├─ {id}_{}.png (file) ├─ post.json (metadata) ├─ attachments │ ├─ 1.png │ └─ 2.png └─ revisions ├─ <PostStructure> │ ├─ ... │ └─ ... └─ <PostStructure> ├─ ... └─ ... -
Available properties for
fileProperty Type idString userString serviceString titleString addedDate publishedDate editedDate
Attributes:
| Name | Type | Description | Default |
|---|---|---|---|
attachments |
Path
|
Sub path of attachment directory |
Path('attachments')
|
content |
Path
|
Sub path of post content file |
Path('content.txt')
|
external_links |
Path
|
Sub path of external links file (for cloud storage links found in content) |
Path('external_links.txt')
|
file |
str
|
The format of the post |
'{id}_{}'
|
revisions |
Path
|
Sub path of revisions directory |
Path('revisions')
|
JobConfiguration
¶
Download jobs Configuration
-
Available properties for
post_dirname_formatandfilename_formatProperty Type idString userString serviceString titleString addedDate publishedDate editedDate -
Available properties for
year_dirname_formatandmonth_dirname_formatProperty Type yearString monthString -
Python Format Specification Mini-Language reference:
https://docs.python.org/3.13/library/string.html#format-specification-mini-language
Attributes:
| Name | Type | Description | Default |
|---|---|---|---|
count |
int
|
Number of coroutines for concurrent download |
4
|
include_revisions |
bool
|
Include and download revision posts when available |
False
|
post_dirname_format |
str
|
Customize the post directory name format, you can use some of the properties in |
'{title}'
|
post_structure |
PostStructureConfiguration
|
Post path structure |
PostStructureConfiguration()
|
mix_posts |
bool
|
Save all files from different posts at same path in creator directory. It would not create any post directory, and |
False
|
sequential_filename |
bool
|
Rename attachments in numerical order, e.g. |
False
|
sequential_filename_excludes |
Set[str]
|
File extensions to exclude from sequential naming when |
Field(default_factory=set)
|
filename_format |
str
|
Customize the filename format by inserting an empty |
'{}'
|
allow_list |
Set[str]
|
Download files which match these patterns (Unix shell-style), e.g. |
Field(default_factory=set)
|
block_list |
Set[str]
|
Not to download files which match these patterns (Unix shell-style), e.g. |
Field(default_factory=set)
|
extract_content |
bool
|
Extract post content and save to separate file (filename was defined in |
False
|
extract_content_images |
bool
|
Extract images from post content and download them. |
False
|
extract_external_links |
bool
|
Extract external file sharing links from post content and save to separate file (filename was defined in |
False
|
external_link_patterns |
List[str]
|
Regex patterns for extracting external links. |
['https?://drive\\.google\\.com/[^\\s]+', 'https?://docs\\.google\\.com/[^\\s]+', 'https?://mega\\.nz/[^\\s]+', 'https?://mega\\.co\\.nz/[^\\s]+', 'https?://(?:www\\.)?dropbox\\.com/[^\\s]+', 'https?://db\\.tt/[^\\s]+', 'https?://onedrive\\.live\\.com/[^\\s]+', 'https?://1drv\\.ms/[^\\s]+', 'https?://(?:www\\.)?mediafire\\.com/[^\\s]+', 'https?://(?:www\\.)?wetransfer\\.com/[^\\s]+', 'https?://we\\.tl/[^\\s]+', 'https?://(?:www\\.)?sendspace\\.com/[^\\s]+', 'https?://(?:www\\.)?4shared\\.com/[^\\s]+', 'https?://(?:www\\.)?zippyshare\\.com/[^\\s]+', 'https?://(?:www\\.)?uploadfiles\\.io/[^\\s]+', 'https?://(?:www\\.)?box\\.com/[^\\s]+', 'https?://(?:www\\.)?pcloud\\.com/[^\\s]+', 'https?://disk\\.yandex\\.[a-z]+/[^\\s]+', 'https?://[^\\s]*(?:file|upload|share|download|drive|storage)[^\\s]*\\.[a-z]{2,4}/[^\\s]+']
|
group_by_year |
bool
|
Group posts by year in separate directories based on published date |
False
|
group_by_month |
bool
|
Group posts by month in separate directories based on published date (requires group_by_year) |
False
|
year_dirname_format |
str
|
Customize the year directory name format. Available properties: |
'{year}'
|
month_dirname_format |
str
|
Customize the month directory name format. Available properties: |
'{year}-{month:02d}'
|
keywords |
Set[str]
|
keywords to filter posts by title (case-insensitive) |
Field(default_factory=set)
|
keywords_exclude |
Set[str]
|
keywords to exclude posts by title (case-insensitive) |
Field(default_factory=set)
|
download_file |
bool
|
Download post file (usually cover image). Set to False to skip file downloads. |
True
|
download_attachments |
bool
|
Download post attachments. Set to False to skip attachment downloads. |
True
|
min_file_size |
Optional[int]
|
Minimum file size in bytes to download. Files smaller than this will be skipped. Set to None to disable minimum size filtering. |
None
|
max_file_size |
Optional[int]
|
Maximum file size in bytes to download. Files larger than this will be skipped. Set to None to disable maximum size filtering. |
None
|