Skip to content

Reference

Configuration

KToolBox Configuration

Attributes:

Name Type Description Default
api APIConfiguration

Kemono API Configuration

APIConfiguration()
downloader DownloaderConfiguration

File Downloader Configuration

DownloaderConfiguration()
job JobConfiguration

Download jobs Configuration

JobConfiguration()
logger LoggerConfiguration

Logger configuration

LoggerConfiguration()
ssl_verify bool

Enable SSL certificate verification for Kemono API server and download server

True
json_dump_indent int

Indent of JSON file dump

4
use_uvloop bool

Use uvloop/winloop for asyncio performance optimization Uses winloop on Windows and uvloop on Unix-like systems for better concurrent performance. Install winloop on Windows with pip install ktoolbox[winloop] or uvloop on Unix with pip install ktoolbox[uvloop].

True

APIConfiguration

Kemono API Configuration

Attributes:

Name Type Description Default
scheme Literal['http', 'https']

Kemono API URL scheme

'https'
netloc str

Kemono API URL netloc

'kemono.cr'
statics_netloc str

URL netloc of Kemono server for static files (e.g. images)

'img.kemono.cr'
files_netloc str

URL netloc of Kemono server for post files

'kemono.cr'
path str

Kemono API URL root path

'/api/v1'
timeout float

API request timeout

5.0
retry_times int

API request retry times (when request failed)

3
retry_interval float

Seconds of API request retry interval

2.0
session_key str

Session key that can be found in cookies after a successful login

''

DownloaderConfiguration

File Downloader Configuration

Attributes:

Name Type Description Default
scheme Literal['http', 'https']

Downloader URL scheme

'https'
timeout float

Downloader request timeout

30.0
encoding str

Charset for filename parsing and post content, external_links saving

'utf-8'
buffer_size int

Number of bytes of file I/O buffer for each downloading file

20480
chunk_size int

Number of bytes of chunk of downloader stream

1024
temp_suffix str

Temp filename suffix of downloading files

'tmp'
retry_times int

Downloader retry times (when download failed)

10
retry_stop_never bool

Never stop downloader from retrying (when download failed) (retry_times will be ignored when enabled)

False
retry_interval float

Seconds of downloader retry interval

3.0
tps_limit float

Maximum connections established per second

5.0
use_bucket bool

Enable local storage bucket mode

False
bucket_path Path

Path of local storage bucket

Path('./.ktoolbox/bucket_storage')
reverse_proxy str

Reverse proxy format for download URL. Customize the filename format by inserting an empty {} to represent the original URL. For example: https://example.com/{} will be https://example.com/https://n1.kemono.su/data/66/83/xxxxx.jpg; https://example.com/?url={} will be https://example.com/?url=https://n1.kemono.su/data/66/83/xxxxx.jpg

'{}'
keep_metadata bool

Keep the file metadata when downloading files (e.g. last modified time, etc.)

True

PostStructureConfiguration

Post path structure model

  • Default:

    ..
    ├─ content.txt
    ├─ external_links.txt
    ├─ {id}_{}.png (file)
    ├─ post.json (metadata)
    ├─ attachments
    │    ├─ 1.png
    │    └─ 2.png
    └─ revisions
         ├─ <PostStructure>
         │    ├─ ...
         │    └─ ...
         └─ <PostStructure>
              ├─ ...
              └─ ...
    

  • Available properties for file

    Property Type
    id String
    user String
    service String
    title String
    added Date
    published Date
    edited Date

Attributes:

Name Type Description Default
attachments Path

Sub path of attachment directory

Path('attachments')
content Path

Sub path of post content file

Path('content.txt')
external_links Path

Sub path of external links file (for cloud storage links found in content)

Path('external_links.txt')
file str

The format of the post file filename (file is not attachment, each post has only one file, usually the cover image) Customize the filename format by inserting an empty {} to represent the basic filename. You can use some of the properties in Post. For example: {title}_{} could result in filenames like TheTitle_Stelle_lv5_logo.gif, TheTitle_ScxHjZIdxt5cnjaAwf3ql2p7.jpg, etc. Meanwhile, you can also use the formatting feature of the Python Format Specification Mini-Language, for example: {title:.6}_{} could shorten the title length to 6 characters like HiEveryoneThisIsALongTitle_ScxHjZIdxt5cnjaAwf3ql2p7.jpg to HiEver_ScxHjZIdxt5cnjaAwf3ql2p7.jpg

'{id}_{}'
revisions Path

Sub path of revisions directory

Path('revisions')

JobConfiguration

Download jobs Configuration

  • Available properties for post_dirname_format and filename_format

    Property Type
    id String
    user String
    service String
    title String
    added Date
    published Date
    edited Date
  • Available properties for year_dirname_format and month_dirname_format

    Property Type
    year String
    month String
  • Python Format Specification Mini-Language reference:

    https://docs.python.org/3.13/library/string.html#format-specification-mini-language

Attributes:

Name Type Description Default
count int

Number of coroutines for concurrent download

4
include_revisions bool

Include and download revision posts when available

False
post_dirname_format str

Customize the post directory name format, you can use some of the properties in Post. e.g. [{published}]{id} could result dirname [2024-1-1]123123, {user}_{published}_{title} could result dirname like 234234_2024-1-1_TheTitle. Meanwhile, you can also use the formatting feature of the Python Format Specification Mini-Language, for example: {title:.6} could shorten the title length to 6 characters like HiEveryoneThisIsALongTitle to HiEver

'{title}'
post_structure PostStructureConfiguration

Post path structure

PostStructureConfiguration()
mix_posts bool

Save all files from different posts at same path in creator directory. It would not create any post directory, and CreatorIndices would not been recorded.

False
sequential_filename bool

Rename attachments in numerical order, e.g. 1.png, 2.png, ...

False
sequential_filename_excludes Set[str]

File extensions to exclude from sequential naming when sequential_filename is enabled. Files with these extensions will keep their original names. e.g. [".psd", ".zip", ".mp4"]

Field(default_factory=set)
filename_format str

Customize the filename format by inserting an empty {} to represent the basic filename. Similar to post_dirname_format, you can use some of the properties in Post. For example: {title}_{} could result in filenames like TheTitle_b4b41de2-8736-480d-b5c3-ebf0d917561b, TheTitle_af349b25-ac08-46d7-98fb-6ce99a237b90, etc. You can also use it with sequential_filename. For instance, [{published}]_{} could result in filenames like [2024-1-1]_1.png, [2024-1-1]_2.png, etc. Meanwhile, you can also use the formatting feature of the Python Format Specification Mini-Language, for example: {title:.6} could shorten the title length to 6 characters like HiEveryoneThisIsALongTitle to HiEver

'{}'
allow_list Set[str]

Download files which match these patterns (Unix shell-style), e.g. ["*.png"]

Field(default_factory=set)
block_list Set[str]

Not to download files which match these patterns (Unix shell-style), e.g. ["*.psd","*.zip"]

Field(default_factory=set)
extract_content bool

Extract post content and save to separate file (filename was defined in config.job.post_structure.content)

False
extract_content_images bool

Extract images from post content and download them.

False
extract_external_links bool

Extract external file sharing links from post content and save to separate file (filename was defined in config.job.post_structure.external_links)

False
external_link_patterns List[str]

Regex patterns for extracting external links.

['https?://drive\\.google\\.com/[^\\s]+', 'https?://docs\\.google\\.com/[^\\s]+', 'https?://mega\\.nz/[^\\s]+', 'https?://mega\\.co\\.nz/[^\\s]+', 'https?://(?:www\\.)?dropbox\\.com/[^\\s]+', 'https?://db\\.tt/[^\\s]+', 'https?://onedrive\\.live\\.com/[^\\s]+', 'https?://1drv\\.ms/[^\\s]+', 'https?://(?:www\\.)?mediafire\\.com/[^\\s]+', 'https?://(?:www\\.)?wetransfer\\.com/[^\\s]+', 'https?://we\\.tl/[^\\s]+', 'https?://(?:www\\.)?sendspace\\.com/[^\\s]+', 'https?://(?:www\\.)?4shared\\.com/[^\\s]+', 'https?://(?:www\\.)?zippyshare\\.com/[^\\s]+', 'https?://(?:www\\.)?uploadfiles\\.io/[^\\s]+', 'https?://(?:www\\.)?box\\.com/[^\\s]+', 'https?://(?:www\\.)?pcloud\\.com/[^\\s]+', 'https?://disk\\.yandex\\.[a-z]+/[^\\s]+', 'https?://[^\\s]*(?:file|upload|share|download|drive|storage)[^\\s]*\\.[a-z]{2,4}/[^\\s]+']
group_by_year bool

Group posts by year in separate directories based on published date

False
group_by_month bool

Group posts by month in separate directories based on published date (requires group_by_year)

False
year_dirname_format str

Customize the year directory name format. Available properties: year. e.g. {year} > 2024, Year_{year} > Year_2024

'{year}'
month_dirname_format str

Customize the month directory name format. Available properties: year, month. e.g. {year}-{month} > 2024-01, {year}_{month} > 2024_01

'{year}-{month:02d}'
keywords Set[str]

keywords to filter posts by title (case-insensitive)

Field(default_factory=set)
keywords_exclude Set[str]

keywords to exclude posts by title (case-insensitive)

Field(default_factory=set)
download_file bool

Download post file (usually cover image). Set to False to skip file downloads.

True
download_attachments bool

Download post attachments. Set to False to skip attachment downloads.

True
min_file_size Optional[int]

Minimum file size in bytes to download. Files smaller than this will be skipped. Set to None to disable minimum size filtering.

None
max_file_size Optional[int]

Maximum file size in bytes to download. Files larger than this will be skipped. Set to None to disable maximum size filtering.

None

LoggerConfiguration

Logger configuration

Attributes:

Name Type Description Default
path Optional[Path]

Path to save logs, None for disable log file output

None
level Union[str, int]

Log filter level

logging.getLevelName(logging.DEBUG)
rotation Union[str, int, time, timedelta]

Log rotation

'1 week'