sist2

Lightning-fast file system indexer and search tool.


Product Overview

Sist2 is a lightning-fast file system indexer and search tool that allows you to scan files stored in archive formats (such as zip, tar, and 7z) as if they were directly in the file system. This tool supports recursive scanning of archives inside archives, making it an ideal solution for indexing and searching large collections of files.

Main Features

  • Archive Scanning: Sist2 can scan files stored in archive formats (zip, tar, 7z) as if they were directly in the file system.
  • Recursive Scanning: Supports scanning of archives inside archives.
  • OCR Support: Enables OCR support for ebook and image file types with options --ocr- lang, --ocr-images, and --ocr-ebooks.
  • Language Support: Comes with common languages (hin, jpn, eng, fra, rus, spa, chi_sim, deu, pol) pre-installed. You can also specify multiple languages using the + separator.

Limitations

  • Limited support for parsing media files with formats that require seek (e.g. .gif, .mp4 w/ fragmented metadata etc.).
  • Archives are scanned sequentially by a single thread. On systems where Sist2 is not I/O bound, scans might be faster when larger archives are split into smaller parts.

Overall, Sist2 is an innovative tool for indexing and searching files stored in archive formats, with features that make it suitable for large-scale file management tasks.

Related

Concrete 5 CMS
bitmagnet
UVDesk
Calibre
SearXNG
LibreX
Digibunch
DirectoryLister
ElasticSearch
sish