scrapy_cffi keeps a Scrapy-style crawler architecture, supporting async execution and both HTTP & WebSocket requests. CLI support is minimal—Python API is recommended for running spiders.
The real highlights are its utility extensions:
• JSON Extractor – handles standard, embedded, and malformed JSON
• Media Downloader – segmented downloads for videos and large files
• Async Database Managers – Redis, MySQL, MongoDB with automatic retry and reconnection
• Multi-process RPC – quickly register functions, classes, and objects for rapid prototyping without MQ/Redis
These utilities can be used independently or combined into full async crawlers, offering flexibility, rapid prototyping, and easy extensibility.
scrapy_cffi keeps a Scrapy-style crawler architecture, supporting async execution and both HTTP & WebSocket requests. CLI support is minimal—Python API is recommended for running spiders.
The real highlights are its utility extensions:
• JSON Extractor – handles standard, embedded, and malformed JSON
• Media Downloader – segmented downloads for videos and large files
• Async Database Managers – Redis, MySQL, MongoDB with automatic retry and reconnection
• Multi-process RPC – quickly register functions, classes, and objects for rapid prototyping without MQ/Redis
These utilities can be used independently or combined into full async crawlers, offering flexibility, rapid prototyping, and easy extensibility.