Websucker Crawler Agent
			
		
		| websucker | ||
| .gitignore | ||
| Dockerfile | ||
| LICENSE.txt | ||
| MANIFEST.in | ||
| README.md | ||
| requirements.txt | ||
| setup.py | ||
Websucker
Agent for Sucking the of Web
Features
- Crawling of best domains
 - Crawling of unvisited domains
 - Text mining
 - Evaluation of domains
 - Daily report
 - Database Summary
 
Requirements
- Python 3
 - running Cassandra 3.11
 - optional Beanstalkd for work queue
 
Installation
Activate virtual environment:
python -m virtualenv ./venv
source ./venv/bin/activate
Install package:
pip install https://git.kemt.fei.tuke.sk/dano/websucker-pip/archive/master.zip
Usage
websuck --help