websucker-pip/README.md

34 lines
522 B
Markdown
Raw Normal View History

2020-05-07 14:09:45 +00:00
# Websucker
2020-05-13 13:20:20 +00:00
Agent for Sucking the of Web
## Features
- Crawling of best domains
- Crawling of unvisited domains
- Text mining
- Evaluation of domains
- Daily report
- Database Summary
## Requirements
- Python 3
- running Cassandra 3.11
- optional Beanstalkd for work queue
## Installation
Activate virtual environment:
python -m virtualenv ./venv
source ./venv/bin/activate
Install package:
pip install https://git.kemt.fei.tuke.sk/dano/websucker-pip/archive/master.zip
## Usage
websuck --help