ZKT26/noc-docker/README.md

513 lines
16 KiB
Markdown

# Network Monitoring System - Docker Multi-Container Application
## Overview
This project implements a lightweight **Network Monitoring System** using a multi-container Docker architecture. The application continuously monitors network connectivity and latency to predefined targets (public DNS servers) and displays real-time monitoring data through an interactive web dashboard.
The system demonstrates core Docker concepts including:
- Multi-container orchestration with Docker Compose
- Custom Docker image creation from Dockerfiles
- Container networking and inter-service communication
- Persistent data storage using Docker named volumes
- Service isolation and containerization
- Port mapping and exposure control
---
## Application Description
### Purpose
The Network Monitoring System is a simplified Network Operations Center (NOC) that provides real-time visibility into network connectivity and performance metrics. It continuously probes network targets and visualizes the results through a web dashboard.
### Functionality
**Core Features:**
1. **Network Probing**
- Continuous ICMP ping requests to multiple targets
- Monitoring targets include:
- Google DNS: `8.8.8.8`
- Cloudflare DNS: `1.1.1.1`
- Quad9 DNS: `9.9.9.9`
- Measures network latency (response time in milliseconds)
- Detects host reachability and availability
2. **Data Processing & Storage**
- Backend API receives probe data
- Processes and validates monitoring metrics
- Stores data persistently in SQLite database
- Maintains historical records across container restarts
3. **Web Dashboard**
- Real-time visualization of network status
- Shows last known state of each monitored target
- Displays latency metrics and availability status
- Accessible via web browser at `http://localhost:5000`
### System Behavior
- Probe service runs network checks every 10 seconds
- Backend API listens for probe updates and processes them
- Frontend dashboard updates to reflect latest monitoring data
- Data persists even if containers are stopped and restarted
- All services communicate over isolated Docker network
---
## Virtual Networks and Named Volumes
### Docker Network Architecture
**Network Name:** `noc-net`
**Type:** Custom Bridge Network (user-defined)
**Purpose:** Isolates application containers from the host and enables service-to-service communication
**Connected Services:**
- `probe` service: Generates monitoring data
- `backend` service: Processes and stores data
- `frontend` service: Displays data via web interface
**Communication Features:**
- Automatic DNS resolution: Services reference each other by name
- Isolated from default Docker bridge network
- Host machine accesses frontend via port mapping: `5000:5000`
**Network Topology:**
```
┌─────────────────────────────────────┐
│ monitoring-network (Bridge) │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ probe │ │ backend │ │
│ │ container │──│ container │ │
│ │ │ │ │ │
│ └─────────────┘ └─────────────┘ │
│ ▲ │
│ │ │
│ ┌─────────────┐ │ │
│ │ frontend │─────────┘ │
│ │ container │ │
│ │ │ │
│ └─────────────┘ │
│ │ (port 5000) │
│ └─────────────────────────── │
└─────────────────────────────────────┘
│ (exposed to host)
localhost:5000
```
### Persistent Volumes
**Volume Name:** `noc-data`
**Type:** Named Volume (Docker-managed)
**Mount Path:** `/data` inside backend container
**Purpose:** Persists SQLite database file across container lifecycle events
**Data Preservation:**
- Survives container stop/start cycles
- Survives container removal (volume independent of container)
- Accessible by backend service for read/write operations
- Database remains intact when application is paused
**Volume Details:**
- Driver: local
- Capacity: Limited by host filesystem
- Backup: Can be backed up using `docker volume inspect` and `docker run` commands
---
## Container Configuration
### Container Specifications
#### 1. Probe Container
- **Image:** Custom image built from `Dockerfile.probe`
- **Container Name:** `network-probe`
- **Port Mapping:** Internal only (not exposed to host)
- **Network:** monitoring-network
- **Restart Policy:** `unless-stopped`
- **Environment Variables:**
- `BACKEND_URL`: Set to `http://backend:5001` for internal communication
- `PROBE_INTERVAL`: Interval between probes (5 seconds)
- **Volumes:** None
- **Resources:** No specific limits defined (uses host defaults)
- **Dependencies:** Requires `backend` service to be running
#### 2. Backend Container
- **Image:** Custom image built from `Dockerfile.backend`
- **Container Name:** `network-backend`
- **Port Mapping:** `5001:5001` (internal network only)
- **Network:** monitoring-network
- **Restart Policy:** `unless-stopped`
- **Environment Variables:**
- `FLASK_ENV`: Set to `development`
- `DATABASE_PATH`: `/data/monitoring.db`
- **Volumes:**
- Named volume `monitoring-data` mounted at `/data`
- **Resources:** No specific limits defined
- **Startup:** Flask API server on port 5001
- **Dependencies:** None (but required by probe and frontend)
#### 3. Frontend Container
- **Image:** Custom image built from `Dockerfile.frontend`
- **Container Name:** `network-frontend`
- **Port Mapping:** `5000:5000` (exposed to host)
- **Network:** monitoring-network
- **Restart Policy:** `unless-stopped`
- **Environment Variables:**
- `BACKEND_API_URL`: Set to `http://backend:5001` for internal communication
- **Volumes:** None
- **Resources:** No specific limits defined
- **Startup:** Static file server on port 5000
- **Dependencies:** Requires `backend` service for API calls
### Configuration Methods
**1. Docker Compose (Primary Method)**
- All configurations defined in `docker-compose.yaml`
- Environment variables specified in service definitions
- Volume mounts declared in service sections
- Network configuration defined at compose level
**2. Environment Variables**
- Used for dynamic configuration without rebuilding images
- Set in `docker-compose.yaml` under `environment` section
- Read by application startup scripts
**3. Startup Arguments**
- Python services accept command-line arguments
- Flask runs with host `0.0.0.0` and specified port
- Nginx configured via configuration file in container
**4. Configuration Files**
- Nginx config embedded in `Dockerfile.frontend`
- Flask app configuration in Python source code
- Database initialization handled by Flask ORM
---
## List of Containers Used
### Container Summary Table
| Container Name | Image Type | Base Image | Port | Purpose | Service |
|---|---|---|---|---|---|
| `network-probe` | Custom | python:3.9-slim | Internal | Network monitoring probe sending ICMP pings | probe |
| `network-backend` | Custom | python:3.9-slim | 5001 | Flask API receiving probe data and managing database | backend |
| `network-frontend` | Custom | nginx:alpine | 5000 | Web server serving dashboard UI and assets | frontend |
### Detailed Container Descriptions
**1. network-probe (Probe Service)**
*Description:* The probe container continuously monitors network connectivity by sending ICMP ping requests to predefined external targets and reporting results to the backend.
*Functionality:*
- Runs Python application that implements probing logic
- Sends HTTP POST requests to backend API with monitoring data
- Executes ping operations to DNS servers (8.8.8.8, 1.1.1.1, 9.9.9.9)
- Measures response latency in milliseconds
- Handles unreachable targets gracefully
- Reports monitoring data every 10 seconds
*Image Contents:*
- Python 3.9 runtime
- `requests` library for HTTP communications
- Standard system utilities for network operations
- Minimal footprint for efficient resource usage
**2. network-backend (Backend Service)**
*Description:* The backend container provides a RESTful API for receiving monitoring data from the probe and serving it to the frontend while maintaining persistent storage.
*Functionality:*
- Runs Flask web framework application
- Exposes `/metrics` endpoint for receiving probe data (POST requests)
- Exposes `/metrics` endpoint for retrieving data (GET requests)
- Manages SQLite database (`monitoring.db`) on persistent volume
- Stores monitoring records with timestamps
- Returns latest monitoring status to frontend clients
- Handles data validation and error conditions
*Image Contents:*
- Python 3.9 runtime
- Flask web framework
- SQLite3 database library
- Additional Python dependencies as needed
**3. network-frontend (Frontend Service)**
*Description:* The frontend container serves a web-based dashboard that displays real-time network monitoring data fetched from the backend API.
*Functionality:*
- Runs Nginx web server for high performance
- Serves static HTML/CSS/JavaScript files
- Hosts dashboard UI accessible at port 5000
- Implements JavaScript fetch API calls to backend
- Displays monitoring status with color coding
- Refreshes data at regular intervals (typically every 5 seconds)
- Responsive design suitable for different screen sizes
*Image Contents:*
- Nginx Alpine Linux distribution (minimal base)
- Nginx web server configuration
- HTML/CSS dashboard files
- JavaScript code for frontend logic and API communication
---
## Instructions
### 1. Preparing the Application
**Command:** `./prepare-app.sh`
**What it does:**
- Creates custom Docker bridge network `monitoring-network`
- Creates named volume `monitoring-data` for database persistence
- Builds three custom Docker images from Dockerfiles:
- `network-probe:latest` from `Dockerfile.probe`
- `network-backend:latest` from `Dockerfile.backend`
- `network-frontend:latest` from `Dockerfile.frontend`
- Performs any necessary initialization tasks
**Prerequisites:**
- Docker daemon must be running
- Sufficient disk space for image layers (~1.5 GB)
- User must have Docker permissions
**Expected Output:**
```
Preparing app...
Creating network: monitoring-network
Creating volume: monitoring-data
Building probe image...
Building backend image...
Building frontend image...
Setup complete!
```
**Time Required:** 2-5 minutes (varies based on system performance and internet speed)
---
### 2. Launching the Application
**Command:** `./start-app.sh`
**What it does:**
- Starts all three containers in detached mode
- Configures automatic restart on failure (`unless-stopped` policy)
- Establishes communication between containers via Docker network
- Mounts persistent volume to backend container
- Maps port 5000 from frontend container to host
**Prerequisites:**
- `prepare-app.sh` must have been run successfully
- Port 5000 must be available on host machine
- Docker daemon must be running
**Expected Output:**
```
Running app...
Starting probe container...
Starting backend container...
Starting frontend container...
The app is available at http://localhost:5000
```
**Startup Sequence:**
1. Backend container starts first and initializes database
2. Frontend container starts and serves dashboard
3. Probe container starts and begins monitoring
**Verification:**
- Check running containers: `docker ps`
- Access dashboard: Open browser to `http://localhost:5000`
- Should see network monitoring dashboard with target status
**Time Required:** 30-60 seconds for all containers to be ready
---
### 3. Pausing the Application
**Command:** `./stop-app.sh`
**What it does:**
- Gracefully stops all running containers
- Preserves all data in named volumes
- Preserves Docker network configuration
- Allows containers to be restarted without data loss
**Important Notes:**
- Containers are **stopped** but not **removed**
- Database data persists in named volume
- Network and volume remain available
- Paused containers consume minimal resources
**Expected Output:**
```
Stopping app...
Stopping probe container...
Stopping backend container...
Stopping frontend container...
All containers stopped.
```
**Data Preservation:**
- Application state is maintained in database
- Monitoring history is preserved
- Restarting with `start-app.sh` resumes with previous state
**Verification:**
- Check stopped containers: `docker ps -a`
- Verify volume still exists: `docker volume ls | grep monitoring-data`
- Verify network still exists: `docker network ls | grep monitoring-network`
**Time Required:** 10-20 seconds
---
### 4. Resuming After Pause
**Command:** `./start-app.sh` (same as initial launch)
**Behavior:**
- Restarts existing containers (not creating new ones)
- Reconnects containers to existing network
- Mounts existing volume with preserved data
- Previous monitoring data is immediately available
**No Data Loss:**
- All metrics collected before pause are retained
- Database state is exactly as it was before stopping
- Frontend dashboard displays previous monitoring history
---
### 5. Deleting the Application
**Command:** `./remove-app.sh`
**What it does:**
- Stops all running containers
- Removes all stopped containers
- Removes custom Docker network `monitoring-network`
- Removes named volume `monitoring-data` (database deleted)
- Removes custom Docker images
**Important:** This command is **destructive** and **cannot be undone**:
- All monitoring data is permanently deleted
- All containers are removed
- Network and volumes are removed
- Only images can be rebuilt by running `prepare-app.sh`
**Expected Output:**
```
Removing app...
Stopping containers...
Removing containers...
Removing network...
Removing volume...
Removing images...
Removed app. All traces deleted.
```
**After Removal:**
- System is returned to clean state
- To run application again, start with `prepare-app.sh`
- No artifacts or data remain
**Caution:** Before running this command, ensure:
- You have backed up any critical monitoring data
- You will not need historical metrics
- This is your intention (no confirmation prompt)
---
## Viewing the Application in Web Browser
### Access Method
1. **Ensure Application is Running**
```bash
docker ps # Should show three running containers
```
2. **Open Web Browser**
- Use any modern web browser (Chrome, Firefox, Safari, Edge)
- Recommended: Chrome 90+, Firefox 88+, Safari 14+
3. **Navigate to Dashboard**
- URL: `http://localhost:5000`
- Alternative: `http://127.0.0.1:5000`
### Dashboard Features
**Main Display:**
- Table showing monitored DNS servers (targets)
- Current status of each target (reachable/unreachable)
- Last measured latency (response time in milliseconds)
- Last update timestamp
**Target Information:**
- Target IP Address
- Target Name (e.g., "Google DNS")
- Current Status Badge
- Green: Reachable (last ping successful)
- Red: Unreachable (last ping failed)
- Latency Value
- Status Message
**Auto-Refresh:**
- Dashboard automatically updates every 5 seconds
- Data fetched from backend API
- No manual refresh needed
### Troubleshooting Access
**If Page Won't Load:**
1. Verify containers are running: `docker ps`
2. Check frontend logs: `docker logs network-frontend`
3. Verify port 5000 is listening: `netstat -tlnp | grep 5000`
**If Data Shows as Unavailable:**
1. Verify backend container is running: `docker ps | grep backend`
2. Check backend logs: `docker logs network-backend`
3. Wait 10-15 seconds for first probe results
**If Port 5000 is Already in Use:**
1. Find what's using port 5000: `lsof -i :5000` or `netstat -tlnp`
2. Either stop that service or use port mapping modification
---
### AI Assistance Used in This Project
**1. Flask Backend Development**
- AI generated RESTful API endpoints
- Database schema design and SQLite integration
- Error handling and data validation logic
**2. Frontend Dashboard**
- AI generated HTML/CSS dashboard template
- JavaScript fetch API implementation
- Real-time data refresh and status visualization
**3. Documentation**
- Technical accuracy and completeness