Cybersecurity Project

Log Analysis & Alerting Tool

Detecting failed-login bursts, geo-anomalies & rare event spikes

A lightweight Python pipeline that parses SIEM-style JSONL logs, runs rule-based and statistical anomaly checks, and emits readable alerts with timestamps, actors, windows, and thresholds — ready for SOC triage.

Capabilities
  • Streaming parse of JSONL into dicts; ISO8601 timestamp handling.
  • Sliding-window detection for failed-login bursts.
  • Policy check for logins from unexpected countries.
  • Rate-vs-baseline for rare event spikes per event_type.
  • Clear alert blocks with context + sample events.
Flow: Logs → Parser → Rules → Alerts
Flow (top→bottom): load JSONL → parse/normalise → apply rules → emit alerts.

Overview

Purpose

Actionable alerts without SIEM bloat

The brief: parse SIEM logs, detect anomalies (failed-login bursts, unusual countries, rare events), and produce clear alerts/reporting. I focused on fast sliding windows and copy-pasteable messages that match real SOC hand-offs.

Tools & Technologies

  • Python 3 (datetime, collections.deque, defaultdict)
  • JSONL input; console alerting (Slack/email hooks planned)
  • Sliding windows + ratio-to-baseline checks
Process

How it works

Parse → Detect → Alert
  1. Parse logs. Stream JSONL lines, parse ISO8601 timestamps, normalise into Python dicts.
  2. Rule 1 — Failed-login bursts. Per-source sliding window; alert when fails ≥ 6 within 60s. Include actor + sample events and apply cooldown to reduce noise.
  3. Rule 2 — Unexpected country. On successful login, alert when country ∉ allowlist (defaults: AU, US, NZ, GB). Includes username, app, source IP.
  4. Rule 3 — Rare event spike. For each event_type, compute short-window rate vs baseline; alert when rate_short ≥ 5× baseline and count ≥ 5.
  5. Alerting. Print structured blocks: rule, window, keys/actors, counts, and samples — ready for ticketing.
Data model & thresholds
  • Log record: { ts, event_type, outcome, username, src_ip, country, app, msg } (fields vary per line).
  • Windows: failures(60s), packet-like counters(5–10m configurable).
  • Geo policy: allow-list array; easy to change per environment.
Edge cases & performance
  • Skips malformed lines; robust timestamp parsing.
  • Per-key cooldowns to avoid duplicate alerts.
  • Deque windows keep memory bounded for long files.

Run & Usage

Zero-config quick run
# From project root
python main.py

# The tool parses 'siem_sample_logs.jsonl' and prints alerts it detects.

Example Alerts

Readable and ticket-ready
Failed login burst alert
Failed-login burst (≥6 in 60s) with window and sample events.
Unexpected country alert
Successful login from a country outside the allow-list.
Rare event spike alert
Rate spike: short-window ≥5× baseline, count ≥5.

Reflection

What I learned & next steps
  • Takeaway: Sliding windows + baseline ratios surface real issues while limiting noise.
  • Challenges: normalising timestamps across sources; balancing thresholds and cooldowns.
  • Future: pluggable outputs (Slack/email), YAML rule config, JSON/CSV summaries, and unit tests for edge windows.