PDC Project — Parallel Detection Console

Question 1

Parallel Malicious Activity Detection

Q1 DONE Complete

Detect Backdoor, DoS & Reconnaissance

Scan the UNSW-NB15 training set (82,332 records) for Backdoor, DoS, and Reconnaissance attack patterns. Rank 0 reads the full dataset and distributes chunks to each MPI process via point-to-point send/receive. Each process scans its chunk independently, then MPI_Reduce aggregates the global totals to rank 0.

MPI_Bcast MPI_Send MPI_Recv MPI_Reduce MPI_Barrier

→ Assigned to Lubna

/// Results — training-set.csv

82,332 records

583

Backdoor

4,089

DoS

3,496

Reconnaissance

→ 8,168 total malicious records — 9.9% of training set.

→ Rank 1 shows 0 detections — rows 20k–41k are entirely Normal/Generic in the dataset's natural ordering. Global totals verified against Python.

Question 2

Parallel Statistical Analysis

Q2 DONE Complete

Distributed Attack Detection & IP Cross-Checking

Each of the 4 MPI processes reads its own UNSW-NB15 CSV file and tracks suspicious IPs locally. MPI_Scatter distributes work counts, MPI_Reduce aggregates global statistics, MPI_Allreduce shares attack flags, MPI_Gatherv collects per-process IP lists, and MPI_Bcast broadcasts the final deduplicated suspicious-IP list to all processes.

MPI_Scatter MPI_Reduce MPI_Allreduce MPI_Gather MPI_Gatherv MPI_Bcast

→ Assigned to Insharah

/// Run Results (np=4, UNSW-NB15)

ATTACK DETECTED

33

Unique Suspicious IPs

116,922

Failed Logins

46,153

Port Scans

84,910

Connections

→ Validation: PASSED — all 4 processes handled distinct log segments

→ Verdict: Potential DDoS / Port-Scanning Attack Detected

→ Top flagged: 59.166.0.x subnet (~11,700 failed logins each), 175.45.176.x (high connections)

Question 3

Serial vs. Parallel. Where does MPI actually help?

Q3 DONE Complete

Performance Analysis & Benchmarking

Benchmark serial vs. parallel execution on the full UNSW-NB15 combined dataset (257,673 records). Measure speedup, efficiency, and communication overhead as MPI process count scales from 1 to 6. Use MPI_Wtime at each phase to identify bottlenecks — specifically whether MPI_Scatter or computation dominates total time.

MPI_Scatter MPI_Gather MPI_Reduce MPI_Allreduce MPI_Bcast MPI_Wtime

→ Assigned to Haseeb

/// Key Finding

→ Communication-bound workload. MPI_Scatter distributes ~125MB (257K lines × 512 bytes) to each process, dominating total parallel time.

→ Computation scales linearly. Local compute drops from 50ms (np=1) to 8.9ms (np=6) — near-perfect computational parallelism.

→ Amdahl's Law in action. Serial bottleneck (file I/O + scatter) limits theoretical max speedup regardless of process count. Optimization path: MPI-IO parallel file reads.

• Backdoor

Critical

583

Unauthorized access attempts detected

• Denial of Service

High

4,089

DoS attack patterns identified

• Reconnaissance

Medium

3,496

Scanning & probing activities

/// Benchmark Results — Speedup vs Processes

82,332 records

Processes (np)	T_serial (s)	T_parallel (s)	Speedup	Efficiency	Comm Overhead	Compute Time
1	0.016969	0.044402	0.38x	38.2%	63.8%	0.016030s
2	0.033313	0.068503	0.49x	24.3%	127.0%	0.015799s
3	0.020483	0.062837	0.33x	10.9%	142.7%	0.015415s
4	0.033230	0.044674	0.74x	18.6%	181.4%	0.013329s
5	0.036810	0.044293	0.83x	16.6%	224.4%	0.005903s
6	0.034533	0.044630	0.77x	12.9%	257.8%	0.002977s

Execution Time Comparison

np=1

0.044s

np=2

0.069s

np=3

0.063s

np=4

0.045s

np=5

0.044s

np=6

0.045s

Serial baseline

Parallel execution

MPI Communication Breakdown (np=4)

Scatter

65.6ms

Reduce

4.0ms

Allreduce

11.3ms

Gather

Bcast

Compute

13.3ms

Communication

Computation

17%

Efficiency
np=5

0.83x

Best Speedup
np=5

224%

Comm
Overhead

Key Findings

→ Communication-bound workload: MPI_Scatter dominates (~70% of parallel time) because 82K lines × 512 bytes = ~40MB must be distributed to each process.

→ Computation scales well: Local compute time drops linearly from 16ms (np=1) → 3.0ms (np=6), showing near-perfect computational parallelism.

→ Amdahl's Law in action: The serial fraction (file I/O + scatter) limits theoretical max speedup regardless of process count.

→ Optimization path: Using MPI file I/O (MPI_File_read) instead of rank-0-reads-then-scatters would significantly reduce overhead.

The Dataset

UNSW-NB15

257,673

Training Records

82,332

Testing Records

49

Features per Record

10

Attack Categories

Attack Categories — Training Set

Normal 37,000

Generic 18,871

Exploits 11,132

Fuzzers 6,062

DoS 4,089

Reconnaissance 3,496

Analysis 677

Backdoor 583

Shellcode 378

Worms 44

Dataset Files

Attack Distribution

Combined Set

• Normal93,000

• Generic58,871

• Exploits44,525

• Fuzzers24,246

• DoS16,353

• Reconnaissance13,987

• Analysis2,677

• Backdoor2,329

• Shellcode1,511

• Worms174

Dataset Files

File	Rows	Size
UNSW-NB15_1.csv	700K	162MB
UNSW-NB15_2.csv	700K	158MB
UNSW-NB15_3.csv	700K	148MB
UNSW-NB15_4.csv	440K	94MB
Training set	82K	15MB
Testing set	175K	31MB

VM Specs

IP 139.84.171.89

CPU 6 vCPU

RAM 16 GB

Disk 300 GB SSD

OS Ubuntu 22.04

Region Delhi

MPI OpenMPI 4.1.2

The Team

Built by three.

Lubna

Q1 — Parallel Detection

Parallel malicious activity detection using MPI_Send/Recv distribution and MPI_Reduce aggregation on 82,332 records.

DONE Complete

Insharah

Q2 — Statistical Analysis

Parallel min/max/avg computation of network features using MPI_Scatter and MPI_Allreduce across all processes.

DONE Complete

Haseeb

Q3 — Performance Analysis

Serial vs. parallel benchmarking, speedup metrics, communication overhead analysis, dashboard and VM infrastructure.

DONE Complete

Live Terminal

Run it yourself.

haseeb@pdc-project:~

Live Execution

pdc@project $ echo "Ready. Click a command above to execute." Ready. Click a command above to execute. Commands execute on the live VM (139.84.171.89) Results appear here in real-time.

Parallel Detection of Malicious Activity

Parallel Malicious Activity Detection

Parallel Statistical Analysis

Serial vs. Parallel. Where does MPI actually help?

UNSW-NB15

Built by three.

Run it yourself.

Parallel Detection
of Malicious Activity

Serial vs. Parallel. Where does MPI actually help?