Settings & Profile
Configure your PhishGuard AI preferences and model parameters.
Thresholds are based on the trained model's probability outputs. Adjust in phishing_app.py for production use.
Logistic Regression Engine
The central threat model that evaluates semantic patterns to calculate the exact statistical likelihood of a phishing attempt.
sklearn.linear_model.LogisticRegressionTF-IDF Word Frequency Model
Converts raw email text into unique numeric weight vectors by analyzing key word counts (individual words + double word pairs).
TfidfVectorizer(ngram_range=(1,2))Regularization Scale (C = 1.0)
Balances model sensitivity. A C-value of 1.0 guards against "over-fitting", making sure the model handles brand-new emails perfectly.
C_parameter = 1.0 (Balanced)L-BFGS Numerical Optimizer
A fast, memory-optimized optimization algorithm used to discover the mathematically perfect dividing line between safe and threat emails.
solver='lbfgs' (max_iter=1000)Balanced Weight Distribution
Automatically balances the importance of benign and malicious classes during training to protect against biased alerts.
class_weight='balanced'80% Learn / 20% Rigorous Test
Splits the security threat library. 80% is used to train the system intelligence, and 20% is held back to rigorously verify correctness.
stratified_split(ratio=0.2)PhishGuard AI is an academic AI cybersecurity project demonstrating email phishing detection using machine learning. The system uses a TF-IDF + Logistic Regression pipeline trained on 18,000+ real-world emails to achieve 98.26% classification accuracy.
Version 1.0 • Built with Flask, scikit-learn • Dataset: Phishing Email Dataset