Skip to main content

AutoMod with Artificial Intelligence

Advanced automatic moderation system powered by AI to detect and prevent unwanted behaviors.

Required Tier

ULTRA - All AI features in AutoMod require Server Premium ULTRA

Description​

AI AutoMod uses advanced language models to understand the real context of messages, reducing false positives and detecting sophisticated threats that traditional systems can't detect.

AI Model: OpenAI GPT-4o-mini with Chain of Thought reasoning

AI Features​

1. AI Context Analysis - Word Filter​

What does it do? Analyzes context before penalizing for forbidden words, understanding the difference between malicious and legitimate use.

Features:

  • Detects sarcasm and irony
  • Understands quotes and educational context
  • Distinguishes friendly jokes from real insults
  • Reduces false positives by 70-80%

Examples:

MessageBlock?Reason
"let's kill this game"NOLegitimate gaming context
"I'm going to kill you"YESDirect threat
"killing time playing"NOIdiomatic expression
"I hate mondays lol"NOHarmless humor
"I hate [person]"YESHate speech

How does it work? The AI analyzes every message containing forbidden words and determines if the usage is offensive or harmless. It only blocks when it has 75%+ confidence that it's offensive.

Configuration: Panel → AutoMod → Word Filter → AI Context Analysis (toggle)


2. AI Duplicate Detection + Paraphrasing​

What does it do? Detects duplicate messages even when they're paraphrased. Prevents sophisticated spam that changes words but maintains the same meaning.

Features:

  • Semantic similarity analysis
  • Synonym and variation detection
  • Configurable threshold (60-95%)
  • Message history per user

Detection Examples:

User sendsSpam detected as
"Hello friends" → "Greetings everyone"Paraphrase detected
"Buy here" → "Purchase here"Minimal variation
"Promo link" (x5 with different text)Promotional spam
Natural conversation with othersNOT detected

Threshold:

  • 60%: Very sensitive, may have false positives
  • 75%: Recommended for most servers
  • 90%: Very strict, only nearly identical duplicates

Configuration: Panel → AutoMod → Anti-Spam → AI Duplicate Detection


3. AI Toxicity Detection​

What does it do? Detects advanced toxicity: insults, harassment, threats, hate speech, bullying, inappropriate content.

Toxicity Types:

  • Insults and offensive language
  • Threats and violence
  • Harassment and bullying
  • Hate speech
  • Inappropriate sexual content
  • Excessive aggressiveness

Features:

  • Multilingual (Spanish and English)
  • Confidence score (0-100%)
  • Configurable threshold
  • Actions: delete, warn, timeout, kick, ban

Recommended Threshold:

  • 70%: Balance between precision and sensitivity
  • 80%: Stricter, fewer false positives
  • 60%: More sensitive, may over-moderate

Configuration: Panel → AutoMod → AI Moderation → Toxicity Detection


4. AI Scam/Phishing Detection​

What does it do? Detects Nitro scams, crypto scams, phishing, fake giveaways, malicious links.

Detected Scam Types:

  • Fake Nitro scams
  • Crypto scams and pump & dump
  • Fake giveaways
  • Phishing links
  • Pyramid schemes
  • Malicious promotional spam

Features:

  • Suspicious URL analysis
  • Scam pattern detection
  • Confidence score
  • Automatic prevention

Recommended Threshold:

  • 60%: Recommended (detects most scams)
  • 70%: Stricter
  • 50%: Very sensitive (may have false positives)

Configuration: Panel → AutoMod → AI Moderation → Scam Detection


5. AI Raid Detection​

What does it do? Analyzes new member join patterns to detect automated raids.

Raid Indicators:

  • Multiple simultaneous joins
  • Similar or generated names
  • Very new accounts
  • Default avatars
  • Coordinated behavior

Features:

  • AI pattern analysis
  • Action recommendations
  • Automatic response
  • Confidence score

Automatic Actions:

  • Kick: Expel suspicious users
  • Ban: Ban if high raid confidence
  • Alert: Only alert staff

Configuration: Panel → AutoMod → AI Moderation → Raid Detection


General Configuration​

Enable AI Moderation​

  1. Go to Panel: panel.yumechanbot.com → Your server → AutoMod
  2. Enable AI Moderation: Main AI toggle to ON
  3. Verify ULTRA tier: Required for all AI features
  4. Configure each module: Enable only the ones you need

Adjust Thresholds​

What is a threshold? It's the minimum confidence level the AI needs to take action.

  • Low (50-60%): Very sensitive, more detections but more false positives
  • Medium (70-80%): Recommended, balance between precision and sensitivity
  • High (85-95%): Very strict, only very clear cases

Select Actions​

For each detection type you can configure:

  • Delete: Delete message only
  • Warn: Warn the user
  • Timeout: Temporarily mute (1min - 1 week)
  • Kick: Expel from server
  • Ban: Permanently ban

Best Practices​

Do​

  • Start with high thresholds (75-80%) and adjust as needed
  • Test in test channels before enabling server-wide
  • Review logs regularly to see what the AI detects
  • Adjust gradually if you see too many false positives
  • Combine with manual moderation for best results

Avoid​

  • Don't use thresholds that are too low (causes over-moderation)
  • Don't enable everything at once (test module by module)
  • Don't trust AI 100% (always review doubtful cases)
  • Don't ignore feedback from users about false positives

Gaming Server​

  • AI Context Word Filter (allows gaming jargon)
  • AI Duplicate Detection (prevents promotional spam)
  • Toxicity 70% (allows friendly banter)

Educational Server​

  • AI Context Word Filter (allows educational quotes)
  • Scam Detection (protects students)
  • Toxicity 80% (stricter environment)

Community Server​

  • All modules enabled
  • Balanced thresholds (70-75%)
  • Raid Detection active (full protection)

How the AI Works​

Chain of Thought (CoT) Reasoning​

The AI analyzes messages following a step-by-step reasoning process:

  1. Content Analysis: Reads and understands the message
  2. Context Extraction: Identifies emotions, intent, tone
  3. Signal Evaluation: Looks for violation indicators
  4. Reasoning: Considers alternatives and context
  5. Final Decision: Determines if there's a violation and confidence score

JSON Schema Validation​

All AI responses use enforced JSON schema for:

  • Guaranteed consistent format
  • No invalid responses
  • Easy automatic processing

Temperature 0​

We use temperature 0 for:

  • Maximum consistency in decisions
  • Deterministic responses
  • Less random variability

Costs​

Each AI analysis costs approximately $0.0001-0.0003 (0.01-0.03 cents). With 10,000 daily messages, the cost would be ~$1-3/month.

Privacy​

  • The AI only analyzes the specific reported message
  • It does NOT store or save message content
  • It does NOT have access to historical messages
  • It only returns: {violation: boolean, confidence: number, type: string}

Support​

If you have issues with AI AutoMod:

  1. Review logs in the panel
  2. Verify thresholds and configured actions
  3. Test in a test channel first
  4. Contact support on the official Yume-chan server