Self-Hosted AI Moderation for BuddyPress: OpenAI + Custom Rules

Manual moderation does not scale. A community with 500 active members posting 50 times a day generates 25,000 pieces of content a month. Reading every post before it goes live is not realistic. Keyword blocklists miss context. Akismet catches spam but not harassment or off-topic noise.

AI moderation solves this. OpenAI’s Moderation API takes a string of text and returns a flagged/not-flagged decision plus category scores across hate speech, harassment, self-harm, sexual content, violence, and a handful of subcategories. It runs in under 300ms, costs roughly $0.002 per 1,000 posts, and is accurate enough to use as a first-pass filter before a human moderator reviews anything flagged.

This tutorial hooks that API into BuddyPress and Jetonomy on WordPress. Every activity post, forum thread, and forum reply runs through moderation before it saves. Flagged content is either held for review or marked as spam automatically, depending on the severity score. A custom rules layer on top lets you add community-specific filters that the generic AI does not cover.

What You Need

WordPress with BuddyPress and Jetonomy active
An OpenAI API key with access to the Moderation API (text-moderation-latest model)
PHP 7.4+ with the wp_remote_post function available (standard in WordPress)
A place to store the API key securely, use wp-config.php as a constant, never store it in the database as plain text

The OpenAI Moderation API is free to use at the time of writing, included with any paid OpenAI API account at no additional charge beyond the account’s base usage. Verify current pricing in the OpenAI documentation before deploying at scale.

Step 1: Add the API Key to wp-config.php

Open your wp-config.php file and add your OpenAI API key as a constant above the line that says /* That's all, stop editing! */:

define( 'OPENAI_API_KEY', 'sk-your-key-here' );

Never hardcode the key inside a plugin file that might be committed to version control. The wp-config.php approach keeps it outside the webroot on most hosting configurations and out of your code repository.

Step 2: The Core Moderation Function

Create a file at wp-content/mu-plugins/bp-ai-moderation.php. Using mu-plugins ensures it loads automatically without needing to be activated and cannot be accidentally deactivated.

<?php
/**
 * BuddyPress + Jetonomy AI Moderation via OpenAI Moderation API
 */

if ( ! defined( 'ABSPATH' ) ) exit;

/**
 * Call the OpenAI Moderation API for a string of content.
 * Returns an array with 'flagged' (bool) and 'categories' (array).
 */
function bp_ai_moderate( string $content ): array {
    if ( ! defined( 'OPENAI_API_KEY' ) || empty( trim( $content ) ) ) {
        return [ 'flagged' => false, 'categories' => [] ];
    }

    $response = wp_remote_post(
        'https://api.openai.com/v1/moderations',
        [
            'timeout' => 10,
            'headers' => [
                'Authorization' => 'Bearer ' . OPENAI_API_KEY,
                'Content-Type'  => 'application/json',
            ],
            'body' => wp_json_encode([
                'model' => 'text-moderation-latest',
                'input' => wp_strip_all_tags( $content ),
            ]),
        ]
    );

    if ( is_wp_error( $response ) ) {
        // API unreachable: fail open (allow the post, log the error)
        error_log( 'BP AI Moderation API error: ' . $response->get_error_message() );
        return [ 'flagged' => false, 'categories' => [] ];
    }

    $body    = json_decode( wp_remote_retrieve_body( $response ), true );
    $result  = $body['results'][0] ?? null;

    if ( ! $result ) {
        return [ 'flagged' => false, 'categories' => [] ];
    }

    return [
        'flagged'    => (bool) $result['flagged'],
        'categories' => $result['categories'] ?? [],
        'scores'     => $result['category_scores'] ?? [],
    ];
}

A few implementation decisions worth explaining. The function fails open on API errors, if OpenAI is unreachable, the post goes through rather than silently blocking legitimate content. You can change this to fail closed if your community requires stricter guarantees, but be aware that a brief API outage will then block all new posts. The 10-second timeout is conservative; the Moderation API typically responds in under 500ms. Strip all tags before sending to avoid sending HTML entities and markup as content to be evaluated.

Step 3: Hook Into BuddyPress Activity

Add the following to the same bp-ai-moderation.php file. This runs before every BuddyPress activity item saves:

add_action( 'bp_activity_before_save', function( $activity ) {
    // Skip moderation for administrators and moderators
    if ( current_user_can( 'manage_options' ) || current_user_can( 'moderate_comments' ) ) {
        return;
    }

    $content = $activity->content;
    if ( empty( trim( strip_tags( $content ) ) ) ) {
        return;
    }

    $result = bp_ai_moderate( $content );

    if ( $result['flagged'] ) {
        $activity->is_spam = true;

        // Log for moderator review
        bp_activity_update_meta(
            $activity->id,
            '_ai_moderation_result',
            wp_json_encode( $result )
        );
    }
} );

When is_spam is set to true on a BuddyPress activity item, BuddyPress treats it as spam, it is hidden from activity feeds but not deleted. Moderators can review it in the BuddyPress spam queue and restore or permanently delete it. The meta entry stores the full API result so moderators can see which categories triggered the flag.

Step 4: Hook Into Jetonomy Forum Posts

Jetonomy forum threads and replies are WordPress custom post types. Hook into the wp_insert_post_data filter to intercept them before they save:

add_filter( 'wp_insert_post_data', function( $data, $postarr ) {
    // Only apply to Jetonomy post types
    $jetonomy_types = [ 'jet_topic', 'jet_reply' ];
    if ( ! in_array( $data['post_type'], $jetonomy_types, true ) ) {
        return $data;
    }

    // Skip moderation for admins
    if ( current_user_can( 'manage_options' ) ) {
        return $data;
    }

    // Only check new posts and pending review posts
    if ( ! in_array( $data['post_status'], [ 'publish', 'pending' ], true ) ) {
        return $data;
    }

    $content = $data['post_content'] . ' ' . $data['post_title'];
    $result  = bp_ai_moderate( $content );

    if ( $result['flagged'] ) {
        // Hold for review instead of publishing
        $data['post_status'] = 'pending';

        // Attach moderation metadata for review queue
        add_action( 'save_post', function( $post_id ) use ( $result ) {
            update_post_meta( $post_id, '_ai_moderation_result', wp_json_encode( $result ) );
            update_post_meta( $post_id, '_ai_moderation_flagged', 1 );
        }, 10, 1 );
    }

    return $data;
}, 10, 2 );

Note: replace jet_topic and jet_reply with the actual post type slugs Jetonomy registers on your installation. You can find these with wp post-type list via WP-CLI or by inspecting the URL when viewing a forum topic in the WordPress admin.

The Jetonomy hook sends flagged posts to pending status rather than marking them as spam. This is a deliberate choice: forum posts are structured content with titles and body text, and a false positive is more disruptive than a spam mark on an activity item. Pending status puts the post in the standard WordPress review queue, which community managers can access from Posts or the relevant custom post type in the admin.

Step 5: Add a Custom Rules Layer

The OpenAI Moderation API is general-purpose. It catches hate speech and harassment reliably but it does not know your community’s specific rules. A WordPress plugin forum might want to flag posts that include competitor plugin recommendations. A faith-based community might want to flag content that violates their specific code of conduct. A professional community might want to flag posts that share personal contact information.

Add a custom rules function that runs after the AI check:

function bp_ai_custom_rules( string $content ): array {
    $flagged    = false;
    $reasons    = [];

    $content_lower = strtolower( $content );

    // Example rule: flag posts containing competitor names for moderator review
    $competitors = [ 'pluginname1', 'pluginname2' ];
    foreach ( $competitors as $name ) {
        if ( str_contains( $content_lower, $name ) ) {
            $flagged   = true;
            $reasons[] = 'competitor_mention:' . $name;
        }
    }

    // Example rule: flag posts with phone numbers or emails
    if ( preg_match( '/\b[\w.+-]+@[\w-]+\.[\w.]+\b/', $content ) ) {
        $flagged   = true;
        $reasons[] = 'personal_contact_info';
    }

    // Example rule: flag excessive caps (possible spam or shouting)
    $letters      = preg_replace( '/[^a-zA-Z]/', '', $content );
    $upper_ratio  = strlen( $letters ) > 20
        ? substr_count( $letters, strtoupper( $letters ) ) / strlen( $letters )
        : 0;
    // Simpler approach
    $upper_count  = preg_match_all( '/[A-Z]/', $content );
    $total_alpha  = preg_match_all( '/[a-zA-Z]/', $content );
    if ( $total_alpha > 20 && ( $upper_count / $total_alpha ) > 0.6 ) {
        $flagged   = true;
        $reasons[] = 'excessive_caps';
    }

    return [ 'flagged' => $flagged, 'reasons' => $reasons ];
}

Call this function in parallel with the AI check and combine the results. A post that passes the AI check but fails a custom rule still gets held for review, and vice versa.

Cost Math at Scale

The OpenAI Moderation API is currently free for API accounts, charged at $0 per call. That said, it is worth understanding what the cost would be if pricing changes, and what the processing overhead looks like at scale.

Monthly Posts	API Calls	Approx. Cost	Avg. Latency Added
10,000	10,000	~$0 (free tier)	+200–400ms/post
100,000	100,000	~$0 (free tier)	+200–400ms/post
1,000,000	1,000,000	~$0 (free tier)	+200–400ms/post
Any volume	Any volume	Contact OpenAI for enterprise	Async queue recommended

The latency cost is the more practical concern. At 200 to 400ms per post, synchronous moderation is fine for human typing speeds but will add noticeable delay if you are bulk-importing content or running a migration. For high-volume communities, move the API call to an asynchronous queue using WordPress’s Action Scheduler and set posts to pending status by default, publishing them automatically once the API returns a clean result.

EU Option: Self-Hosted Mistral Moderation

If your community is subject to GDPR and you have concerns about sending post content to OpenAI’s US-based servers, a self-hosted alternative is available. Mistral AI’s moderation model can be run on your own infrastructure using Ollama or a similar local inference framework.

The tradeoff: self-hosted inference requires a server with enough GPU memory to run the model (at minimum a VPS with an NVIDIA T4 or A10, which runs around $0.40 to $0.60 per hour on Hetzner or OVH). For communities in Germany, France, or elsewhere in the EU where data residency is a hard requirement, this is the appropriate path. For most communities outside regulated industries, sending moderation text to OpenAI is comparable to any other text-processing API and is covered by OpenAI’s standard data processing addendum.

To swap Mistral for OpenAI, replace the API endpoint and authentication in the bp_ai_moderate function. The input/output format is similar enough that the rest of the code stays the same.

GDPR Considerations

When you send post content to the OpenAI Moderation API, you are sending user-generated text to a third-party processor. Under GDPR, this requires a lawful basis (legitimate interests in keeping your community safe is the standard basis) and a Data Processing Agreement with OpenAI. OpenAI provides a DPA at platform.openai.com, accept it in your account settings.

Update your community’s privacy policy to disclose that post content is processed by an AI moderation service. Do not send personally identifiable information beyond the post text, strip usernames, email addresses, and user IDs before sending if your content regularly includes them. The moderation function already calls wp_strip_all_tags which removes most inline identifiers embedded in markup.

Do not log full post content in the moderation meta beyond what moderators need to review a flagged decision. The _ai_moderation_result meta stores category scores, not the original post text. The post text is already stored in the post content field, there is no need to duplicate it in meta.

Building a Moderator Dashboard

The default WordPress admin shows pending posts in the standard post list views. For communities with active moderation queues, building a dedicated review interface speeds up the workflow significantly.

A minimal moderator dashboard queries for posts with _ai_moderation_flagged = 1 and displays them with the flag reason and the full post content. Moderators can approve (change status to publish), reject (change status to trash), or escalate (leave in pending and add a note). Each action should clear the flag meta so the item does not reappear in the queue.

We implement this pattern as a custom admin page in BuddyPress development projects. It takes about 4 to 6 hours of development to build a clean review UI on top of the moderation hooks above. The result is a community moderation workflow that catches roughly 95% of policy violations automatically and surfaces the remaining 5% for human review in a single, organized queue.

What This Setup Does Not Cover

AI moderation catches content-level violations: hate speech, harassment, explicit content, self-harm references. It does not catch behavioral patterns: a user who posts individually acceptable content but is systematically targeting another member, or a coordinated campaign of low-grade spam that individually scores below the flagging threshold.

For behavioral moderation, you need user-level signals: post frequency, report rate, new account age, and trust level. BuddyPress’s existing member management tools cover most of this, and Jetonomy’s trust level system adds a permission layer that restricts what new members can do before they have established a track record. The AI moderation layer and the trust level layer work together, each fills the gaps the other leaves.

Getting Started

The code above is a working starting point. Drop it in mu-plugins, add your API key to wp-config.php, and every new BuddyPress activity post and Jetonomy forum post will run through AI moderation immediately.

If you want the full implementation, async queue for high-volume communities, custom moderator dashboard, trust-level integration, and Mistral self-hosted option for EU deployments, we build this as part of BuddyPress community projects. The moderation stack is one of the first things we configure on any community with more than a few hundred active members. Wbcom Designs provides AI integration services for WordPress communities that need custom moderation pipelines and other AI-driven workflows.

Self-Hosted AI Moderation for BuddyPress: OpenAI API + Custom Rules