Your First Protocol Discovery

Walk through a complete protocol discovery workflow -- from capturing traffic to generating a parser.

Overview

QBITEL Bridge's protocol discovery pipeline takes raw network traffic and automatically identifies unknown protocols, learns their grammar, and generates parsers. This tutorial walks you through the entire process.

Prerequisites

  • QBITEL Bridge running locally (see Quick Start)
  • API accessible at http://localhost:8000
  • curl or any HTTP client installed

Step 1: Verify the API

Confirm the AI Engine is running and healthy:

curl http://localhost:8000/health

Expected response:

{
  "status": "healthy",
  "version": "1.0.0",
  "components": {
    "ai_engine": "ready",
    "database": "connected",
    "llm": "available"
  }
}

Step 2: Prepare Traffic Data

Protocol discovery accepts base64-encoded traffic samples. Encode your captured packets:

# Encode a sample HTTP request
echo -n "GET /api/users HTTP/1.1\r\nHost: api.example.com\r\n\r\n" | base64

# Output: R0VUIC9hcGkvdXNlcnMgSFRUUC8xLjENCkhvc3Q6IGFwaS5leGFtcGxlLmNvbQ0KDQo=

Step 3: Submit a Discovery Request

Send the encoded traffic data to the discovery endpoint:

curl -X POST http://localhost:8000/api/v1/discover \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your_api_key" \
  -d '{
    "packet_data": [
      "R0VUIC9hcGkvdXNlcnMgSFRUUC8xLjENCkhvc3Q6IGFwaS5leGFtcGxlLmNvbQ0KDQo=",
      "UE9TVCAvYXBpL2xvZ2luIEhUVFAvMS4xDQpDb250ZW50LVR5cGU6IGFwcGxpY2F0aW9uL2pzb24NCg0K"
    ],
    "metadata": {
      "source": "manual_capture",
      "confidence_threshold": 0.7
    }
  }'

Step 4: Understand the Results

The API returns discovered protocols with grammar rules, parsers, and confidence scores. Key fields in the response:

Field Description
discovered_protocols Array of identified protocols with confidence scores
grammar Learned PCFG rules with production probabilities
parser Auto-generated parser with success rate metrics
validation_rules Regex and structural validation rules
statistics Traffic statistics: entropy, message lengths, binary ratio

Step 5: Detect Fields in a Message

Once protocols are discovered, detect individual fields in messages:

curl -X POST http://localhost:8000/api/v1/detect-fields \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your_api_key" \
  -d '{
    "packet_data": "R0VUIC9hcGkvdXNlcnMgSFRUUC8xLjE="
  }'

The response includes detected fields, data types, and structural boundaries.

Step 6: Use the Protocol Copilot

Ask the Protocol Copilot to explain what was discovered using natural language:

curl -X POST http://localhost:8000/api/v1/copilot/query \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your_api_key" \
  -d '{
    "query": "Explain the protocol discovered in my last analysis",
    "session_id": "your-session-id"
  }'

How the Discovery Pipeline Works

Behind the scenes, your request passes through these stages:

  1. Statistical Analysis -- extract entropy, byte distributions, and structural patterns
  2. Pattern Extraction -- identify recurring delimiters, headers, and field boundaries
  3. Grammar Learning -- infer a Probabilistic Context-Free Grammar (PCFG) from the traffic
  4. Parser Generation -- automatically generate a parser from the learned grammar
  5. ML Classification -- classify against known protocols using CNN, LSTM, and Random Forest models
  6. Validation -- verify the discovered protocol with structural and semantic checks

Next Steps