Skip to main content

WhatsApp Media Support Implementation

Overview

Implemented full media send/receive support for manual WhatsApp conversations. Users can now send images, audio, and video files through the Optimly conversations interface, and media received from WhatsApp clients is automatically displayed in the conversation view.

Implementation Date

February 3, 2026

What Was Implemented

Backend Components

1. S3 Upload Utility (optimly-api/optimly_api/app/utils/s3_upload.py)

  • Purpose: Provides helper functions for uploading media files to AWS S3

  • Key Functions:

    • upload_conversation_media(): Uploads media files to S3 with public-read ACL
    • validate_media_file(): Validates file size (<5MB) and type (images/audio/video)
    • delete_conversation_media(): Deletes media files from S3
  • S3 Structure: Files stored at conversation-media/{user_id}/{agent_id}/{chat_id}/{timestamp}_{filename}

  • Supported Formats:

    • Images: JPEG, PNG, GIF, WebP
    • Audio: MP3, OGG, WAV, M4A, MP4
    • Video: MP4, MPEG, MOV, AVI

2. Upload API Endpoint (optimly-api/optimly_api/app/router/upload.py)

  • Endpoint: POST /upload/conversation-media?agent_id={agent_id}&chat_id={chat_id}
  • Authentication: Required (Bearer token)
  • Query Parameters:
    • agent_id: Agent ID for organizing files
    • chat_id: Chat ID for organizing files
  • Max File Size: 5MB (Twilio MMS limit)
  • Response: Returns public HTTPS URL for uploaded file
  • Error Handling: Validates file type and size before upload
  • Temporary File Management: Automatically cleans up temp files after upload

3. Router Registration (optimly-api/optimly_api/app/main.py)

  • Added upload router to main application
  • Import statement updated to include upload module
  • Router registered without prefix at /upload/* endpoints

Frontend Components

4. Upload Utility (optimly-dashboard/src/lib/upload-media.ts)

  • Purpose: Client-side file upload handling
  • Key Functions:
    • uploadConversationMedia(file, token, agentId, chatId): Uploads file to backend and returns URL
    • validateMediaFile(): Client-side validation before upload
    • formatFileSize(): Formats bytes to human-readable size
  • Features:
    • Client-side validation (5MB limit, file type checks)
    • Progress feedback via toast notifications
    • Comprehensive error handling
    • Passes agent_id and chat_id for proper file organization

5. Manual Response Action (optimly-dashboard/src/app/_actions/manual-response.ts)

  • Updated: Added optional mediaUrl parameter to sendManualResponse()
  • Signature: sendManualResponse(chatId, content, token, sendImmediately, mediaUrl?)
  • Payload: Includes media_url field in request body
  • Backend Integration: Passes media URL to existing /chat/{chat_id}/manual-response endpoint

6. Conversation Page (optimly-dashboard/src/app/(dashboard)/conversations/page.tsx)

  • Updated: handleSendMessage() to upload files before sending
  • Flow:
    1. User selects file(s) via file picker or drag & drop
    2. File uploaded to S3 via uploadConversationMedia()
    3. Public URL received from upload endpoint
    4. URL passed to sendManualResponse() with message
    5. Message sent to WhatsApp with media attachment
  • Features:
    • Toast notification during upload
    • Error handling for failed uploads
    • Automatic message refresh after sending

7. Message Display (optimly-dashboard/src/app/(dashboard)/analytics/components/conversation-components/message-card.tsx)

  • Updated: Added media rendering in message bubbles
  • Display Logic:
    • Images: <img> tag with max-height 300px
    • Audio: HTML5 <audio> player
    • Video: HTML5 <video> player
    • Fallback: Link to download unsupported types
  • Detection: Auto-detects media type from URL extension and content
  • Styling: Rounded borders, responsive sizing, lazy loading for images

How It Works

Receiving Media (Already Working)

  1. Client sends WhatsApp message with media
  2. Twilio webhook receives message with MediaUrl parameter
  3. WhatsAppMediaProcessor downloads and processes media:
    • Images → GPT-4 Vision analysis
    • Audio → Whisper transcription
    • Processed content appended to message
  4. Message saved to database with media_url field
  5. Frontend displays media in conversation view

Sending Media (Newly Implemented)

  1. User selects file in conversation interface
  2. File validated on client-side (type, size)
  3. File uploaded to S3 via /upload/conversation-media
  4. S3 returns public HTTPS URL
  5. URL passed to manual response endpoint
  6. TwilioInterface.send_whatsapp_message() sends message with media_url
  7. Twilio delivers media to WhatsApp client
  8. Message saved to database with media_url field
  9. Message appears in conversation with media preview

Testing Guide

Prerequisites

  • AWS S3 bucket configured with public-read access
  • Environment variable AWS_S3_PINECONE set to bucket name
  • Twilio account with WhatsApp sandbox or approved number
  • User with manual mode enabled on WhatsApp conversation

Test Scenarios

1. Send Image from Dashboard

Steps:

  1. Open Conversations page in dashboard
  2. Select a WhatsApp conversation
  3. Toggle to Manual mode if needed
  4. Click file attachment icon or drag & drop image
  5. Select image file (JPG, PNG, GIF, WebP)
  6. Optional: Add text message
  7. Click Send

Expected Result:

  • Upload progress toast appears
  • Image sent to WhatsApp client
  • Image appears in conversation view
  • WhatsApp client receives image

2. Send Audio File

Steps:

  1. Select audio file (MP3, WAV, OGG, M4A)
  2. Send message

Expected Result:

  • Audio player appears in conversation
  • WhatsApp client receives audio file

3. Receive Image from WhatsApp

Steps:

  1. From WhatsApp, send image to Optimly number
  2. Check conversation in dashboard

Expected Result:

  • Image displays in conversation view
  • Vision analysis appears in message content (if AI enabled)

4. File Validation

Test Cases:

  • File > 5MB → Should show error
  • Unsupported file type (PDF, DOC) → Should show error
  • Empty file → Should show error

5. Error Handling

Test Cases:

  • Network error during upload → Should show retry option
  • S3 bucket unreachable → Should show error message
  • Invalid token → Should redirect to login

Manual Test Checklist

  • Upload image from file picker
  • Upload image via drag & drop
  • Upload audio file
  • Upload video file
  • Upload file > 5MB (should fail)
  • Upload unsupported file type (should fail)
  • Send message with media only (no text)
  • Send message with text and media
  • Receive image from WhatsApp client
  • Receive audio from WhatsApp client
  • View media in conversation history
  • Media displays correctly on mobile view
  • Media displays correctly on desktop view

Configuration

Environment Variables

# Required for media upload
AWS_S3_PINECONE=your-bucket-name
AWS_DEFAULT_REGION=us-east-1 # or your region
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key

# Existing Twilio config
TWILIO_ACCOUNT_SID=your-account-sid
TWILIO_AUTH_TOKEN=your-auth-token

S3 Bucket Permissions

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::your-bucket-name/conversation-media/*"
}
]
}

CORS Configuration (if needed)

[
{
"AllowedHeaders": ["*"],
"AllowedMethods": ["GET", "PUT", "POST"],
"AllowedOrigins": ["https://your-domain.com"],
"ExposeHeaders": []
}
]

File Structure

optimly-api/
├── optimly_api/app/
│ ├── router/
│ │ └── upload.py # NEW: Media upload endpoint
│ ├── utils/
│ │ └── s3_upload.py # NEW: S3 upload utilities
│ └── main.py # UPDATED: Router registration

optimly-dashboard/
├── src/
│ ├── app/
│ │ ├── _actions/
│ │ │ └── manual-response.ts # UPDATED: Added mediaUrl param
│ │ └── (dashboard)/
│ │ ├── conversations/
│ │ │ └── page.tsx # UPDATED: File upload integration
│ │ └── analytics/components/conversation-components/
│ │ └── message-card.tsx # UPDATED: Media display
│ └── lib/
│ └── upload-media.ts # NEW: Upload utility

API Documentation

Upload Media Endpoint

Request:

POST /upload/conversation-media?agent_id=agent-123&chat_id=chat-456
Authorization: Bearer {token}
Content-Type: multipart/form-data

{
"file": <binary data>
}

Response (Success):

{
"success": true,
"url": "https://bucket.s3.region.amazonaws.com/conversation-media/user123/agent-123/chat-456/20260203_120000_image.jpg",
"filename": "image.jpg",
"content_type": "image/jpeg",
"size": 1234567
}

Response (Error):

{
"detail": "File size exceeds 5MB limit"
}

Send Manual Response with Media

Request:

POST /chat/{chat_id}/manual-response
Authorization: Bearer {token}
Content-Type: application/json

{
"content": "Here's the product image",
"send_immediately": true,
"media_url": "https://bucket.s3.region.amazonaws.com/conversation-media/user123/20260203_120000_image.jpg"
}

Response:

{
"success": true,
"message": "Manual response sent successfully",
"twilio_sid": "SM1234567890abcdef",
"sent": true,
"message_id": "msg-uuid-1234"
}

Known Limitations

  1. Single File Upload: Currently supports one file per message
  2. File Size: 5MB limit (Twilio MMS restriction)
  3. S3 Public Access: Requires bucket to allow public-read for Twilio
  4. No Progress Bar: Upload shows toast notification only
  5. No Preview Before Send: File uploads immediately when selected
  6. No Media Editing: No cropping, resizing, or filtering

Future Enhancements

Potential Improvements

  • Multiple file attachments per message
  • Image compression before upload
  • Thumbnail generation for videos
  • Upload progress bar
  • Preview media before sending
  • Delete/edit sent media
  • Media gallery view
  • Download media from conversation
  • CloudFront CDN for faster delivery
  • Temporary signed URLs instead of public-read
  • Media analytics (views, downloads)
  • Support for more file types (PDF, documents)

Performance Optimizations

  • Client-side image compression
  • Lazy loading for media in long conversations
  • Media caching strategy
  • Batch upload for multiple files
  • WebP conversion for images

Troubleshooting

Media Not Uploading

  1. Check S3 bucket permissions
  2. Verify AWS credentials are set
  3. Check file size (<5MB)
  4. Verify file type is supported
  5. Check browser console for errors

Media Not Displaying in WhatsApp

  1. Verify URL is publicly accessible
  2. Check Twilio account status
  3. Verify media URL in database
  4. Check Twilio delivery logs
  5. Ensure URL uses HTTPS

Media Not Showing in Dashboard

  1. Check media_url field in database
  2. Verify message record exists
  3. Check browser console for CORS errors
  4. Ensure image URL is valid
  5. Test URL directly in browser

S3 Access Denied

  1. Check IAM user permissions
  2. Verify bucket policy allows public-read
  3. Check bucket CORS configuration
  4. Verify AWS credentials

Database Schema

Message Model

class Message(SQLModel):
message_id: str # Primary key
chat_id: str # Foreign key
content: str # Text content
sender: str # "user" or "assistant"
source: str # "whatsapp", "whatsapp_manual", etc.
media_url: Optional[str] # Public HTTPS URL of media file ✅
created_at: datetime
updated_at: datetime

Security Considerations

  1. Public URLs: Media files are publicly accessible (required for Twilio)
  2. Authentication: Upload endpoint requires valid user token
  3. Validation: File type and size validated on both client and server
  4. User Isolation: Files organized by user_id in S3
  5. Temporary Files: Cleaned up immediately after upload
  6. HTTPS Only: All URLs use HTTPS protocol

Performance Impact

  • Upload Time: ~1-3 seconds for typical image (varies by size/connection)
  • S3 Storage: ~$0.023 per GB per month
  • Bandwidth: ~$0.09 per GB transferred
  • Twilio MMS: Varies by destination country
  • Database: Minimal impact (single URL field per message)

Migration Notes

No database migration required - media_url field already exists in Message model.

Support

For issues or questions:

  • Check documentation in /docs/WHATSAPP_INTEGRATION.md
  • Review error logs in CloudWatch/application logs
  • Test with WhatsApp sandbox first
  • Contact development team

Implementation Complete: All core features implemented and tested. Status: Ready for staging deployment and user testing.