AI Interview Proxy 🤖☎️¶

Personal Project

An experiment in real-time voice AI—exploring how natural conversation agents could work in professional contexts. Built to learn the OpenAI Realtime API and Twilio integration. While I've configured it with my professional information for the interview use case, the framework can be adapted for any conversational AI application.

A real-time voice AI agent accessible via phone—demonstrating conversational AI with a practical example of answering questions about professional experience.

The Concept¶

An exploration of conversational AI in a practical context: what if you could call a phone number and have a natural conversation with an AI agent that knows someone's background? The agent responds in real-time voice, answering questions about work history, technical expertise, and career goals—demonstrating the potential of voice AI for information delivery.

How It Works¶

graph LR
    A[Caller] -->|Phone Call| B[Twilio]
    B -->|Audio Stream| C[OpenAI Realtime API]
    C -->|Voice Response| B
    B -->|Audio| A
    D[Agent Instructions] -.->|Context| C
    E[My Resume/Info] -.->|Knowledge Base| D

Architecture¶

Twilio: Handles incoming calls and manages the phone infrastructure
OpenAI Realtime API: Powers natural voice conversation with low latency
Agent Instructions: Custom prompt engineering with my professional information
Real-time Streaming: Bidirectional audio streaming for natural conversation flow

Why This Project?¶

Learning Goals: Hands-on experience with emerging voice AI technology

What It Demonstrates: - Building with OpenAI's Realtime API for low-latency voice interaction - Integrating telephony infrastructure (Twilio) with AI services - Designing conversational agents for specific domains - Creating accessible interfaces (phone vs. web-only)

AI Technologies Used¶

OpenAI Realtime API: Low-latency voice-to-voice interaction
Prompt Engineering: Crafted instructions to represent my experience accurately
Agent Design: Conversational flow that feels natural and helpful
Twilio Integration: Telephony infrastructure for accessibility

Key Learnings¶

Technical Wins

Real-time voice AI creates surprisingly natural conversations with minimal latency
Prompt engineering significantly impacts agent personality and accuracy
Phone accessibility is simpler than expected with modern APIs
Bidirectional streaming works smoothly when properly configured

Interesting Challenges

Balancing personality vs. professionalism in agent tone
Handling unexpected questions gracefully outside the knowledge base
Understanding API cost implications for real-time voice streaming
Designing conversation flows that feel natural, not scripted

Voice AI Ecosystem¶

This project sparked my interest in the rapidly evolving voice AI space. While I chose Twilio for this implementation due to its robust telephony infrastructure and flexibility, the landscape offers several compelling options:

Current Implementation: - Twilio: Reliable phone infrastructure with full control over the integration

Exploring Next: - Vapi: Newer platform that significantly simplifies voice AI integration with built-in OpenAI support - ElevenLabs: Advanced voice synthesis for more natural-sounding agents - SoundHound: Voice AI platform with strong NLU capabilities

The modular design of this project makes it straightforward to swap or integrate different voice providers as the ecosystem evolves.

Future Enhancements¶

Experiment with Vapi for simplified AI integration
Integrate ElevenLabs for higher quality voice synthesis
Add conversation summaries and analytics
Multi-language support for global accessibility
Calendar scheduling integration

Try It Yourself¶

The code is open source and can be configured with any prompt/instructions for your own use case.

Privacy & Access

The GitHub repository contains the framework and integration code. My personal resume information and prompt instructions are not included in the public repo.

The phone number is available upon request rather than posted publicly to manage API costs—both Twilio and OpenAI Realtime API have usage-based pricing.

Want to experience it? Contact me for the phone number to try the live demo.

← Previous: AI Concepts Next: LLM Experiments →