Skip to content

AI Interview Proxy 🤖☎️

Personal Project

An experiment in real-time voice AI—exploring how natural conversation agents could work in professional contexts. Built to learn the OpenAI Realtime API and Twilio integration. While I've configured it with my professional information for the interview use case, the framework can be adapted for any conversational AI application.

A real-time voice AI agent accessible via phone—demonstrating conversational AI with a practical example of answering questions about professional experience.

The Concept

An exploration of conversational AI in a practical context: what if you could call a phone number and have a natural conversation with an AI agent that knows someone's background? The agent responds in real-time voice, answering questions about work history, technical expertise, and career goals—demonstrating the potential of voice AI for information delivery.

How It Works

graph LR
    A[Caller] -->|Phone Call| B[Twilio]
    B -->|Audio Stream| C[OpenAI Realtime API]
    C -->|Voice Response| B
    B -->|Audio| A
    D[Agent Instructions] -.->|Context| C
    E[My Resume/Info] -.->|Knowledge Base| D

Architecture

  • Twilio: Handles incoming calls and manages the phone infrastructure
  • OpenAI Realtime API: Powers natural voice conversation with low latency
  • Agent Instructions: Custom prompt engineering with my professional information
  • Real-time Streaming: Bidirectional audio streaming for natural conversation flow

Why This Project?

Learning Goals: Hands-on experience with emerging voice AI technology

What It Demonstrates: - Building with OpenAI's Realtime API for low-latency voice interaction - Integrating telephony infrastructure (Twilio) with AI services - Designing conversational agents for specific domains - Creating accessible interfaces (phone vs. web-only)

AI Technologies Used

  • OpenAI Realtime API: Low-latency voice-to-voice interaction
  • Prompt Engineering: Crafted instructions to represent my experience accurately
  • Agent Design: Conversational flow that feels natural and helpful
  • Twilio Integration: Telephony infrastructure for accessibility

Key Learnings

Technical Wins

  • Real-time voice AI creates surprisingly natural conversations with minimal latency
  • Prompt engineering significantly impacts agent personality and accuracy
  • Phone accessibility is simpler than expected with modern APIs
  • Bidirectional streaming works smoothly when properly configured

Interesting Challenges

  • Balancing personality vs. professionalism in agent tone
  • Handling unexpected questions gracefully outside the knowledge base
  • Understanding API cost implications for real-time voice streaming
  • Designing conversation flows that feel natural, not scripted

Voice AI Ecosystem

This project sparked my interest in the rapidly evolving voice AI space. While I chose Twilio for this implementation due to its robust telephony infrastructure and flexibility, the landscape offers several compelling options:

Current Implementation: - Twilio: Reliable phone infrastructure with full control over the integration

Exploring Next: - Vapi: Newer platform that significantly simplifies voice AI integration with built-in OpenAI support - ElevenLabs: Advanced voice synthesis for more natural-sounding agents - SoundHound: Voice AI platform with strong NLU capabilities

The modular design of this project makes it straightforward to swap or integrate different voice providers as the ecosystem evolves.

Future Enhancements

  • Experiment with Vapi for simplified AI integration
  • Integrate ElevenLabs for higher quality voice synthesis
  • Add conversation summaries and analytics
  • Multi-language support for global accessibility
  • Calendar scheduling integration

Try It Yourself

The code is open source and can be configured with any prompt/instructions for your own use case.

Privacy & Access

The GitHub repository contains the framework and integration code. My personal resume information and prompt instructions are not included in the public repo.

The phone number is available upon request rather than posted publicly to manage API costs—both Twilio and OpenAI Realtime API have usage-based pricing.

Want to experience it? Contact me for the phone number to try the live demo.

← Previous: AI Concepts Next: LLM Experiments →