AI Interview Proxy đ¤â︶
Personal Project
An experiment in real-time voice AIâexploring how natural conversation agents could work in professional contexts. Built to learn the OpenAI Realtime API and Twilio integration. While I've configured it with my professional information for the interview use case, the framework can be adapted for any conversational AI application.
A real-time voice AI agent accessible via phoneâdemonstrating conversational AI with a practical example of answering questions about professional experience.
The Concept¶
An exploration of conversational AI in a practical context: what if you could call a phone number and have a natural conversation with an AI agent that knows someone's background? The agent responds in real-time voice, answering questions about work history, technical expertise, and career goalsâdemonstrating the potential of voice AI for information delivery.
How It Works¶
graph LR
A[Caller] -->|Phone Call| B[Twilio]
B -->|Audio Stream| C[OpenAI Realtime API]
C -->|Voice Response| B
B -->|Audio| A
D[Agent Instructions] -.->|Context| C
E[My Resume/Info] -.->|Knowledge Base| D
Architecture¶
- Twilio: Handles incoming calls and manages the phone infrastructure
- OpenAI Realtime API: Powers natural voice conversation with low latency
- Agent Instructions: Custom prompt engineering with my professional information
- Real-time Streaming: Bidirectional audio streaming for natural conversation flow
Why This Project?¶
Learning Goals: Hands-on experience with emerging voice AI technology
What It Demonstrates: - Building with OpenAI's Realtime API for low-latency voice interaction - Integrating telephony infrastructure (Twilio) with AI services - Designing conversational agents for specific domains - Creating accessible interfaces (phone vs. web-only)
AI Technologies Used¶
- OpenAI Realtime API: Low-latency voice-to-voice interaction
- Prompt Engineering: Crafted instructions to represent my experience accurately
- Agent Design: Conversational flow that feels natural and helpful
- Twilio Integration: Telephony infrastructure for accessibility
Key Learnings¶
Technical Wins
- Real-time voice AI creates surprisingly natural conversations with minimal latency
- Prompt engineering significantly impacts agent personality and accuracy
- Phone accessibility is simpler than expected with modern APIs
- Bidirectional streaming works smoothly when properly configured
Interesting Challenges
- Balancing personality vs. professionalism in agent tone
- Handling unexpected questions gracefully outside the knowledge base
- Understanding API cost implications for real-time voice streaming
- Designing conversation flows that feel natural, not scripted
Voice AI Ecosystem¶
This project sparked my interest in the rapidly evolving voice AI space. While I chose Twilio for this implementation due to its robust telephony infrastructure and flexibility, the landscape offers several compelling options:
Current Implementation: - Twilio: Reliable phone infrastructure with full control over the integration
Exploring Next: - Vapi: Newer platform that significantly simplifies voice AI integration with built-in OpenAI support - ElevenLabs: Advanced voice synthesis for more natural-sounding agents - SoundHound: Voice AI platform with strong NLU capabilities
The modular design of this project makes it straightforward to swap or integrate different voice providers as the ecosystem evolves.
Future Enhancements¶
- Experiment with Vapi for simplified AI integration
- Integrate ElevenLabs for higher quality voice synthesis
- Add conversation summaries and analytics
- Multi-language support for global accessibility
- Calendar scheduling integration
Try It Yourself¶
The code is open source and can be configured with any prompt/instructions for your own use case.
Privacy & Access
The GitHub repository contains the framework and integration code. My personal resume information and prompt instructions are not included in the public repo.
The phone number is available upon request rather than posted publicly to manage API costsâboth Twilio and OpenAI Realtime API have usage-based pricing.
Want to experience it? Contact me for the phone number to try the live demo.