top | item 47111179

Show HN: ByePhone- An AI assistant to automate tedious phone calls

5 points| gitpullups | 8 days ago |byephone.io

I have a bit of phone anxiety, and have a ton of dread around making phone calls to restaurants, banks, doctors, and so on and on.

I thought: AI could do this with a web form turned into a prompt.

Stack started out simple -> using 11labs for voice + claude + twillio, but it actually got rather complex (even though I tried vibe coding most).

First off, finding the phone numbers quickly is hard. This is done by scraping the web with some basic duckduckgo search and structure with openai calls.

Second, collecting the right information. I’m still struggling a bit with this but the architecture is that: A) user puts in call objective and business name B) if keywords are detected spin up one of the default form categories C) if not, get structured json from gpt-4o-mini and turn into react form

The cost of making a single call spun out of control, but luckily sonnet can handle a lot of the calls and I’m ok paying for twillio.

Ended up taking months to build my week-long project because of course.

It’s still WIP so feel free to email me: galcohavy@ucla.edu with any ideas or issues u ran into. \

3 comments

order

PranayKumarJain|7 days ago

Nice work — real-time voice plumbing always looks “simple” until you build it.

A few things that helped us keep cost + complexity sane on similar voice-agent flows:

- Treat the call as a state machine (collect slots -> confirm -> execute). Don’t let the LLM free-run every turn; use small models for routing/slot-filling, escalate only on ambiguity. - Put hard guardrails on “thinking”: max tokens/turn + short system prompts. It’s shocking how often cost is prompt bloat + retry loops. - If you’re using Twilio, Media Streams + a streaming STT/TTS loop reduces latency and avoids “LLM per sentence” patterns. - Phone-number discovery: try a tiered approach (cached business DB / Places API / fallback scrape) and cache aggressively; scraping every time is where it gets gnarly.

We build production voice agents at eboo.ai and have hit the same Twilio + latency + cost cliffs — happy to share patterns if you want to compare notes.

gitpullups|7 days ago

is this an ad? written by an LLM?

gitpullups|8 days ago

Still on roadmap: scheduling phone calls, and changing your agent voice beyond just gender should be done within a few days