(no title)
YaraDori | 8 days ago
Two reliability questions: 1) How do you handle UI drift / A-B layouts / dynamic text (e.g., different font sizes, localization)? Do you anchor on vision+OCR only, or do you also incorporate accessibility tree / UIAutomation hints where possible? 2) Do you have a concept of checkpoints + recovery (e.g., if a tap misfires, can the agent detect it’s in the wrong screen and roll back / retry)?
I’m working on SkillForge (https://skillforge.expert) — we’re doing something adjacent for web UIs: record a workflow once, then replay it later as a reusable skill with checkpoints + retries. Curious if you’ve considered exporting a “skill file” for iPhone flows (step evidence + approved boundaries), so people can share/reuse automations safely.
No comments yet.