top | item 40201467

(no title)

idan | 1 year ago

So, part of the trickiness here is that there's a few different moving pieces that have to cooperate for success to happen.

There needs to be a great UX to elicit context from the human. For anything larger than trivial tasks, expecting the AI to read our minds is not a fruitful strategy.

Then there needs to be steerability — it's not just enough to get the human to cough up context, you have to get the human to correct the models' understanding of the current state and the job to be done. How do you do that in a way that feels natural.

Finally, all this needs to be defensive against model misses — what happens when the suggestion is wrong? Sure, in the future the models will be better and correct more often. But right now, we need to design for falliability, and make it cheap to ignore when it's wrong.

All of those together add up to a complex challenge that has nothing to do with the prompting, the backend, the model, etcetc. Figuring out a good UX is EXACTLY how we make it a useful tool — because in our experience, the better a job we do at capturing context and making it steerable, the more it integrates that thinking you stopped to do, but should have had some rigorous UX to trigger.

discuss

a_t48|1 year ago

Yeah to be clear I think Copilot Workspace is a great start. I wonder if the future is multi-modal though. Ignoring how obnoxious it would be to anyone near me, I could foresee narrating my stream of thoughts to the mic while using the keyboard to actually write code. It would still depend on me being able to accurately describe what I want, but it might free me from having to context switch to writing docs to hint the LLM.

idan|1 year ago

I mean we explored that a little with Copilot Voice :D https://githubnext.com/projects/copilot-voice/

But yeah, the important part is capturing your intent, regardless of modality. We're very excited about vision, in particular. Say you paste a screenshot or a sketch into your issue...