top | item 39291269

Apple releases MGIE, an AI-based image editing model

118 points| gnicholas | 2 years ago |appleinsider.com | reply

108 comments

order
[+] nickthegreek|2 years ago|reply
I was not expecting for Apple to be releasing any open source generative ai product. So the big players in the download it yourself gang are Meta & Apple? I don't think anyone would have taken that bet when OpenAI blew up.
[+] jedberg|2 years ago|reply
Apple makes a ton of sense. They sell hardware (and services for that hardware). It would make sense that they want to commoditize foundation models while at the same time creating yet another reason to buy their most powerful hardware.

Meta is the real wildcard here. They're betting on "commoditize your compliment" and hoping by making GenAI training nearly free (by doing it for you) that none of their competitors can use GenAI models to gain an advantage over them.

[+] frankfrank13|2 years ago|reply
They do have quite a few open source projects, I don't think they would open source something like LLama, but something on this scale makes sense
[+] barrkel|2 years ago|reply
The code repo this article is based on - https://github.com/apple/ml-mgie - is based around InstructPix2Pix (which is based on Stable Diffusion) and LLaVA (multi-modal LLM that supports vision input).

It's research code, not a product. I don't think Apple would ever productionize anything with data lineage from Stable Diffusion.

[+] devinprater|2 years ago|reply
This is going to be really cool. Being able to have pictures described, with AI, and then edit them using text, like underline this part of the screenshot, or something like that. Especially with Llava 1.6, and Apple's other model where it understands spatial parts of images, this should be pretty possible for me, as a blind person, to do. So yeah I think I'll wait to get this September's iPhone.
[+] gcr|2 years ago|reply
The significant part of this, for me, is not the image model. It's that Apple rarely publishes in the vision field. Back in Neurips (2016 I want to say?) Apple had a workshop where they said "we're going to engage more with the academic community, we're going to publish more," but since then only a small number of CV projects have seen the academic light of day. I remember one on detection of flashing lights, a whitepaper/tech report on Hey Siri wakeword detection, some interesting synthetic image generation for iris recognition, now this.

If I had to guess, I think Apple Research may have some institutional obstacles to getting research work out there. Perhaps this could indicate those walls are continuing to dissolve bit by bit?

[+] imjonse|2 years ago|reply
They have at least two recent papers (MobileOne and FastViT) with code and weights for low-latency (as measured on iPhone) vision backbones. Also AIM (Autoregressive Image Models) released last month.
[+] jahabrewer|2 years ago|reply
Important clarification from reading tfa for a second: image _editing_. I can't believe Apple would sully their image with actual generation.
[+] ericmcer|2 years ago|reply
It feels like Apple has been very quiet about their approach in some of the future spaces (self-driving, VR/AR, AI) but then they suddenly burst out with something like Vision Pro.

It seems like they are biding their time and avoiding the hype knowing that they don't need to get their first. If they can produce a superior product their brand will let them dominate those markets whenever they decide to release.

[+] rchaud|2 years ago|reply
> It seems like they are biding their time and avoiding the hype knowing that they don't need to get their first

This idea that Apple gets to things late to "perfect them" is an artifact of the iPod/iPhone era. Nothing Apple put out since then has met this bar.

Apple does things low-key these days because it has to. Remember the Apple Maps fiasco? A few memes was all it took to send everyone scurrying back to GMaps for a decade.

[+] gnicholas|2 years ago|reply
> If they can produce a superior product their brand will let them dominate those markets whenever they decide to release.

I anticipate that Apple's Pro devices (iPhone, MacBooks) will have beefier hardware designed specifically for this use case. My guess is they will sell a much higher proportion of Pro iPhones as a result. They'll probably also sell a lot more new devices overall, contra the trend toward slower upgrade cycles.

At the same time, I think third-party companies will find a way to do pretty decent on-device (and very good cloud-based) AI, so it will be a tradeoff in terms of speed and privacy. If you want the best speed and no data being shared with a cloud-based AI provider, the Pro devices will be aimed at you.

[+] JohnFen|2 years ago|reply
Apple has always been aware that "first mover advantage" is largely bullshit. "The pioneers get all the arrows." The sweet spot is to be second or third to market.
[+] rujkking|2 years ago|reply
Someone just noticed Apple’s decades old SOP. Watch the aggregate fumble about, fail over and over, then take the few ideas that survive rapid, aimless iteration attempts and add the last mile of polish.

A decade of crappy Palm, WinCE, and easily forgotten Java based mobile devices came and went before iPhone.

They know what software people refuse to accept; it’s the hardware experience that matters and they’ll wait until the hardware is there.

Modern hardware is responsible for AI, more reliable networks, and the rest of modern compute. Has little to do with the bloated software stacks we git pull into the data center

Edit: Also consider that Intel and Apple don’t just make a consumer device. They design an advanced manufacturing pipeline. The difference between print(“hello world”) and for index in array: print(contents of index)

That’s the basis of their value and why they are propped up by government and why software focused startups will always just be pump and dump schemes.

[+] seydor|2 years ago|reply
yet every review calls the vision Pro a "work in progress". I don't buy this

I also wonder why they broke with their naming scheme ("Pro" is not the Pro of any smaller model)

[+] RyanHamilton|2 years ago|reply
What makes you think vision pro is dominating? Unlike iPhone etc it has no distinguishing feature I know of.
[+] _ncyj|2 years ago|reply
I’m curious why so many AI papers consist mostly of people with Chinese-like names. I’m surprised there aren’t more witch hunts from the far right & skepticism that they can embed CCP propaganda into the models.
[+] Am4TIfIsER0ppos|2 years ago|reply
Why would we, the right-wing conspiracy nuts, need chinamen to do that when there are so many communists in america and europe who do it anyway.
[+] dosinga|2 years ago|reply
Interesting read, but then suddenly there's this passive aggressive paragraph:

Apple stock has taken a beating as of late, in part because analysts have loudly proclaimed that the company is behind Meta, Google, and Microsoft in generative AI implementation. It's not clear why this wasn't a problem when it wasn't first to a mobile phone, a tablet, a smartwatch, or a VR headset, but is with generative AI.

somebody is hurting

[+] system16|2 years ago|reply
I also found that an odd statement. If you need a “why” analysts may not feel confident about Apple’s ability to deliver when it comes to AI, look no further than Siri and Apple Maps.
[+] pksebben|2 years ago|reply
I dunno, the tone may be what it is but I kinda see the message re: analysts and their forecasts. Beat the managers, buy the index, yaknow.
[+] gitfan86|2 years ago|reply
No one is talking about how Joi is the killer app for vision pro