Mona Lisa in 50 polygons, using a genetic algorithm (2008)

[+] habosa|13 years ago|reply

Source code? I really, really want to try something like this but I've never done GP and I'm not sure where to start.

In lieue of the source code, can anyone point me to a reference on GP and maybe something about image generation in C? In particular, what's an efficient graphics pipeline for converting all of these polygons to pixels? Something like bresenham for the edges and then additive coloring in the middle? And then how do I convert an RGB pixel array into some reasonable image format? I apologize for my ignorance, I don't even know what to start googling.

I think it would be a good exercise for me to write something like this from scratch on my own, just want some pointers to start.

[+] jeremiep|13 years ago|reply

You can find the source at http://code.google.com/p/alsing/source/browse/

[+] kevinastock|13 years ago|reply

Here's how I did it: https://bitbucket.org/teoryn/image-approximation

For converting the polys to pixels I render it with OpenGL and then extract the resulting image. I use ppm for both input and output, they're just a plain text file, and any decent image editor can convert to or from them.

[+] nemo1618|13 years ago|reply

Here's a similar algorithm you can play around with, implemented in Javascript + Canvas: http://alteredqualia.com/visualization/evolve/

HN post here: http://news.ycombinator.com/item?id=392036

[+] akent|13 years ago|reply

Also of interest: MonaTweeta, using genetic algorithms to fit Mona Lisa into a 140 "character" tweet. http://www.flickr.com/photos/quasimondo/3518306770/

[+] smanek|13 years ago|reply

I did something similar around 2008: (but in Lisp!): https://github.com/smanek/ga

Had to write my own bitmap processing library, since couldn't find anything fast enough off the shelf :-D Handled alpha blending, file i/o, etc (checkout the bitmap.lisp and color.lisp files in the repo).

Here's a video of it 'evolving' a picture of John McCarthy (best individual from each generation): http://www.youtube.com/watch?v=-_VFZ_ON0A8

And here it is doing the Mona Lisa: http://www.youtube.com/watch?v=S1ZPSbImvFE

[+] drallison|13 years ago|reply

It is not quite a genetic algorithm: there is no population of individuals s which get mated and possibly mutated. It's more of a dynamic programming polygon match. Still, the result is impressive and amusing.

[+] cmsmith|13 years ago|reply

Exactly, the 'genetic' part of the term implies some sort of breeding, not that you can transform your search space into a vector which you call a genome. In fact, the algorithm used seems to be exactly what is described on the wikipedia for 'Random Optimization' [1].

I would expect that a true GA might work better, but not be the best choice. In my semi-related experience, Particle Swarm Optimization [2] works much better for continuous valued problems.

[1] http://en.wikipedia.org/wiki/Random_optimization

[2] http://en.wikipedia.org/wiki/Particle_swarm_optimization

[+] michaels0620|13 years ago|reply

He answers this in his faq (http://rogeralsing.com/2008/12/09/genetic-programming-mona-l...):

Q) Is this Genetic Programming?, I think it is a GA or even a hill climbing algorithm.

A) I will claim that this is a GP due to the fact that the application clones and mutates an executable Abstract Syntax Tree (AST).

Even if the population is small, there is still competition between the parent and the child, the best fit of the two will survive.

[+] RogerAlsing|13 years ago|reply

Roger Alsing the author of that old gimmic here :-)

Some clarifications from my part here, 4 years after the post was released:

1) No this does not qualify as a true GA/GP. by definition GP need to have an computational AST, EvoLisa has an declarative AST. There is also no cross over in play here. (see 3* for explatation on this)

2) Hill climbing or not? According to wikipedia, Hill climbing only changes _one_ value of the problem solving vector per generation. ""At each iteration, hill climbing will adjust a single element in X and determine whether the change improves the value of f(X)""

So it doesn't quite fit the hill climbing definition either, also the DNA/vector is of dynamic size in EvoLisa while Hill climbing AFAIK uses fixed size vectors (?)

3) Why wasn't a larger population used and why no cross over?

This is the part that most of you get wrong, increasing the population size and adding cross over will NOT benefit this specific problem.

The polygons are semi transparend and overlap, thus, adding/removing polygons will have a high impact on the fitness level, in pretty much every case in the wrong direction.

Let's use words as an example here:

organism1: "The Mona Lisa" organism2: "La Gioconda"

Both may have similar fitness level, but completely different sets of polygons (letters in this naive example)

combining those will very very rarely yeild an improvement.

e.g. child(result of org1 and org2) "Lae Mocondisa" that is complete nonsense and the fitness level falls back to pretty much random levels.

Thus, you can just as well use pure mutation instead of cross over here.

If the problem instead had been based on genes that paint individual parts, e.g. a gene for the face, a gene for the background, a gene for the body etc.

THEN it would have made sense to use crossover. In such case it would be possible to combine a good face gene with a good background gene and the fitness level would improve.

However, due to the nature of this specific problem where the polygons span the entire image, this is not effective.

And if crossover is not benefitial, then a larger population gets less interesting also since you cannot combine them.

Increasing the population will only make more separate individuals compete against eachother with no additative effect in any way.

see it like this.

If we have one sprinter running 100meters, if he might complete the run in about 10 sec.

If we add 1000 sprinters to the population, each of them might complete the run in about 10 sec each.

Thus, the problem is not solved any faster by adding more individuals here. Also, by increasing the population size, there will be much more data to evaluate for each generation, so even if we can bring down the number of generations needed to solve the problem, the actual real world time to complete it would increase due to evaluation logic.

Anyway, nice to see that people still find this somewhat interesting. It was pretty much a single evening hack back 4 years ago..

//Roger

[+] RogerAlsing|13 years ago|reply

And for those interested real GP, see http://rogeralsing.com/2010/02/14/genetic-programming-code-s...

That code uses real crossover and a large population in order to crack black boxed formulas.

[+] JoeAltmaier|13 years ago|reply

Lots of fun to write, I imagine. A cool demo. But computationally, after nearly a million generations it does pretty good. How does it compare to any other directional computational algorithm, e.g. decimating an accurate tessalation? I've tried genetic algorithms for industrial modeling, and not seem anything close to efficiency. If you have a computer model you don't need genetic algorithms (can use better modeling techniques); if you have a real-world model e.g. a starch conversion process, you can't afford a large number of experiments, or even ANY experiments that aren't going to yield pretty-good results.

[+] crusso|13 years ago|reply

Thanks for this great demo. I first saw it a few years ago and bookmarked it. I've gone back to it several times for inspiration since then.

Much appreciated.

[+] mattdw|13 years ago|reply

To add to the cacophony of "I built one of these too!", here's mine:

http://mattdw.github.com/experiments/image-evo/

It's more strictly a genetic algorithm than the OP, too, as it's mutating a population, and instances age and eventually die.

[+] gregschlom|13 years ago|reply

Only 50 semi transparent polygons? I would have thought a lot more would have been needed.

Also worth checking out, the gallery with more paintings: http://rogeralsing.com/2008/12/11/genetic-gallery/

[+] T-hawk|13 years ago|reply

The polygons overlap with transparency, so 50 polygons could in theory create up to 2^50 unique regions. Think Venn diagram with every possible combination of overlapping areas, any combination of the 50 sets of points. (This is possible if the Venn areas are not convex.)

Realistically, it looks like the final result has somewhere upwards of 200 unique areas created by various overlaps of the 50 polygons.

[+] konstruktor|13 years ago|reply

Thanks for the link. Looking at the Images, I find that the polygonal representation also yields a very cool visual effect.

[+] RBerenguel|13 years ago|reply

I wrote my own a long time ago in C (inspired by Roger) to create a Christmas postcard: http://www.mostlymaths.net/2010/04/approximating-images-with...

[+] Lerc|13 years ago|reply

Ah that'll be where I got my Santa image. I made a 1k version of that.

[+] conroe64|13 years ago|reply

I wonder if this could be used to create a 3D sculpture using semi-reflective film, which if you looked at from a certain angle with light coming from a certain angle, you could recreate the image.

[+] abuzzooz|13 years ago|reply

Not to be outdone, I wrote my own genetic algorithm in Perl to do the same. It starts from randomly generated triangles, and evolves them to match the given image.

Here is the result of my first trial after 5000 generations:

http://imgur.com/L9Odx

For this run, I used 50 triangles, each at 50% alpha (fixed), a GA population size of 200, a crossover rate of 0.91 and a mutation rate of 0.01. It took around 12 hours to run, but that's mainly because I opted to do it in Perl and didn't spend any time optimizing it.

[+] sdoering|13 years ago|reply

Given, that it took more than 900k generations, I couldn't help but think, that this means, it would take way more then 22 million years in human evolution.

A human generation is said to be round about 25 years.

;-)

[+] Lerc|13 years ago|reply

Since this has popped up again I might as well point to where I took this. http://screamingduck.com/Article.php?ArticleID=46&Show=A...

And some reconstructions from data. http://screamingduck.com/Lerc/showit.html http://screamingduck.com/Lerc/showit2.html

[+] montecarl|13 years ago|reply

The goal here to use polygons (a set of 2d points, a color, and opacity) to best reproduce the original image.

A given number of n-sided polygons represent a choice of basis set. This can be viewed as an optimization problem, where you try to minimize the difference between the rendered polygon image and the original image.

I wonder if this basis set is ideal? That is, is there a basis set you can choose, that represents the original image equally well, but uses less information?

Each n-sided polygon uses 2n+4 numbers (2n for the points and 4 for the color (RGB) and opacity). What is the ideal number of points in the polygon basis?

One could imaging using a set of orthogonal functions to represent the image. Coming up with a good set that isn't overfit to a training set might be a challenge. Perhaps one can make use of features of the human eye to come up with a good basis (maybe similar how MP3 does this for audio).

[+] Lerc|13 years ago|reply

That's sort of the conclusion that I came to as well.

For evolving The best basis is some representation that represents the widest ranges of perceived images while keeping some similarity between images with similar data sets.

The range of perceived images is a tricky problem in itself. Many images of noise can be perceived to be the same whereas images of a face will look significantly different with a small change to the nose.

The polygon approach is obviously not good at expressing fine textures. It would be interesting to construct the image allowing rendering into different representations of the same frame buffer. Allow drawing directly into a frequency domain for instance.

[+] ohazi|13 years ago|reply

What you've described is basically what JPEG does using DCT (JPEG-1) or wavelet coefficients (JPEG 2000) as the basis. The advantage with JPEG is that the forward transform is very easy.

You can use whatever basis you want, but I wouldn't call it ideal in any practical sense if you have to run a GA for a several hours to encode an image.

[+] oboizt|13 years ago|reply

I read these things and wish so desperately that I was this creative. Mind. Blown.

[+] delano|13 years ago|reply

It all starts with a silly, little thing.

[+] vidar|13 years ago|reply

Just start.

[+] acd|13 years ago|reply

NASA has used genetic programming to make more efficient antennas. But would it be possible to use Genetic algorithms to produce more efficient aircraft wings? And how about more difficult problems like CPU design, say that you want a 3d stackable cpu consisting of different layers and cooling but you let the computer design it itself. How about using it to construct more efficient solar cells and wind turbines. Is that possible? How do you make it general enough so the constraints does not constrain the possible solutions to much?

[+] pohl|13 years ago|reply

An entry on the antennas: http://en.wikipedia.org/wiki/Evolved_antenna

[+] calinet6|13 years ago|reply

Holy crap this would make a good image compression algorithm. Might be complex to encode, but super small and efficient to decode.

Can we have this, please? Someone?

[+] gordaco|13 years ago|reply

I suspect that there would be a lot of edge cases where the algorithm wouldn't yield very satisfactory results. Think about images without broad similarly coloured areas, like white noise and such. Maybe further research will alleviate this. I'm thinking that maybe dividing the image in tiles, like JPG does, could help.

Another advantage is that the compressed format would be vectorial instead of raster, so it would provide smooth scaling.

[+] ampersandy|13 years ago|reply

The algorithm requires that you compare the current iteration to the source image, how does that constitute good compression? Not to mention the final image required 904314 generations to reach it.

[+] wangweij|13 years ago|reply

Is there a limit on the number of vertices for each polygon? If no, I think there is always a way to emulate anything with a single polygon.

Edit: albeit the colors.

[+] ghshephard|13 years ago|reply

I was thinking the exact same thing. Also, it would have been nice to see what each additional polygon was at each stage, beside the merged results.

[+] Ives|13 years ago|reply

From the looks of it the average polygon in the system has about 6 vertices, so at 4 bytes a vertex and 4 bytes for RGBA color that's a total of 28 bytes per poly or 1400 bytes total.

And that's overestimating vertex positioning (at that size, 1 or maybe 2 bytes would suffice). Encoding an image like that would be very slow though.

[+] yesbabyyes|13 years ago|reply

Original discussion:

https://news.ycombinator.com/item?id=389727

Discussion about bd's javascript reimplementation:

https://news.ycombinator.com/item?id=392036

53 comments