The Application of an Idea

October 13, 2009
by David Gilmore (dbg09)

     Last week, we spent the majority of our time attempting to answer a question fundamental to any visual GP project we might end up with: how should we encode the “DNA” of the image to allow for the easy modification, cross-breeding, and mutation that evolution requires? We started on this by creating a program capable of rendering (and subsequently modifying) any number of polygons with some specific characteristics: vertices, size, position, rotation, color, transparency, and layering. With a large enough population of polygons it is possible to accurately represent any given image. This general idea was touched upon by Evolution of the Mona Lisa. Although the author presents it as genetic programming it most certainly is not. Rather, it’s hill climbing; there exists no population (spare one parent and one child) and no breeding or biological diversity occurs. Consequently, the program is perhaps significantly less efficient than an actual GP equivalent would be. While this project is nowhere near we’d like to end up, it seems to serve as a good starting point for tweaking our image DNA and also a good crash course in GP. We believe that if this same problem were to be properly implemented with a large population size featuring mutation and crossover, that desirable results could be obtained significantly faster than with the existing model. This shouldn’t take a lot of time once the DNA framework is finished and it will give us an opportunity to play with genetic programming and create a genetic model good for image manipulation, to become familiar with the cluster, to fine tune our DNA system, and perhaps give us some insight into a future fitness test for our project.

     In discussions about the future of our project, two common themes this week have been the idea of Amazon’s Mechanical Turk – suggested by a classmate – and evolution of art as art. Through extremely preliminary and rough estimates, I’ve determined that if we were to use Mechanical Turk it would cost between $0.0005 and $0.0010 per agent per generation to return viable fitness data. To get usable results, the same images would need to be compared multiple times in order to achieve any sense of objectivity. While the cost may seem miniscule, it quickly adds up; with a population size of 1,000 agents and a total of 100,000 generations, the total cost would be between $500 and $1,000. This means that even if we could secure some amount of funding, we could probably only run the process a single time (and it would take a significant amount of time to get all our data back). This is also running under the presupposition that human input is worthwhile and that humans are capable of distinguishing between such minute differences in visual art, especially at the beginning stages when everything is abstract and, probably, terrible.

     The other idea tossed around was evolution as art. There are many ways that this could be accomplished but one concept was a process similar to the Evolution of Mona Lisa. The idea would be to have an exhibit set up in a gallery consisting of a display, a button, a printer, and a camera. When the button is pressed the camera snaps a picture to serve as our fitness test. Immediately a population is created and outsourced to a cluster to strive towards an acceptable image as quickly as possible (with the Evolution of Mona Lisa this takes around two weeks…we’re hoping GP could optimize it a whole lot more, into the ten minute range with the cluster). While the generations are processed, an interesting visual presentation is shown on the screen and the viewer is able to see their image evolve from something completely random and purposeless, purpose to a very deep and philosophical Sagan-esque monologue about the wonder of nature as a computer. At the end of the presentation when the final image is ready, it’s displayed on screen and the viewer is given a series of print outs: the original image, the final image, a timeline of evolution (maybe best image every thousand generations?), and perhaps their final image’s DNA. While the display is not in the process of a presentation it could cycle through previously completed images and display them to the entire gallery. If we were able to present this in a way that highlighted how cool evolution really is while still being aesthetically interesting and dynamic, I think we could create human competitive art; it’s definitely something that would catch my eye at a gallery. Also I’m a nerd so my opinion on the matter of what’s interesting in a gallery could be completely invalid.

The Evolution of Art: Questions and Concepts

September 29, 2009
by David Gilmore (dbg09)

We’ve been kicking around some ideas about our project, how to technically accomplish it and how to make it as human competitive as possible. Here’s a quick rundown of the current thoughts and questions.

Neural Networks and Machine Learning:

  • Is it possible to create a loose neural network framework that can analyze an image pixel by pixel and make judgements about it with absolutely no training? If so, can these judgements be relevant and useful or are they destined to just be idiosyncratic and purposeless?
  • How important is it to teach the computer what bad art is? Could it also learn from static noise, poorly composed photographs? Is it important that we rate art before feeding it into the network so that it has more to work from than just “bad” or “good?”
  • Is the field of visual art too broad for the neural network to make any reasonable deductions? Should we focus on a specific style of art? If so, what would be the most effective choice?
  • Where the hell do we even start?
Generation and Evolution:
  • After reading about the “evolution” of the Mona Lisa (which is really more hill-climbing than GP), the idea of creating an image (and genetically representing it) as a collection of initially randomly placed (and added) polygons seems like an effective starting point. With a high enough polygon count the art could begin to approach realism and even a low polygon count would create a unique aesthetic style.
  • The fitness function will necessarily need to be linked to our neural network critics’ opinions. This needs to be on a scale – “bad” or “good” will not help us with the determination of the most fit genes in a population.
  • Mutation here may need to be at a higher rate than is common with most genetic programming projects; adding new polygons or changing their characteristics (color, opacity, position) seems crucial for population diversity.
  • Would it be possible or effective to identify specific parts of the image that are especially bad and select them for more heavy mutation or crossover in the reproduction phase?
  • When are we done? A maximum number of iterations or a minimum required fitness score?
Human Competitive Success:
  • If we can create something that we feel could potentially be placed in a gallery, how should it be presented? Does stating that it was generated entirely by a computer defeat the human competitive nature of the project? Along the same lines, could a series of images showing how a final piece of visual art evolved from nothing be seen as human competitive art?
  • Lee and I spoke briefly about the idea of evolving descriptions or backgrounds as a companion to the visual art we create. We both attended a lecture by Paul Bloom entitled “But is it art?: A case-study in the cognitive science of pleasure.” In his talk, Bloom emphasized the importance of author intent and meaning in the human appreciation of art, putting it above even aesthetic appeal in importance. A striking example he gave was the piece of art brings people to tears more than any other: The Rothko Chapel. The chapel contains a wall with several canvases painted entirely dark purple. There is no disturbing imagery or any realism at all. The reason these people were so deeply affected was because of their knowledge of the artist, that Rothko committed suicide not long after the completion of the paintings. Aesthetically the paintings are simple but the meaning conveyed by them is multi-dimensional. What if it was possible, using GP, to generate these detailed and emotionally powerful back stories and descriptions? Could that add another dimension to the art and make it more human competitive?
We’re still philosophizing at this point as it’s important to have all the basic concepts down before starting a technical application. Any input and feedback would be greatly appreciated.
-Ben Gilmore