Visual Synthesis and Esthetic Values

 

Carla Farsi

Associate Professor of Mathematics

Chair, 2005 CU Special Year in Art and Mathematics http://math.colorado.edu/Art&Math/index.html

Department of Mathematics

University of Colorado/Boulder

Boulder CO 80301-0395, USA

farsi@euclid.colorado.edu  http://spot.colorado.edu/~farsi/

 

Kristopher R. Collins

Department of Mathematics

University of Colorado/Boulder

Boulder CO 80301-0395, USA

Kristopher.Collins@colorado.edu http://www.digvisuals.com/genarts/

 

Abstract

 

As primarily visual creatures, we are subject to enormous amounts of sight-stimulus at all times.  In nature we see things that are regarded as "beautiful,” a flower, a sunset, a mountain range.  We also see things that invoke a negative reaction, often one of fear or danger. 

 

We propose to create original software that will synthesize visual material.  This material will be generated using algorithms that will attempt to correlate certain mathematical functions with visually pleasing results.  We hope to find characteristics of appealing visual material that can be quantified in some way.  Symmetry, color balance, color harmonization, self-similarity, smoothness, negative space, sharpness and complexity are only some of the apparently quantifiable areas which seem to play a role in determining if an image is pleasing or not. 

 

After developing our program to allow for a huge range of output, a test will take place. Images will be generated, and then rated by a panel of math students.  For each image, the students will rate the visual appeal.  From this data we hope to find trends that show what kind of parameters generate good results.

 

By synthesizing images, we remove the element of objective familiarity; a flower would presumably be rated as appealing, but would a stylistically similar synthesized image receive the same rating?

 

A carefully implemented program will allow for each unique image to be based upon a savable string of data, or genotype (according to Karl Sims’s terminology).  This project will attempt to collect a decent sized sample base of images, their genotypes, and their ratings. 

 

 

 

 

 

I. Introduction

 

Like other forms of visual art, computer art is subject to certain generalizations or trends that correlate to the piece's appeal.  In language, we have developed words such as "palette", "shape", "composition" and many others to represent some particular visual attribute of the image.  When we look at pieces of art, we know what we like, but it is not always easy to describe why we like it.  Where does the appeal of Mona Lisa lie?  Does it come from her form, the colors, the history, the realism - or a combination?  Certainly all of these and many more play a role in the appeal of such a painting. 

 

The advantage of computer graphics is that it now definitively supercedes all other 2 dimensional art forms in terms of "visual information".  By this I mean that through the technological advances of digital image quality, we can visually represent anything flat in a high resolution, high depth RGB image.  From photographs to drawings and paintings, the digital version can be so detailed that we do not differentiate from the original.  Of course there are things that get lost - texture, size, and environment, to start - but when asked to identify the image on screen, there is still more than enough information to immediately recognize Mona Lisa. 

 

Not only will the recognition get translated, but so will the appeal and feeling derived from the visual aspects of the original.  For this reason, it would follow that the RGB display is more than sufficient canvas to examine core aspects of image appeal.  Of course, the power (and the point) of examining images in digital form is to quantify the visual information in some way. 

 

This study, and the accompanying Mac OSX visual synthesis software (MOTH 3.0a) is an attempt to isolate and quantify very basic elements of image appeal.  By developing a system of image synthesis that allows for a large variety of possible outcomes as well as a quantification of its synthesis routines, we hope to find trends corresponding to image appeal.  Since the space of possible images is so huge, a way is needed to quickly explore lots of possibilities. Hence, the interface design was derived from looking at the interfaces of analog audio synthesizers.  

 

The main advantage to the use of synthesized images as opposed to scans, or digital photographs is that the generated material is a more pure visual specimen.  By this I mean that for example many of the concepts which make Mona Lisa appealing - subject (human face), realism (very realistic), emotion (sad?), history  - are inherently missing from synthesized art.  Without these factors, we are able to better isolate the two things we care most about: color (palette), and composition (how the array is filled).

 

We will ignore the possible influence of the medium of the digital form itself.  Elements such as resolution, artifacts, and incompleteness of the RGB color model, while possibly playing a role in image appeal, are ignored.  Of course the digital image will never be anything but a compartmentalized symbol of the true continuous sensory experience of "real art".  As color depth and resolution approach infinity, the eye is tricked but the original image is in some very valid sense lost altogether.  For this paper it is assumed that a high-resolution scan of Mona Lisa on a 1:1 scale screen carries the same visual information as the original.   

 

 

II. The software basics

 

In visual synthesis, the range of possibilities is immense.  Imagine simply a 720x486x16bit VGA image space.  Its possible values could be nearly anything. From a pure black field to Dali's "Persistence of Memory”.  Every frame from every movie ever made with every combination of every effect, as well as every variation of hue, saturation, and levels.  Then think about every version of those shifted every possible amount along every vector. 

 

To explore this space, a flexible system was needed that could generate a large range of results from relatively simple rules.  Thinking of the image as a 2 dimensional array that needs to be populated in some way, and building on earlier software concepts it was decided that 2 dimensional periodic interference would be a rational choice as the basis for our visual synthesizer.  Using waves in the visual synthesis again follows the analogy to audio synthesizers.

 

Interference can create many interesting results that bear stylistic resonance to many natural phenomena and artistic compositions.   As each point on the plane has a unique coordinate (x,y) a color can be assigned to it based on some function f(x,y).  For example

 

Color = x+y

 

Would plot a pixel at (3,4) of "color 7". 

 

In RGB images there are of course three components of color per pixel (RGB), and it is feasible to determine each one's value through a different equation.  For the purposes of this study however it was decided that a color palette would be determined first, then a single equation could select the color index. 

 

The use of a color palette as opposed to separately treating RGB values is a matter of taste and scope.  From an artistic view, in my experience the use of a predetermined palette (whether discrete or function-based) in generative art can produce results with more sensorally pleasing results.  This knowledge was important because the goal was to generate a system capable of generating pleasing imagery.

 

I think of watching the reflection of a sunset in a gently wavy pond.    If you watch a certain point, at regular phases of the waves' collective interference you see red, at others yellow.  Their interference is an equation which points to a set of colors - the sky.

 

III. Software design

 

For the framework of data generation, a model was invented.  This model was designed as a hypothetical method for generating images that might bear resemblance to natural forms. 

 

Color->Population->Displacement

 

As described above, the Population part of the model used interference techniques to create visual information.  The color palette generation method was developed using oscillators to continue the audio synth analogy.

To maintain smooth continuity and following the concept of interference and natural phenomena, periodic functions were used to define a palette of RGB values.  This method allows for a great variety of results, where the palette can have many interesting fluctuations and harmonies.

 

PaletteR[index] = cos(index / a1 + c1) * b1

PaletteG[index] = cos(index / a2 + c2) * b2

PaletteB[index] = cos(index / a3 + c3) * b3

 

(a,b and c scaled sensibly)

 

After the colors have been defined, and the interference has populated the image, a displacement occurs to further add variety and compositional possibilities.  The method of displacement was an inverse mapping based on vectors.  For each point in the image, its new value would be defined by the pixel pointed to by a vector.  This vector is defined by a function.  So for example, a displacement vector could tell a point to get its new value by rotating 30 degrees around the center of the image. 

 

If these displacement vectors indicate a point off the image coordinates, the point is reflected back into the image along the axis of the border.  This reflection again adds visual variety in the form of symmetry, and rhythm.  Both of these things are very common elements in both nature as well as art, and their inclusion in a visual information synthesizer is welcome.  Further, when this technique is iterated, another great feature of visual synthesis is introduced - self-similarity.   

 

So for each of the three phases: Color, Population and Displacement, a function is defined which takes few arguments, but gives a large range of possible visual results.  The arguments were set to be both controlled through a GUI, and by random population for quick exploration of results.  The image's unique identifier is then encoded in a short genotype, visible below the image screen.

 

After development, testing and refinement a system was settled upon which was able to create a large variety of images and quantify their creation.   The software is code named MOTH and is available at the website listed above.

 

IV.  Experiment

 

While the major portion of this project was the development of a software environment for image synthesis, we also hoped to find some trends in generative image appeal.  This is not a hard scientific study. The goal is more for us as artists to gain some insight into what makes images appealing, and what does not.

 

Any questions involving art and "appeal" will ultimately boil down to taste.  However it is the contention of this paper that given multiple samples of individual tastes it is possible to discern trends reflecting general image appeal. 

 

For this reason it was decided to generate a sequence of images and quantify their visual appeal by group viewings.  The group members would then rate each image.  The images synthesized were to exploit features, which could reveal information about key elements of why people might like them.  For example, image number 24 was simply a hue-rotated version of image 22.  Image 31 was designed to be confusing and would be hypothesized to score poorly.  The ones I liked best were chosen as candidates for high scores.

 

There were also control images planted in the sequence.  Two abstract photographs were placed as grounding elements, hoping to receive reviews of indifference.  There were two fractals inserted in the sequence as well.  Generated by another piece of software, these fractals were placed to break up the sequence with completely different forms, thereby keeping the discerning eye fresh.

 

Again, since this is an artistic venture more than a scientific one, we aren't concerned with the details and implications of image sequencing. Similarly, psychological questions and statistical details are not important. We simply hope to notice trends which will aid us in the development of generative art. 

 

The people who would view these images were to rate each one on a scale of one to five.  One correlates to "highly unappealing", two to "unappealing", three means indifferent, four means "appealing”, and five means "Highly appealing".  This very rough rating system is helpful because instead of people getting caught up in whether an image is a "6" or a "7", for example, they simply react to it and chose the closest rating.

 

The participants were asked to view images for purely their visual appeal.  After displaying each image in sequence (sequence is visible at the end of this paper) and obtaining ratings from the participants, the mean is found to arrive at an approximation of the general appeal of each image.  As of now, approximately 50 people have added data through their ratings.  The data obtained in the experiment is shown here:

 

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Total

Mean

 

30

5

5

5

4

5

4

4

2

4

5

5

4

2

5

5

64

4.27

 

35

4

5

3

4

5

4

5

5

5

4

2

4

3

4

4

61

4.07

 

6

5

4

4

2

4

5

2

3

4

4

5

4

3

5

5

59

3.93

 

7

4

5

3

4

4

5

4

3

5

4

4

4

4

3

3

59

3.93

 

9

5

5

4

5

4

4

3

2

2

5

4

5

2

5

4

59

3.93

 

1

5

3

2

1

4

4

5

4

3

5

4

4

3

5

4

56

3.73

 

11

4

5

3

5

4

3

2

3

4

5

3

3

2

5

5

56

3.73

 

16

4

4

3

4

5

4

3

3

2

5

4

3

2

4

4

54

3.60

 

17

5

2

3

4

4

5

3

4

3

4

5

3

3

3

3

54

3.60

 

20

3

3

3

4

4

3

5

4

3

4

2

5

3

4

4

54

3.60

 

5

5

5

2

4

1

4

3

4

3

5

4

5

3

1

4

53

3.53

 

19

2

3

4

3

2

3

5

3

4

5

4

5

2

3

4

52

3.47

 

21

3

4

4

1

3

5

4

3

4

4

3

5

4

2

3

52

3.47

 

25

2

4

4

3

1

3

5

4

4

3

3

4

4

4

4

52

3.47

 

37

3

5

4

3

4

3

4

2

4

3

3

4

3

3

4

52

3.47

 

28

4

2

4

2

4

4

3

3

3

3

4

5

2

3

5

51

3.40

 

2

3

3

3

4

4

4

5

3

3

4

3

2

4

2

3

50

3.33

 

8

5

3

2

4

2

4

2

4

3

4

3

4

2

4

4

50

3.33

 

13

4

5

3

4

2

4

3

3

1

4

2

4

3

4

4

50

3.33

 

29

4

4

3

3

3

4

3

4

3

4

3

4

2

3

3

50

3.33

 

4

3

5

2

4

3

4

2

4

2

4

3

2

2

5

4

49

3.27

 

27

4

3

3

4

3

3

3

3

2

4

4

4

2

4

3

49

3.27

 

38

4

4

2

4

4

2

2

4

4

4

3

3

3

3

3

49

3.27

 

15

2

2

3

3

3

3

4

2

3

5

3

4

2

4

3

46

3.07

 

33

4

2

3

2

4

2

3

4

4

3

1

3

3

4

4

46

3.07

 

3

4

2

2

5

2

4

1

2

4

3

2

3

4

4

3

45

3.00

 

32

1

5

3

2

3

3

3

4

2

4

3

5

2

3

2

45

3.00

 

12

5

3

2

2

4

3

1

4

3

4

2

3

2

3

3

44

2.93

 

14

2

4

4

4

2

4

4

3

1

2

2

3

2

3

4

44

2.93

 

23

1

5

2

4

1

2

4

2

3

4

2

4

2

4

3

43

2.87

 

40

3

5

2

1

2

2

3

4

3

3

4

3

2

3

2

42

2.80

 

18

5

2

2

2

3

2

2

2

5

2

1

4

2

2

5

41

2.73

 

26

3

2

4

1

4

3

4

2

2

1

1

2

2

3

5

39

2.60

 

39

2

2

4

3

2

2

2

4

5

1

2

2

2

1

4

38

2.53

 

10

2

2

1

1

2

3

3

3

4

2

2

4

3

2

2

36

2.40

 

34

3

2

4

1

1

2

3

3

1

4

1

2

2

3

4

36

2.40

 

36

2

4

3

1

1

2

2

3

5

3

1

2

2

2

3

36

2.40

 

22

2

4

1

1

1

3

3

2

2

2

2

3

3

4

2

35

2.33

 

24

2

5

1

1

1

2

3

2

3

2

2

2

3

3

2

34

2.27

 

31

1

2

1

1

1

2

2

2

1

2

2

3

2

2

3

27

1.80

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Each row represents an image, numbered from 1-40.  Each column represents an individual reviewer.

 

This data could be analyzed many different complex ways.  For this project however, we are leaning further into the philosophy of art than into numerical analysis.  A generative artist should first and foremost look at the average ratings along with the sequence.  They can then view trends that correspond to their own questions about color and composition.  So for now, our focus is maintained on broad trends which ca be further refined in the future. 

 

V. Results and Conclusion

 

One trend is immediately noticeable: the lower rated images tend to have higher frequencies involved in their generation.   High frequencies in the palette generation stage of synthesis such as image 31 yielded images that scored lower.  Conversely, palettes with lower frequencies such as 30, 35 and 9 tended to score better.

 

Also regarding color, palettes with good complimentary balance scored better than those without.  For example, image 7, 20 and others display this property, while 36 does not.

 

Similarly, the factors of the array Population (interference) introduced trends related to frequency.  The higher frequency interference patterns tended to score much lower than the lower frequencies.  For example, images at the lowest 10 ranks all have higher frequencies of interference patterns than most of the higher scoring images.

 

The lower appeal of high-frequency (or "busy") images makes sense in different ways.  First, as an analogy to audio: high frequencies that are out of phase and non-harmonic sound unappealing.  Image 31 was designed with these exact characteristics in mind, hypothesized to score poorly and it did.   Second, as the image detail approaches a sub-resolution level, the bitmap loses its ability to display them in a meaningful fashion.  For example, image 26 shows a very detailed and complex interference pattern.  However, the resolution of the image limits the detail, and the image becomes chaotic and unappealing. 

 

Displacement was similar to the others in its frequency-affected trends.  The high frequency changes in displacement images like 24, and 26 scored lower than slower-changing displacements like 35 and 16.

 

It seems that as a general rule, *the frequencies that seem to provide the best results are those which cycle only a few times over their input range*.  If the parameter goes through a single cycle over its range, then it exploits all of its possible values, but no more.  This slower changing of parameters often correlates to an image with descriptors such as "flowing", "smooth", "painterly", or "balanced".  But the images featuring prominent high frequency attributes received comments such as "messy" and "busy".  The ratings back up these adjectives as quantification of such concepts.

 

There are images in the sequence that seem to defy the trends I have mentioned.  For example, image 6 was predicted to score poorly but it finished ranked third.  It features what I considered ugly colors, and a confusing and busy composition. 

 

Interestingly, the highest scoring image was one of the control images.  It was image 30, a mutation of Mandelbrot Set (escape velocity shading) that is actually my favorite of all the images as well.  While inserted as a control, and developed under a different project (MAX/MSP fractal explorer), its appeal is still relevant in the sequence.  It shows what is in my opinion something beautiful.  With an average rating greater than "Appealing", the data seems to support that opinion.  

 

But it is just numbers.  It’s a set of rules simple enough to code on a single screen.  Yet somehow, after the program iterates enough times, it creates beauty.  A flower, a gently cloudy sky over a fiery sunset, trees in autumn, the ocean, and many of the most stereotypical "beautiful" things imaginable are being driven by complex systems.  While we can't replicate them, we can synthesize things that use models of similar forces in the hope of generating appealing images.

 

So our project was a success.  The goal of creating an interactive visual synthesizer was achieved.  Also, we observed at least two useful general trends in the data that will aid in generative art design.  The software is fun to use, and is a tool to create interesting pieces of abstract generative art. 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

VI. Image Sequence

 

1                                             2                                             3

4                                             5                                             6

7                                             8                                             9

10                                           11                                           12

13                                           14                                           15

16                                           17                                           18

 

 

 

19                               20                                           21

22                                           23                                           24

25                                           26                                           27

 

28                                           29                                           30

31                                           32                                           33

 

 

 

 

 

 

34                                           35                                           36       

37                                           38                                           39

40