From Loop

The Color Project

Can a picture represent 33,000 words? If Martin Wattenberg has anything to do with it, the answer is most certainly yes. Explore the map of tiny rectangles and color with "The Color Project."

The Color Project is a map of 33,000 English nouns. Each tiny rectangle corresponds to a noun. The color of the rectangle has been assigned a color, based on an internet image search for that noun. The words are clustered so that similar words are near each other. Fig. 1 has been annotated to show a few prominent clusters.

Launch the Color Project (For best results, use Internet Explorer.)

Can I pick a color and see what word it corresponds to?

Yes. Click on the "color" button. That will arrange the words by hue and brightness, forming a kind of color chooser. Move the mouse over the rectangles to see what words they correspond to.

Can I type a word and see what color it is?

Yes. Type the word into the "all starting with" text box. As you type, the map will narrow down to show all the words starting with the letters you've typed.

How exactly did you assign colors?

For each word, we performed a Yahoo image search and retrieved 50 image results. (If fewer than 50 images were found, we deemed the word too obscure and discarded it.) Then for each image we averaged the values of the pixels in the middle and then averaged those 50 results. We brightened the colors slightly for display.

Why can't I see anything?

Color Code is a Java applet, a program written in the secure Java language and formatted to run within a web browser. If you can't see the artwork, your browser may not currently support Java. Check Options/Preferences to make sure that Java is enabled. If you don't have Java installed at all, you can download the free plug-in from Sun Microsystems.

What words did you use, and how did you categorize them?
The words are a selection of the nouns found in WordNet. We then used WordNet data on word relationships to create a tree-structured categorization of the words. Our process is a rough approximation but it works well enough to create a meaningful map. Note that we wanted words to appear in only one place, which can lead to some strange results: the part of the map that contains colors doesn't contain "orange," for example, because "orange" is in the citrus fruit section.

Shouldn't you watch your language?

The dictionary of words is designed to present a full picture of the English language. As a result, it includes some words that are not used in polite conversation. We considered filtering the words, but that would create a dishonest portrait. Another source of confusion is the categorization used by WordNet, which encodes some judgments that are not culturally universal.

Why are the word rectangles different sizes?

The size of the rectangle corresponds to the intensity of the color, with bright, intense colors receiving more space. The goal was to emphasize those parts of the language that are most associated with specific colors.

Some of the bright colors don't make sense!

If you're surprised by a word's color, use the mouse menu to do an image search. In some cases ("amnesiac") there may be an association you're not aware of. In others ("throng" or "overrun") a cluster of images with an idiosyncratic usage creates an unexpected result.

Are there other visualizations of language?

Yes; here is a partial list of examples. Matthew Grenby's Gradus plots a 3D historical view of English. Thinkmap's Visual Thesaurus shows a hypnotic view of word relationships. Hugo Liu's Aesthetiscope is an installation that uses color averaging to complement poetry.

About the Author: Martin Wattenberg's work centers on the theme of making the invisible visible. Past projects include the Thinking Machine series, the NameVoyager, The Shape of Song, the Whitney Artport's Idea Line and Apartment. Wattenberg is a researcher at IBM, where he creates new forms of data visualization. He is also known for the SmartMoney.com Map of the Market. He holds a Ph.D. in mathematics from U.C. Berkeley.

  1. link to this comment by Dru Martin Wed Aug 17, 2005

    Wonderful! It seems it's really Yahoo's portrait of the world, thru the lens of the photographers that made the images in the first place. Sort of mathematical impression of an impression of an impression. A great portrait, and fun way to explore the language. And for me, it's less about language, and more about the things themselves. Sort of like removing the 'word as a symbol of something' altogether.

  2. link to this comment by Dave Zielinski Thu Aug 25, 2005

    Part of my reaction is - this is extremely interesting and has something worth investigating. I think the map of language is terrific. I also love the different hues of Pasta terms, and the different shade Savoy has compared to Cabbage. My other reaction is - so what. What meaning to I assign to the fact that Sniveling has a very different color than Snivel?

    I think for objects, these results could provide valuable cues for a visual/design language. For abstract terms I think it’s just a roulette wheel of mostly middle-of-the-palette muddy hues.

    But the overall effect is visually impressive and kudos for taking a complex idea to full execution.

  3. link to this comment by Parth Upadhye Thu Oct 13, 2005

    Great execution. But it's the data that is "muddy". Would like to be able to "control" color assigning. Different functions to extract colors will yield some quite amazing "color" effects. It's like giving me control on how to see the "results".

  4. link to this comment by Héloïse Neefs Sat Mar 24, 2007

    Pity you cannot launch the program anymore!

  5. link to this comment by Paul Pierog Fri Jul 13, 2007

    This sounds very interesting to me.

    However, as it cannot be launched, what has happened to it?

  6. link to this comment by Peter Vianden Wed Dec 19, 2007

    Fantastic! How can I get further access to the project? Hopefully there is a chance for perpetuation!

Add a Comment

AIGA encourages thoughtful, responsible discourse. Please add comments judiciously, and refrain from maligning any individual, institution or body of work.