There should be an image here!We live in an information-rich era. Being quite good at collecting information, we’re not especially good at figuring out what to do with it: understanding it, learning from it, and being able to convey its meaning to others.

“What do the paths that millions of visitors take through a Web site look like?” asks author Ben Fry in the beginning of his new book, Visualizing Data. “How do the 3.1 billion A, C, G, and T letters of the human genome compare to those of the chimp or the mouse? Out of a few hundred thousand files on your computers hard disk, which ones are taking up the most space, and how often do you use them?”

With all the data we’ve collected, we still don’t have many satisfactory answers to these sorts of questions. “This is the greatest challenge of our information-rich era: how can these questions be answered quickly, if not instantaneously?” says Fry. “We’re getting so good at measuring and recording things, why haven’t we kept up with the methods to understand and communicate this information?”

Fry points out that all of the previous questions involve a large quantity of data, which makes it extremely difficult to gain a big picture understanding of its meaning. The problem is further compounded by the data’s continually changing nature, which can result from the addition of new information being added or older information continuously being refined.

Visualizing Data shows readers how to make use of data as a resource that they might otherwise never tap. Fry teaches a variety of basic visualization principles, including how to choose the right kind of display for specific purposes, and how to provide interactive features that will bring users to a site over and over. Most important, Fry provides an approach for how to understand data: how to get from a sea of numbers to meaningful information.

Readers of his book will discover:

  • The seven stages of visualizing data — acquire, parse, filter, mine, represent, refine, and interact
  • How all data problems begin with a question and end with a narrative construct that provides a clear answer without extraneous details
  • Several example projects with the code to make them work
  • Positive and negative points of each representation discussed. The focus is on customization so that each one best suits what you want to convey about a data set

This book does not provide ready-made “visualizations” that can be plugged into any data set. Instead, with chapters divided by types of data rather than types of display, Fry shows how each visualization conveys the unique properties of the data it represents — why the data was collected, what’s interesting about it, and what stories it can tell.

“I’m trying to address people who want to ask questions, to play with data, to gain an understanding of how to communicate data to others,” says Fry, who explains that the audience for his book is quite broad, and includes Web designers who want to build more complex visualizations than their tools will allow, as well as software engineers who want to become adept at writing software that represents data. “Fundamentally, it’s a book for people who have a data set, a curiosity to explore it, and an idea of what they want to communicate about it.”

Performing sophisticated data analysis no longer requires a research laboratory, just a cheap machine and some code. Complex data sets can be accessed, explored, and analyzed by the public in a way that simply was not possible in the past. Visualizing Data shows how.