2006F MAT259 - Visualizing Information


MAT 259 - 2007W		Mark Daggett "Planning Databases for Visualization"

Introduction		This quarter George gave me the opportunity to work on a large scale visualization project, which when completed will visualize the type of images people take, save and share using their cellphones. For his project I am building a web server, which will interface with the Flickr.com API to retrieve images, which viewers have uploaded to Flickr through their cellphone. I am also building database, whose structure should succinctly expresses all the components of the project, yet afford enough flexibility to be extended as the complexity of the project’s visualizations grow. I will also be building voice-activated interface to the server, which affords users the ability to retrieve server data by navigating a series of tree structures using their cellphone keypad or their voice. While many of the tasks I am working on are not expressly visual, they do have the opportunity to empower those who do the visualization if done correctly. The thrust of this paper will be to explain my approach to planning this project’s database to allow the visual designers to have both an easy to use programming interface into the database records and a well-planned structure that will scale well. I will describe my techniques and methodologies for creating the various components under my control. Where relevant, I will explain how my choices affect the visualizations and why I made the decisions that I did.

Methodology		It is easy to get overwhelmed by the details, deadlines, and dependencies when starting a new project. Over the years I have borrowed from other development methodologies to create a process, which works well with my own personality and those, who I collaborate with. Using a methodology even bad ones give you a staring point to tackle projects. Development methodologies are critical for visualization projects, because these projects require an extremely high level of attention to detail. Additionally, the typical visualization project deals with information, which is hard to understand. Using a development methodology allows you to break the large visualization project into discreet tasks, which are achievable instead of becoming overwhelmed by the enormity of the larger problem to be solved. Below is a description of several pillars of my development methodology, which I am using to complete this project. Small Iterations The idea of small iterations (also called "incremental builds", and "iterative releases") is a concept borrowed from the "Extreme Programming" (XP)1 methodology, which is proselytized by very smart people like Kent Beck. One of XP’s aims is to allow programming to start faster. In contrast, other methodologies (like Waterfall) spend much of their time initially gathering requirements. They put heavy emphasis on doing use-case and work-flow diagrams and writing up a project specification. Eventually, these documents are put in front of the client for a sign off. This process is very time consuming, and often times the documents generated through this process are heavily revised or even tossed out as requirements, scope, time frames and budgets change. Obviously, some level of requirements gathering is needed, because this process illuminates the problems implicit in the project. XP attempts to capture just the requirements needed to get started. The rest of the requirements are organically captured through deploying, documenting and discussing the incremental builds. Ideally, an incremental build should be a version of the application that includes only working features. For example, my first incremental build of the server only had the ability to search the Flickr database using their API. By releasing only working code you can ensure that the feedback you get will be relative to the existing features. Otherwise feedback may contain technical and conceptual errors created because of partially implemented future functionality. Reduce Serial Dependencies Complex problems often have elements, which need to be solved in a specific sequence to find a correct solution. Working as an individual means that they alone choose the sequence of development. However, collaborating between many people complicates development because it adds an additional layer of dependency to completing tasks. When developing as a team an effort should be made to work in parallel wherever possible, which means allowing team members to develop without having to wait for other members to finish first. For example, Angus the other develop and I agreed to divide the development process into two halves; the server and the presentation layer. When the presentation layer needs information from the server he will mock up the data into the format, which he desires. This allows him to keep forging ahead without waiting on my server functionality. Additionally, once I am ready to implement the methods needed to support his visualizations I can use his mock data as a template. Reduce Functionality Visualization projects often need to present the retrieved data in a variety of visual contexts. Being able to pivot or drill down into data affords users deeper understandings of what they the visualization is trying to illuminate. However, developing a rich feature set is not easy. Each new function that you add to the application is another choice, which the user must consider before making their choice. Having too many options can make users freeze unable to progress, sometimes called "analysis paralysis". It is essential to release an application with just the most specific features needed to understand the problem. Over time developers can extend the application and add additional functionality which would be "nice to have". Limiting the functionality from outset allows developers to deliver the project sooner and allows users to learn the new software faster. Over time as features are added the same users will only need to learn the new subset. For our project, the team has decided to limit the first release to include only the features which are needed to use the cellphone to upload photos and control the visualization from the cellphone.

Current Implementation		This project will extend well beyond the time-period dedicated to this class. Yet, I have made considerable progress during the time I dedicated to it during this quarter. This section describes the features (part and whole), which I implemented during this quarter. Flickr Search I created a search engine, which can query the Flickr.com API for photos, which match either a full-text query or include a specific tag word. The search engine will return the results into a paginated collection, which means the user does not need to wait for all of the results to be loaded into the server’s memory before the first one can be viewed. Instead the results are paginated over a series of pages. By default the server displays ten photos per page, but this is easily changed. The tag search allows you to combine tags to narrow your result set. For example, show me photos tagged with "cars and corvette and red", which would be a narrower result set than just searching for "cars". Custom Tagging Parsing the photo out of an email sent from a cell phone is a non-trivial task. In earlier iterations of this project one of the stumbling blocks was finding a reliable algorithm that would extract the photo from the incoming message. Fortunately, Flickr.com does this extremely well. However, Flickr’s tagging systems is very specific to their needs and therefore once we have downloaded the user’s photo from Flickr I wrap the Flickr tags inside a custom tagging class, which I wrote. The tagging class I wrote allows many types of objects to be associated with a tag not just photos. Over time as people upload and annotate their content it will be very nice to have the flexibility to then associate those tags with other meta data in the database. Voice XML Call Scripts Operators in call centers often interact with their callers by reading from scripts. These call scripts are decision trees, which help route the caller to the information they were looking for by asking a series of detailed questions, meant to narrow the focus of the call. My first call script class attempts to allow the user to access content from the database by working their way through a menu tree of choices. My first call script gives the user the ability to choose one of three choices. They can hear the credits for the work, they can hear the description of the work, or they can search the database using their keypad. The last option is the most complex and once refined will comprise the main interaction modality for this aspect of the work. Presently, a user can only retrieve the most recent photos and the most recent tags, by selecting either the number one or two on their keypad. The goal is to allow users to search using their voice, where they would speak their search criteria into the phone and the server would then extract the terms from the user’s vocal sentence.

Objects		The objects of the database are the classes I created to wrap specific tables in the database. Each object class defines how columns within the database table are to be treated. For example, some classes enforce validation rules before the server is allowed to save a record. Tags need a "name", which is a column in Tag table, which allows only strings. If a user enters an integer or nothing at all the Tag class, which wraps the tag Table will not allow this object to be saved. Photo A photo is any digitized photo, which a user has captured using their camera or cellphone and uploaded to Flickr. Flickr only stores digital images in the jpg format. Tag Search engines, which create indexes of information use text as the main way to generate the catalog. Tagging content means associating extra keywords to the campaign, which do not appear in the textual description using tags. Tags are especially important for binary data like photos because search engines cannot determine the contents of a photo without some associative meta data, which tags provide. Tags can also be used as a new organizational metaphor. Information is often arranged in predetermined ontologies like ones used in category trees. These tree structures can be rigid and requires that information fit neatly into parent child relationships with one another. Tags allow actors to subvert these rigidly organized structures by organically create "folksomonies", which use tags to link data though collaborative organization. Tagging Tags are polymorphic meaning that they can append many types of data including , users, photos and recordings. To afford this many to many association a tagging object does some magic in the database, which routes the tags to the correct object type. Tags create a "through" relationship between the tagged object and the tag itself. This works much like a proxy. CallScript A CallScript is a hierarchical menu tree, which a user navigates to trigger server functionality. A CallScript is created for each discreet task, which needs to be completed to support the user’s control of the visualization. Some CallScripts will be remotely hosted on a third-party VOIP (voice over internet protocol) server, which handles all of the voice recognition functionality. The voice recognition service is needed to traverse the menu. CallScript fragments will also be hosted on the server I am building. The server will dynamically combine these fragments with results from the database to afford user’s access to the database’s content. Recording Users can record and append verbal comments to a photo, in the same way they may use textual tags. The recorded comments are associated to a maximum of one photo, unlike tags which can be affixed to many other photos. User Any person who wants to upload a photo, record an audio message, control a visualization using their cellphone or append a tag to an object. A user does not have to create an account explicitly like you might if you were signing up for a website like Amazon.com. Instead an account is created for a user, each time they call the voice recognition server. The server uses the caller’s phone number to determine uniqueness of the record within the server.

Object Relational Mappings		The object relational mappings describe the conceptual associations between tables within the database. Modern relational databases like Mysql use unique keys within a database record to infer relations to another record. Sometimes those keys can actually be shared between many records within the database. Shared keys are known as foreign keys because the record’s key column actually acts as a pointer to another record. This process of linking tables through keys is the main method databases use to understand often times semantic relationships in the data. I have included this list below to explain how the various tables relate to one another without having to dig into the database specific technical information. A User has_many Photos A User has_many Tags through Taggings A User has_many Taggings A User has_many Recordings A Recording belongs_to User A Recording belongs_to Photo A Recording has_many Tags through Taggings A Recording has_many Taggings A Tag has_many Users through Taggings A Tag has_many Recordings through Taggings A Tag has_many Photos through Taggings A Tag has_many Taggings A Photo belongs_to User A Photo has_many Recordings A Photo has_many Tags through Taggings

Database Schema		The database schema is a database agnostic representation of the columns and tables which constitute the database. While this schema is a development file it is easy to understand even for a non technical person. I am including this file because it gives a detailed description of the various columns, which make up the various tables. ActiveRecord::Schema.define(:version => 3) do create_table "photos", :force => true do \|t\| t.column "user_id", :integer t.column "flickr_id", :integer t.column "url", :string t.column "tag_aggregate", :text t.column "title", :string t.column "comments", :integer t.column "description", :text t.column "created_on", :datetime t.column "modified_on", :datetime end create_table "recordings", :force => true do \|t\| t.column "user_id", :integer t.column "photo_id", :integer t.column "filename", :string t.column "created_on", :datetime t.column "modified_on", :datetime end create_table "taggings", :force => true do \|t\| t.column "taggable_id", :integer t.column "tag_id", :integer t.column "taggable_type", :string end create_table "tags", :force => true do \|t\| t.column "name", :string end create_table "users", :force => true do \|t\| t.column "name", :string t.column "login", :string t.column "email", :string t.column "crypted_password", :string, :limit => 40 t.column "salt", :string, :limit => 40 t.column "created_at", :datetime t.column "updated_at", :datetime t.column "remember_token", :string t.column "remember_token_expires_at", :datetime t.column "activation_code", :string, :limit => 40 t.column "activated_at", :datetime t.column "phone_area_code", :integer t.column "phone_number", :integer t.column "phone_carrier", :string end

References and Resources		Below are a list of references and resources consulted during the writing on this paper. I also included links to help further introduces in greater detail concepts, which are essential to the completion of the project, yet outside the scope of this writing. Ruby on Rails Introduction http://www.onlamp.com/pub/a/onlamp/2005/01/20/rails.html http://api.rubyonrails.org/ http://corelib.rubyonrails.org/ http://manuals.rubyonrails.com/read/chapter/65 Development Methodologies http://www.extremeprogramming.org/ http://www.extremeprogramming.org/rules.html http://gettingreal.37signals.com/ Ruby on Rails Database Schema & Migration Concepts http://www.oracle.com/technology/pub/articles/saternos-rails.html http://media.rubyonrails.org/video/migrations.mov