300 likes | 397 Views
Web Design Workshop . DIG 4104c Spring 2014 Dr. J. Michael Moshell University of Central Florida Lecture 9: MoranVision: A Case Study. www.redbugbblle.com. About David Moran. * Grad student in Digital Media MFA Program * Gay activist * Non-driver --- fascinated with urban landscape
E N D
Web Design Workshop DIG 4104c Spring 2014 Dr. J. Michael Moshell University of Central Florida Lecture 9: MoranVision: A Case Study www.redbugbblle.com
About David Moran * Grad student in Digital Media MFA Program * Gay activist * Non-driver --- fascinated with urban landscape What are the special problems of pedestrians in Orlando? * MFA Project: Interactive Photo Exhibition "Dead Quare Walking" * Concept: Halloween at Parliament House Then walk 15 miles to UCF, all night 200+ photos, along the way
David is a photographer * Needed a software delivery environment * I like to hack software, so -- I am building the environment for him. CONCEPT: Viewer is shown a picture, asked to hashtag it. The hashtag is used (in some way) to select the next picture. The study: what patterns emerge in the hashtags when people react to David's photography?
Design of Basic Architecture * Front end: HTML5 & Javascript * Back End: PHP * Why? It's what I know, and we want this site to be on the web. * Front end: Arrange the pictures and user interaction. Take in a hashtag, pass it to the back end. * Back end: perform cosine distance measure to find best image, inform front end of choice. advanceblueprintservice.com
The problems with hashtags * No easy way to separate out the words e. g. #sunnyplaceforshadypeople * How do you find the next picture? So, we decided to require CamelCase from user inputs. SunnyPlaceForShadyPeople (The demo is on a computer, not a mobile device)
Organizing the Exhibition Assume you've gathered a word-cloud (list of words) for each of 200 pictures. How do you organize the pictures? How do you search for the "next picture?" David's decision: nine "pages" of 4 pictures each Image groups of about 20 photos per page.
Word Clouds and Distance Metrics Each picture should have a GROWING WORD-CLOUD as users wander through the exhibition. Metaphor: how paths are (should be) designed on campuses. 1) Just plant grass and watch where people walk 2) Then put the concrete there. leveragepoint.typepad.com
Word Clouds and Distance Metrics So, we need two things: 1) A database structure to associate a large and growing number of words with each picture 2) An algorithm to measure the "distance" between a given hashtag (small wordcloud) and each picture's wordcloud Then we will (a) add the hashtag to the CURRENT pic's cloud, and (b) pick the NEXT picture whose cloud is most similar to the hashtag.
Word Clouds and Distance Metrics Literature research led to a popular distance metric: The Cosine Measure. 1) take each document and produce a frequency histogram
Word Clouds and Distance Metrics Literature research led to a popular distance metric: The Cosine Measure. 1) take each document and produce a frequency histogram
Word Clouds and Distance Metrics Literature research led to a popular distance metric: The Cosine Measure. 1) take each document and produce a frequency histogram To measure the distance between two documents, line up their histograms and multiply the matching terms.
Word Clouds and Distance Metrics Literature research led to a popular distance metric: The Cosine Measure. 1) take each document and produce a frequency histogram To measure the distance between two documents, line up their histograms and multiply the matching terms.
Word Clouds and Distance Metrics Literature research led to a popular distance metric: The Cosine Measure. 1) take each document and produce a frequency histogram To measure the distance between two documents, line up their histograms and multiply the matching terms. Some are "noise words": to, the, a, etc.
So, Job 1: Build cosine metric $histo['a']=2; $histo['distance']=2; $histo['and']=2; // etc Idea: use the SHORTER list to search the LONGER one. If it's not in the shorter list, the product =0 anyhow.
Problem: Long vs. Short The "cosine" must always be between 0 and 1. Analogy: the "angle" between two vectors.
Solution: Normalization The "cosine" must always be between 0 and 1. Analogy: the "angle" between two vectors. dot (a, b) cos (a, b) = dot(a, a) * dot (b, b)
Solution: Normalization The "cosine" must always be between 0 and 1. Analogy: the "angle" between two vectors. dot (a, b) cos (a, b) = dot(a,a) * dot (b, b) And now, if cos(a,b) =0.0, they have NO words in common. If cos(a,b) = 1.0, words AND frequencies match perfectly.
Next problem:CamelCase HashTags How to break up such a creature into words? Some quick research led to a regular-expression tool. This will take CamelCaseHashTags and produce array like: ('C', 'amel', 'C', 'ase', 'H', 'ash', 'T', 'ags');
Designing the Pictionary We want a structure that allows us to add words without limit, to each picture. Here's what we came up with. INPUT: Excel spreadsheet: (provides "starter kit" of tags)
Pictionary: Output: an "array of arrays" each item in wordlist has fields to store WHO added it and what PAGE they were on at the time. (note 'whole-tag' version here )
Skipping much detail, a late-stage issue: How do we discover .jpg vs .mov, and display each one appropriately? First Attempt: let back-end do it. Wasted some hours. Decided to let back-end NOT KNOW about filetype. Why? Simplicity. Just find the best-match number and pass it forward. Second Attempt: Javascript must check the file extension, and act appropriately.
BUT: Javascript cannot read local files! Part of the Security Model ** Javascript can read URLs – it's web-oriented ** We are running in MAMP, so "files are also URLs". ** We track down a function
We wrap it in our own 'file_exists' function: And ... we build it into a picture-getter, to decide if a particular file is .mov or .jpg
Our showpicture function: Has a bizarre feature ... the <video> tag MUST NOT CONTAIN a newline – or we get the dreaded Mystery Syntax Error
To show the video, I just ... jammed the HTML for a video tag directly into the node's innerHTML. To my amazement, it worked.
To show the picture (.jpg), more conventionally: I didn't figure out how to add a <video> node as a child. That would have been more legitimate, methinks.
Demo the work-in-progress Note: for debugging purposes if we get no tag-match, I show a nice kitty. In "real" project, we'll show a random image if no match.
For analysis, I show the correlations and word-clouds below (via the return-message)
Status:* Presenting at Information Fluency Conference next week. Defending thesis project, late March.* Still gotta get the page-2, etc. working* Currently using $_SESSION to accumulate tags; but must put into a Database* Source on course website if you want to look at it, borrow ideas, etc.