Technologies Used

Home > Portfolio > Data Manipulation

Data Manipulation and Analysis

A selection of my coursework on harvesting, processing, and aggregating data.

Date: Fall 2008
Course: SI 601: Data Retrieval and Analysis Techniques


Some examples of my visualizations


Keywords accompanying "weather" in AOL search term queries

This data set was comprised of the database of leaked AOL search queries from several years ago. The data was stored in a mySQL database, so our script had to access the database to get the search terms related to a certain query.


Top 10 Contributors to U.S. Presidential Candidates

For this assignment, I wrote a script that accessed the API of the government transparency website OpenSecrets.org to get the top contributors to John McCain and Barack Obama, parsed the XML that was returned, and formatted the data so it could be visualized. Surprisingly, four of their top contributors are companies that donated to each - and all are financial institutions.


Number of Movies Made per Genre Worldwide, 1888-2008

To find out which movie genres have been getting more or less popular in the last 100+ years, I wrote a script to parse the flat text file of movie data provided by IMDB. This script used many regular expressions to get the movie genre information and format it. The resulting graph shows that genres such as westerns have been on a sharp decline, while action movies have become more and more common.