Saturday Night Live on Yahoo Screen – mapping keywords to vast back catalog of sketches and video collections
(No longer live due to deprecation of Yahoo Screen)
Yahoo Screen was planning to release a massive catalog of Saturday Night Live clips, spanning the series entire run. I was approached to create relevant search features to highlight the licensed content.
I was given a list of video collections–recurring characters, hosts, themes, etc.–and an enormous spreadsheet containing video metadata for every single item in the catalog. My task: organize this data into discrete sets of keywords to bring up the precise, relevant video result or carousel of results, and investigate existing search coverage of SNL to ensure we cover those videos.
In addition to creating basic query patterns for such a huge catalog, I was able to have some fun creating aliases for the more popular videos, such as “needs more cowbell.” Specific clips and characters were likely to appear in search logs shortly after new episodes aired, so I planned to monitor dashboards in order to add new videos and keywords Sunday mornings.
I turned around a huge list of keywords in a matter of days so the search experience launched at the same time as the Screen content. Post-launch, I trained a teammate how to look up video IDs and identify relevant query patterns so we could take advantage of the unique, timely spikes that only SNL had.
After Yahoo acquired Tumblr, Search leadership asked me to find a way to feature Tumblr content in web search results.
I started out with several things to consider:
Understanding the type of content on Tumblr
Determining what content, if any, could map to real web search user needs
Figuring out what metadata we could extract from Tumblr posts and whether it was enough to work well in our content management platform
Learning as much as we could from what little data the Tumblr team could share with us
Because I was unable to discover much evidence of existing Yahoo search-to-Tumblr content behavior in our logs, and the nature of Tumblr’s content is freewheeling and relatively unstructured, we had to experiment.
The first test featured content from specific Tumblr users (celebrities, online personalities, organizations–entities with discrete matching queries) in a simple image carousel. Limitations of this approach: only image-type posts could be displayed, so blogs with text posts, links, etc. would appear with limited results or none at all, despite frequent updating; we could only trigger on keywords that had a clear match to a single blog (e.g., Beyonce, ZooBorns). As a result, coverage was low, and leadership tasked us with significantly expanding the experience.
To accomplish this, I needed to rely on automatic triggering methods that offered far less control over what content appeared in search results. Despite concerns about relevance and quality, we launched a test for a small percentage of search traffic. The initial test had to be taken offline within days because, although the backend team took steps to remove content flagged as “adult,” pornographic results (and worse) slipped through.
Search leadership was determined, however, and resources were provided to dramatically improve the indexing for quality and cleanliness. The backend team also added logic for when to return content at all, based on timeliness and other factors. A designer was brought in to collaborate a unique template for Tumblr that accounted for the variable types of content and included more Tumblr branding (color, logos). The UX and content improvements launched in bucket, and although metrics weren’t impressive, it didn’t cause major problems, and the feature launched for all desktop web traffic.
Seeking to experiment further in hopes of improving and better understanding its performance, I took the initiative to categorize queries that triggered the Tumblr module and identify categories that might be well-served with Tumblr content. I used existing keyword lists roughly mapping to a dozen or so categories and set up a test bucket version of the module with only these categories with logging for each. I also wanted to see if other factors affected performance, including where the module appeared on the page (“slotting”) and how consistently it appeared (whether to ignore backend display logic). I tracked and compared my experiment’s performance to the primary module’s on a weekly basis, using that data to make small tweaks to each category along the way.
The great Tumblr in search experiment ended after about a year and a half, when leadership decided the investment was no longer justifiable. Despite the effort’s ultimate failure, I was recognized for my contribution and creativity.
Key categories in my final experiment did show some lift in performance: food, books, holidays, fictional characters, TV series, and movie series.
The editorial team that worked with Yahoo’s search content management platform was broadly tasked with generating and working on ideas for new search features. One method of doing this was to browse government developer sites for APIs and structured data that might map to valuable search user needs. That’s how I learned about the United States Geological Service’s “latest earthquakes” API.
First, I wanted to determine how often people search for earthquake information, and how they formulate such queries. Once internal analytics tools validated that there was sufficient search volume to proceed, I used a tool that let me pull queries by URL–that is, I got a report containing search queries that resulted in a web result click anywhere on the “earthquake.usgs.gov” domain. I classified these queries as “earthquake data intent” and “not earthquake data intent,” and further by patterns like “6.2 earthquake,” “earthquake in Japan,” and “latest LA earthquakes.”
Once I identified the main earthquake-intent patterns, I needed to determine how best to handle location information. Because the USGS API supported location by both a radius around a single point (appropriate queries containing a city name) and a bounding box with minimum and maximum latitude and longitude (worked for countries and regions), I created a list of popular locations for which bounding box location was more useful than a radius and let a global location list handle everything else.
Finally, the UX proved trickier than hoped. Initially I hoped to include a map component, which was supported in our tool, but a front end bug prevented it from rendering correctly, so I worked with a designer to massage a text-only table into an appealing result that presented all the relevant information. (Because the bug did not appear for any high-priority features, the engineering time was unable to devote time to fixing it.)
Earthquakes represented an interesting use case for location handling. The search feature routinely spiked any time the earth moved and appeared alongside news headlines.
Community Season 6 – featured Yahoo Screen video content + keyword list cultivation for TV series knowledge graph
(no longer active due to deprecation of Yahoo Screen)
When Yahoo Screen made its foray into full-length original TV series with Community and others, we were ready to go in search. I made sure that the latest episodes carousel had excellent coverage and TV Series Knowledge Graph contained accurate, detailed profiles.
Vertical search experience embedded in web search results – set up in support of Yahoo’s Digital Magazines strategy (Tech, Style, Movies, etc.)
Yahoo launched several new media verticals called “Magazines” and did not migrate any corresponding vertical search experiences, which were based on an older platform. Instead, a search product manager was tasked with adding vertical content to web search, filtered according to the user’s site of origin, and they enlisted my support to create and launch the necessary features in our search content management platform.
Each query sent from a search box included more than just the user’s query–it contained referral information, usually unique to the property or even page. I used this information to determine when a search experience should appear, as well as pass variables to a backend. The news backend, which indexed news from hundreds of sources worldwide, including Yahoo’s own sites, could return articles matching any query, filtered by property and sorted by freshness.
Expanding on a template design already in use for news results in general web search, I created a “vertical search” feature that included up to 10 results with thumbnails for each and paginated results if there were more than 10 stories matching a given query. This large search feature appeared on top of the usual web algorithmic links and any other non-monetized search features.
Product owners also asked for Magazine stories to be highlighted in web search results (in a less aggressive form, of course). I created simple “navigational” features with a max of 3 stories to appear on searches by Magazine name and featured writers to satisfy their primary need. Because Magazine stories were indexed along with all news content, these stories could appear in regular news search results without any extra effort.
No-maintenance, low-effort vertical search launched on all new Magazines. Sites and big-name authors were effectively promoted in web search.
Also created season details experience (no longer functional).
The Knowledge Graph team wanted to develop major feature improvements to entity profiles around People, Movies, and TV Series with competitor parity as a goal. As a TV enthusiast, I volunteered to support TV Series profile enhancements.
Because I was responsible for the migration of the existing TV series experience into our content management platform, I was well-acquainted with the data elements and quality we already had and could create a clear list of requirements to replace and significantly improve. I was also able to compare this with elements found in design mocks created for the effort. Initially, the planned design did not account for the quality of images available in our database, so I was able to leverage my experience to strike a balance between design requirements and content realities. I was responsible for UX copy including data labels and headers.
After the experience had been live for a time, I learned that our Knowledge Graph database now included content I believed to be valuable to search users: streaming links. While competitors already offered this information, they focused on a la carte options and not popular subscription services, where many series could already be found. I sought and received buy-in from product owners to pursue major changes to the TV series experience.
Streaming links were stored at the episode level, not series or even season, but our keyword list was manually generated, but I found a scalable way to serve episode-level links without manually curating triggering lists covering thousands of individual episodes. I also used this data to identify whether a series could be found on the streaming sites Netflix, Hulu, and Amazon, and if so, to construct a series-level link.
The Knowledge Graph team also wanted to perform a quality audit of their data for several key topics. I served as the lead for TV Series, creating a finite list of elements to review and guidelines for assessing quality, moderating discussion, and authoring a report on the results.
TV Series in web search results went from a very basic snapshot of title, synopsis, network, logo image, and cast to an interactive set of features including the same key base content as well as episodes, seasons, air dates, and streaming links.
While these solutions were not necessarily an ideal user experience, they allowed users to easily click and find information they might want without too much effort.
The results of the quality audit were used to develop systemic improvements to the way they merge and manage data sources.
2016 US Presidential Election search features, including candidate Knowledge Graph with fundraising data from OpenSecrets.org and polling from Real Clear Politics.
Factchecked quotes from the Politifact Truth-O-Meter.
Where candidates stand on the issues, with researched positions from ProCon.org and “explore related” suggested queries manually curated.
I was approached by search product teams working on distinct experiences around the then-upcoming 2016 U.S. primary election to offer feedback on proposed designs and organize editorial efforts in content curation and quality validation.
Led team effort to develop detailed list of potential features along with timeline, content source(s), and priority. Features based on team knowledge and real user search data/query patterns.
Researched content sources for features that would be relevant in early 2016 (general politics, candidate research). Key requirements included high-quality, politically neutral data; structured in a way that was compatible with our content management platform; served in XML or JSON format or able to be extracted and converted to usable form.
Joined product team meetings to give updates on content development and share feedback on whether design matched real content and user needs.
Created and launched experiences on web search and provided support for other platforms using our work.
By February 2016, we launched several features on web search:
Presidential candidate knowledge graph, incorporating party affiliation, donation data from OpenSecrets.org, polling data, and political office history.
Latest quotes with “truthfulness” rating from Politifact for all presidential candidates.
Candidate stances on 20+ key political issues extracted from ProCon.org with manually curated browse element to help search users explore candidate opinions.
Political cartoon of the day.
Additionally, detailed requirements for election results and future election experience ideas were documented.