Saturday Night Live on Yahoo Screen – mapping keywords to vast back catalog of sketches and video collections
(No longer live due to deprecation of Yahoo Screen)
Yahoo Screen was planning to release a massive catalog of Saturday Night Live clips, spanning the series entire run. I was approached to create relevant search features to highlight the licensed content.
I was given a list of video collections–recurring characters, hosts, themes, etc.–and an enormous spreadsheet containing video metadata for every single item in the catalog. My task: organize this data into discrete sets of keywords to bring up the precise, relevant video result or carousel of results, and investigate existing search coverage of SNL to ensure we cover those videos.
In addition to creating basic query patterns for such a huge catalog, I was able to have some fun creating aliases for the more popular videos, such as “needs more cowbell.” Specific clips and characters were likely to appear in search logs shortly after new episodes aired, so I planned to monitor dashboards in order to add new videos and keywords Sunday mornings.
I turned around a huge list of keywords in a matter of days so the search experience launched at the same time as the Screen content. Post-launch, I trained a teammate how to look up video IDs and identify relevant query patterns so we could take advantage of the unique, timely spikes that only SNL had.
After Yahoo acquired Tumblr, Search leadership asked me to find a way to feature Tumblr content in web search results.
I started out with several things to consider:
Understanding the type of content on Tumblr
Determining what content, if any, could map to real web search user needs
Figuring out what metadata we could extract from Tumblr posts and whether it was enough to work well in our content management platform
Learning as much as we could from what little data the Tumblr team could share with us
Because I was unable to discover much evidence of existing Yahoo search-to-Tumblr content behavior in our logs, and the nature of Tumblr’s content is freewheeling and relatively unstructured, we had to experiment.
The first test featured content from specific Tumblr users (celebrities, online personalities, organizations–entities with discrete matching queries) in a simple image carousel. Limitations of this approach: only image-type posts could be displayed, so blogs with text posts, links, etc. would appear with limited results or none at all, despite frequent updating; we could only trigger on keywords that had a clear match to a single blog (e.g., Beyonce, ZooBorns). As a result, coverage was low, and leadership tasked us with significantly expanding the experience.
To accomplish this, I needed to rely on automatic triggering methods that offered far less control over what content appeared in search results. Despite concerns about relevance and quality, we launched a test for a small percentage of search traffic. The initial test had to be taken offline within days because, although the backend team took steps to remove content flagged as “adult,” pornographic results (and worse) slipped through.
Search leadership was determined, however, and resources were provided to dramatically improve the indexing for quality and cleanliness. The backend team also added logic for when to return content at all, based on timeliness and other factors. A designer was brought in to collaborate a unique template for Tumblr that accounted for the variable types of content and included more Tumblr branding (color, logos). The UX and content improvements launched in bucket, and although metrics weren’t impressive, it didn’t cause major problems, and the feature launched for all desktop web traffic.
Seeking to experiment further in hopes of improving and better understanding its performance, I took the initiative to categorize queries that triggered the Tumblr module and identify categories that might be well-served with Tumblr content. I used existing keyword lists roughly mapping to a dozen or so categories and set up a test bucket version of the module with only these categories with logging for each. I also wanted to see if other factors affected performance, including where the module appeared on the page (“slotting”) and how consistently it appeared (whether to ignore backend display logic). I tracked and compared my experiment’s performance to the primary module’s on a weekly basis, using that data to make small tweaks to each category along the way.
The great Tumblr in search experiment ended after about a year and a half, when leadership decided the investment was no longer justifiable. Despite the effort’s ultimate failure, I was recognized for my contribution and creativity.
Key categories in my final experiment did show some lift in performance: food, books, holidays, fictional characters, TV series, and movie series.
Tree knowledge graph with growing information extracted from Calpoly’s website and images via Flickr Creative Commons APIs
(no longer live)
Search leadership asked our editorial team to identify and develop “low hanging fruit” reference-type content as part of a competitive parity initiative. In some cases, we took content that had already been curated for standard search features and turned it into expanded right-side “Knowledge Graph”-style elements.
For existing content, this was effectively a UX or template migration: the content was there, it just needed to be moved to a different format. Trees and food nutrition facts were two good examples. The only trick was that the images already curated weren’t large or high quality enough to serve as a large “hero”-type banner image, so I took advantage of Yahoo subsidiary, Flickr, which has a search-based API that let us serve only Creative Commons-licensed user images.
Though these efforts did not represent high-volume queries, the effort did not go unnoticed by leadership, and it also served as an opportunity for less experienced team members to build their skills and flex creative muscle.
The editorial team that worked with Yahoo’s search content management platform was broadly tasked with generating and working on ideas for new search features. One method of doing this was to browse government developer sites for APIs and structured data that might map to valuable search user needs. That’s how I learned about the United States Geological Service’s “latest earthquakes” API.
First, I wanted to determine how often people search for earthquake information, and how they formulate such queries. Once internal analytics tools validated that there was sufficient search volume to proceed, I used a tool that let me pull queries by URL–that is, I got a report containing search queries that resulted in a web result click anywhere on the “earthquake.usgs.gov” domain. I classified these queries as “earthquake data intent” and “not earthquake data intent,” and further by patterns like “6.2 earthquake,” “earthquake in Japan,” and “latest LA earthquakes.”
Once I identified the main earthquake-intent patterns, I needed to determine how best to handle location information. Because the USGS API supported location by both a radius around a single point (appropriate queries containing a city name) and a bounding box with minimum and maximum latitude and longitude (worked for countries and regions), I created a list of popular locations for which bounding box location was more useful than a radius and let a global location list handle everything else.
Finally, the UX proved trickier than hoped. Initially I hoped to include a map component, which was supported in our tool, but a front end bug prevented it from rendering correctly, so I worked with a designer to massage a text-only table into an appealing result that presented all the relevant information. (Because the bug did not appear for any high-priority features, the engineering time was unable to devote time to fixing it.)
Earthquakes represented an interesting use case for location handling. The search feature routinely spiked any time the earth moved and appeared alongside news headlines.
Community Season 6 – featured Yahoo Screen video content + keyword list cultivation for TV series knowledge graph
(no longer active due to deprecation of Yahoo Screen)
When Yahoo Screen made its foray into full-length original TV series with Community and others, we were ready to go in search. I made sure that the latest episodes carousel had excellent coverage and TV Series Knowledge Graph contained accurate, detailed profiles.
Vertical search experience embedded in web search results – set up in support of Yahoo’s Digital Magazines strategy (Tech, Style, Movies, etc.)
Yahoo launched several new media verticals called “Magazines” and did not migrate any corresponding vertical search experiences, which were based on an older platform. Instead, a search product manager was tasked with adding vertical content to web search, filtered according to the user’s site of origin, and they enlisted my support to create and launch the necessary features in our search content management platform.
Each query sent from a search box included more than just the user’s query–it contained referral information, usually unique to the property or even page. I used this information to determine when a search experience should appear, as well as pass variables to a backend. The news backend, which indexed news from hundreds of sources worldwide, including Yahoo’s own sites, could return articles matching any query, filtered by property and sorted by freshness.
Expanding on a template design already in use for news results in general web search, I created a “vertical search” feature that included up to 10 results with thumbnails for each and paginated results if there were more than 10 stories matching a given query. This large search feature appeared on top of the usual web algorithmic links and any other non-monetized search features.
Product owners also asked for Magazine stories to be highlighted in web search results (in a less aggressive form, of course). I created simple “navigational” features with a max of 3 stories to appear on searches by Magazine name and featured writers to satisfy their primary need. Because Magazine stories were indexed along with all news content, these stories could appear in regular news search results without any extra effort.
No-maintenance, low-effort vertical search launched on all new Magazines. Sites and big-name authors were effectively promoted in web search.
Yahoo News original documentary – full video plays inline on top of search results
Smaller version among search results for alternative keywords
(no longer live)
Yahoo’s video site, Screen, hosted a wide variety of original content over the years, and they periodically asked for this content to appear in search results.
Over the years, I built a relationship with contacts working on Yahoo’s video properties to support their efforts to receive views via relevant search traffic. While at first these appeared in search only as enhanced hyperlinks, eventually the search front-end team implemented an inline player functionality so we could embed Yahoo original content and other videos directly on a search results page. I created an expandable search feature that allowed me to respond quickly to the video team’s requests and generate appropriate keyword lists.
Teams throughout Yahoo knew they could rely on Search to support their efforts to promote premium original content on-network without using significant resources per request.
Also created season details experience (no longer functional).
The Knowledge Graph team wanted to develop major feature improvements to entity profiles around People, Movies, and TV Series with competitor parity as a goal. As a TV enthusiast, I volunteered to support TV Series profile enhancements.
Because I was responsible for the migration of the existing TV series experience into our content management platform, I was well-acquainted with the data elements and quality we already had and could create a clear list of requirements to replace and significantly improve. I was also able to compare this with elements found in design mocks created for the effort. Initially, the planned design did not account for the quality of images available in our database, so I was able to leverage my experience to strike a balance between design requirements and content realities. I was responsible for UX copy including data labels and headers.
After the experience had been live for a time, I learned that our Knowledge Graph database now included content I believed to be valuable to search users: streaming links. While competitors already offered this information, they focused on a la carte options and not popular subscription services, where many series could already be found. I sought and received buy-in from product owners to pursue major changes to the TV series experience.
Streaming links were stored at the episode level, not series or even season, but our keyword list was manually generated, but I found a scalable way to serve episode-level links without manually curating triggering lists covering thousands of individual episodes. I also used this data to identify whether a series could be found on the streaming sites Netflix, Hulu, and Amazon, and if so, to construct a series-level link.
The Knowledge Graph team also wanted to perform a quality audit of their data for several key topics. I served as the lead for TV Series, creating a finite list of elements to review and guidelines for assessing quality, moderating discussion, and authoring a report on the results.
TV Series in web search results went from a very basic snapshot of title, synopsis, network, logo image, and cast to an interactive set of features including the same key base content as well as episodes, seasons, air dates, and streaming links.
While these solutions were not necessarily an ideal user experience, they allowed users to easily click and find information they might want without too much effort.
The results of the quality audit were used to develop systemic improvements to the way they merge and manage data sources.
2016 US Presidential Election search features, including candidate Knowledge Graph with fundraising data from OpenSecrets.org and polling from Real Clear Politics.
Factchecked quotes from the Politifact Truth-O-Meter.
Where candidates stand on the issues, with researched positions from ProCon.org and “explore related” suggested queries manually curated.
I was approached by search product teams working on distinct experiences around the then-upcoming 2016 U.S. primary election to offer feedback on proposed designs and organize editorial efforts in content curation and quality validation.
Led team effort to develop detailed list of potential features along with timeline, content source(s), and priority. Features based on team knowledge and real user search data/query patterns.
Researched content sources for features that would be relevant in early 2016 (general politics, candidate research). Key requirements included high-quality, politically neutral data; structured in a way that was compatible with our content management platform; served in XML or JSON format or able to be extracted and converted to usable form.
Joined product team meetings to give updates on content development and share feedback on whether design matched real content and user needs.
Created and launched experiences on web search and provided support for other platforms using our work.
By February 2016, we launched several features on web search:
Presidential candidate knowledge graph, incorporating party affiliation, donation data from OpenSecrets.org, polling data, and political office history.
Latest quotes with “truthfulness” rating from Politifact for all presidential candidates.
Candidate stances on 20+ key political issues extracted from ProCon.org with manually curated browse element to help search users explore candidate opinions.
Political cartoon of the day.
Additionally, detailed requirements for election results and future election experience ideas were documented.
Search leadership wanted to take advantage of our then-new content management platform to release a complete suite of Olympics results features in each of 12 key markets, including the Arabic language site Maktoob. As an editorial leader and tool expert, I was tapped to organize the global team in this complex, ambitious effort.
While the search front-end engineering team developed templates designed specifically for the Olympics–the first time our platform was used for an important tentpole experience–our editorial team organized into content/query experts and technical builders capable of wrangling backend data and tricky template mapping. I oversaw these efforts and maintained detailed tracking of efforts on a per-market basis.
In lieu of engineering-heavy front-end localization, I created an editorially-driven “localization” data source that was easy to use in the content management platform and simple for the global team to input and update specific text strings for UX copy. This made it easier to build centralized featured and simultaneously deploy in almost every market.
Keyword creation was an immense undertaking: we built whitelists of thousands of athlete names and variations (including event and country); numerous patterns were developed to address results by event/sport and country.
I was responsible for keeping stakeholders up to date, ensuring all delegated work was completed in time, supporting pre-launch QA, and understanding how it all worked well enough to address bugs and concerns as they arose.
Our successful global Olympics experience demonstrated the power of the content management platform and the non-technical editors who worked with it. It also highlighted ways to improve the process to reduce engineering overhead and make even more complexity and customization possible.