Posterous theme by Cory Watilo

Proving Display Can Perform Better than Search

What more can we say here. The numbers speak for themselves.


Yieldbot Conversion Rate: 35.59% G2

Yieldbot Conversion Rate:  26.04% G3

Google CPC Conversion Rate: 29.23% G2

Google CPC Conversion Rate: 20.65% G3

Google Organic Conversion Rate: 7.81% G2

Google Organic Conversion Rate: 6.10% G3

That’s a 26% higher conversion rate vs. Google Paid Search for Goal 3 and 326% higher conversion rate than Google Organic Search.

This is the first two weeks of data for the campaign. As hard as it is to believe these numbers are before any optimization has taken place. 

These are IAB standard 300 x 250 and 728 x 90 units. But the realtime decisioning on the ad call is anything but standard. In fact, we believe these results bear out that Yieldbot has redefined relevance in display and created a new advertising channel in the process. A channel unlike any other. 

We’ve worked for over two years to build Yieldbot on the thesis that first party publisher data to harvest intent and realtime decisioning to match against it would deliver performance that would rival Search. Even we didn’t think one of our early campaigns would out perform it.  It’s great news for marketers and publishers. For us, the best news is that we’re just getting started.



Redefining Premium Content Towards CPM Zero

Say Goodbye to Hollywood

When Ari Emanuel, co-CEO of talent agency William Morris Endeavor said that Northern California is just pipes and needs Premium Content it’s clear that he just doesn’t get it. There is no such thing as premium content. There are only two things premium on a mass scale anymore - distribution and devices. 

Massive media fragmentation fueled by the Internet has forever redefined what is ‘premium’ content. The democratization of media – the ability for a critical mass of people (now virtually the entire world) to create, distribute and find content killed the old model of premium. Modern Family is a good TV show but when I can more easily stream a concert like this through my HDTV at any moment I want I’m pretty sure “premium content” has been redefined.

Since the web is the root cause of death for premium content it makes sense that the effect is no better exemplified than in web publishing. Since the advent display advertising publishers have sought to categorize and valuate their content in ways that were familiar to traditional media buyers. No media channel has promoted the idea of or value for premium content more than digital. Thus, print media’s inside front and back covers became the homepages and category pages on portals. Like print, these were areas where the most eyeballs could be reached. 

But a funny thing happened in digital behavior. People skipped over the front inside cover and went right to content that was relevant to them. Search’s ability to fracture content hierarchies and deliver relevance not only became the most loved and valuable application of the web, it destroyed the idea of premium content all together. In reality, premium never really existed in a user-controlled medium because it was never based on anything that had to do with what the user wanted. It was based on the traditional ad metric of “reach” when in this medium, decisions about what is premium are determined by on-demand ability and relevance. 

Sinking of the Britannica


The beauty of this medium is in the measurement of it. Validation for the drowning of premium beyond the fact that Wikipedia destroyed Encyclopedia Britannica rests in the performance of digital media. A funny thing happened as advertising performance became more measured. Advertisers discovered premium didn’t nearly matter as much as they thought.  There were better ways to drive performance that yielded better and more measureable results. The ability to match messaging to people on-request and in a relevant way was more valuable in this medium than some content provider idea of what was “premium.” In this medium the public not the publisher determines what is premium.

As realtime rules based matching technology continues to improve performance advertising and marketing itself continues to grow at the expense of premium advertising. Today, despite those trying to hold on to the past, premium is little more than an exercise in brand borrowing and little else. Despite the best efforts of the IAB to bring Brand advertising to Digital it has fallen as a percentage of ad spend for five straight years. In the world we live in today Mr. Emanuel’s $9 billion dollar upfront for network TV primetime advertising is $1.5 billion less in ad revenue than Google made last quarter

What this all means for the future of digital media (and thus all media eventually) is that it’s headed to “CPM Zero.” Look around - all the digital advertising powers - Google, Facebook, Twitter, Amazon - are selling based one thing. Performance. They are not selling on the premium sales mechanism of CPM. When ‘CPM Zero’ happens, and it will, these forces pushing the digital ad industry forward win. They own the customer funnel and they will own the future of marketing and advertising. It begs one big question. Where does this leave content creators and publishers?

Don't Fear the Reaper

Publishers will never be able to put the CPM sales genie back in the bottle. CMOs and advertisers are already finding out that they are paying too much for premium. Go ask GM what they think. What publishers are finding out is that they are no longer selling their media; it’s being bought. Purchased from a marketplace with infinite inventory in a wild west of data. Therein lies the publisher’s ace in the hole and the strategies and tactics digital publishers (and eventually broadcasters) can use to combat the death of premium. 

Like Search, Publishers need to have two crucial components to their marketplaces. They need the tension of scarcity in the marketplace. That will drive up demand and force advertisers to spend the time working on improving their performance. This was the cherry on the sundae for Google as a $1billion industry - Conversion Testing and Content Targeting grew out of nowhere to support spends in Search. Most every dollar saved with optimization went to drive more volume – or back to Google. They need a unique currency for the marketplace. Keywords were a completely new way to buy media. Nothing has ever worked better. Facebook is selling Actions with OpenGraph. Ultimately advertisers are buying customers not keywords or actions but there is a unique window of opportunity for publishers at this moment in time to create something new and uniquely people, not page focused.

The tactics used to fuel these strategies all rely on one natural resource - data. Publishers have diamonds and gold in beneath the surface of their properties. Mining these data nuggets and using them to improve the performance of their media is the sole hope publishers have competing in the world of “CPM Zero.” Only publishers can uniquely wrap their data with their media and drive performance in a manner unique to the marketplace. That’s what Google does. That’s what Facebook does. That’s what Twitter does. The scarcity mentioned above is created because the realtime understanding of site visitor interest and intent is only derived using first party data as rules and integration with the publisher ad server for delivery. So pubs are really left with one choice – take control of their data and use it for their benefit creating an understanding of WHY people are buying their media and how it performs. Or let Google, Facebook, third-party et al come in and grab their data and know nothing about why it’s being bought and how much it’s being sold.

The ability to match messaging to people on-request and in a relevant way is within the publisher’s domain. It is the most premium form of advertising currency ever created and will deliver an order of magnitude more value. It will fuel the 20% YoY growth of digital advertising and marketing for the next 15 years. Who captures the majority of that value, the advertiser or the publisher, is the only question remaining.


Measure Twice, Cut (over) Once


This past weekend we did a deploy at Yieldbot unlike any other we've done before.

At its completion we had:

  • upgraded from using Python 2.6 to 2.7.3;
  • reorganized how our realtime matching index is distributed across systems;
  • split off monitoring resources to separate servers;
  • moved out git repos that were submodules to be sibling repos;
  • changed servers to deploy code from Chef server instead of directly from github;
  • completely transitioned to a new set of servers;
  • suffered no service interruption to our customers.

The last two points deserve some emphasis. At the end of the deploy, every single instance in production was new - including the Chef server itself. Everything in production was replaced, across our multiple geographic regions.

Like many outfits, we do several deploys a week, sometimes several a day. Having no service disruption is always critical, but in most deploys is also usually fairly straightforward. This one was big.

The procedures we had in place for carrying it out were robust enough though that we didn't even internally notify anyone from the business side of the company when the transition was happening. The only notification was getting sign-off from Jonathan (CEO) on Friday morning that the cut-over would probably take place soon. In fact, we didn't notify anyone *after* the transition took place either, unless you count this tweet:


I suppose we cheated a little by doing it late on a Saturday night though. 🙂


We have a few kinds of data that we had to consider. Realtime streaming, analytics results, and configuration data.

Realtime Streaming and Archiving

For archiving of realtime stats, the challenge was going to be the window of time that old systems were still receiving requests while new servers were starting to take their place.  In addition to to zero customer impact we demanded zero data loss.

This was solved mostly by preparation. By having the archiving include the names of the source donig the archiving, the old and new servers could both archive data to teh same place without overwriting each other.

Analytics Results

We currently have a number of MongoDB servers that hold the results of our analytics processes, and store the massive amounts of data backing the UI and the calculation of our realtime matching index.

Transitioning this mostly fell on MongoDB master-slave capabilities. We brought up the new instances as slave instances pointing to the old instances as their master. When it was time to go live on the new servers, a re-sync with chef reverted them to acting as masters.

There was a little bump here where an old collection ran into a problem in the replication and was replicating to be much larger in the new instance than in the large instance. Luckily it was an older collection that was no longer needed, and dropping it altogether on the old instance got us past that.

Configuration Data

Transitioning the config data was made easy by the fact that it uses a database technology that we created here at Yieldbot called HeroDB. (which we'll much more to say about it in the future).

The beneficial properties of this database in this case is that it is portable, and can be easily reconciled against a secondary active version. So we copied these databases over and had peace of mind that we'd reconcile later as necessary with ease.


We tested the transition in a couple different ways.

As we talked about in an earlier blog post, we use individual AWS accounts for developers with Chef config analogous to the production environment.  In this case we were able to bring up clusters in test environments along the way before even trying to bring up the replacement clusters in production.

We also have test mechanisms in place already to test proper functioning of data collection, ad serving, real time event processing, and data archiving. These test mechanisms can be used in individual developer environments, test environments, and production. These proved invaluable in validating proper functioning of the new production clusters as we made the transition.

The Big Switch - DNS

DNS was the big switch to flip and the servers go from "ready" to "live". To be conservative we placed one of our new edge servers (which would serve a fraction of the real production traffic in a single geographic region) into the DNS pool and verified everything worked as expected.

Once verified, we put the rest of the new edge servers across all geographic regions into the DNS pools and removed all of the old edge servers from the DNS pools.

The switch had been flipped.

There Were Bumps (but no bruises)

There were bumps along the way. Just none that got in our way. Testing as we went, we were confident that functionality was working properly and could quickly debug anything unexpected. As any fine craftsman knows, you cut to spec as precisely as possible, and there's always finish work to get the fit perfect.

Chef, FTW!

The star of the show, other than the team at Yieldbot that planned, coded, and executed the transition, was Chef.

We continue to be extremely pleased with the capabilities of Chef and the way that we are making use of it. No doubt there are places where it is tricky to get what we want. And of course there's a learning curve in understand how the roles, cookbooks, and recipes all work together, but when it all snaps into place, it's like devops magic.


The Future of Marketing and Advertising Belongs to Software

Since day 1 I've described what we are building as "a web technology for marketing and advertising - not an advertising and marketing technology for the web.” Of course it's a play on words but the purpose is to more clearly define our product. It is software. As we begin to open source some of the tools we have created we are reminded everyday that Yieldbot is a software company. That’s a good thing.

I ventured into display advertising because it had a weak technology stack supporting a pre-digital business model. The de-facto intelligence in display advertising is a 1-10 ranking system – the ad server waterfall - with a single unit of measurement – the impression - that was in fact always different. From a software perspective, display advertising is a massive of opportunity.

My web software experience was first in e-commerce where I watched amazing software be created by the likes of ATG and Endeca. Then in Search where I watched Google and Yahoo employ thousands strong armies of engineers. Most recently at Offermatica/Adobe Test&Target where the software serves billions of highly optimized and dynamic web experiences every week. Software. Software. Software.

Michael Walrath, Founder of RightMedia said recently:

“In order to build a truly disruptive and highly valuable company delivering enterprise software for digital advertising, the new solution has to be an order of magnitude better than the existing systems.  It is not enough to deliver an incrementally better version of the existing systems.  If there is to be a resurgent disruptor in the advertising technology space it has to change the game. It must attack the white space…”

What I love about this quote is that it frames the market opportunity as enterprise software and software that must do something where nothing has been done before. 

Yieldbot attacks this challenge everyday. Massive batch and realtime and predictive analytics. Machine-learning and automated intelligence. Differentiated and highly dynamic units of measurement.  The visualization of data and the ability to make it actionable. A white space where the focus is not on buying or selling media – but on how well media and people can be matched in realtime. 

Matching differently. That is our disruption.

The enterprise software I admired and mentioned above all looked to solve the matching problem. Display advertising’s main problem as I wrote 2 years ago is the only place where the “order of magnitude better than existing systems” can be achieved. This is because new, more intelligent methods of matching can fundamentally revalue the media around something besides impressions and cookies. We believe that something is realtime visit intent.

It’s an amazing time to build software. There is more technology to get more understanding and create more intelligence at a lower cost than ever before. The advances in analytics, databases and the languages create an order of magnitude more power. I couldn’t think of anything more exciting to be working on in this day an age than software or a better group of people to be doing it with. The future of marketing and advertising belongs to software. 


Introducing Pascalog


(Shared under Creative Commons Attribution-ShareAlike license: Flikr user Timitrius)

Today, the dev team at Yieldbot is excited to announce plans to open source one of our prized internally developed technologies: Pascalog.

Technology often evolves more in cycles than linearly, with past patterns showing through as more recent innovations are made.

For a while we were doing all of our analytics in Cacsalog, and things were going great. As a Clojure DSL written on top of the Hadoop Cascading API, Cascalog is a brilliant technology for efficiently processing large data sets with very tersely written code.

In fact, we even wrote about those experiences here and here.

But we found ourselves writing things like this:

(<- [!pub !country !region !city !kw !ref !url ?s]
    (rv-sq !pub !country !region !city !kw !ref !url ?pv-id ?c)
    (c/sum ?c :> ?s))

We thought that there had to be a better way. When we realized that Clojure being a Lisp has its foundations in the 1960's we immediately realized the next logical step would be an upgrade into the 1970's.

Wouldn't we want to write something more like:

program HelloWorld;
   writeln('Hello, World!');

And we immediately set upon bringing the best of software development of the 1970's, Pascal, into the Big Data world of the 2010's. Pascalog was born.  (who couldn't love a language that wants you to end your programs with a "."?)

This also fit well with internal discussions we were having at the time lamenting the complexity of managing a Hadoop cluster and the efficiencies that might be gained by combining all the functionality back into one processing environment on a mainframe. That dream is on hold until we find a suitable hardware vendor, but there was certainly no reason to hold Pascalog development back for that.

Data is a readln() Away

In Pascalog we've done the heavy lifting. By adapting readln() to be bound to a Cascading Tap, you read data in the way you've done since your Turbo Pascal days.

It didn't take us long to realize that you'd want to save the results of your calculations somewhere, so in a followon version we added the mapping of writeln() to an output Cascading Tap.

Configuring your input and output taps and mapping them to readln() and writeln() is as easy as configuring an INI file.

An upcoming version which should be available shortly will also allow the readln() of one Pascalog program to be mapped to the writeln() of an upstream Pascalog program, allowing you to daisychain your Pascalog programs.

Why Pascal?

We make it sound above like we jumped onto the Pascal bandwagon right away, but in truth we considered several alternatives from the 1970's.

Of particular interest was the ability to write nested procedures. We've grown accustomed to this from our Python development on other parts of the platform and this allows us to migrate between the two worlds more seemlessly (compared to, say, Fortran).

The availability of a goto statement is also a great feature to bail you out if you start getting a little too lost in your control flow. This has become a lost art.

We did consider C, but couldn't get over the hump of having it named "Clog".

The Future

We're furiously looking for a Pascal Meetup group where we can make a live presentation. If you know of one, please let us know!

We have a long list of features in mind to build, but we also want to hear back from the community.

Visit to get started! We're looking forward to the pull requests. If you have live questions there's usually one of us hanging out on CompuServe under user ID [73217, 55].


Development as Ops Training


It's become failrly well understood that "Dev" and "Ops" are no longer separate skill sets and are combined into a role called "DevOps". This role has become one of the hottest and hardest to fill.

At Yieldbot we've taken a pretty hardcore approach to putting together Dev and Ops into DevOps that serves us well and should be a great repeatable pattern.

Chef + AWS Consolidated Billing

The underlying philosophy we have is that the development environment should match as closely as possible the production environment. When you're building an analytics and ad serving product with a worldwide distributed footprint that can be a challenge.

Our first building block is the use of Chef (and on top of that ClusterChef, which is now Ironfan). Using these tools we've fully defined each role of the servers in a given region (by defining as a cluster), and all of the services that they run. We coordinate deploys through our Chef server with knife commands, and Chef controls everything from the OS packages that get installed, to the configuration of application settings, to the configuration of DNS names, etc.

The second building block is that every developer at Yieldbot gets their own AWS account as a sandbox. We use the AWS "Consolidated Billing" feature to bring the billing all under our production account. This lets us see a breakdown of everybody's charges and means we get one single bill to pay.

The last detail is that every developer uses a unique suffix that is used to make resource references unique when global uniqueness is necessary. This is mostly used for resolving S3 bucket names. For any S3 bucket we have in production such as "", the developer will have an equivalent bucket named "<developer>".

Doing Two Things at Once

With all of that as the status quo, developers are almost always doing two things: developing/testing (the Dev), and learning/practicing how the platform is managed in production (the Ops).

Everyone has their own Chef server, which is interacted with the same way that the production Chef server is. As they deploy the code they are working on into their own working environment, they're learning/doing exactly what they would do in production.

All of this was put in place over the last year while the developement team was static, during which time we switched from Puppet to Chef.

But the power of this approach really hit home recently as we've started to add more people to the team.  The first thing a new hire does is go through our process of getting their development environment set up. There's still bumps along the way, and they get problems and take part in ironing them out. The great thing about this approach though is that each bump is a lesson about how the production environment works and a lesson in problem solving in that environment.

The Differences

Having said all that, there are a couple differences that we've put in place between consciously development and production, with the driving force being cost.

The instances are generally sized smaller, since the scale needed for production is much greater.  Amazon's recent addition for support of 64-bit on the m1.small was a great help.

We use several databases (a mix of MongoDB, Redis, and an internally developed DB tech) that are distributed on different machines in production that we collapse together onto a single instance with a special role called "devdb".


We'll have to have some future blog posts about how we import subsets of production data into development for testing, and the like.

We also use Chef with ClusterChef/Ironfan for managing the lifecycle of our dynamic Hadoop clusters. Yet another good topic for a post all its own.

Have experience with a similar approach or ideas about how to make it even better? We want to hear about it.



Realtime Kills Everything

Our first ad campaigns are live and the results are exciting. The campaign ran on a premium publisher in the women’s lifestyle vertical and beat the publisher’s control group on Click- Through-Rate (CTR) by 77% on the 728 x 90 unit and 194% on the 300 x 250. There were over 1M impressions in the campaign served on this domain over a 2-week period. Yieldbot is now serving the entire campaign.  

Most exciting to us are some of the individual results:

  • The best performing keyword has a CTR of 1.56%. 
  • The best creative unit (a 300 x 250) is getting 1.01%

We are running IAB standard banner units. This is not text. This is not rich media.

According to MediaMind the industry average CTR for the campaign vertical is 0.07%

The most matched keyword intent has a CTR of .43%. It also has a CPC of $5. 

That math works out to an eCPM of $21.44. That’s pretty exciting stuff. Even more so when you factor in that this campaign is running in what was unsold inventory.

When I shared the results with one of our Board Members he asked me, what at the time I thought was a simple question. “Why are the results so good?” But then, I actually had to think hard about the answer. I had to boil down a year of beta testing and then another year of building a scalable platform into what deserved to be a simple answer.


Realtime was my one word answer. Never before was every page view of intent for this publisher's visitors captured in realtime - let alone used to make a call to an ad server at that very moment.

Realtime is different. Realtime kills everything before it. As such, Yieldbot is not building ad technology for the web. We are building web technology for ads. Since nothing is more important for advertising success than timing it makes sense that nothing is more valuable for results than realtime.

Realtime was a big buzzword for a while but the hype has died down. That’s good. In the Hype Cycle we’re now somewhere moving from the “Trough of Disillusionment” to the “Slope of Enlightenment.” It is however this ability of the web to react in realtime that makes the future of the medium so exciting. 

Twitter of course is the best representative example. Twitter changed everything about media that came before it. Used to be that breaking the story was the big deal Now, even online news seemed stodgy compared to people giving realtime updates that planes have landed on rivers, people being killed and opining on a live show right along with it. 

As technology continues to get better at processing the trillions of inputs from millions of people going about their daily lives - doing everything from riding their car to work, buying a pack of chips, surfing the web – the web will respond in realtime. Because of that it will be relevant. The idea of an ad campaign will seem like owning a 32 volumes set of Encyclopedia Britannica. Everything becomes response because the technology is responsive. Calculations need inputs. The web will be measuring just about everything you do and know the moment you are doing it. Nothing will be sold. Everything will be bought.

It’s that realtime pull that creates these new valuations of the media. That new value of the media is what we have been working to create at Yieldbot. That is why these results are so exciting. Best of all, we’re just getting started. We’ve got a bunch of new campaigns about to get underway and we’re only going to get smarter and more relevant. We’ll continue to keep you posted on how it’s going and if you're running Yieldbot you'll know yourself. In realtime.


Relevant News


"The enemies of advertising are the enemies of freedom.” - David Ogilvy

Exciting news for Yieldbot and lovers of relevance today as we’re announcing a new Series A round of funding led by New Atlantic Ventures (NAV) and RRE Ventures.  Seed Investors kbs+p Ventures, Common Angels and Neu Venture Capital also participated again in this round.

The funny thing about raising money in media technology is that very few VC’s actually understand it and even fewer have vision for where it’s headed. We’re fortunate to bring together a team of investors that live and breathe this stuff and proudly represent New York’s media leadership and Boston’s technology leadership in a way that mirrors Yieldbot’s own corporate footprint.

The funds will be used to continue development and bring to market our Yieldbot or Publishers (YFP) realtime intent-graph™ technology (launched July 2011) and our Yieldbot for Advertisers (YFA) realtime intent marketplace that launched in alpha this month. Together YFP and YFA create a valuable media channel of realtime consumer intent that delivers an order of magnitude more relevant ad matching and performance. 

From day one, two years ago, we wanted to bridge the largest digital inventory source, Web Publishers, with the largest and best digital ad spends, Search advertisers, in a way the brings a more relevant web experience to people. We’ve progressed an extremely long way with a small team and relatively little funding so far. Today we’re putting dry powder in our muskets and continuing to battle. The enemies of freedom are only so because they know not relevance.


Working at Yieldbot

We're adding more developers to our team and pushing things to the next level. If you like seriously interesting and challenging work in the areas we're looking for help in, you should be talking to us. You'll have a single-digit employee number, so you'll be getting in early and powering us on our way to fulfilling the huge potential we're sitting on.

What can you expect if you decide to jump in and join us on our mission to make the web experience more relevant? For one thing, no shortage of interesting hard problems to solve, and the latest tools to do it with.

A Great Environment

For our devleopment environment we each have an AWS sandbox that deploys the same code as production, so everyday work is production devops training too, with a Mac for your local dev environment. You'll have Campfire group chat up all day, and be in the middle of all the important conversations around what we need to do and how we need to do it, from the CEO on down.

The language and tools you use most during the day will depend on what part of the platform you're focusing on.

A Distributed Realtime Platform

A large part of the core platform is in Python. All of the code around scheduling of tasks and managment of the platform are found here, as well as the key ad serving logic and realtime event processing. You'll be making use of MongoDB, redis, and ElephantDB. You'll be solving problems on running the platform distributed across several data centers worldwide. You'll likely be doing some devops stuff here too, and loving the ease with which Chef lets you get that done.

Bleeding Edge Analytics

If you're working on our analytics then you are loving the use of Cascalog (a Clojure DSL that runs over the Cascading API on Hadoop). The power-to-lines-of-code ratio here is ridiculous. More than that, you'll be writing realtime analytics in Storm. That's not cutting edge, it's definitely bleeding edge.

Focus on UX

To work on the UI you're pushing the limits on the latest Javascript UI tools like D3.js and Spine.js. Have you thought about how clean client-side MVC should be done? Spine is it. We're serious about quality of UX here. If you're serious about it too, this is where you should be.

An Awesome Team

The team you'll be joining has been there before. We've founded and built successful products, platforms, and companies. We know our industry and what it takes to be successful. And we're doing it.

The most important thing that keeps us developers here at Yieldbot energized is that we're building something people want, that's been clear from the beginning. Our mission to make the web experience more relevant resonates with users, publishers, and advertisers.

If you're up for the challenge contact us at Check out We have some seriously challenging work you can get started on right away.



How Yieldbot Defines and Harvests Publisher Intent

The first two questions we usually get asked by publishers are:

1) What do you mean by “intent”?

2) How do you capture it?

So I thought it was time to blog in a little more detail about what we do on the publisher side. 

The following is what we include in our Yieldbot for Publishers User Guide.

Yieldbot for Publishers uses the word “intent” quite a bit in our User Interface. Webster’s dictionary describes intent as a “purpose” and a “state of mind with which an act is done.” Behavioral researchers have also said intent is the answer to “why.” Much like the user queries Search Engines use to understand intent before serving a page, Yieldbot extracts words and phrases to represent the visitor intent of every page view served on your site.

Since Yieldbot’s proxies for visit intent are keywords and phrases the next logical question is how we derive them. 

Is Yieldbot a contextual technology? No. Is Yieldbot a semantic technology? No. Does Yieldbot use third-party intender cookies? Absolutely not!

Yieldbot is built on the collection, analytics, mining and organization of massively parallel referrer data and massively serialized session clickstream data. Our technology parses out the keywords from referring URLs – and after a decade of SEO almost every URL is keyword rich - and then diagnoses intent by crunching the data around the three dimensions of every page-view on the site. 1) What page a visitor came from 2) what page a visitor is about to view and 3) what happens when it is viewed. 

Those first two dimensions are great pieces of data but it is coupling them with the third dimension that truly makes Yieldbot special. 

We give our keyword data values derived from on-page visitor actions and provide the data to Publishers as an entirely new set of analytics that allow them to see their audience and pages in a new way – the keyword level. Additionally, our Yieldbot for Advertisers platform (launching this quarter) makes these intent analytics actionable by using these values for realtime ad match decisioning and optimization.

For example: Does the same intent bounce from one page and not another? Does the intent drive two pages deeper? Does the intent change when it hits a certain page or session depth? How does it change? These are things Yieldbot works to understand because if relevance were only about words, contextual and semantic technology would be enough. Words are not enough. Actions always speak louder.

All of this is automated and all of this is all done on a publisher-by-publisher level because each publisher has unique content and a unique audience. The result is what we call an Intent Graph™ for the site with visitor intent segmented across multiple dimensions of data like bounce rate, pages per visit, return visit rate, geo or temporal.

Here’s an example of analytics on two different intent segments from two different publishers:



For every (and we mean every) visitor intent and URL we provide data and analytics on the words we see co-occurring with primary intent as well as the pages that intent is arriving at (and the analytics of what happens once it gets there). We also provide performance data on those words and pages.

Yieldbot’s analytics for intent are predictive. This means that the longer Yieldbot is the site the smarter it becomes - both about the intent definitions and how those definitions will manifest into media consumption. And soon all the predictive analytics for the intent definitions will be updated in realtime. This is important because web sites are dynamic “living” entities - always publishing new content, getting new visitors and receiving traffic from new sources. Not to mention people’s interests and intent are always changing. 

I hope this post has served a good primer on Yieldbot for Publishers and maybe even gotten you interested in seeing it in action on your site. One of the best parts of what we do is seeing people’s faces when they first see the product. If you are a publisher and would like a demonstration please email info <at>