« Reich: Health Care and the Political Lessons of History | Main | links for 2010-03-16 »

Monday, March 15, 2010

The Economics of Aggregation

This is for Felix. If you run a website that depends upon advertising, what is the optimal number of aggregator sites (sites that run part of your original posts plus a link back to the original)? What is the optimal length of an excerpt? This won't produce any results that can't be obtained with a few minutes of thought, it simply formalizes the exercise so that the assumptions and critical parameters are evident.

Let's start by supposing that the profit for the original content site depends upon the number of visitors, i.e. that:

π = R(sN) - W

where R is the ad revenue function, sN is the number of visitors to the site, and W is costs. It is assumed that RsN > 0, i.e. that ad revenue increases with the number of visitors. This function could be made more complicated, but so long as profit is increasing in the number of visitors, this will suffice since the profit function will reduce to something along these lines in any case.

The number of visitors has two parts, s and N, where N is the total available audience for the type of content you are offering, and s is your share of that audience. I will assume that aggregators, denoted by A, increase the total size of the audience. That is, they provide a service by grouping and filtering content, the grouping/filtering of content lowers transactions (search) costs for readers, and this increases their number. More formally:

N = N(A),  NA > 0

What determines the share of the audience, s, that comes to the site? The share is assumed to depend negatively upon the number of aggregators (since aggregators divert traffic), and positively upon he number of "clickthroughs" from the aggregators:

s = s(A,C),  sA < 0, and sC > 0.

What determines the number of clickthroughs, C? The value of C depends upon two things, the number of visitors to aggregators, rN, where r is the share of the total audience captured by the aggregators, and the average length of excerpts among the aggregators, L:

C = C(rN,L),  CrN > 0 and CL < 0.

Clickthroughs are assumed to go up when aggregators get more visitors, and to go down when aggregators run longer excerpts, so longer excerpts are bad for the originator. But there's also a benefit to longer excerpts, the share of visitors to aggregators goes up when the length of excerpts goes up, and this increases clickthroughs:

r = r(L,A),  rL > 0, rA >0.

It is also assumed that the share to aggregators is increasing in the number of aggregators, A.

Finally, I am holding costs, W, constant since they don't vary in any important way with the parameters of interest, the number of aggregators, A, and the average length of an excerpt, L.

There are two tradeoffs built into this setup. First, as the number of aggregators goes up, the size of the audience goes up (at a decreasing rate), and that increases the visitors to the original content site. However, the share of the audience goes down since more people will choose to do their reading at the aggregators rather than the original site. For the length of an excerpt, L, the tradeoff is that as L goes up, more people visit the aggregator sites since the quality of their posts is higher, and that increases the total size of the audience and the number of clickthroughs. But it also reduces the originator's share of the audience.

More formally, the profit function is:

π = R[s(A,c(r(L,A)N(A),L))N(A)] - W

Then the first order conditions for A and L are:

πA = RsN[[sA + sCCrN(rNA + rAN)]N + sNA] = 0
πL = RsN[sCCrNrLN+ sCCL]N = 0

These are both intuitive (and obvious - hope I got the derivatives right). Let's take the second equation first. This equation, which is the first order condition for L, reduces to:


The term on the left-hand side is the marginal benefit to the original content provider from an increase in L (with A constant). It says that when L goes up, aggregators get a higher share of traffic since the quality of their services goes up (some of the increase in traffic is brand new, i.e. it does not all come from the content provider). The right-hand side is the marginal cost. It says that when L, the average excerpt length goes up, the percentage of clickthroughs goes down. The optimal L balances these two effects on the margin. Importantly, for some parameter values the solution is interior, i.e. the optimal excerpt length is greater than zero, and it's possible that the optimum is for full, uncut excerpts. (For example, if CL is relatively small, i.e. if increasing the excerpt length has little detrimental effect on the number of clickthroughs, then the solution is likely to be interior and the excerpt length longer. If CrN or rL is large, i.e., if clickthroughs are quite responsive to increases in aggregator traffic, or if traffic to aggregators responds strongly to increases in excerpt length, then this also points toward an interior solution and longer excerpt lengths, perhaps even full excerpts.)

The first order condition for the number of aggregators, A, can be written as:

RsN[sC(CrN(rNA + rAN))N + sNA] = -RsNsAN

As before, this says that marginal cost equals marginal benefit. When the number of aggregators, A, goes up, the marginal cost is that the share of the total available visitors, sN, for the original content provider goes down (because s falls), and this in turn lowers ad revenue. This is the right-hand side of the equation, -RsNsAN. The marginal benefit of A going up is threefold. First, as A goes up, N goes up and this increases traffic to the originator for any given value of s. This is the last term, RsNsNA, on the left-hand side of the equation. Second, because the increase in A also increases N, and visitors to aggregator sites depend upon rN, visitors to aggregator sites also go up, clickthroughs then go up, and this increases the share and ad revenue for the original content provider. This is captured by the term RsNsCCrNrNAN. Third, as A goes up, the share, r, to aggregators goes up, and this also increases clickthroughs to the original site. This term is RsNsCCrNrAN2.

More succinctly, on the benefit side, when A goes up, both r and N go up. Two of the marginal benefit terms capture the increase in the aggregators share, rN that results, and how the increase in rN affects clickthroughs and ultimately ad revenue. The other term captures how the increase in N affects the originators traffic, sN.

Again, importantly, the optimal number of aggregators (from the original content providers point of view) is not necessarily zero. An optimum at zero requires a corner solution, so it depends upon the value of the parameters.

Finally, let me emphasize that this is a very quick effort -- I literally threw it together while watching my Ph.D. students take their final this afternoon -- and there are many ways in which it could be improved. It is intended to illustrate some of the effects that are at work in this decision, and to make the argument that the optimal values of L and A are not necessarily zero, nothing more.

Next up -- when there's time -- what's the optimal length of an rss feed? If there are only two choices, all or (essentially) none, what are the conditions under which the full rss solution prevails?

    Posted by on Monday, March 15, 2010 at 07:02 PM in Economics, Technology, Weblogs | Permalink  Comments (4)



    Feed You can follow this conversation by subscribing to the comment feed for this post.