This is for Felix. If
you run a website that depends upon advertising, what is the optimal number of
aggregator sites (sites that run part of your original posts plus a link back to
the original)? What is the optimal length of an excerpt? This won't produce any
results that can't be obtained with a few minutes of thought, it simply
formalizes the exercise so that the assumptions and critical parameters are
evident.

Let's start by supposing that the profit for the original content site
depends upon the number of visitors, i.e. that:

π = R(sN) - W

where R is the ad revenue function, sN is the number of visitors to the site,
and W is costs. It is assumed that R_{sN }> 0, i.e. that ad revenue
increases with the number of visitors. This function could be made more complicated, but so long as profit is increasing in the number of visitors, this will suffice since the profit function will reduce to something along these lines in any case.

The number of visitors has two parts, s and N, where N is the total available
audience for the type of content you are offering, and s is your share of that
audience. I will assume that aggregators, denoted by A, increase the total size of the
audience. That is, they provide a service by grouping and filtering content, the grouping/filtering of
content lowers transactions (search) costs for readers, and this increases their
number. More formally:

N = N(A), N_{A} > 0

What determines the share of the audience, s, that comes to the site? The
share is assumed to depend negatively upon the number of aggregators (since
aggregators divert traffic), and positively upon he number of "clickthroughs"
from the aggregators:

s = s(A,C), s_{A} < 0, and s_{C }> 0.

What determines the number of clickthroughs, C? The value of C depends upon two things,
the number of visitors to aggregators, rN, where r is the share of the total
audience captured by the aggregators, and the average length of excerpts among
the aggregators, L:

C = C(rN,L), C_{rN} > 0 and C_{L} < 0.

Clickthroughs are assumed to go up when aggregators get more visitors, and to go down when
aggregators run longer excerpts, so longer excerpts are bad for the originator. But there's also a benefit to longer excerpts, the share of visitors to aggregators goes up when the length of excerpts goes
up, and this increases clickthroughs:

r = r(L,A), r_{L} > 0, r_{A} >0.

It is also assumed that the share to aggregators is increasing in the number of aggregators, A.

Finally, I am holding costs, W, constant since they don't vary in any
important way with the parameters of interest, the number of aggregators, A, and
the average length of an excerpt, L.

There are two tradeoffs built into this setup. First, as the number of
aggregators goes up, the size of the audience goes up (at a decreasing rate), and
that increases the visitors to the original content site. However, the share of
the audience goes down since more people will choose to do their reading at the
aggregators rather than the original site. For the length of an excerpt, L, the
tradeoff is that as L goes up, more people visit the aggregator sites since the quality of their posts is higher, and that
increases the total size of the audience and the number of clickthroughs. But it also reduces the originator's share of the
audience.

More formally, the profit function is:

π = R[s(A,c(r(L,A)N(A),L))N(A)] - W

Then the first order conditions for A and L are:

π_{A} = R_{sN}[[s_{A} + s_{C}C_{rN}(rN_{A}
+ r_{A}N)]N + sN_{A}] = 0

π_{L} = R_{sN}[s_{C}C_{rN}r_{L}N+ s_{C}C_{L}]N
= 0

These are both intuitive (and obvious - hope I got the derivatives right). Let's take the second equation first.
This equation, which is the first order condition for L, reduces to:

C_{rN}r_{L}N= -C_{L}

The term on the left-hand side is the marginal benefit to the original
content provider from an increase in L (with A constant). It says that when L goes up, aggregators
get a higher share of traffic since the quality of their services goes up (some of the
increase in traffic is brand new, i.e. it does not all come from the content
provider). The right-hand side is the marginal cost. It says that when L, the average excerpt length goes
up, the percentage of clickthroughs goes down. The optimal L balances these two
effects on the margin. Importantly, for some parameter values the solution is
interior, i.e. the optimal excerpt length is greater than zero, and it's possible that the optimum is for full, uncut excerpts. (For example, if C_{L}
is relatively small, i.e. if increasing the excerpt length has little detrimental effect on
the number of clickthroughs, then the solution is likely to be interior and the excerpt length longer. If C_{rN}
or r_{L} is large, i.e., if clickthroughs are quite responsive to
increases in aggregator traffic, or if traffic to aggregators responds strongly
to increases in excerpt length, then this also points toward an interior
solution and longer excerpt lengths, perhaps even full excerpts.)

The first order condition for the number of aggregators, A, can be written
as:

R_{sN}[s_{C}(C_{rN}(rN_{A} + r_{A}N))N +
sN_{A}] = -R_{sN}s_{A}N

As before, this says that marginal cost equals marginal benefit. When the
number of aggregators, A, goes up, the marginal cost is that the share of the
total available visitors, sN, for the original content provider goes down (because s falls), and
this in turn lowers ad revenue. This is the right-hand side of the equation, -R_{sN}s_{A}N. The
marginal benefit of A going up is threefold. First, as A goes up, N goes up and
this increases traffic to the originator for any given value of s. This is the last term, R_{sN}sN_{A},
on the left-hand side of the equation. Second, because the increase in A also increases N, and
visitors to aggregator sites depend upon rN, visitors to aggregator sites also go up,
clickthroughs then go up, and this increases the share and ad revenue for the original
content provider. This is captured by the term R_{sN}s_{C}C_{rN}rN_{A}N.
Third, as A goes up, the share, r, to aggregators goes up, and this also increases
clickthroughs to the original site. This term is R_{sN}s_{C}C_{rN}r_{A}N^{2}.

More succinctly, on the benefit side, when A goes up, both r and N go up. Two of the marginal
benefit terms capture the increase in the aggregators share, rN that results,
and how the increase in rN affects clickthroughs and ultimately ad revenue. The
other term captures how the increase in N affects the originators traffic, sN.

Again, importantly, the optimal number of aggregators (from the original
content providers point of view) is not necessarily zero. An optimum at zero requires a
corner solution, so it depends upon the value of the parameters.

Finally, let me emphasize that this is a very quick effort -- I literally threw it
together while watching my Ph.D. students take their final this afternoon -- and
there are many ways in which it could be improved. It is intended to illustrate
some of the effects that are at work in this decision, and to make the argument
that the optimal values of L and A are not necessarily zero, nothing more.

Next up -- when there's time -- what's the optimal length of an rss feed? If
there are only two choices, all or (essentially) none, what are the conditions under which the full rss solution
prevails?