[iDC] Alternatives to black-box page-rank algorithm (was conference summary part 2: the internet as playground and factory)

Thu Nov 19 19:33:16 UTC 2009

Zbigniew Lukasiak raises important issues below.  I don't think there will
be many commercial alternatives developing, for reasons I give here:

http://madisonian.net/2009/03/18/seven-reasons-to-doubt-competition-in-the-general-search-engine-market/

So we have to respond to the dominant search engine (i.e., the
Googlement) we've got. Gaming is serious problem.  My short answer is that a
trusted advisory committee within the Federal Trade Commission (the US’s
national privacy and consumer protection regulator) could help courts and
agencies adjudicate coming controversies over search engine practices,
without revealing rankings to the public.  Such a committee, like the FISA
court, would not practice “total transparency”—it would practice “qualified
transparency,” only releasing relevant methods “in camera” to entities that
have a bona fide complaint.  Such a committee would extend to the
administrative realm an old judicial practice called “protective orders,”
which allow trade secrets in litigation to be reviewed.

This institution might provide one method for developing what Christopher
Kelty calls a “recursive public”–one that is “vitally concerned with the
material and practical maintenance and modification of the technical, legal,
practical, and conceptual means of its own existence as a public.”  Questioning
the power of a dominant intermediary like Google is not just a prerogative
of the anxious.  Rather, monitoring is a prerequisite for assuring a level
playing field online.

However, even if we think that type of institutional solution is not
practical, it’s still valuable to consistently remind people of the
weaknesses of “algorithmic authority,” as I do here:

http://balkin.blogspot.com/2009/11/assessing-algorithmic-authority.html

I think that sort of consciousness-raising is important because, at the
conference, one participant at the closing session said that media studies
was in a primitive state, closer to “alchemy” than a real science like
physics. We need to bear in mind the power of internet intermediaries before
treating the web as a natural phenomenon to be studied and understood using
the models of natural science.

Search engines are referees in the millions of contests for attention that
take place on the web each day.  There are myriad entities that want to be
the top result in response to a query like “sneakers,” “best restaurant in
New York City,” or “best employer to work for.” The top and right hand sides
of many search engine pages are open for paid placement; but even there the
highest bidder may not get a prime spot because a good search engine strives
to keep even these sections relevant to searchers.  The unpaid, organic
results are determined by search engines' proprietary algorithms, though
users often fail to distinguish between unpaid and paid placements.

Given the secrecy of search engines’ ranking algorithms and carriers’
network management practices, it is very difficult for an entity to
determine whether it has a “stealth marketing problem” online—i.e., a
competitor that is somehow leveraging payments or business partnerships with
intermediaries in order to gain greater relative exposure.  Recognizing this
problem, the FTC has taken some tentative steps toward recognizing the
potential for consumer deception and cultural distortion here.  In 2002, the
agency sent a letter to various search engine firms recommending that they
clearly and conspicuously distinguish paid placements from other results.  But
neither the FTC nor other potential regulators has followed up such guidance
with systematic monitoring.

 In order for the FTC to determine whether its guidance is actually being
followed, it will need to develop sophisticated methods of understanding how
organic results are determined.  Without such an understanding, it will be
impossible to distinguish between paid and organic content.  This monitoring
needs to happen in real-time, rather than after a dispute arises, for many
reasons.  First, data retention may be spotty. Second, the history of
regulation of high technology industries indicates that government lag in
understanding how critical infrastructure functions can effectively neuter
even a strong regulatory regime. Just as Danny Weitzner has called for an
“independent panel of technical, legal and business experts to help [the
FTC] review, on an ongoing basis, the privacy practices of Google,” the
agency needs to develop the capacity for understanding the ranking practices
of Google and its competitors.  This capacity could, in turn, enable
litigants to submit focused queries to a nonbiased third party that could
quickly give critical information to courts mired in discovery disputes in
search-related lawsuits.

I hope this counts as a practical response that respects Google’s war
against spammers.  As Elizabeth Van Couvering, has argued, search engines
often operate using a “war schema . . .  as they assume the role of guardian
or protector of something precious—in this case, access to the Web” (*Is
Relevance Relevant? Market, Science, and War: Discourses of Search Engine
Quality*, 12 J. Computer-Mediated Comm. 866, 880 (2007)).   The public
should have some idea how the internet is shaped by search engines.  And
where, as in the case of books, the problem of spamming should be less acute
than that on the web as a whole, more transparency may well be appropriate.

All best,

--Frank

PS: France's "Commission Nationale De L'Informatique et des Libertes"
(CNIL) appears to have taken some important steps regarding privacy, but I'd
love to hear from French list members to hear if it's actually an
institutional model for assuring that "the development of information
technology remains at the service of citizens and does not breach human
identity, human rights, privacy or personal or public liberties."

On Thu, Nov 19, 2009 at 3:08 AM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:

> Hi there,
>
> I have not been at the conference and I don't know if this point was
> raised, if it was then - please forgive me.
>
> On Wed, Nov 18, 2009 at 6:28 AM, nathan jurgenson
> <nathanjurgenson at gmail.com> wrote:
> > Frank Pasquale forcefully called on Google to be more transparent. Given
> > what was discussed above, as well as Google’s central status in our
> > day-to-day knowledge-seeking life, Pasquale leaves us with questions to
> > ponder: should its page-rank algorithm be public? Should Google be
> allowed
> > to up-rank or down-rank links based their relationship to the company?
> > Should Google be able to simply remove pages from its listings? Should
> > Google be forced to let us know when they do these things? ~nathan
>
> I am also more and more afraid of the kafquesque world of Google
> government of our information sources - but they do have a valid point
> for the secrecy of page-rank: this is about defending against those
> that try to game the system.  If the page-rank algorithm was public it
> would be analysed and effective ways to game it would be found and we
> would drown under the deluge of spam.  Now there are still people and
> companies that try to analyse the black-box - but at least their
> actions cannot be very effective.
>
> If we are to be constructive in our criticism Google for the black-box
> algorithm we should also propose some alternative.   Most probably
> there is no alternative that Google could unilaterally deploy - most
> probably this would require a complex web of law, social norms and
> technical changes.  This would be an interesting project.
>
> Cheers,
> Zbigniew Lukasiak
> http://brudnopis.blogspot.com/
> http://perlalchemy.blogspot.com/
> _______________________________________________
> iDC -- mailing list of the Institute for Distributed Creativity (
> distributedcreativity.org)
> iDC at mailman.thing.net
> https://mailman.thing.net/mailman/listinfo/idc
>
> List Archive:
> http://mailman.thing.net/pipermail/idc/
>
> iDC Photo Stream:
> http://www.flickr.com/photos/tags/idcnetwork/
>
> RSS feed:
> http://rss.gmane.org/gmane.culture.media.idc
>
> iDC Chat on Facebook:
> http://www.facebook.com/group.php?gid=2457237647
>
> Share relevant URLs on Del.icio.us <http://del.icio.us/> by adding the tag
> iDCref
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.thing.net/pipermail/idc/attachments/20091119/55c7f8ba/attachment-0001.htm