Making law in a “webby” world — what’s in it for libraries?

Jason Wilson has an interesting new series of blog posts (1, 2) grappling at the idea of defining, or re-defining, the paradigms in which legal practice locates, defines, and uses authority. Ultimately, he argues that it is (past) time to abandon a tired “library” metaphor for a more modular, LEGO-like, approach to crafting responses to legal research needs.

It is an interesting, although incomplete, argument. I particularly find value in the first part, urging a reconsideration of both print and, more broadly, library metaphors for online, inter-linked, legal information. And I would agree that law libraries as institutions have often been too captive, if perhaps inevitably so, to incremental improvements in traditional services and resources, such that we are probably doomed to miss the really big disruptive, transformational, changes in the ways information (including legal information) is constructed and consumed.

While borrowing Jason’s disclaimer to state that I, also, am not a legal philosopher, it is by his analysis of law and legal change that I’m unconvinced. He seems to be writing off the construction of legal research tools around silos of authority (statutory codes, case repositories, etc.) as a rusty and outdated shackle that we’ve retained only from our habitual over-reliance on library metaphors. Instead, I see the tendency of online tools to erode the visibility of hierarchy and structure in the law as one of the greatest dangers of online legal research (to its practitioners) and, perhaps, even a danger to the broader nature and quality of legal work and discourse.

I should say the tendency and potential of online tools to erode the visibility of lines of authority and context — authority and context don’t disappear (replaced by some incompletely-defined LEGO-block of a legal meme) just like that – even if the collective cognitive structure of “law” is to break down over time it will be a slow process, leaving residues and traces, and the practitioner who understands those residual patterns is advantaged over the one who doesn’t. Actually, I’d say that so long as law is law our ability to make reference to these structures and contexts is essential. We’re all legal realists now, and we ‘play the game’ with self awareness, but nonetheless judging and other law-making activities are bounded and legitimized only by the necessity to make legal argument within the bounds of established hierarchies of authority and precedent, and with shared reference to external authority (both sovereign and cognitive) as defined and accepted within communities of practice. It is that “cognitive” part where I worry about researchers who’ve lost touch with context-giving research tools – it seems like there is value in crafting an argument, not just from a grab-bag of authorities, such as what court p has said about similar-scenario x, but also in knowing how legal issue y has traditionally been understood as a close cousin of legal issue z and how the law in a subject area y has tended to show more favor to analogy a than to analogy b because of a‘s relevance in the law of z.

The research component of this, and the law librarian’s stake, is that tools can either make those contexts and networks of authority transparent or not. I’d agree that we may reach a point where book-metaphor and library-metaphor representations aren’t the best way to clarify those connections — something “webbier” may be emergent. But context-providing research tools, from primary sources constructed as statutory codes to secondary sources like well-accepted narrative treatises that give voice to a practice community’s understanding of how its subject area is contextually organized, make it easier for practitioners to make good legal arguments and, presumably, for lawmakers to make good law. Tools that obscure context make it harder.

The existing generations of online research tools have obscured context — their focus (technically necessary at their inception) on literal, Boolean-based, textual retrieval targeted to an individual “document” removed all visible context. This is critically limiting as soon as search turns to something like a statutory code, that does have internal order, and accesses it using tools and paradigms (e.g. text-based query with each section defined as a ‘document’) that actively hide that internal order and context. For cases, bypassing a Digest does remove one external giver of order/context, but at least judicial opinions, in some sense, really do “float out there by themselves,” each an atomized document, just as they appeared in early Lexis or Westlaw, connected to one another primarily by way of citation and cross-reference (i.e. links). So a web of “links” — cross-referencing documents over time — isn’t actually a bad way to represent judicial law. New tools, like Westlaw Next, that (finally) replace both Boolean literalism and the crude relevance ranking of their old ‘Natural Language’ with the connection-seeking, link-parsing, logic of post-Google search engines are important, and will doubtless change the way we search, read, and write law. But I still have trouble seeing how such tools would replace some need to communicate structured context to the researcher. And so I remain skeptical of the logical next step, which would be a collapsing of any remnant of the “library metaphor” database directories into one big wide open Google-esque search box.

I have more to say about Jason’s second post, where he grapples toward a sort of object-oriented approach to legal argument (using a LEGO metaphor). So far as this seems to be a recognition of the inadequacy of some CALR equivalent of the basic Google search, while still looking for some alternative to the old database silos, I’m in sympathy. But that’s a whole other discussion…

Case Law in Google Scholar

Google Scholar has a new radio-button selection on its front page to search for “Legal opinions and journals.” This development is at least a useful new free way to quickly obtain the (cut-and-past-able, html) text of known opinions with cited opinions conveniently hyperlinked — it remains to be seen what, if any, deeper research value the tool will have.

Based on a few minutes of tinkering, the legal opinions that turn up in searches are full-text, hosted by Google, while journal article results tend to be hosted by third parties and/or have only a “citation” result turning up from the Google Scholar search. The ‘Advanced Scholar Search’ interface also allows the user to limit the search to opinions only from either federal courts or from individual states. The Google-hosted Scholar results do not seem to show up in regular web-search Google results.

Google Scholar searching is citation based, in that the presentation and ranking of results is based on the Google’s crawler’s indexing of the citing documents rather than primarily on the indexing of the document returned by the search (which, indeed, may not even be available in full-text online if it is cited adequately often by crawl-able online content). While Google’s web search obviously looks (more broadly)at linking patterns (an innovation on the Google founders’ part that was informed by their familiarity with scholarly citation analysis) the focus on citations as search fodder in Google Scholar is narrower in focus and much more explicit. As a result, using Google Scholar for cases will differ in important ways from searching in other online, word-searchable, repositories of case law, where results are based primarily on the presence and placement of search words in the actual retrieved texts.

A test search for In re Bilski, for instance, turns up the Federal Circuit’s opinion, and also several of the major patent cases cited in Bilski. This didn’t work all that well — Diamond v. Diehr, for instance, is turned up no where in the results — as Google’s bots obviously have no way of knowing which cited works are either at the top of the authority pyramid or figure most importantly into the analysis of the courts and parties. The citations-based ranking algorithm likely does do much better than Lexis’ or Westlaw’s ‘natural language’ searches in zeroing in on ‘lead cases’ with a fairly crude/simple keyword search — a test search for “business method patent” turned up (appropriately, I’d say)the State Street Bank case as the first result (with other major cases following)while Lexis’ and Westlaw’s ‘natural language’ searches predictably turned up mostly long lists of recent district court opinions.

I’m reluctant to extend an endorsement of Google Scholar as a case-finding tool too far, as any number of secondary sources would, much more reliably (and with valuable provision of context and explanation) direct the researcher to the same lead cases. On the other hand, Google’s citation-based ranking algorithm certainly adds something new to the mix, and provides results that strike me (on limited experimentation) as an interesting contrast to those from web-based, keyword-searched, repositories.

The citation-analysis basis of Google Scholar is also leveraged in the presentation of results. When a judicial opinion result is selected by the user, it is presented in a tabbed display, with the ability to toggle between the default “view this case” tab and a tab labled “how cited.” “How cited” includes ‘blurbs’ from citing resources (hint to Google that offering a way at the, e.g., “83 similar citations,” would be useful for this material) and also a full list of citing documents that Google has unearthed for the case in question. It is unclear to me how these citing documents are ranked as results, and there do not appear to be tools to limit this display to, e.g., cases only (much less to cases by jurisdiction). But at least the germ of a new, free, citator exists here.
It is unclear from precisely where Google is sourcing its judicial material or what the depth or scope of the database is (though Justia notes that opinions from all 50 states are included from 1950). Results do come – in some way – from official reporters and do include marginal notation of print pagination. I can speculate that the Public.Resource.Org compilation of federal case law, much originating with a donation from FastCase, may play a role. State sourcing is even more mysterious — I hope colleagues can fill me in on what role Google’s bulk library book scanning plays, as my sense was that the law libraries involved in the program had not yet been scanned.

Other places with new news and tinkering/experiment results:
Justia’s Law, Technology, and Legal Marketing Blog (and Paul Stanley’s Twitter feed); Harvard Law School Library’s Et Seq. blog; ResourceShelf; and Internet for Lawyers; Rick Klau of Google, on Twitter.

Andrew Plumb-Larrick

