Uelma should be enacted by the states. But while its concept of verifiable authenticity is useful, compliance with that prong of the law should not lead to excessive reliance on PDF files to share legal resources in only very limited, maximally “print-like,” and insufficiently machine-readable and bulk-processable formats. Open legal resources should also allow commercial, non-profit, community, and library entities to consume relatively “raw” government resources and add the value that makes them efficient and effective tools for an informed public.
Uelma is the Uniform Electronic Legal Materials Act. Promulgated by the Uniform Law Commission, UELMA covers online state legal materials that are deemed “official” (while requiring that covered legal publications produced only electronically be considered official) and requires that they be produced in a way that is capable of being authenticated, that is preserved, and that is made permanently available to the public.
The law library community has been pushing hard for UELMA. Here’s the resource page from the American Association of Law Libraries. Much of the librarian’s urgency around UELMA flows, I think, from alarm at the prospect that states will move to online-only production of important legal materials without adequate consideration of preservation, long-term access, or “authenticity.” AALL’s two major state-by-state studies, of preservation in 2003 and of authentication in 2007 (updated in 2009), remain the clearest statements of both these fears and the condition of their accommodation (or not) by the states. The subsequent effort to inventory state materials provided further evidence of an increasing number of official, but not authenticated, online materials.
That last concept, of authentication, is a bit conceptually trickier than preservation and permanent public access. At its heart is the notion that a legal system requires both the reality and the public perception that writings with the official force of law, once published, can’t be changed — either accidentally or maliciously. We have to be able to know that writings purporting to be the law really are the law: that they are authentic to the statement originally emanating from the legislature, regulatory body, or court and that any changes or amendments have themselves been fully documented and described in an explicit, audit-able, way.
The importance of authenticity
Competing paradigms to explain both the concept, and the alarm the library community feels about it, would be to compare a government website that can change from one day to the next (either in a big, political, way with the rise of a new administration or in small, day-to-day, changes of focus or interest) with the traditional dissemination of official legal materials in sets of books which are distributed to many libraries and, as a practical matter, can only be “amended” with the distribution of supplements or replacement copies. Libraries can and do maintain the old material, allowing lawyers and other researchers to reconstruct the law at any moment in the past with a high degree of confidence in the reliability of the materials used to do so. Librarians, for reasons that I think should seem obvious, are alarmed at any prospect of the law coming to resemble the casual, easily and centrally modifiable, nature of a government agency website. Clearly some combination of process and technologies to make sure that materials put forth as “the law” really are the law is important.
Print, and its discontents
Think about a major legal genre, statutory codes. We still live in a world where, in the vast majority (all?) states, as well as in the federal system, only the bound and printed edition of statutory codes is considered “official.” (And this is apart from issues related to codification, enactment from session laws, etc.) What this means is that most typically only the printed (often expensive, privately published/printed) version of the code is formally considered to be proof, or at least definitive ‘prima facia’ evidence, of the actual law. This means that the versions of codes that are shared freely online with the public are formally derogated as “unofficial” and often carry explicit statements that they are not to be relied upon. This state of affairs is often damaging to full public access to the law. It might also be regarded as carrying a whiff of protectionism, vis-a-vis commercial legal publishers who have state contracts and even, to some degree, vis-a-vis traditional public or public-university academic law libraries.
Note that this situation doesn’t prevent workaday lawyering from relying, overwhelmingly, on high-end commercial online legal research providers, whose electronic versions are also “unofficial.” That is a topic for another day, but I certainly think that the fact that leaders among those providers are the self-same publishers who disseminate official print codes contributes to this. More darkly, I’d also speculate that the very fact that these CALR systems are expensive contributes to our willingness to rely upon them.
Print does have some real inherent virtues here. As a technology, bound print with some kind of formal serialized paper supplementation does a really good job of capturing static snapshots of the law at the moment of publication that can also be advanced forward, with supplements, in a way that requires some technical understanding of the formal print tools but is ultimately very much controlled by and transparent to the end user (researcher).
As far as “authentication” the virtue of print is distribution. Sure, you can counterfeit one volume of one book. You can’t readily alter all of the copies of all of the books on all of the shelves of all of the libraries without a lot of visible process and without drawing a lot of attention. Even fixing actual printing errors, once something is “in the wild,” requires a lot of visible effort. It is also hard for the authentic nature of distributed print to be damaged accidentally. The kind of technical error that might, conceivably, corrupt the audit trail of a centralized electronic repository — even one that supposedly keeps accurate track of change and amendment — couldn’t easily promulgate through the print world.
But — and this is suprisingly important — the authenticity of print has nothing to do with how it looks. It has little or nothing to do with the way that books are paginated (yes, this is relevant for citation stability, but not for authenticity) or with the way in which text is arranged on a page. An electronic file that is made to present itself to the reader as “print like” in a user-interface respect is inherently no more print-like for auditability/authenticity purposes than a raw text file, or HTML, XML, or JSON. This may seem to some like stating the obvious, but I’ve worked enough with law journal cite-checkers to know the magical hold that “can you get a PDF of this” has on some corners of the legal mind.
Uelma and the Magical PDF
Which takes us back to UELMA, and its potential unintended consequence for open legal data, machine-readable legal information, and the emergence of a legal information “ecosystem” in which basic and free release of legal information by states might allow a truly competitive market for enhanced legal research products and tools. A stream of verifiable, auditable, legal output from a state could be scooped up by commercial AND non-profit builders of legal research tools for editorial and/or structural enhancement. Libraries could of course be re-publishers themselves. They could also “plug in” to undertake their own projects for long-term archiving and preservation — including engaging in their own capture-and-print preservation of vital material for truly long-term century-scale preservation.
There are certainly a lot of other obstacles to the emergence of this legal-information ecosystem. Not least among them is the reliance of governments on privileged outside publishers for the production and distribution of at least some official legal materials (especially codes).
But well-meaning and important projects like UELMA should not, themselves, become unintended obstacles to the emergence of that kind of “open source” legal information framework.
In an important blog post, Waldo Jaquith of U.S. Open Data addresses both a problem and a potential solution, with a track-record in Minnesota. The problem that he identifies is that UELMA-adopting states, newly obliged to not only preserve permanent public access to covered materials but to share them in a verifiable format (capable of being authenticated), will turn solely to “signed” PDF for publication of legal material. PDFs are very much not the static, photographic, facsimiles of an inert page that the (legal) public often thinks they are. (That we use a format for this role that also has a lot of dynamic, scripted, interactive capacity is a whole other can of worms.) They are, at least, openly and well documented and accessible, in some ways, to software (e.g. extraction of text in PDFs for search-engine indexes). But collections of individual signed PDFs are far from an ideal format for large-scale machine-readability, data-processing, and re-use. As Jaquith notes:
To comply, many states are simply looking at publishing all laws as PDFs, using Adobe Acrobat to digitally sign each PDF. That would be a cheap, easy way to comply with the law, with the downside that it could be the end of legal data within states. That’s because an authentication system based on Acrobat cannot be applied to anything other than a PDF, like JSON, XML, YAML, etc. That would be catastrophic for open data. As states adopt UELMA, they have a year or two to figure out how to implement it, providing a brief window in which this disaster can be headed off.
I’d add the observation that the very “print-like” nature of the PDFs we most commonly use in law and the elegance of their “signature” metaphor makes it easy to willingly trust, on a pretty surface-level assessment, the validity of their “signatures” while failing to equally trust other file verification systems that are described less plainly to a non-technologist public. Also, every politician and every lawyer knows PDF — it is an easy response to concerns about implementation cost or techy “rabbit holes” to say “well, just use signed PDFs.” But comparable techniques of encryption, hashes, and checksums and the same need for trusted (or not) certificate authorities ultimately apply to a signed PDF as do to any digital technology method to guarantee that the record is “unaltered from the official record published by the publisher.”
Some years back in the discussions and “tour” of meetings under Public Resource’s “law.gov” banner, I noted a divide among participants between (most of) the librarians and (most of) the others who, for convenience, I’ll refer to as the “technologists.” The former tended to be thinking in terms of government entities providing “full stack” public access — sharing the law in usable, accessible, readable forms on robust public websites, for direct (searching or browsing) access by members of the public. The technologists seemed much more concerned with compelling government to just open the pipes and get the “data” out there, and confident that a second tier of providers would be there to scoop it up and build the appropriate tools and resources for their various slices of the public. The actual “Principles and Declaration” circulated above the names of the “Law.gov” conveners essentially splits this difference, sketching a vision in which governments do more than just open a fire hose, but where the emphasis is also more on bulk access and data than on item-by-item access through government portals. That conveners list, by the way, was dominated by legal academics, including law library directors, and also included computer-science academics and some entrepreneurs and other non-academics.
The U.S. Open Data blog post posits the difficulty governments have in procuring and implementing computer technology as an obstacle (for which they provide a solution) to authentication in a “data friendly” way. I’d posit that it may be an even bigger obstacle to getting full-scale economical implementations of the kinds of direct government provision of free, usable, legal research tools that many of us are waiting for.
Look back at the example of Codes for a moment. The model of public access has too often been based on privileged state relationships with legal publishers who produce unofficial and often very poor-quality, functionally-limited, public-facing websites as part of the deal to produce (expensive, commercial) official print Codes. UELMA itself doesn’t solve this problem, or even directly touch these materials so long as the states continue to produce (or authorize) a print version designated as the only official one. But wouldn’t a data-friendly implementation of UELMA be one paving stone on the road to robust, verifiable, output of basic state legal information (freely, online) that would allow any number of fully competitive publishers and intermediaries to delivery equally authentic, cite-able, and usable, versions of that material to the reading and researching legal public?
(I’ve frequently used the example of codes in this post. I should note that session laws are the more immediate example where states have begun to cease print publication, such that robust availability of online versions has been very much a live topic.)
8/3 — Adding a note to cross-reference this post of mine, in which I attempt to tease out what I think are the real meaningful distinctions between “print” and “online” resources (http://andrew.plumb-larrick.com/print-vs-online-really-difference/), and this post, with additional earlier ruminations on the PDF (http://andrew.plumb-larrick.com/why-is-so-much-law-in-pdf-files/).