15 June 2014

HathiTrust
V. Trees

 

B. Indexing As Fair Use

Suing HathiTrust (in essay form)
Proposed Settlement With Google (in essay form)
Suing Google (in essay form)

 

This is where things begin to get... interesting. Remember that this relates only to the works properly in front of the Second Circuit — the 78 specific works on which summary judgment was granted below. Remember, too, that this is about a very specific description of the indexing use:

It is not disputed that, in order to perform a full-text search of books, the Libraries must first create digital copies of the entire books. Importantly, as we have seen, the HDL does not allow users to view any portion of the books they are searching. Consequently, in providing this service, the HDL does not add into circulation any new, human-readable copies of any books. Instead, the HDL simply permits users to “word search”—that is, to locate where specific words or phrases appear in the digitized books. Applying the relevant factors, we conclude that this use is a fair use.

Slip op. at 18. With due respect to the Court, this conflates two distinct infringements: The copying necessary to make the index, and the display of the results from the index. That is simply not consistent with the Second Circuit's own case law — particularly the case law (that was purportedly merely "codified" in the 1976 Act, see slip op. at 16, although anyone who actually reads the case law cited in the legislative history will question that characterization) concerning extraction and use of text as fact. Nonetheless, this conflation may be irrelevant... as the plaintiffs, repeating the mistake of the Guild's ill-implemented complaint against Google, did not properly plead the initial copying as an act of infringement.69 Thus, the Court is left considering only the output and whether it qualifies as fair use, thanks to some bad lawyering.

With that understanding, the Second Circuit's finding that this particular, extremely restricted output form qualifies as fair use is neither unsurprising nor particularly objectionable. But that's the limit. A full KWIC (keyword-in-context) result, showing significant parts of text on either side of the search result, is outside the scope of this opinion. And that is as it should be; by limiting its consideration to the contextless, data-only result, the court avoided actually giving real consideration to the second and third fair use factors, and folded its consideration of the fifth, controlling, nonstatutory fair use factor (administrative convenience) rather silently into the first and fourth fair use factors.70 If Hathi starts providing context around search results — and there is already substantial pressure to do so, which will only increase over time — this decision provides it no cover. Nor should it: That would enable, through a relatively simple recursive script, a user to recover the entire text of the underlying work in ready-to-read form, and the greater the context provided with search results, the easier that script is to validate.


  1. See part III.A, supra; see generally the discussion of the merits in the main GBS essay.
  2. Indeed, the most form-neutral interpretation of "transformative use" turns almost entirely upon the administrative convenience (or, more to the point, inconvenience) of getting permission for every "transformative use" that is not somewhere on the parody/satire spectrum. Cf., e.g., Suntrust Bank v. Houghton Mifflin Co., 257 F.3d 1247 (11th Cir. 2001) with Dr. Seuss Enters. v. Penguin Books USA, Inc., 109 F.3d 1394 (9th Cir. 1997) and Mattel, Inc. v. Walking Mountain Prods., 353 F.3d. 792 (9th Cir. 2003).