Generative AI at work
2 min read

On October 31st, The Library Innovation Lab at the Harvard Law School Library announced the launch of its Caselaw Access Project API and bulk data service. The service makes available almost 6.5 million cases since 1600s till date, thus making the full corpus of published U.S. case law online for anyone to access for free.

According to the Harvard Law Today, “Between 2013 and 2018, the Library digitized over 40 million pages of U.S. court decisions, transforming them into a dataset covering almost 6.5 million individual cases.” The Caselaw Access Project API and bulk data service puts this important dataset within easy reach of researchers, members of the legal community and the general public.

Adam Ziegler, director of the Library Innovation Lab, in an article in Fortune Magazine, said, “the Caselaw Access Project will be a treasure trove for legal scholars, especially those who employ big data techniques to parse the corpus. It’s an opportunity to reconstruct the law as a data source, and write computer programs to peruse millions of cases.”

The CAP API and the bulk data service

The CAP API is available at and offers open access to descriptive metadata for the entire corpus. API documentation is written in a way to make it easy for both experts and beginners to understand.

Jonathan Zittrain, the George Bemis Professor of International Law at Harvard Law School, and Vice Dean for Library and Information Resources said, “Libraries were founded as an engine for the democratization of knowledge, and the digitization of Harvard Law School’s collection of U.S. case law is a tremendous step forward in making legal information open and easily accessible to the public.”

Real time implementation of the CAP API and the bulk data service

John Bowers, a research associate at Harvard Library Innovation Lab, used the Caselaw Access Project API and bulk data service to uncover the story of Justice James H. Cartwright, the most prolific opinion writer on the Illinois Supreme Court, according to Bower’s recent blog post.

Generative AI at work

Bowers said, “In the hands of an interested researcher with questions to ask, a few gigabytes of digitized caselaw can speak volumes to the progress of American legal history and its millions of little stories.”

By digitizing these materials, the Harvard Law School Library aimed to provide open, wide-ranging access to American case law, making its collection broadly accessible to nonprofits, academics, practitioners, researchers, and law students. Thus anyone with a smartphone or Internet connection can have an access to this data.

Read more about this project in detail, on Caselaw Access Project.

Read Next

Data Theorem launches two automated API security analysis solutions – API Discover and API Inspect

Michelangelo PyML: Introducing Uber’s platform for rapid machine learning development

Twilio acquires SendGrid, a leading Email API Platform, to bring email services to its customers

Generative AI at work
A Data science fanatic. Loves to be updated with the tech happenings around the globe. Loves singing and composing songs. Believes in putting the art in smart.