Resource
awesome-legal-data
Collection of Datasets for Legal Text Processing
📂 Demo
awesome-legal-data
A curated list of resources dedicated to legal data. The collection contains data sets, tools and other links related to the legal domain. Most resources are openly available.
United States
- Caselaw Access Project by Harvard Law School
- CourtListner - Search millions of opinions by case name, topic, or citation. 403 Jurisdictions. Sponsored by the Non-Profit Free Law Project.
- H2O Open Case Book
UK
Canada
Australia
- Australasian Legal Information Institute
- Open Australian Legal Corpus: The First Multijurisdictional Open Corpus of Australian Legislative and Judicial Documents
Germany
- OpenJur
- Open Legal Data
- A Dataset of German Legal Documents for Named Entity Recognition (Lynx Project)
- GerDaLIR: A German Dataset for Legal Information Retrieval (Paper)
- gesp: Download all available German court decisions straight from the command line
- German Legal Sentences (GLS): Semantic sentence matching and citation recommendation
Switzerland
Netherlands
Norway
Poland
Czech
Finland
France
EU
- EUR-Lex
- MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
- Mining Legal Arguments in Court Decisions - Data and software (European Court of Human Rights (ECHR))
Japan
Other datasets
Tools
- Blackstone - A spaCy pipeline and model for NLP on unstructured legal text.
- Pseudo-anonymization of French legal cases
- Scripts to crawl English legal corpora
- LEGAL-BERT: The Muppets straight out of Law School
- Law-OMNI-BERT-Project