The CEPS EurLex dataset: 142.036 EU laws from 1952-2019 with full text and 22 variables
Borret, Camille; Laurer Moritz
Publication Date:
Dataset Description:
The dataset contains 142.036 EU laws – almost the entire corpus of the EU’s digitally available legal acts passed between 1952 – 2019. It encompasses the three types of legally binding acts passed by the EU institutions: 102.304 regulations, 4.070 directives, 35.798 decisions in English language. The dataset was scraped from the official EU legal database ( and transformed in machine-readable CSV format with the programming languages R and Python. The dataset was collected by the Centre for European Policy Studies (CEPS) for the TRIGGER project ( We hope that it will facilitate future quantitative and computational research on the EU.
Name | Description |
Act_amends | CELEX number of the old act amended by the new act (see detail on CELEX below) |
Act_cites | CELEX number of other acts cited by the act |
Act_name | Full name of act |
act_raw_text | The full raw text of the act in one string. Mostly includes: title, recitals, legal articles and annex. Please note that the texts of older laws is not always clean. |
Additional_info | Additional information |
Amends_links | link to previous act which is amended by the new act |
Authors | name of the act’s authors |
CELEX | unique CELEX identifier of the act. |
Cites_links | link to other acts, cited by the act |
Date_document | Date of the document. The website does not provide an explanation of which exact date in the legislative process this represents. The dataset ranges from 1952 to August 2019. |
Date_publication | Date the document was published |
ELI_link | European Legislation Identifier (ELI) link to the act. |
EUROVOC | A group of EuroVoc keywords associated with the act. |
Eurlex_link | Link to act on website. |
First_entry_into_force | Date when act first entered into force |
Legal_basis_celex | The CELEX number of the act’s legal basis |
Oeil_link | Link to the European Parliament’s Legislative Observatory (Oeil). Provides procedural information. |
Procedure_number | Number of the legislative procedure leading to the act |
Proposal_link | Link to the Commission proposal proceeding the act |
Status | whether the act was in force at the time of scraping (August 2019). (“In Force” or “Not in Force”) |
Subject_matter | Group of keywords representing the subject matter of the act. Similar to EUROVOC, only less detailed, more abstract. |
Temporal_status | Date of end of validity of the act |
Treaty | Name of the Treaty the act is based on |
Harvard Dataverse
1.5 GB
Creative Commons Zero v1.0 Universal