Access OffsetsDB Data#

OffsetsDB provides a detailed view of carbon offset credits and projects. You can access the data in various formats or directly through Python using our data package.

Important

By downloading or accessing the OffsetsDB data archives, you agree to the Terms of Data Access.

CSV & Parquet Zipped Files#

Download the latest version of OffsetsDB in CSV:

Download Credits & Projects

Download the latest version of OffsetsDB in Parquet:

Download Credits & Projects

Citation#

Please cite OffsetsDB as:

CarbonPlan (2024) “OffsetsDB” https://carbonplan.org/research/offsets-db

Accessing The Full Data Archive Through Python#

For more dynamic and programmatic access to OffsetsDB, you can use our Python data package. This package allows you to load and interact with the data directly in your Python environment. With the data package, you can access the data in a variety of formats including CSV (for raw data) and Parquet (for processed data).

Installation#

To get started, install the offsets_db_data package. Ensure you have Python installed on your system, and then run:

python -m pip install offsets-db-data

Using the Data Catalog#

Once installed, you can access the data through an Intake catalog. This catalog provides a high-level interface to the OffsetsDB datasets.

Loading the Catalog

import pandas as pd
pd.options.display.max_columns = 5
from offsets_db_data.data import catalog

# Display the catalog
print(catalog)

<Intake catalog: offsets_db_data>

Available Data#

The catalog includes different datasets, like credits and projects

Getting Descriptive Information About a Dataset#

You can get information about a dataset using the describe() method. For example, to get information about the ‘credits’ dataset:

catalog['credits'].describe()

{'name': 'credits',
 'container': 'dataframe',
 'plugin': ['parquet'],
 'driver': ['parquet'],
 'description': 'OffsetsDB processed and transformed data',
 'direct_access': 'forbid',
 'user_parameters': [{'name': 'date',
   'description': 'date of the data to load',
   'type': 'str',
   'default': '2024-02-13'}],
 'metadata': {},
 'args': {'urlpath': 's3://carbonplan-offsets-db/final/{{ date }}/credits-augmented.parquet',
  'storage_options': {'anon': True},
  'engine': 'fastparquet'}}

Accessing Specific Datasets#

You can access individual datasets within the catalog. For example, to access the ‘credits’ dataset:

# Access the 'credits' dataset
credits = catalog['credits']

# Read the data into a pandas DataFrame
credits_df = credits.read()
credits_df.head()

	project_id	quantity	transaction_date	transaction_type	vintage
0	VCS1	12630	2009-03-26 00:00:00+00:00	issuance	2007
1	VCS1	9074	2014-01-21 00:00:00+00:00	issuance	2006
2	VCS10	153460	2009-04-22 00:00:00+00:00	issuance	2006
3	VCS10	368968	2009-04-22 00:00:00+00:00	issuance	2007
4	VCS10	505908	2009-04-22 00:00:00+00:00	issuance	2008

Similarly, to access the ‘projects’ dataset:

# Access the 'projects' dataset
projects = catalog['projects']

# Read the data into a pandas DataFrame
projects_df = projects.read()
projects_df.head()

	category	country	...	status
0	[energy-efficiency]	Madagascar	...	unknown
1	[energy-efficiency]	Madagascar	...	unknown
2	[energy-efficiency]	Madagascar	...	unknown
3	[energy-efficiency]	Madagascar	...	unknown
4	[energy-efficiency]	Madagascar	...	unknown

5 rows × 15 columns

Calling projects.read() and credits.read() without specifying a date, will return the data downloaded and processed on 2024-02-13.

To load data for a specific date, you can specify the date as a string in the format YYYY-MM-DD. For example:

projects_df = catalog['projects'](date='2024-02-07').read()
projects_df.head()

	category	country	...	status
0	[forest]	Peru	...	listed
1	[renewable-energy, ghg-management]	China	...	listed
2	[unknown]	Cameroon	...	listed
3	[forest]	Kenya	...	listed
4	[agriculture]	Brazil	...	listed

5 rows × 15 columns

Note

If you specify a date for which the data is not available, the package will raise a PermissionError: Access Denied.

Access OffsetsDB Data

Contents

Access OffsetsDB Data#

CSV & Parquet Zipped Files#

Citation#

Accessing The Full Data Archive Through Python#

Installation#

Using the Data Catalog#

Available Data#

Getting Descriptive Information About a Dataset#

Accessing Specific Datasets#