1/1
7 files

Image collection and supporting data for: An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning

dataset
posted on 16.12.2021, 19:05 authored by Peter WilfPeter Wilf, Scott L Wing, Herbert W. Meyer, Jacob A. Rose, Rohit Saha, Thomas SerreThomas Serre, N. Rubén Cúneo, Michael DonovanMichael Donovan, Diane M. Erwin, Maria A. Gandolfo, Erika B. Gonzalez-Akre, Fabiany Herrera, Shusheng Hu, Ari Iglesias, Kirk R. Johnson, Talia S. Karim, Xiaoyu Zou
Here we provide the image dataset and supporting data files for the following primary article. Please refer to the primary article and the supporting data (provided here) for all details.

Wilf P, SL Wing, HW Meyer, J Rose, R Saha, T Serre, NR Cúneo, MP Donovan, DM Erwin, MA Gandolfo, E González-Akre, F Herrera, S Hu, A Iglesias, KR Johnson, TS Karim, X Zou. 2021. An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning. PhytoKeys 187: 93–128, doi:10.3897/phytokeys.187.72350 https://doi.org/10.3897/phytokeys.187.72350

Files are provided here as zip archives, as follows. v1.0 is the dataset that corresponds precisely to the published article and will be preserved here. Any future updates provided here will use new version numbers.

Extant_Leaves_A-E_v1.0.zip
Extant_Leaves_F-O_v1.0.zip
Extant_Leaves_P-Z_v1.0.zip
Families A–E, F–O, and P–Z, respectively, of cleared and x-rayed leaf images.

Florissant_Fossil_v1.0.zip
Fossil-leaf image collection from Florissant Fossil Beds National Monument.

General_Fossil_v1.0.zip
Fossil-leaf image collection from several other sites in North and South America.

General_Fossil_uncropped_v1.0.zip
Reference set of the uncropped image versions for the General Fossil collection, for access to scale bars and other archival information not otherwise available digitally (see main article). Filenames are suffixed with "_uncropped" and may have minor differences in format from the cropped set.

supplemental_data_v1.0.zip
Archive containing three files:

Master_inventory_leavesdb_v1.0
Master inventory file listing all extant and fossil specimens.

See details in the main article (esp. table 1) for how to look up additional specimen data, which are easily available on the Web for most of the collections using the catalog numbers listed in this inventory file (also see below). Please note that the catalog numbers listed here may be primary or secondary, as described in the main article (table 1). The "old_Family" field preserves legacy data that can assist in locating physical specimens in the collections, which usually retain their original taxonomic organization (see main text).

The other two files are catalogs of specimen data not otherwise available on the Web (see main article).

General_fossils_catalog_v1.0.csv
Specimen data for the "General fossil" image collection.

Wing_x-ray_catalog_v1.0.csv
Voucher data for the Wing X-Ray image collection.
Technical notes:
Catalog number field in the Master Inventory file = negative number + leaf number as listed in this file.
Example: "Wing_199-001" in the Master Inventory = negative 199, leaf 1 here = Alphonsea arborea (Annonaceae) = primary voucher US 904529.
Some typographical errors in this legacy catalog are left as-is, and identifications are not updated here. Vetted spellings and updated family and order assignments can be found by catalog number (= negative + leaf number) in the Master Inventory file.
This file includes some additional records that did not meet criteria for the image dataset.

Funding

Collaborative Research: Origins of Southeast Asian Rainforests from Paleobotany and Machine Learning

Directorate for Geosciences

Find out more...

Collaborative Research: Origins of Southeast Asian Rainforests from Paleobotany and Machine Learning

Directorate for Geosciences

Find out more...

Collaborative Research: Origins of Southeast Asian Rainforests from Paleobotany and Machine Learning

Directorate for Geosciences

Find out more...

Collaborative Research: Patagonian Fossil Floras, the Keys to the Origins, Biogeography, Biodiversity, and Survival of the Gondwanan Rainforest Biome

Directorate for Biological Sciences

Find out more...

Collaborative Research: Patagonian Fossil Floras, the Keys to the Origins, Biogeography, Biodiversity, and Survival of the Gondwanan Rainforest Biome

Directorate for Biological Sciences

Find out more...

National Park Service

History

Research Institution(s)

Pennsylvania State University, Smithsonian Institution, Florissant Fossil Beds National Monument, Brown University, and many others

Contact email

pwilf@psu.edu

I confirm there is no human personally identifiable information in the files or description shared

Yes

I confirm the files and description shared may be publicly distributed under the license selected

Yes