RGB and VIS/NIR Hyperspectral Imaging Data for 90 Rice Seed Varieties



# RGB and VIS/NIR Hyperspectral Imaging Data for 90 Rice Seed Varieties

The dataset contains 90 rice seed species and 96 kernels per species resulting in 8,640 rice seed kernels in total. The dataset was collected in 2017 using the following two imaging systems:

1. Visible - Near Infrared (VIS/NIR) Hyperspectral Imaging Device System (~385nm - ~1000nm) consisting of a Specim V10E Imaging Spectrograph and Hamamatsu ORCA-05G CCD camera.
2. RGB - Fujifilm X-M1 with a 35mm/F2.0, ISO 400.

For each species, 96 kernels have been captured in two imaging bundles with 48 kernels in each bundle. For each imaging bundle, the 48 kernels were carefully positioned on a sheet of white paper and arranged in an `8x6` matrix. This rice seed matrix was then positioned on a translational stage and imaged using the HSI and RGB cameras described above.

The following three files result from a single acquisition:
- `.hdr`: The HSI ENVI header file (More information on the ENVI format can be found at the [Harris Geospatial Solutions](https://www.harrisgeospatial.com/docs/ENVIHeaderFiles.html) documentation.
- `.raw`: The HSI datacube data.
- `.jpg`: The RGB image.

The filename convention used is the (short) species name followed by a dash, followed by the bundle number (i.e. 1 or 2), followed by the filename suffix. For instance, the data for the `BC15` rice seed variety are contained in the following 6 files:
- `BC15-01.hdr`
- `BC15-01.raw`
- `BC15-01.jpg`
- `BC15-02.hdr`
- `BC15-02.raw`
- `BC15-02.jpg`

The data were captured in 9 batches across multiple days. All the data from the same batch are contained in a dedicated folder. For instance the folder `Data-VIS-20170111-2-room-light-off` indicates that the data are in the VIS/NIR range, captured on the 11th of January 2017 and this was the second batch for that day with the room lights off. Two halogen bulbs were used for illumination and these were accurately positioned to provide balanced lighting across the scene. To ensure stability, the halogen bulbs were switched on and allowed to reach constant operating temperature before the data were acquired in a dark room to minimise any other sources of illumination variance.

For the purposes of calibration each HSI image contains in the scene a 100% reflective spectralon tile which is a highly reflective Lambertian scatter. For the dark reference, each folder contains an HSI image with the lens-cap covering the camera. The dark reference can be founds in each folder under the filename `black.hdr`/`black.raw`.

A full index of the data for each species is provided in the `index.csv` file. The file contains the following columns:

- Species Full Name: The full species name (as used in filenames).
- Species Short Name: A shorthand of the species name.
- Bundle Number: Imaging Bundle Number (each bundle contains 48 kernels) every species has 2 bundles.
- Folder: The name of the folder containing the data (as described above where each folder contains a batch of images captured in a single imaging session).
- File Name: The stem of the filename. Note: that there are 3 suffixes for each stem (`.hdr`, `.raw`, `.jpg`)

The HSI system was used to capture 256 wavelengths in this experiment and the exact wavelengths corresponding to the data provided are included in the file `wavelengths.csv`.

Both camera systems were fixed on a rigid frame for the duration of the experiments. To permit possible registration between the two cameras, a chessboard pattern has been imaged and the acquired files are also contained in the folder `chessboard`.

**Note:** The bundle `01` for the species `NDC1` was originally acquired during the batch `Data-VIS-20170111-2-room-light-off`. However, the file was corrupted and hence, the acquisition was repeated during the batch `Data-VIS-20170203-1-room-light-off`. As a result, the `NDC1-01` files are in the `Data-VIS-20170203-1-room-light-off` folder.

External deposit with Zenodo
Date made available22 Jan 2020
Date of data production11 Jan 2017 - 3 Feb 2017

Cite this