Abstract
Crop yield prediction using Earth Observation data presents challenges due to the diverse data modalities and the limited availability of relevant datasets, which are often proprietary or private. Decentralised federated learning has been proposed as a solution to address these privacy concerns as no data labels will have to be distributed to a third party. However, the performance of federated learning is significantly influenced by the number of clients and the distribution of data among them. This study investigates the impact of aggregation levels on federated learning using a proxy model trained on crop type data derived from Copernicus Sentinel-2 images. Interaction of these aggregation levels with other parameters is simulated and studied to aim to generalise the results to different situations. The analysis also includes an examination of the current and future distributions of crop yield datasets to determine the optimal aggregation levels for effective federated learning. The findings highlight that dataset size directly affects the learning outcomes as well as the degree of privacy that can be maintained. Other scenarios and the implications of these results are discussed for a future crop-yield decentralised federated learning architecture.
| Original language | English |
|---|---|
| Article number | 10454 |
| Journal | Scientific Reports |
| Volume | 15 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 26 Mar 2025 |
Keywords
- Crop yield prediction
- Decentralised federated learning
- satellite data
- remote sensing