The Centre for Synthesis and Analysis of Biodiversity (CESAB) is the Foundation for Research on Biodiversity (FRB) main programme and a leading research organization in Europe, with an international reputation. Its aim is to implement the innovative work of synthesis and analysis of existing data in the field of biodiversity. Advancing knowledge, developing culture and collaboration, and facilitating bridges between scientific disciplines and with stakeholders, are the main objectives of CESAB, which welcomes every year a large number of researchers from all continents.
For more information about CESAB: https://www.fondationbiodiversite.fr/en/about-the-foundation/le-cesab/
Keywords: Satellite imagery, Deep-Learning, time series, multi-resolution, classification, missing data, image processing.
Very often in remote sensing, one wishes to produce a geographical cartography that is to say to produce a map divided into zones, where each zone is labeled by a value. This value can for example be the level of annual consumption of households, the health of the assets, etc.
The standard approach for producing the cartography is to extract features for each area of interest. These features are represented by a set of integer values, real or binary. These features can be the material of the buildings, the material of the roofs, the number of rooms of a house, the type of house, the distances between various points of infrastructure, the urban or rural classification, the annual temperature, the annual precipitations etc [Xie et al 2016 - Transfer]. It is obvious that it is not easy to have access to such information that requires field surveys.
A less expensive solution for mapping is to use high-resolution satellite imagery (the remote sensing principle is to remotely measure), and extract features that can then be used to predict the value of each zone. For example, in 2016, Xie et al. [Xie et al 2016 - Transfer] proposed predicting the level of poverty (= annual consumption level of households) for 1 km x 1 km areas (see figure above). The whole problem of satellite image cartography is that it is necessary to have enough labeled images (images + values for each zone) to be able to use machine learning algorithms. The publication of Xie et al. is interesting because it requires only few labeled images since it is based on a two-step approach:
- Prior learning of a CNN (Convolutional Neural Network) to predict light intensity in satellite images taken at night, from satellite images taken during the day,
- Then a reuse of this CNN network (notion of "Transfer Learning") for this time to teach him to predict poverty (use of a regression) from satellite images of day. Indeed, we will assume that the level of poverty is inversely correlated with night illumination (the more electricity there is, the less the area is poor).
The approach of Xie et al. is particularly elegant since it no longer requires field interventions to obtain features. It also makes it possible to make a prediction on entire countries such as Africa (see the results in [Jean et al., 2016 - PredictPoverty]). On their set of test, the approach of Xi et al. achieves 71% accuracy, which is 3% better than non-transfer approaches, also using Deep-Learning, and is only 4% less than the field survey-based approach. In a more recent publication, Jean and his collaborators [Jean et al. 2019 - Tile2Vec], always in a spirit of weakly supervised approach, propose a learning by "triplet loss". This corresponds to learning with thumbnail triplets (two near zones in the satellite image and a distant zone), a feature vector discriminating the near zone from the distant zone. This learning is totally unsupervised. In a second step, a regression is carried out to predict the level of poverty from the feature vector of the input image. The results of this new proposal give a regression whose correlation is better than that obtained by the transfer approach [Xie et al 2016 - Transfer]. However, there is still room for improvement since the correlation after regression is only 70%.
The methodology is interesting, but as indicated by the authors [Jean et al. 2019 - Tile2Vec], it does not take enough into account the temporal aspect. In addition, the data on which the experiments were carried out were deliberately noisy by the government agencies.
The student will therefore study and propose a solution in the case where we have a set of satellite images taken over a decade, with a variable frequency, a sampling with "holes", a scale resolution variable and a small number of annotated data.
Firstly, the student will have to make a state of the art of segmentation approaches (labeling) of satellite images, paying particular attention to the prediction approaches using sequences of satellite images. In parallel, the student will take again the experiments carried out by the team of Jean et al. to have a basis of comparison, but also to take in hand the data we have. At the end of the internship, the student will be able to compare the approach of Jean et al. and its approach based on image sequence prediction.
C / C ++ programming, Python, Classification, Data mining, knowledge of image processing, basic knowledge of Deep Learning, English scientific writing. No knowledge of remote sensing is required.
Duration: 5-6 months.
Funded by FRB-CESAB on the Belmont project.
The internship will take place at the LIRMM (Campus St Priest) in the ICAR team.
Your curriculum vitae
A letter of interest (1 page)
Your grades from the first year Master’s degree (if applicable)
Applications and questions regarding application must be sent no later than 31/08/2019 to Marc Chaumont, Marc.Chaumont@lirmm.fr.
Application deadline: 20th December 2019
Starting date: February 2020
[Jean et al. 2019 - Tile2Vec] Tile2Vec: Unsupervised representation learning for spatially distributed data N. Jean, S. Wang, A. Samar, G. Azzari, D. Lobell, S. Ermon AAAI Conference on Artificial Intelligence (AAAI), 2019
[Jean et al. 2016 - PredictPoverty] Combining satellite imagery and machine learning to predict poverty N. Jean, M. Burke, M. Xie, W. M. Davis, D. B. Lobell, S. Ermon, Science, 353(6301), 790-794, 2016.
[Xie et al 2016 - Transfer] Transfer learning from deep features for remote sensing and poverty mapping M. Xie, N. Jean, M. Burke, D. B. Lobell, S. Ermon AAAI Conference on Artificial Intelligence (AAAI), 2016.