1. Labelling process
Fun fact: According to EIT Digital 2020, Data Scientists spend 60% of time on Cleaning and Organizing Data, 19% on Collecting Data Sets and only 9% on Modelling/Machine Learning. We are fully experiencing this in our ongoing cloud masking project, where creating a labeled dataset is the key element for success. So far, we have labeled 1940 images with the size of 512×512 pixels, where every pixel is labeled as cloud/cloud shadow/semi-transparent cloud/clear land class! We agree with Taivo Pungas in his blog, that the process of building a dataset is undeservedly out of the scope of research activities. Instead of putting so much effort into tuning model parameters and designing creative architecture, smart AI decision-making need conscious investment into collecting data and understanding the domain. Luckily, we have remote sensing experts in KappaZeta to help with that.
Marharyta Domnich, Machine learning engineer
2. Estimating the best time for harvesting crop
Image: Madis Ajaots
Since August we have talked to different stakeholders and gathered user requirements for a new service prototype – Harvesting Time Recommendation for maximum crop Yield (HaTRY). This is another project carried out in collaboration with the European Space Agency (ESA) and with the local farmers. The project started this summer and will last until the end of next year.
What is the project about and what the prototype, we are testing next harvesting season, should do? From the farmer’s perspective the value of the service can be phrased like this: “I want to know the time when crops are mature and ready to harvest. The earlier I get this information the better.”
Timing of the harvesting can be challenging if you need to consider all the variables in this equation. For example, available machinery, workforce, grain protein content, grain moisture content, spatial distribution of the fields and weather. Our prototype, which will base on Sentinel-1 and Sentinel-2 imagery time-series and weather data, will not solve this equation fully, but aims to give another source for informed decisions.
We have already collected a reasonable amount of reference data for the neural network prediction model and start testing the architecture and performance in the coming months. During the next summer, we want to set up a near real-time prototype to test the usability, limits and value with the farmers. We focus on the Northern Europe region and three most commonly grown crop types: winter wheat, spring barley and winter rapeseed.
We will keep you informed about the process and share the results in future newsletters and blog posts.
Mihkel Järveoja, HaTRY project manager
3. Vision of our Sentinel-1 API services
Copernicus programme and Sentinel satellites is a great European success story as we all know and this is also the view shared by the Earth Observation (EO) community. Being inside the community, our job is to recognize the people around us, who would potentially benefit from Copernicus data and who do not know such data exists or have never tried to use it.
Only a fraction of the potential Copernicus users take full advantage of the programme. For a large majority, it is still a data source mostly used by scientists and specialized companies to satisfy their needs. We think it does not have to be this way. The key question here is how to provide easy access of pre-processed data. Take Sentinel-1 for example. The terabytes of carefully processed GRD and SLC files produced by ESA are great, but for large majority of users they are too heavy and too complex to use. The learning curve to see the first benefits is too steep. Just to see the latest image of your house, field or forest – you should not need to download a 1-4 GB file, unzip it and open with a dedicated EO or GIS desktop software. Everyone should be able to browse the imagery with ease, making one click in a browser.
During the coming months we are planning to develop and release a series of Sentinel-1 value-added services including:
- S1 parcel-based time series
- S1 coherence and backscatter raster-services
- S1 simple visual WMS/WCS
- Synthetic Sentinel-2-like image from S1 data
Image: Sentinel-1 IW mode dual polarimetric backscatter image without (left) and with (right) enhanced, details preserving noise removal
For all the derivatives, the key words will be professional calibration, noise removal and maximum possible spatial resolution (see an example image above). The integration of the data layers and information products will be as simple as possible, keeping in mind both visual human interpretation and machine-readable API services on both web and desktop platforms.
Let’s make Sentinel-1 data conveniently accessible for everyone! More information from our newsletter and blog during coming months as we progress with the development.
Kaupo Voormansik, CEO