We would like to keep you informed about our progress. For this reason, we invite you to subscribe to our newsletter. We will not burden you with unnecessary information. Only the interesting stuff. The previous newsletters can be read below the subscription form.


* indicates required

previous newsletters:

 Kappazeta NewsLetter, July 2022
 Kappazeta NewsLetter, June 2022
 Kappazeta NewsLetter, May 2022
 Kappazeta NewsLetter, April 2022
 Kappazeta NewsLetter, March 2022
 Kappazeta NewsLetter, February 2022
 Kappazeta NewsLetter, January 2022
 Kappazeta NewsLetter, December 2021
 KappaZeta Newsletter, November 2021
 KappaZeta Newsletter, October 2021  
 KappaZeta Newsletter, September 2021
 KappaZeta Newsletter, August 2021
 KappaZeta Newsletter, July 2021
 KappaZeta Newsletter, June 2021
 KappaZeta Newsletter, May 2021
 KappaZeta Newsletter, April 2021
 KappaZeta Newsletter, March 2021
 KappaZeta Newsletter, February 2021
KappaZeta Newsletter, December 2020

Newsletter, July 2022

1. ESA Latvia Workshop 2022

The 12th ESA Training Course on Earth Observation (EO) took place in Riga Technical University from June 27 to July 1. Targeting researchers and young professionals from in and around the Baltic area in EO, we participated (remotely and in-person) in theory and practical sessions covering (Interferometric) SAR, and applications of satellite data in Forestry, Agriculture and Marine observations.

Figure 1. ESA Workshop in Latvia.

It was heart-warming to see our Mowing and Grazing projects featured in different presentations, and exciting to learn more about the implementation (and challenges) of agricultural projects similar to our current portfolio, directly from ESA experts. The sessions were a blend of interactive theory discussions and hands-on data processing or analyses of radar and optical satellite data with old and new GIS tools. These new tools (including ForestryTEP, OpenEO Cloud) are mostly focused on aggregating satellite data sources and easing access to, and processing on this data.

Special thanks to the ESA, Ministry of Education and Science (Latvia), the Institute of Environmental Solutions and Riga Technical University for putting this workshop together. Details on the course outline, and materials used are available here.
 Hudson Taylor Lekunze, data analyst
2. Space Festival in Tallinn
On 12th of July KappaZeta took part in the Space Festival "100 km from space" event in Tallinn. Our EO Analyst Jelizaveta Vabištševitš & Geospatial Data Quality Specialist Olga Wold with KappaZeta intern Abdullah Toqeer were talking about the importance of satellites and Copernicus programme in agricultural sector.

Figure 2. KappaZeta at the Space Festival.

Many great meetings with young bright space enthusiasts, who learned a lot about the space sector in Estonia and how could the open-source satellite data be used. Many Estonian companies working in the field of space presented their challenging work and impressive research.

Figure 3. KappaZeta at the Space Festival.

Also, the main guest of the festival was the European Space Agency astronaut Matthias Maurer! He spent 177 days at the International Space Station and returned from space in May this year. His inspiring presentation was about the beauty of our planet and the importance of taking care of it, as it is our only home (for now).

Olga Wold, geospatial data quality specialist

Newsletter, June 2022

1. Webinar: “KappaOne – Sentinel-1 data layers for the subsidy checks under CAP”

Sentinel-1 is a radar satellite that works both day and night and can see through the clouds and is, therefore, an excellent source of data for monitoring changes in agricultural land. The KappaOne service is designed to make Sentinel-1 SAR data easily accessible and ready to be used and analysed. It is extremely helpful for the subsidy checks under the Common Agricultural Policy (CAP). For example, it is valuable for planning field visits to the most problematic areas and getting an initial overview of the field before leaving the office. Moreover, it is applicable for checking the parcels manually, where the machine learning model gives an uncertain or borderline result. Sentinel-1 images are also well suited for assessment of mowing, harvesting, ploughing and other markers.
Join the KappaOne webinar to learn more about the service and its use cases for the subsidy check under CAP. The webinar will be held on the 30th of June from 15:00 until 16:00 EEST. You can register here: https://forms.gle/dzYjorhjrPNxRYym7 
 Jelizaveta Vabištševitš, EO analyst
2. KappaMask in cloud masks comparison
KappaZeta is taking part in the Cloud Mask Intercomparison Exercise (CMIX) meeting in ESA/ESRIN, 20-21 June. It is a framework to compare alternative cloud masks for Sentinel-2, Landsat-8/-9 and other widely used optical missions. Cloud mask processor developers and Earth Observation experts discuss and (hopefully) agree on a common way how to label the reference data and measure the accuracy of cloud masks. Taken the hard work our team has done I’m sure KappaMask will stand out here as one of the most accurate cloud masks for Sentinel-2.

Kaupo Voormansik, CEO, SAR expert

3. NISAR is coming
There are several new and interesting upcoming SAR missions including Copernicus ROSE-L, Sentinel-1 NG, ALOS-4 and NISAR. Perhaps the most interesting in the near future is NISAR jointly developed by NASA and ISRO (Indian Space Research Organisation). The satellite is almost ready with the testing and qualification procedure yet to be concluded before the launch, which is expected to early 2024. NISAR is very interesting mission because of three reasons, firstly it is a dual-band mission acquiring SAR data simultaneously in S- and L-band. It will be a Sentinel-1 like data factory imaging the whole planet several times per week and its data policy is planned to be free and open.

NISAR, image by NASA.

The last two points are very important for an operational service development point of view. KappaZeta will keep an eye on NISAR developments and help the world to make the most of this beautiful up-and-coming SAR data factory. If everything goes well people can access and use NISAR data in a simple way via our KappaOne service starting from 2024.
Kaupo Voormansik, CEO, SAR expert

Newsletter, May 2022

1. KappaZeta in the Living Planet Symposium
It is 23rd of May 2022 and we are glad to share that we are present at European Space Agency Living Planet Symposium in Bonn!

On Friday 27th of May, our CEO Kaupo Voormansik will give an oral presentation on “High quality food for AI - Sentinel-1 analysis-ready data (ARD) with interferometric coherence”. The presentation will focus on freshly launched KappaOne service. The SAR expertise of KZ helps to take full advantage of the interferometric and polarimetric data content of Sentinel-1. And for the end-user, it means calibrated and noise corrected imagery products with the highest possible spatial resolution using advanced speckle suppression methods.

On Tuesday our EO analyst Jelizaveta Vabistsevits will give a session on “KappaOne fresh Sentinel-1 data layers’ use case examples for the subsidy checks in Common Agricultural Policy”. She will tell you about KappaOne and its use case for the CAP subsidy checks through a hands-on experience!

On Friday there will also be a poster presentation on “AI-based Cloud Mask Processor for Sentinel-2” by Olga Wold, our Geospatial Data Quality Specialist. This presentation will describe the KappaMask model details and its progress on going global!
See you at the Living Planet Symposium 2022!
Olga Wold, geospatial data quality specialist
2. Crop Damage Detection
In KappaZeta we are among other things investigating the possibilities to use remotely sensed time-series data to detect crop damage on agricultural parcels.

The initial stage of the research was carried out with the cooperation of local farmers, who provided information about the general condition of agricultural parcels and granted access to the croplands for field surveying. During the in-person visits that took place early in July 2021, a visual assessment of various winter and spring vegetation types was performed and georeferenced images from across the fields were taken. Extra attention was paid to the difference between crop condition within damaged and healthy areas.

Figure 1. Damaged and healthy areas of winter wheat.

The results of the preliminary analysis on a test subset of agricultural parcels vary. The number of parcels in the current stage of analysis was quite limited and it caused several issues. Nevertheless, using simple unsupervised classification techniques it is possible to obtain a spatial representation of the damaged areas through the analysis of specific features of winter rapeseed at particular stages of its development.

At the next stage of the analysis, the work will be continued in the scope of supervised classification techniques combined with individual approaches to various agricultural crop types.

Anton Kostiukhin, software developer

Newsletter, April 2022

1. Advanced speckle filtering
Our raster processing got a major upgrade that significantly improves the quality of our backscatter and coherence rasters. We analyzed, modified and combined multiple published methods when designing a new filter for KappaOne. Although the filter adds significant load to the computations, the output raster will become much sharper with it as the edge structures are preserved much better.

To demonstrate the impact of a new filter to the backscatter images, we chose an area that has comb-shaped parcel on it (see the bottom right corner of figure 1). The narrow “teeth” of the comb are about 20 meters in width. Although “teeth” were also visible previously, their sharpness on KappaOne output is drastically higher.

Figure 1: Backscattering in VV (red) and VH (green) polarization previously (left) and with KappaOne (right) speckle filter.

The newly created KappaOne filter allows improving the quality of coherence images as well. On figure 2 the same area is used for demonstration as previously. Now the comb-like structure is gone, as for coherence more averaging is needed to produce meaningful output. There are large high coherence areas on the previous coherence image, which are created by objects that create extremely high coherence. One of those areas is marked by an ellipse on figure 2. Our novel filter helps to significantly shrink the area where those coherence values dominate. As a result, we can produce an image with much finer details on it.

Figure 2: Previously computed coherence image in VV polarization (left) and the one with KappaOne speckle filter (right).
Mihkel Veske, software developer
2. Towards production: Synthetic NDVI (SNDVI) rasters
The ultimate test of any machine learning model waits in production. Hence in the last month, we have focused on operationalizing the NDVI model as a layer of a new Analysis-Ready Data (ARD) product – KappaOne. In the ARD Germany demo alone, we prepared 35 rasters (each spanning 2500 sqkm) with a newly implemented semi-automated pipeline, that takes collocated Sentinel-1 and -2 products through preprocessing, inference, and post-processing tasks, to create NDVI or SNDVI rasters. 31 of these rasters in Northern Germany were entirely or partially created by our generative model due to cloudy Sentinel-2 images. For larger demo areas such as these, the production objective is to have a new NDVI or SNDVI image within 24 hours of newly acquired data from Sentinel-2.

Figure 3: Rasters created for Germany ARD demo (NDVI – green circles, SNDVI – white circles)

However, we are tackling new challenges at this stage, foremost of which is validating the SNDVI rasters. Validation involves evaluating the quality of images generated by our model, and the accuracy of parcel-level SNDVI compared with available NDVI. We have made significant progress in the former as reported in previous issues, however by the next issue on SNDVI we plan to have a comparable SNDVI timeseries (in addition to partial NDVI timeseries) in the demo map to validate newly created rasters. Most importantly, we want to learn more about the model as we scale and validate on new data and unseen areas, to improve the accuracy and reliability of predictions.
Hudson Taylor Lekunze, data analyst

Newsletter, March 2022

1. KappaMask for Sentinel-2
Phase II of the KappaMask development has been going in full power for a few months already! We started introducing the model new areas such as deserts in Africa and Australia, tropics in South America and even Arctic regions. Previously KappaMask was trained only for Northern terrestrial summer conditions, but now we are also preparing ground truth data for all the seasons. And thanks to the hard and precise work of our labellers, the first results of the new KappaMask model on the previously unseen areas are out and you can examine them on the figures below.

Figure 1. KappaMask predictions in water conditions. Yellow – clouds, blue – semi-transparent clouds.
Previously, KappaMask misclassified water as cloud shadows or semi-transparent clouds, or even as invalid pixels. Now, the model predictions look reasonable, however errors still occur.

On Figure 2, KappaMask predictions are presented in desert areas. You can see that most of the clouds and shadows are predicted correctly. Although, some of the terrain conditions are misclassified as cloud shadows and missed semi-transparent clouds are presented. Therefore, there is a lot of work still to be done!

Figure 2. KappaMask predictions in desert areas. Yellow – clouds, green – cloud shadows, blue – semi-transparent clouds.

Olga Wold, geospatial data quality specialist
2. Success at Kemira and Valtra Hackathons
KappaZeta participated in Kemira and Valtra hackathons on 9-10 and 22-23 of March in Finland. We teamed up with Lauri Karp and the partnership resulted in brilliant results. Our joint-team won the Kemira Hackathon with a concept of monitoring the end-to-end supply chains of next generation bio-based renewable materials with radar satellite data.

Few weeks later the same team got an honorable mention in Valtra Hackathon with a concept of automatic log generation of farming events (ploughing, sowing, mowing, harvesting etc.) for non-smart tractors.

More info at:

Kaupo Voormansik, CEO, SAR expert

Newsletter, February 2022

1. Grazing detection based on Copernicus data
Grazing detection from Copernicus data for agricultural subsidy checks is one of our recently finished projects, carried out in collaboration with the European Space Agency (ESA) and with our partner, Czech EO company Gisat s.r.o. We developed a grazing detection methodology based on Sentinel-1 and Sentinel-2 imagery time series. This work is important for the reform of the Common Agricultural Policy, which will replace on-the-spot grazing checks with satellite monitoring. As about 20% of grasslands in Europe are being grazed, our detection methodology will help to complete the grassland maintenance checks alongside the mowing detection.
Fieldwork was a significant part of ground reference data collection required for the development of grazing detection methodology. We have collected the data about grazing and mowing events and grass height weekly during summer 2021 from several grasslands in central Estonia. Summary of the data together with S2 NDVI, NDWI, and PSRI indices and calculated grazing intensity for two grasslands is visualized in the figure below.

Figure 1. Time series of two grasslands in central Estonia for the May - September period.

On these grasslands, cows moved from one field to another throughout the summer, staying on each grassland for up to 5 weeks. In Figure 2 it is visible that grassland V2 was separated by the farmer into two parts. The larger part was mowed on the 30th of June for silage, which is visible from the NDVI, NDWI, and grass height drop on the time series. Animals were located on the smaller part of the V2 before and after the mowing event. Due to the mowing event in between, it is quite hard to distinguish changes caused by grazing activity, even though the grazing intensity is high (4.2 LU/ha).

Figure 2. Sentinel-2 NDVI raster time-series. Base map: Estonian Land Board

Changes in biomass on the grasslands are mostly visible on Sentinel-2 NDVI rasters, as the increase or decrease of NDVI values are more distinguishable by the human eye (Figure 2). In comparison, the visual analysis of Sentinel-1 raster parameters is more complex. The changes in VV or VH coherence values are difficult to detect if the exact location of animals is missing (Figure 3). However, the machine learning approach can be implemented in future works. As grazing is not distinguishable for the human eye on Sentinel-1 images, some of the signatures can be detected by the AI model.

Figure 3. Sentinel-1 coherence VV raster time-series. Base map: Estonian Land Board

Jelizaveta Vabištševitš, EO analyst, project manager
2. KappaZeta is going to Kemira and Valtra hackathons in Finland
KappaZeta and CarbonEye Global are to deliver new breed innovative solutions that fuse Carbon, Space and Agricultural markets to Kemira and Valtra hackathons in Finland in March 2022. Hackathons are organized by Jyväskylä University (JAMK) in cooperation with global segment leader Kemira Oyj (Chemicals company) and Valtra Inc. (Smart Tractors company). More here. For further enquiries please contact jurgen.lina@kappazeta.ee and lauri.karp@carboneye.ai.

Figure 4. Biomass View in KappaZeta ARD Demo.

Read more about the hackathons:

Kaupo Voormansik, CEO, SAR expert

Newsletter, January 2022

1. KappaOne
We are happy to announce KappaOne, a WMS service for selected Sentinel 1 based raster images and an API for parcel statistics. The WMS service gives an opportunity to either look at your particular area of interest in a web browser, like many mapping solutions currently work, or use the WMS directly in analytics program like QGIS.

Figure 1. The logo of KappaOne.

Currently, one can see backscattering in VH, a coherence image of VV in 6/12 interval, and RGB composite of previous two plus backscatter in VV. In addition, we provide a custom product – synthetic NDVI, a product where an AI algorithm derives current NDVI from Sentinel 1 image, using history of Sentinel 1 and Sentinel 2 data. This way, an NDVI can be given for dates when clouds do not permit optical satellites to get useful data.

Figure 2. KappaOne ARD demo.

For parcels, by which we mean a continuous agricultural area with size larger than 0.5 hectares, we provide statistical information, which can be viewed either in the web map or accessed directly via an API. The statistics uses the same information that is available in raster but aggregated over the parcel area. This data, represented as time series, allows detection of biomass changes and farming events like haying, harvesting and ploughing, but also natural events of significant influence, like flooding or hail damage.

Figure 3. KappaOne ARD demo - parcel view.

We provide two examples for the service in our initial demo, one for Germany https://demodev2.kappazeta.ee/ard_demo/ and one for Estonia https://demodev2.kappazeta.ee/ard_ee/. We encourage you to test the services and let us know what you think.
The KappaOne service development is funded by the InCubed program of ESA ɸ-lab.
Andres Luhamaa, KappaOne product owner
2. sNDVI rasters now available on our Demo Map
Since the last issue, we have prepared NDVI images for a 50×50km demo area in Northern Germany spanning 5 months (May to September). We replace cloudy image tiles with inferred synthetic NDVI (sNDVI) images, using historical NDVI and Synthetic Aperture Radar (SAR) data from the same area as inputs for the model. A sub-section of this demo area is pictured below (Figure 4), with the color scheme (ranging from brown through yellow to green) showing increasing areas of healthy vegetation, estimated by NDVI.
Figure 4. Synthetic NDVI Map of subarea in Germany Demo. Brown through yellow to Green, shows increasing healthy vegetation. Image source: KappaZeta Germany ARD Layers (September 29, 2021)

This demo is available in KappaZeta web map environment and includes other SAR data - backscatter and coherence - over the same area.

Interaction with red-outlined parcels, reveals rich temporal information of selected fields, as averaged parcel statistics over the entire demo period (April – December 2021) and can be used to compare accuracy of synthetic NDVI images. For now, not all areas have NDVI images for the entire period since historical inputs are required for image synthesis and some areas did not have sufficient cloud-free historical images for prediction. Months with dense NDVI coverage are May, June, and September.

An evaluation of model accuracy over some test images (212 images sized at 512px) from this period, shows synthetic images have a mean structural similarity (or reconstruction accuracy) of 82% and absolute pixel error of 6.86%. The next challenge is making all synthetic images available, even for areas with insufficient historical data.
Hudson Taylor Lekunze, Data analyst

Newsletter, December 2021

1. Towards Generalizing NDVI image Synthesis
We promised to discuss how well our NDVI model works for other non-Estonian areas in this issue, and it is no trivial issue (pun intended). The images below illustrate the structural differences in agricultural areas for disparate regions considered. (Grayscale image intensities represent NDVI values, with dark to light corresponding with low to high values respectively).

Figure 1: Differences in field layout: (a) Northeastern Estonia (b) North Germany (c) Central Poland

For each sub-tile the difficulty in modeling (NDVI and structural changes in fields) can be conceptualized by the number of contiguous fields and layout variations per a 512px image tile. It also includes other key indicators such as differences in NDVI crop signatures, and time difference between historical inputs and target outputs. In a recent demo for Northeastern Germany, we used an ensemble model trained over Baltic areas and Germany (due to field similarities) to synthesize NDVI images illustrated in Figure 2. Evaluations over a test area (consisting of 360 512px images each representing 5242.88 km2 land patches) leave our German model at a 68% averaged structural similarity to target images.

Figure 2: Recent Synthetic NDVI images from select fields in Northeastern Germany

We still have some way to go, and some improvement techniques involve active learning, training with smaller image tiles (256px for e.g.) amongst other data and model architectural changes. Looking forward to sharing progress in subsequent releases including how the results look in a demo web map!
Hudson Taylor Lekunze, Data analyst
2. 2021 was a good development and growing year for KappaZeta
Perhaps the most visible and ready to use product coming from us this year is KappaMask Sentinel-2 cloud mask. It is useful for wide variety of users from EO and non-EO sector. Everyone, who needs to use Sentinel-2 data in large volumes and wants to separate cloud-free areas from cloudy ones, will benefit from KappaMask.

Though still in development, in our first focus area – Northern Europe, it already seems to be convincingly the most accurate free and open Sentinel-2 cloud mask. Read more about it from our recently published research paper: https://www.mdpi.com/2072-4292/13/20/4100
Try it out: https://github.com/kappazeta/cm_predict
Or re-use our reference dataset: https://zenodo.org/record/5095024#.YP8HpY4zaUk

First trials by users confirm that it even works outside the focus area – in Southern Europe. Though, with some land cover types, especially with water bodies it sometimes gives strange results. It is most likely due to the limitations of the reference set, where our model is fitted. We will address it in the 2nd phase of the project.

The 2nd phase of the project has already started. The goal is very simple – to make KappaMask the most accurate free and open Sentinel-2 cloud mask globally in all seasons, not just for Northern European terrestrial summer season conditions. We are extending the reference set (manual labelling with clever active learning approach plus re-using free and open data sets by other teams) and improving the model. Next version of KappaMask will be released in summer 2022 – stay tuned.

In 2021 we also extended our Sentinel-1 data pre-processing services portfolio. Besides the parcel level aggregated feature set time series, you can also have raster output for our coherence and backscatter data layers. The biggest changes and major new product launches of our Sentinel-1 analysis ready data (ARD) layers for business to business (B2B) and business to government (B2G) are still ahead and coming in 2022 and 2023. To make it happen we were lucky to receive the investment from ESA InCubed program (https://incubed.phi.esa.int/), but even more proud we are about the pre-purchase deals we signed for close to 100 kEUR. It proves that users see the value and need our easy to use, well calibrated and noise removed Sentinel-1 ARD layers.

2021 has also been a good year in terms of growing interest from investors. So far, we haven’t taken any investor money on board, but who knows maybe in the coming years we join someone with who we have a great match, shared vision and ideals. The increased interest of investors reflects the fact that we are developing, growing and doing relevant work for the world. Here we must pay credit to the mentorship programs we took part in 2020-2021. The Design Masterclass of Enterprise Estonia gave us just an invaluable toolbox for fast prototyping, product-market-fit searching, user interest mapping and validation. Let’s now put it to daily use in KappaZeta, whatever new product or service we are about to create. Kristjan Raude’s mentoring hours within Põhjanael program were equally valuable for marketing and sales work. According to my opinion – pure gold, we just need to apply it in practice. Last, but not least this autumn marks the ending of our incubation and mentorship support in Superangel program. It is difficult to overestimate the value of the very practical advice and contacts we received from Kalev Kaarna & co. Superangel team – you really do a work with a mission, which is beyond making just money.

This year marks also the end of two important ESA projects – grazing detection and HaTRY. In grazing detection project, we were working towards grazing detection methodology development using satellite monitoring. In HaTRY the goal was to predict the optimal time for winter wheat, winter rapeseed and spring barley harvesting considering the ripeness and moisture content. Both projects served as a good preparatory work for future developments to reach tried and tested operational services one day.

Regarding subsidy checks services for Estonian Government (PRIA https://www.pria.ee/) 2021 was a challenging year. Crop classification model results in 2021 new time series were much worse than in the RITA project test sets. It underlines a fundamental problem – our crop classification model does not generalize well and is too dependent on the specific behavior of the Sentinel-1 and -2 feature set time series from the training set seasons. The drawback does not render all the work we did in RITA methodology development project (2019-2020) useless. The input features of S1 and S2 parameter time series still carry the information of which crop corresponds to a certain input data. The input data as prepared by us is very rich and fine, we only need to work on our model and methodology to make it more robust and better generalizable when moving from one season to another.

With mowing detection, it seems that we underestimated the varieties what our weather can bring once again. With experiences and training data from 2017, 2018, 2019 and 2020 in our pocket it seemed that we had largely solved the mowing detection task at least for Estonian conditions. But the tropical weather and drought conditions in June-July this year brough a surprise. The vegetation dried out so fast that the pattern in NDVI and coherence time series looked very similar to mowing events. Without knowing that there was a drought we would have marked them as mowing events. Should we bring weather data back to let the model learn the patterns of a drought and distinguish them from the mowing events? We will address this question for the seasons to come.

No matter the economic situation, inflation, deflation – the greatest value of KappaZeta is always our people. In that sense 2021 was a good year. I’m very glad to welcome young and talented new members – Olga, Hudson, Anton, Liza and Tanya to our team. Early 2022 will bring some more positive news regarding great people joining KappaZeta. Follow us in LI and FB to get the news.

Have peaceful Christmas time with your friends and families. Productive and positive new year!
Kaupo Voormansik, CEO, SAR expert

Newsletter, November 2021

1. Sentinel-2 KappaMask
For the last year we were developing our own free and open AI based cloud mask, which can detect pixels corrupted by cloud shadows and various types of clouds. This complex task was successfully finished and KappaMask is the most accurate cloud mask for Sentinel-2 for Northern European terrestrial summer conditions!

Legend: Yellow – cloud, Green – cloud shadow
Figure 1.1. Visual comparison of KappaMask with Sen2Cor

Now we are going to the global scale. A lot of challenging tasks are facing KappaZeta team, as introduction of the new areas and conditions is going to be done. Currently snowy areas are misclassified as clouds and ocean surface is currently often mistaken by our model as cloud shadows and semi-transparent clouds. Moreover, we are introducing the area outside of Europe, which will contain deserts, tropical areas, etc.
To achieve this challenging goal, we are going to further develop the model and extend our existing open-source labelled dataset “Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks”, which you are welcome to check and use in Zenodo platform!
Olga Wold, Geospatial data quality specialist
2. HaTRY project about to end
HaTRY (Harvesting Time Recommendation for maximum crop Yield) project has reached to a stage, where we are analysing data collected during last harvesting season and making conclusions.
Our field surveys on 9 test sites were successful, and together with Estonian Crop Research Institute (ETKI) we have gathered valuable information about crop development stages and gained some deeper insight into ripening stage, which was most important for our project.
It might be naive to think, that ripening process and optimal harvesting time is easily detected only from Sentinel parameters, but our hypothesis was, that to some extent it’s possible. Field survey results show that there are definitely some patterns and correlation between Sentinel parameters and different growth stages, and a common signature for ripening process can be described. Beside NDVI and VHVV ratio decrease during ripening, we could also detect the increase in coherence VV values before harvesting dates for all three crops (Figure 1). Of course, there will always remain question whether the farmer harvested the crop in optimal time? In many cases they didn’t, and there were different reasons for that (queue in the dryer, lack of machinery). We will write a dedicated blog post about our findings, but until then enjoy some of our graphs, where our geospatial data quality specialist Olga Wold compared winter rapeseed growth stages, Sentinel parameters and lab analysis for collected seed samples.

Figure 2.1. Winter rapeseed principal growth stages and calculated satellite parameters.

Figure 2.2. Satellite parameters in comparison with most important crop maturing parameters during ripening stage of winter rapeseed.
Mihkel Järveoja, Project manager / GIS developer

Newsletter, October 2021

1. KappaMask article has been published
The article introducing our KappaMask to the scientific world is published now: https://www.mdpi.com/2072-4292/13/20/4100. We developed Level-1C (L1C) and Level-2A (L2A) models that output 10 m resolution mask that includes clouds, cloud shadows, semi-transparent clouds and clear classes. The comparison with rule-based methods: Sen2Cor, Fmask, MAJA and machine learning methods: S2cloudless and DL_L8S2_UV (uses deep learning model) was performed and KappaMask outperformed all these algorithms on ~20% dice coefficient. The comparison was done on an isolated test dataset which was spatially and temporally distributed over Northern European region and was labeled with use of active learning methodology.

The dataset for training and test is freely available on this link: https://zenodo.org/record/5095024#.YXVuzJtRXpQ. For both L1C and L2A model we performed the feature importance analysis. Currently KappaMask L2A outperforms L1C model utilizing the use of AOT and WVP features in addition to bands. To our knowledge this is the first model built considering these features that output 10 m resolution mask.

The article was also presented during the ESA EO Phi-Week 2021 on AI4DQ - AI for Data Quality section. Let us know if you are willing to share Sentinel-2 dataset or model to enrich our comparison. You can always try our cloud mask models on Github: https://github.com/kappazeta/cm_predict.

At Phi-Week, a photo by Marharyta Domnich

We plan to extend the scope of the project, so it will cover global area. Additionally, we are looking into more reference datasets and deep learning cloud mask method for bigger and more systematic comparison.

Marharyta Domnich, machine learning engineer
2. Participation in the 7th Superangel Base Camp Hackathon
In the beginning of October KappaZeta team participated in the 7th Base Camp Hackathon in Tallinn (https://www.superangel.io/basecamp). Thank you very much for the opportunity, Kalev Kaarna and the whole Superangel team. There were a lot of wise people around us and a lot of clever ideas. If during next months we can implement 10% of them – then we have already made a big step forward in our development! :-) Special thanks for Elar Killumets, who spent with us a very fruitful hour discussing about company culture and goal clarity. We really needed the hackathon and the effect on us was like from fresh air. Thank you!

KappaZeta team members at Base Camp Hackathon, a photo by Tiit Tamme, from Superangel Facebook

As KappaZeta has never been a classical start-up, whose only goal is to make a lot of money – then here are some ideas how to make such events even better in the future. Do we need to measure the success only in money? What if the investors besides the question “how do you make money?” would also always ask “how do you make the world better?” And the latter question would be equally important, not only somewhere in the background. We understand that there is nothing fundamentally wrong with making money. The greater is the revenue and profit of your company, the more the clients value your work and are willing to pay for it. It is very OK and KappaZeta has also set themselves very ambitious financial goals. To reach those we must make excellent products and services and gather feedback from the users regularly – are our products and services useful for you?
Still, to have only financial goals is not enough for us and it seems that we are not alone. There seems to be a larger shift in values reflecting actual human development. Being “green” is not only a fashion thing. Especially the younger generation of investors are avoiding oil company stocks, because they don’t like their past and current actions. The number of ”green” investors is growing.
We believe that KappaZeta’s mission to “help the world to understand our home planet better via accurate satellite data and derived information” is a big thing what the world needs. But how we turn it into revenue?
A more general open question – how to tie and connect the goals of making the world better with making money? So that it would be not quite possible to make money without making the world better. Let us know if you have good ideas about it.

Kaupo Voormansik, SAR expert, CEO

3. KappaZeta at Design Thinking Conference
It is not always easy to have a balance between developing technical capacities and providing a well-suited service to your clients. But this equilibrium is vital and you need to find it or at least make a compromise between user needs and technology. Many of us probably know the caricature-like situation where passionate inventor presents his technologically brilliant solution to public but gets a cold reception from potential users as there is no need for that specific solution. Solution might be ingenious from the perspective of science or software architecture, but if there is no job to be done for somebody by this solution, then it is useless. Indeed, this piece of intellectual property might prove itself to be valuable in the future as a component of some other system. But still, to keep the process efficient and not to burden too much the warehouse of “currently not used” intellectual property, one should start the development process by identifying the existing problems of potential clients. Yes, leaving the office for some scouting or at least having some video calls with people is needed.
About a year ago our team entered the Design Master Class funded by Enterprise Estonia and led by BDA Consulting OÜ. This encouraged us to adopt the philosophy of you are never too close to your customer. In other words, this great design program equipped us with will and tools to really find out what are the problems of our clients that should be addressed with solutions built on our base technology – radar satellite remote sensing. By the way, the Design Master Class is only one example of many that tell the story of how supportive are Estonian public sector and start-up environment towards enterprises that really want to learn and evolve.
In this October we had a great chance to participate in the Design Thinking Conference by Designminds OÜ - https://dmkonverents.ee/en. One of KappaZeta founders, Tanel Tamm also delivered there a presentation about our journey in the Design Master Class. For Tanel it was a good way to look back from a new angle and to recall our main lessons from the Master Class:
  • Mentoring can speed up the customer validation.
  • Bringing together different user groups in the form of co-creation workshop gives additional insights.
  • Keeping close relationship with users must be a continuous effort.

Our founder, Tanel Tamm making a presentation at Design Thinking Conference, a photo by Patrik Tamm, Design Minds
Jürgen Lina, business development manager

Newsletter, September 2021

1. Modeling NDVI for Estonian Fields
With painstaking work on extracting and processing data steadily progressing, our latest successes have come from modeling NDVI for agricultural areas in Estonia. Training data was prepared from 1088 land patches from select areas in Estonia and Latvia due to similarities of agricultural land. Each land patch represents a 5242.88 km2 area.
We were able to predict correct NDVI values with an accuracy of 93% and equally significant improvements in other structural image qualities and sharpness. Pictured below is a synthetic image compared with its target, visualized with contrast enhancement. Try guessing the synthetic image!

There are still many questions we are exploring, including the generalizability of models over other areas in Northern Europe. We are currently considering Poland, Germany, Denmark, Sweden and Finland, and the effects of crop types from the mapped fields.
We are excited to share some updates on these questions in the following issues.
Hudson Taylor Lekunze, Data analyst
2. Grazing Detection with Satellite Data
The importance of the grazing detection project lies in the reform of the Common Agricultural Policy, where the current method of on-the-spot checks will be replaced with satellite monitoring. As a significant part of grasslands (about 20%) in Europe is being grazed, complete grassland maintenance checks cannot be performed due to the complexity and high costs. Thus, satellite monitoring will take over the on-the-spot checks.

At the beginning of September, we conducted an international meeting with agricultural agencies from Denmark, Sweden, Estonia and Czechia. It is clear that this project is valuable for many countries, considering the lower costs of satellite monitoring and the impossibility of full coverage by the on-the-spot checks. Thus, to reach this challenging goal, a lot of development work is still to be done in future.

The following figure shows the time series of two different grasslands during the vegetative season. There is a clear NDVI drop, and coherence increase in the upper time series, which corresponds to the mowing event. The lower time series has no abrupt change, instead, there is a gradual NDVI decrease during the whole period, which illustrates grazing activity. Since these time series are highly different, grazing cannot be detected with the same model as mowing. Therefore, a different approach is needed, and the improvement of the existing mowing and grazing methodology is planned to be done.

Figure 1. S1 coherence and S2 NDVI time series of two Danish grasslands during the vegetative season in 2018. Red – grazing signature.

We are also pleased to report that our weekly fieldwork in Estonia is finished! We have collected continuous data about grazing activity and changes in the grassland condition, which is now being analyzed.

Figure 2. A member of KappaZeta team doing fieldwork

Figure 3. Cattle doing fieldwork

Olga Wold, Geospatial data quality specialist

Newsletter, August 2021

1. Why KZ's work is deep and cool and most of other companies’ work is not
At some point most of us have asked ‘why’? Why we do the work we do? What is the deeper meaning of it? What is my mission? And if you dig deeper, the answers can be horrifyingly empty in most of the companies. Besides the joy of the process and nice colleagues there is usually nothing left besides earning money and making some little process more efficient for our little human problems. Earning more money for what? Making some process more efficient for what?

Optimizing the people and food transport by further 5% - is it really badly needed? Better understanding of our one and only home planet [for solving climate change] is really badly needed, but this is not.

Putting the best brains to trade stocks to make more money for an already very rich company?

Developing yet another online bank or online shop? It is already invented, re-doing it again is boring.

The best data scientists studying customer behavior to improve the targeted ads algorithms. Is it the best use of our talent?

Making bank transfers and currency exchange a bit cheaper? Not bad, but also not the most burning question of our time.

Conquering the climate change and deeply understanding the processes that drive it – is potentially the most burning question of our time! It deserves much more investment talent and focus than we currently do!

The work of KappaZeta helps to solve this question. Making satellite data more accurate, easy to use, visible and accessible for all the interested people – contributes directly for better understanding of our planet. It helps to do better decisions based on fresh and objective data.

It is directly related with the highest values of ours and with the fundamental characteristic of mankind. The meaning of life according to Dalai Lama is extending the limits of human knowledge. It is something bigger and longer lasting than our human lives. New knowledge in the sense of science is also something truly new, it is the only ‘real news’. One of the most fundamental characteristics of humanity is curiosity. Willingness to know more than we already know, to find out new things. A good reason to wake up in the morning as there is so much new and exciting to be found out!
Kaupo Voormansik, CEO, SAR expert
2. Improving our Generative Model for Synthetic NDVI Estimations
Recently, we shared some results from our Synthetic NDVI MVP, showing it was feasible to model Optical NDVI images from SAR backscatter. We also mentioned this MVP was planned to be extended from June, to improve the fidelity of the images generated and their quality (i.e., sharpness, structural similarity, closeness to original NDVI values).
The focus on improving the model so far, has considered more data covering more time periods and agricultural land areas, some new features from the Sentinel-1 satellite (coherence on both polarizations) and some model improvements.
Preparing this new data alone, has required some careful work; Figure 1 below compares preprocessing techniques to scale VV-backscatter values for training. We do not yet understand the relation between Coherence and NDVI and are not sure if this new feature will improve the model. However, we are excited to learn more about the relation between Sentinel-1 data and NDVI (and even more Sentinel-2 features, including RGB bands in the future) and are hopeful for improved results on our evaluation metrics.

Figure 1: Raw VV-Backscatter, Standardized VV, 98% percentile clipped VV (for sub-tile in training set)

Hudson Taylor Lekunze, data analyst

Newsletter, July 2021

1. Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks
For the last year we have been developing Sentinel-2 cloud mask labelled dataset, available for community https://zenodo.org/record/5095024#.YP8HpY4zaUk. The dataset consists of 4403 labelled subscenes at 10 m resolution from 155 Sentinel-2 (S2) Level-1C (L1C) products distributed over the Northern European terrestrial area. Temporally there are around 30 S2 products per month from April to August and 3 S2 products per month for September and October. Each selected L1C S2 product represents different clouds, such as cumulus, stratus, or cirrus, which are spread over various geographical locations in Northern Europe. The classes in mask are the following:
  • MISSING: missing or invalid pixels;
  • CLEAR: pixels without clouds or cloud shadows;
  • CLOUD SHADOW: pixels with cloud shadows;
  • SEMI TRANSPARENT CLOUD: pixels with thin clouds through which the land is visible; include cirrus clouds that are on the high cloud level (5-15 km);
  • CLOUD: pixels with cloud; include stratus and cumulus clouds that are on the low cloud level (from 0-0.2 km to 2 km);
  • UNDEFINED: pixels that the labeler is not sure which class they belong to.
The dataset was labelled with help of CVAT and Segments.ai labelling tools (with the possibility of integrating active learning process in Segments.ai, the labelling was performed semi-automatically and verified by human labeler). The files are in NetCDF format that have preprocessed and oversampled at 10 m resolution B01, B02, B03, B04, B05, B06, B07, B08, B8A, B09, B10, B11, B12 and some of the products have additionally Fmask, Sen2Cor, S2cloudless and MAJA masks for comparison. We believe that sharing such dataset is a big step for improving cloud mask performance for the whole community! Feel free to use and share!
Marharyta Domnich, machine learning engineer
2. Building the Harvesting Time Recommendation Service: weekly surveys
It is summer and with complex weather conditions it can be a difficult task for farmers to choose the right time for harvesting crops. To find this out and reach the challenging goal we are performing fieldworks in scope of Harvesting Time Recommendation for maximum crop Yield (HaTRY) project, to collect samples of three typical crops grown in Northern Europe and perform the analysis with help of Estonian Crop Research Institute (ETKI) lab.
During dedicated fieldwork we collect grains from three winter wheat, three winter rapeseed and three spring barley fields (total of nine fields) and then bring them to the ETKI lab for the analysis.  The main goal is to collect continuous data before the fields will be harvested, which will represent the key parameters describing crop ripening (protein content, grain moisture, chlorophyl content, oil content, etc.).

Figure 1. a) Mihkel Järveoja collecting winter wheat grain samples with hand harvester in Jõgeva county; b) collage of collected grain samples. Photos by Olga Wold

When all the monitored crop fields will have been harvested and the samples collected and analyzed, we are going to match and compare the analysis results from the laboratory with Sentinel-1, Sentinel-2 feature set signatures and validate the model which will help us to predict the best harvesting times in the future service.

Olga Wold, geospatial data quality specialist

Newsletter, June 2021

1. KappaMask for Sentinel-2
We are happy to announce that KappaMask is supporting now both Sentinel-2 Level-2A and Level-1C inputs. KappaMask predictor can be used for different levels and obtain results that outperformed all open-source existing methods. It generates 10 m georeferenced output .tif mask with clear, cloud shadow, semi-transparent clouds, clouds and missing classes. For validation, we make an extremely challenging test set that is covering the whole Northern European terrestrial conditions. The result is that KappaMask L1C reached 79% dice coefficient score, while Sen2Cor on the same data resulted in 55%, Fmask – 63%, S2cloudless – 64% and MAJA - 43%. KappaMask L2A model is performing even better than KappaMask L1C on cloud shadow class. However semi-transparent clouds are better recognized by L1C model, since L1C has a specific cirrus band which is not available for L2A. To make cloud mask this accurate we labelled more than 4000 sub-tiles of 512 x 512 pixels size. The predictor can be installed and run using this link: https://github.com/kappazeta/cm_predict. We share the open-source approach and appreciate community input, try it out and give us your feedback!
Marharyta Domnich, machine learning engineer
2. HaTRY model development and first result
Our harvesting time recommendation model developments have reached a point, where we can share some insights into model architecture and preliminary results.

The model consists of two parts: a regression model (M1) which predicts future time series and a binary classification model (M2) which picks up the harvesting signatures from the regression model output. Instead of a single black-box model approach, we opted for a modular design to make it easier to troubleshoot and evaluate the system as well as to re-use parts of the pipeline across projects.

To make the time series easier to predict, Sentinel-1 parameters are separated into different features by their relative orbit number (RON). With resampled and linearly interpolated time series as input, M1 smoothens the signatures further. The following figure shows the predicted vs. actual VH coherence median for RON 153, the median of backscatter VH/VV ratio for RON 58, vegetation index (NDVI) median and active temperature sum time series for a single winter rapeseed parcel.
M2 has been trained to function on time series input with weather forecast and M1 output stitched together. The following figure illustrates harvesting probabilities for another parcel (spring barley) from M2 output, in comparison to a labelled time series of Sentinel-1 coherences, backscatter VH/VV ratio and Sentinel-2 NDVI. Sentinel-1 parameters from all RONs are shown on the image.

Due to each crop type having different growth signatures and harvesting times, dedicated M1 and M2 are used for each crop type. In the scope of the HaTRY project, we have three crop types (winter wheat, winter rapeseed and spring barley) with a pair of models for each. This is illustrated in the figure below.

We expect to have a live prototype of the harvesting time recommendation service integrated with the crop monitoring system in July and will validate our model predictions with field surveys and farmers’ feedback.
Mihkel Järveoja, HaTRY project leader
3. Weekly field research for the grazing detection
It’s 8 o’clock in the morning on Friday and we are already heading northwest to perform our new routine – meet with the farmers, fly a drone over the field and record all the changes in the grassland condition.
Fieldwork is a significant part of ground reference data collection required for the development of grazing detection methodology. Our main goal is to collect enough continuous data to calculate the grazing intensity (i.e., number of animals per one hectare) of the test parcels in Estonia. Continuity in the data collection is important and hence we are doing the fieldwork on a weekly basis throughout the whole active grazing period.

Figure 1. Drone picture of cattle in Laeva village, Tartu county.

Our partner for the fieldwork is the University of Tartu Geography department. Together we focus on three main activities: counting animals on the parcel with a drone, measuring grass height and monitoring soil moisture. In addition, we keep track of animal movements between the different parcels, grassland condition, mowing activities and supplementary fodder.
Figure 2. Fieldwork team (from left to right Risto Merdenson, Tõnu Oja, Ants Reitsak, Mihkel Järveoja) starting the drone for the first time in Siniküla village, Tartu county. Photo by Jelizaveta Vabištševitš.

Jelizaveta Vabištševitš, EO Analyst

Newsletter, May 2021

1. Sentinel-1 Next Generation – what we know so far
Sentinel-1 is arguably the most important SAR mission in the world. Thanks to its free and open data policy and fine temporal resolution it is the de facto standard of SAR for many. Therefore, the future of the mission is of great interest.

We know that the C and D units of the mission will be almost exact copies of the currently operational A and B units, but things are getting more interesting with the E and F satellites, which correspond to the next generation (Sentinel-1 NG). Not much is known about this potentially game changing mission for Earth Observation world, but there are still some documents. DLR has published a set of slides and a short document based on their Phase-0 study, available from here and here.

We can read that the spatial resolution is planned to be updated from 5 by 22 m to 5 by 5 m. Area-wise it means about 5-times denser data, repeat cycle is planned to be updated from 6 to 4 days. Perhaps the most interesting and significant update concerns the fully polarimetric mode. The system could work in either, dual. pol. mode with 400 km swath or with 280 km swath in quad pol. mode. Concerning the land applications in agriculture that KappaZeta has worked so far, the latter mode is definitely more interesting. Sentinel-1 temporal resolution is already great. With 400 km swath and 4-day revisit it will be even greater – this is not the place where the agricultural applications face the trouble. In some applications, for example, we use only the ascending orbits’ data, neglecting half of the Sentinel-1 temporal data density.

The bigger problem is the information richness of the data. Sentinel-1 dual pol. data with 4 variables per pixel offers just a small subset of applications, which are possible in the SAR world. With quad pol. data there are 8 variables per pixels, though some of them are redundant, you don’t need to be Einstein to understand that more linearly independent input variables are good. In the world of AI and data science, this translates to a richer food for the AI models to eat, which means more accurate information for the end users. Some applications currently impossible would be enabled with the emergence of quad pol. SAR data in large quantities.

An example of how the quad pol. SAR image would look like is shown in the Figure 1 nearby. This is a RADARSAT-2 image with very similar spatial resolution. Unlike tiny amounts of specifically ordered acquisitions of RADARSAT-2, Sentinel-1 NG would start to produce this data in large scale with free and open data policy. We really hope that the 280 km swath quad. pol. mode will be the default one for Sentinel-1 NG. The temporal resolution of Sentinel-1 is already great, but the information acquired could be much richer.

Figure 1: How does quad. pol. SAR data look like in Pauli basis? A subset of RADARSAT-2 quad. pol. image about Estonia from summer 2013. RADARSAT-2 Data and Products © MacDonald, Dettwiler and Associates Ltd. 2013 All Rights Reserved. RADARSAT is an official trademark of Canadian Space Agency.

We hope to write about it in a longer blog post in KappaZeta web in the near future. Though it will not happen very soon (launch of the mission is around 2028?), Sentinel-1 NG will be a great game changer in the Earth Observation world. But not only this – it will improve the understanding of our planet in global, regional and local scale – without it, it is difficult to imagine the transition to green economy.
Kaupo Voormansik, SAR expert, CEO
2. Our crop monitoring service prototype is live!
In the beginning of May we launched our crop monitoring prototype service. Three farmers and Estonian Crop Institute are the first to test our service in Estonia and give us valuable feedback to continuously improve the product. Crop monitoring is the first phase of the Harvesting Time Recommendation service prototype, which we are planning to launch in July with same participants.

In crop monitoring service the farmers can see the ongoing season timeseries of Sentinel-1 and Sentinel-2 field-based features and NDVI/RGB/NRG images which are being updated as soon as new images are available. We have also included main weather parameters (daily temperature and precipitation) and are daily updating accumulated agrometeorological parameters like active temperature sum and effective temperature sum. Here are some screenshots from our live service:

We are ready to set up similar service anywhere in the world, so if you are interested in demo or want more information about the prototype then contact us.
Mihkel Järveoja, HaTRY project leader
3. Synthetic NDVI for proxy biomass calculations

Plant biomass is known to be a huge indicator of the ecological state of an area covered by vegetation, providing insights into the energy stored in plants and resulting biofuel available, grazing capacities of fields, amongst others. However, measuring biomass, especially over large areas is a costly process and hence proxy measurements/models have been developed to compute these estimates, one of which uses the Normalized Difference Vegetation Index (NDVI).

The ESA’s Sentinel-2 satellite provides this NDVI data openly (computed from the red and near-infrared bands), with a 5-day revisit time. However, as with any optical satellite data, this is often obstructed by clouds, leaving very little to be seen and analyzed. Sentinel-1 data however is a more faithful data source, with lower resolutions nonetheless on the backscattered radio waves. Hence, we developed an MVP in a month to explore the modeling of NDVI from Sentinel-1 data using Generative Models.

Figure 2 (L-R): Sentinel-1 False Color RGB | Target NDVI Image | Synthetic NDVI Image

Pictured above, we discovered that this modeling was possible using Multi-Temporal CGANs (Conditional Generative Adversarial Networks), but there is still some work to be done with enhancing the quality of images generated and enforcing temporal constraints on the synthetic images more strictly. Work beyond this MVP resumes from June and we are excited about the results with more data from Sentinel-1, additional preprocessing and model improvements.
Hudson Taylor Lekunze, data analyst

Newsletter, April 2021

1. Next step in the grazing detection project with the ESA
In short, we are developing a grazing detection methodology based on Sentinel-1 and Sentinel-2 imagery time series. This work is important for the reform of the Common Agricultural Policy, which will replace on-the-spot grazing checks with satellite monitoring. As about 20% of grasslands in Europe are being grazed, our detection methodology will help to complete the grassland maintenance checks alongside the mowing detection.
So far, we have gathered a significant amount of ground reference data – a total of 7,445 animals (61% cattle, 32% sheep, 2% horses) were labelled all over Estonia.

Cattle (left) and sheep (right). Source: Estonian orthophoto, Land Board.
Next, we are planning to carry out a field survey in Estonia, which will focus on counting the number of animals on the test parcels throughout the active grazing period (from May to August). The main objective of the in situ data collection is to record the grazing intensity (LU/ha) and grassland conditions on a weekly basis.
Jelizaveta Vabištševitš, EO analyst, project manager

2. Our new cloud mask
On 23rd of April KappaZeta presented the Cloud Mask project results on Very High-resolution Radar & Optical Data Assessment (VH-RODA) 2021 workshop. VH-RODA 2021 (20-23 April) held presentations and discussions of current status and future developments related to the calibration and validation of spaceborne very high-resolution SAR and optical sensors and data products with the focus on synergies between optical and SAR communities, presentation of standards and best practices for data quality.

We named our cloud classification mask KappaMask. Thus, we presented that our KappaMask outperformed Sen2Cor with 92% vs 57% dice coefficient on validation set. The pixel-wise metric is considering the performance for cloud, semi-transparent cloud, cloud shadow and clear area. Check out our cloud mask comparison slider-pages with rule-based masks (Sen2Cor, Fmask) from here and machine learning-based Sinergise S2cloudless mask from here. The labelling process is still in progress (so far, we have ~3500 labelled subscenes 512x512 pixels at 10 m resolution covering summer season in Northern Europe). Therefore, we are expecting even more outstanding results with further labelling and fine-tuning!  

Marharyta Domnich, machine learning engineer

Newsletter, March 2021

1. Advanced coherence rasters
It is a common understanding, that to reach a useful AI model, it is crucial to have validated ground reference data for training. Just as important is to have meaningful, calibrated, low noise features that have substantial physical relation to the phenomenon being modelled. At KappaZeta we are paying extra attention to get most out of Sentinel-1 SAR data.

The figure above compares 6-day VH coherence from SNAP with the output from our processing chain. As a reference some borders of agricultural parcels and roads are visualised. As you can see the spatial variability and the dynamic range from the KZ processor is much higher. You can also notice small areas with white NoData pixels. We don’t output pixel values that are below the noise floor, do not contain meaningful data and instead would introduce noise into your modelling tasks. With analysis-ready Sentinel-1 data we can make space a valuable asset for everyone!
Tanel Tamm, GIS expert, board member

2. Ivory tower, analysis ready data and how to make the most from the European investment into Copernicus programme
Sentinel-1 is a beautiful data factory, but its data is still under-used, because it is too complex to use for large majority of potential beneficiaries. When Copernicus programme and Sentinel-1 mission was planned the investment was justified with various public services and benefits it would enable and provide to the European and global citizens. Now it is almost 7 years from the launch of the first satellite and with few exceptions the data is still mainly used by university research groups and specialized Earth Observation companies like KappaZeta.

The data is beautiful, it is systematic, it is in large volumes and it is free and open! It enables numerous analysis and informed decisions based on actual data, starting from geology and ending with agriculture. Why is it under-utilised then? Why most of ICT and data science companies and governmental users overlook it? Because the entrance barrier is too high. If you shoot the user with a 1-4 GB zipped file in a native space agency format most of the users are not so determined to look for and download the dedicated software, which enables to browse, visualise and export the imagery products. In the era of convenient online services, they would like to just quickly see the image to determine if it is something useful for them and worth spending the time. The data layer should ideally be integrated with just one click and be useable out of the box.

Committee on Earth Observation Satellites and European Space Agency have recognized the problem and have came up with the concept of Analysis Ready Data. In practice it means working out a set of standards – if the data provider follows them the clients should be happier and have less pain with pre-processing the data. Among others there are standards for Synthetic Aperture Radar data, which are directly applicable for Sentinel-1.

Of course, the standards themselves have no value if nobody follows them. Here the role of KappaZeta comes into play. We have the vision, and we want to make the most from Sentinel-1 data, present it in a way that users understand and can access easily. A lot has been already done with making Sentinel-2 data easily accessible (credit to Sinergise for Sentinel Hub), but for Sentinel-1 the job is largely undone. In practice it means pre-processing the data right and providing the raster images and value-added data layers as web-services. Among other developments, our team is building the Sentinel-1 backscatter WMS.

This topic is close to our hearts. We came down from the ivory tower of science, because making the results of science to work for “ordinary” people is at least as important as science itself. If everything goes well, we will be rolling out a series of value-added Sentinel-1 data layer web services over the course of next two years. We will make Sentinel-1 data more easily accessible, and the investment made into Copernicus programme more useful.

If you see that satellite data can be useful in your work, but you don’t know how to use it – please contact us!

Kaupo Voormansik, SAR expert, CEO
3. Meeting clients’ needs
Our mission is to make space a valuable asset for everyone. On one hand, we do this by using our expertise in SAR data processing and providing input data for other companies. On the other hand, more and more we approach to end-users. On this road, a valuable milestone has been serving Estonian paying agency with countrywide automated mowing detection system. This experience has encouraged us to approach new client segments in the agricultural domain. And again, we are not alone on this way – we do this hand in hand with Estonian innovation-minded farmers and have a solid support from the mentorship programs. By the way, Estonia has very favourable environment for business development in sense of support from public and private sector. During previous 12 months our team has been advised by three organisations: North Star Consulting Group, Superangel and Enterprise Estonia (EAS). In addition, we participate in a joint counselling program Põhjanael, which guides us on the way to form relevant value propositions for our client groups.
EAS Design Masterclass has been very helpful in the process of creating a meaningful service. By the second workshop we had sorted out that in this program we were going to focus on crop monitoring application for farmers and agricultural insurance companies. Before deciding, we had conducted more than 20 interviews with farmers to get their perspective. This had given us a quite clear understanding of what kind of remote sensing assist would actually provide value for farmers. Having a clear and honest dialogue with your client is invaluable as this is something that prevents one from developing another state of art application which actually delivers only futile value for end users. It might be very tempting to build something just for its pure technological elegance, but if it ignores clients’ needs, it is unprofitable. The Masterclass has emphasized the importance of lean approach to service development and the art of hypothesis. There are many efficient tools out there to facilitate design process and fast prototyping, e.g., Miro for online visual collaboration and Figma for interface design.
Jürgen Lina, business development manager

Newsletter, February 2021

1. Insight to crop harvesting dates in 2020
During the HaTRY (Harvesting Time Recommendation for maximum crop Yield) project we have collected and analyzed exact sowing and harvesting times for more than 700 crop fields in Estonia.

Season 2020 sowing and harvesting events for those 3 crops were distributed over time like this (numbers and colors indicate, how many fields were sowed or harvested that specific day):

Next, we are going to match those events with signatures from Sentinel-1, Sentinel-2 and weather parameters and try to predict the best harvesting windows during season 2021 already in live prototype. It’s going to be a challenge, for sure.

Mihkel Järveoja, HaTRY project leader

2. Sentinel-2 labeled datasets
It was impressive to find out that there are at least 2 high-quality Sentinel-2 labeled datasets which can have potential use inside our free AI-based cloud mask processor for Sentinel-2.
One of the datasets, “Sentinel-2 reference cloud masks generated by an active learning method” by Louis Baetens and Olivier Hagolle, consists of 7 scenes labeled by human and 31 scenes generated by active learning methods. There are classes for the ground-truth mask: no data, not used, low clouds, high clouds, cloud shadows, land, water and snow. However, the dataset is generated at 60 m resolution. While we are building a detector for images with a ground resolution of 10 m, we still believe that we can incorporate this data into our processor.
The second dataset we were excited to find, “Sentinel-2 Cloud Mask Catalogue” by Francis, Alistair et al., was released recently in November. The dataset comprises of 513 labeled subscenes of 1022x1022 pixels at 20 m resolution. The labels represent 3 classes: clear, cloud and cloud shadow. Additionally, the subscenes have been categorized by surface type, cloud type, cloud height, thickness and extent. The data was annotated semi-automatically, using the IRIS toolkit, which makes use of a random forest model for pre-labeling.
We believe that with more variety of data and sources we can build the best cloud mask processor so far. Thus, let us know if you know any other S2 labeled dataset! :)

Marharyta Domnich, Machine learning engineer
3. Generating synthetical Sentinel-2-like images from Sentinel-1 images
The Sentinel-2 satellites are known for providing us beautiful optical imagery at high spatial resolution. More than just pretty pictures, the Sentinel-2 data can be utilized to calculate a variety of vegetation indices, among others the normalized difference vegetation index (NDVI). There is just one problem: when the sky is covered with clouds, we don’t have access to any of this. This can be a frustrating problem especially in the autumn season.
The images from the Sentinel-1 satellites are not so easy to interpret. Even so, they have a huge advantage on their side: as radar satellites they can see through clouds and offer an uninterrupted time series of images. And, after all, Sentinel-1 images depict the same places and objects as Sentinel-2 images.
In KappaZeta, we have started to work on modelling synthetical Sentinel-2-like images, based on Sentinel-1 input. For that we use a specific generative adversarial network called “pixel2pixel”, an image-to-image translation architecture which learns a function to map from an input image to an output image. The model is trained on co-located pairs of Sentinel 1 and cloud-free Sentinel-2 images, and eventually we plan to use the history of Sentinel-2 images as well. It means, we can produce cloud-free Sentinel-2-like images in case the actual Sentinel-2 images are disturbed by clouds. This way we can combine the strengths of both sensors.

In the image: an example of a preliminary training result. Left: Sentinel-1 image, middle: synthetical output from the model, right: the actual Sentinel-2 image.
Heido Trofimov, Software developer

Newsletter, January 2021

1. Faster labelling
We managed to boost exponentially our labeling speed for the cloud masking project with Segments.ai annotation tool. Comparing to CVAT, which we have used previously, it is superior in speed and functionality since it allows to apply machine learning from the start. It is possible to label small subset of images first, train i.e., a segmentation model and then predict “prelabeled” for the rest of the data. Afterwards, the correction of predicted labels can be done simply with the brush. The better the model is, the faster is labeling and it is possible to continue training with corrected data creating a next prelabeled version (more info).
Marharyta Domnich, Machine learning engineer

2. Great amount of square kilometers processed
In many cases land is divided into generally homogenous parcels for management purposes. For instance, this is common in forestry and agriculture in Europe. If this is the case, more refined products can be delivered to clients. We started to offer Sentinel-1 statistics time-series for areas of interest (AOI) as a service in April 2020. Since then, we have delivered to our clients approximately 15 000 000 statistical measurements about interferometric coherence and backscatter in VH and VV polarisations for 335 000 km2 of land.

Tanel Tamm, GIS expert and board member
3. Warm years from 2016 to 2020
Interviewing the active and innovation-friendly farmers of Estonia has brought us to an interesting topic – climate change. From one hand the farmers are happy as warmer climate gives greater yield. On the other hand, softer winters and warmer summers bring new pests and diseases further North, which did not like the cool climate that was here before.
To not let ourselves be fooled by the cold winter here now in January 2021, we dug into real data and checked how much the weather has actually changed in the past 5 years. We checked the monthly average temperatures with respect to the climate norm 1981-2010 according to Estonian Weather Service. The chart below summarizes our findings.

It is pretty unidirectional picture. In 45 months out of 60 the temperature has been warmer, on average a month is 1.3 degrees warmer than the climate norm. For country and monthly average, it is a lot! This data here is about Estonia, but the conclusion is to a large extent valid for whole Northern Europe / Baltic Sea Region. We do not want to argue here how large part of it is caused by humans and if it is a cyclic behaviour. Still, regularly warmer months during past years is a fact.
KappaZeta tries to be close to our users helping them to better understand and analyse, what is going on with our nature and what are the wise decisions in this context. Analysing objective satellite and weather data is a good basis for unbiased and clever decisions.
Kaupo Voormansik, CEO

Newsletter, December 2020

1. Labelling process
Fun fact: According to EIT Digital 2020, Data Scientists spend 60% of time on Cleaning and Organizing Data, 19% on Collecting Data Sets and only 9% on Modelling/Machine Learning. We are fully experiencing this in our ongoing cloud masking project, where creating a labeled dataset is the key element for success. So far, we have labeled 1940 images with the size of 512x512 pixels, where every pixel is labeled as cloud/cloud shadow/semi-transparent cloud/clear land class! We agree with Taivo Pungas in his blog, that the process of building a dataset is undeservedly out of the scope of research activities. Instead of putting so much effort into tuning model parameters and designing creative architecture, smart AI decision-making need conscious investment into collecting data and understanding the domain. Luckily, we have remote sensing experts in KappaZeta to help with that.
Marharyta Domnich, Machine learning engineer

2. Estimating the best time for harvesting crop

Image: Madis Ajaots

Since August we have talked to different stakeholders and gathered user requirements for a new service prototype – Harvesting Time Recommendation for maximum crop Yield (HaTRY). This is another project carried out in collaboration with the European Space Agency (ESA) and with the local farmers. The project started this summer and will last until the end of next year.

What is the project about and what the prototype, we are testing next harvesting season, should do? From the farmer’s perspective the value of the service can be phrased like this: “I want to know the time when crops are mature and ready to harvest. The earlier I get this information the better.”

Timing of the harvesting can be challenging if you need to consider all the variables in this equation. For example, available machinery, workforce, grain protein content, grain moisture content, spatial distribution of the fields and weather. Our prototype, which will base on Sentinel-1 and Sentinel-2 imagery time-series and weather data, will not solve this equation fully, but aims to give another source for informed decisions.

We have already collected a reasonable amount of reference data for the neural network prediction model and start testing the architecture and performance in the coming months. During the next summer, we want to set up a near real-time prototype to test the usability, limits and value with the farmers. We focus on the Northern Europe region and three most commonly grown crop types: winter wheat, spring barley and winter rapeseed.

We will keep you informed about the process and share the results in future newsletters and blog posts.

Mihkel Järveoja, HaTRY project manager
3. Vision of our Sentinel-1 API services
Copernicus programme and Sentinel satellites is a great European success story as we all know and this is also the view shared by the Earth Observation (EO) community. Being inside the community, our job is to recognize the people around us, who would potentially benefit from Copernicus data and who do not know such data exists or have never tried to use it.

Only a fraction of the potential Copernicus users take full advantage of the programme. For a large majority, it is still a data source mostly used by scientists and specialized companies to satisfy their needs.  We think it does not have to be this way. The key question here is how to provide easy access of pre-processed data. Take Sentinel-1 for example. The terabytes of carefully processed GRD and SLC files produced by ESA are great, but for large majority of users they are too heavy and too complex to use. The learning curve to see the first benefits is too steep. Just to see the latest image of your house, field or forest – you should not need to download a 1-4 GB file, unzip it and open with a dedicated EO or GIS desktop software. Everyone should be able to browse the imagery with ease, making one click in a browser.

During the coming months we are planning to develop and release a series of Sentinel-1 value-added services including:
  1. S1 parcel-based time series
  2. S1 coherence and backscatter raster-services
  3. S1 simple visual WMS/WCS
  4. Synthetic Sentinel-2-like image from S1 data

Image: Sentinel-1 IW mode dual polarimetric backscatter image without (left) and with (right) enhanced, details preserving noise removal

For all the derivatives, the key words will be professional calibration, noise removal and maximum possible spatial resolution (see an example image above). The integration of the data layers and information products will be as simple as possible, keeping in mind both visual human interpretation and machine-readable API services on both web and desktop platforms.

Let’s make Sentinel-1 data conveniently accessible for everyone! More information from our newsletter and blog during coming months as we progress with the development.
Kaupo Voormansik, CEO

Newsletter, November 2020

1. Farmers info day in Tartu Observatory
On 11th of November, we organised an information day about remote sensing applications in agriculture in cooperation with Tartu Observatory and as part of the RITA project. There were presentations by local farmers and companies, who are developing services for farmers, researchers, and Estonian Paying Agency. There were about 30 participants and due to COVID-19, most of the public was attending remotely. Talking to the users is the key for developing useful services.

Presentations can be found from here.

And the video of the entire event is available here.

We would like to thank all the participants, organizers and presenters, who gave the content and helped to make the info day a success.
Kaupo Voormansik, CEO

2. Grazing detection project with ESA
Grazing detection from Copernicus data for agricultural subsidy checks is one of our latest projects, carried out in collaboration with the European Space Agency (ESA) and with our partner, Czech EO company Gisat s.r.o. The project started in October 2020 and will finish in November 2021.

Grazing detection service is urgently needed for the monitoring approach reform of the Common Agricultural Policy (CAP). With the reform, the current method of on the spot checks (OTSC) will be gradually replaced with satellite monitoring. One of the most common requirements under CAP that needs to be checked is grassland maintenance, which can be performed by either mowing or grazing. To provide a complete satellite-based grassland monitoring service, grazing detection is equally needed alongside mowing detection. Without the grazing detection, complete grassland maintenance checks cannot be performed with satellite monitoring, as a significant part (~20%) of the European grasslands are being grazed.

Source: Estonian orthophoto, Land Board

Our aim is to develop a grazing detection methodology based on Copernicus data (Sentinel-1 and Sentinel-2 imagery time series). We are planning to test the methodology with selected European national paying agencies (NPAs) and further aim is prototype development for public sector applications.
Planned outcomes:
  • Develop grazing detection methodology based on Copernicus data (Sentinel-1 and Sentinel-2 imagery time series).
  • Provide the NPA operators means to carry out checks on grasslands using EO data and substituting on the spot checks of grassland grazing activity with new EO data based grazing detection methodology.
  • Close the grasslands subsidy checks case for CAP satellite monitoring.
We are currently gathering ground reference data and still actively looking for reliable sources of the number of animals on grassland parcels in Europe. This could be anything from field books or GPS-logged animals to sophisticated databases. So, if you have any information about possible data sources, do not hesitate to contact us. For more information, contact aire.olesk@kappazeta.ee.
Aire Olesk, coordinator and remote sensing specialist

3. Our new research paper
Fresh from the press, a new research paper was published by KappaZeta, freely available here. The paper summarizes a large part of our R&D work with Estonian Paying Agency regarding satellite monitoring for agricultural subsidy checks during past five years (2016-2020). The work is based on a large dataset and it shows very convincingly the value of Sentinel-1 coherence for agricultural applications in the context of Common Agricultural Policy and beyond.

Since we have already shown in several of our research papers that Sentinel-1 (S1) is well suited for grassland monitoring service, it has also sparked interest in other companies, who have started to acknowledge the potential of radar data. Recently, article by Sinergise has pointed out that mowing detection from space is a good idea and tried it out using Sentinel-2 data. In the conclusions, they have summarized that due to the cloudy observations, it would be beneficial to also include more sensitive data from Sentinel-1, kindly referring to our research paper from 2016.

We are glad to see that Sinergise has a very similar approach to Kappazeta, to openly share their knowledge, being open for discussion, feedback and lessons learned.

We truly believe in an open access approach, therefore sharing and teaching each other will ultimately get us all further. You are most welcome to read, share and reference.

Thanks for all the team and partners, who helped to make it happen.
Kaupo Voormansik, CEO

Newsletter, October 2020

1. Towards the operational service of crop classification We are about to finish an R&D project, where we developed and tested crop classification methodology specifically suited for Estonian agricultural, ecological and climatic conditions. We relied mostly on Sentinel-1 and -2 data and used neural network machine learning approach to distinguish 28 different crop types. Results are promising and the methodology is ready for operational service to automate another part of agricultural monitoring.

Read more: https://kappazeta.ee/blog/towards-the-operational-service-of-crop-classification

Mihkel Järveoja, GIS developer

2. Data splitting challenge In machine learning applications, labelled training data is usually split into training, validation and test sets. The splitting is often performed by random sampling the input dataset according to a pre-defined ratio, such as 60% / 30% / 10%, for example. In this example, 60% of the shuffled input samples would be used for training, 30% of them for validation and the remaining 10% for the test set.

However, different samples might have different labelling confidence. For multiclass datasets it can be quite a challenge to ensure properly balanced classes throughout the datasets. To tackle these issues, we use a custom splitting logic for each project. The splitting logic supports configurable splitting ratios as well as configurable filters per dataset. It is also possible to configure the upper threshold for under-sampling and lower threshold for an alternative splitting ratio. Regardless of the configurability there tend to be nuances which make it difficult to use a common splitting logic for all projects.

From our perspective, there is a need for a generic framework which would simplify data splitting and make it easy to apply it to projects with different requirements and different types of datasets.

Read more: https://kappazeta.ee/blog/data-splitting-challenge
Indrek Sünter, software developer
Marharyta Domnich, machine learning engineer

3. New cloud mask In September 2020 we signed a contract with European Space Agency to develop an open source AI-based cloud mask processor for Sentinel-2. We are exploiting CNN architecture and the goal is ambitious – to develop the most accurate open source cloud mask for Sentinel-2. To limit the scope we are concentrating on Northern European summer season terrestrial conditions. To speed up the labelling we are applying active learning and exploitation of already labelled data sets. The resulting cloud mask will be free and open. If you wish to contribute contact us.

A longer cloud mask article will soon be published in our blog: https://kappazeta.ee/blog

Kaupo Voormansik, CEO