Thales Sehn Körting

Informações:

Synopsis

Data mining, pattern recognition, image processing, remote sensing. Enjoy!

Episodes

  • What is Data Science? (Part 1)

    11/12/2020 Duration: 25s

    In this podcast I provide a detailed discussion of what is Data Science. In Part 2 I will continue... Follow my podcast: http://anchor.fm/tkorting Subscribe to my YouTube channel: http://youtube.com/tkorting The intro and the final sounds were recorded at my home, using an old clock that belonged to my grandmother. Thanks for listening

  • Is Deep Learning FAIR?

    29/12/2019 Duration: 08min

    Deep Learning articles use benchmarks to measure the quality of the results. However, several benchmarks do not have the copyright of all data used. So, how to believe that every paper uses the same benchmark? From https://www.go-fair.org/fair-principles/ we have the description of the FAIR acronym Findable: The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers.  Accessible: Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation. Interoperable: The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing. Reusable: The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings. From the article Implementing FAIR Data Pri

  • Do you trust in pretrained Deep Learning models?

    19/11/2019 Duration: 06min

    Several authors rely on transfer learning from pretrained models, arguing that using well-known datasets, which are available on the internet (e.g. ImageNet) their model will be able to handle a specific problem with a reduced training step. In Remote Sensing this perspective is also becoming a trend when using Deep Learning techniques to classify Remote Sensing datasets. In my opinion, the datasets used for pretrain are very different from Remote Sensing targets, mainly in two aspects: spatial resolution: a sensor can be ultra high spatial resolution (50cm for example) or very low resolution (2km for a single pixel), and the edges in all these images are different spectral resolution: the datasets found on the internet are composed by color pictures, obtained mainly by phone cameras, which are composed by 3 channels (red, green and blue). In Remote Sensing we can have several spectral channels, such as yellow or red-edge bands (available in WorldView-2), or infra-red channels, available in most of the sat

  • Are you sure you apply only Data Mining to your database?

    05/11/2019 Duration: 05min

    In this podcast I discuss the (sometimes) wrong use of the term Data Mining, with in accord to the paper From Data Mining to Knowledge Discovery in Databases, written in 1996 by Usama Fayyad, Gregory Shapiro, and Padhraic Smyth,  is defined as: Data mining is a step in the KDD process that consists of applying data analysis and discovery algorithms that produce a particular enumeration of patterns (or models) over the data. KDD means Knowledge Discovery in Databases, and is composed by the following steps: Data -> (selection) -> Target Data -> (preprocessing) -> Preprocessed Data -> (transformation) -> Transformed Data -> (data mining) -> Patterns -> (interpretation/evaluation) -> Knowledge Several authors call Data Mining when they are performing the entire cycle (from Data to Knowledge) and not only the data mining step, which can be represented also by the use of classification/clustering algorithms. The reference paper is available at: https://wvvw.aaai.org/ojs/index.php

  • When the high resolution is not so high...

    28/10/2019 Duration: 05min

    In this podcast I discuss the wrong use of the term Resolution in scientific articles or in the general media. Resolution in Remote Sensing can be used to describe several aspects of images, such as: temporal resolution: the time difference between two images of the same place spectral resolution: related to the number of bands and wavelengths, such as in Panchromatic, Multispectral, Hyperspectral, or Ultraspectral radiometric resolution: the number of bits needed to store a pixel value (e.g. 8 bits in Landsat 7 or 11 bits in WorldView-2) spatial resolution: the focus of this podcast, relating the area represented by a single pixel in an image I provide an interesting reference with an easy to use table, to understand what can be considered High Spatial Resolution, or Low Spatial Resolution: Taxonomy of Remote Sensing Systems - Spatial Ground Resolution Ultra High: < 1m Very High: [1m, 4m] High: [4m, 10m] Medium: [10m, 50m] Low: [50m, 250m] Very Low: > 250m The reference is: Ehlers

  • Is there an "Almost Perfect" agreement in a classification?

    21/10/2019 Duration: 07min

    I discuss the extensive use of the Table Strength of Agreement based on different Kappa values, provided by: Landis, J.R. and Koch, G.G., 1977. The measurement of observer agreement for categorical data. Biometrics, pp.159-174. According to Google Scholar, this paper has more than 53.000 citations (up to October, 2019). In my opinion this table has been used sometimes with a different purpose than the original paper, which, according to the authors, "have been illustrated with an example involving  only two observers", and "these divisions are clearly arbitrary". The original paper is available at https://www.jstor.org/stable/pdf/2529310.pdf Follow my podcast: http://anchor.fm/tkorting Subscribe to my YouTube channel: http://youtube.com/tkorting The intro and the final sounds were recorded at my home, using an old clock that belonged to my grandmother. Thanks for listening

  • What is the origin of the term "Big Data"?

    17/10/2019 Duration: 05min

    I discuss the article from Doug Laney, published in 2001, entitled 3D Data Management: Controlling Data Volume, Velocity, and Variety. This paper is one of the basis for the definition of the term "Big Data". Curiously, the explicit term "Big Data" does not appear in the text, but the author explain a 3D interpretation of a database, which grows in Volume, Velocity and Variety, the well known 3 V's. Follow my podcast: http://anchor.fm/tkorting Subscribe to my YouTube channel: http://youtube.com/tkorting The intro and the final sounds were recorded at my home, using an old clock that belonged to my grandmother. Thanks for listening

  • Unsupervised classification exists?

    11/10/2019 Duration: 08min

    I explain what is an unsupervised classification algorithm and what is a supervised algorithm.  I use examples about remote sensing image classification and I discuss my opinion about the unsupervised algorithms, which are in fact similar to the supervised ones.  Take as one example the well known unsupervised K-Means algorithm. The analyst must inform a priori the most important parameter to run the algorithm, the K value. I have a video about the K-Means algorithm. Follow my podcast: http://anchor.fm/tkorting Subscribe to my YouTube channel: http://youtube.com/tkorting The intro and the final sounds were recorded at my home, using an old clock that belonged to my grandmother. Thanks for listening

  • Waiting feedback - data mining, deep learning, remote sensing, image processing, pattern recognition

    02/10/2019 Duration: 03min

    This is a first message to check if someone will find my podcast and will have interest on it.  Waiting for feedback on remote sensing, image processing, data mining, deep learning, data augmentation, sample selection, articles, papers, etc. Follow my podcast: http://anchor.fm/tkorting Subscribe to my YouTube channel: http://youtube.com/tkorting Thanks for listening