Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Aug 15, 2024
1 parent 3447848 commit e990d28
Show file tree
Hide file tree
Showing 7 changed files with 199 additions and 194 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
d65191bf
b3af616c
34 changes: 17 additions & 17 deletions chapters/01_classification.html
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@ <h2 id="toc-title">Table of contents</h2>
<section id="data-acquisition" class="level2" data-number="1.1">
<h2 data-number="1.1" class="anchored" data-anchor-id="data-acquisition"><span class="header-section-number">1.1</span> Data Acquisition</h2>
<p>In this chapter, we will employ machine learning techniques to classify a scene using satellite imagery. Specifically, we will utilize <code>scikit-learn</code> to implement two distinct classifiers and subsequently compare their results. To begin, we need to import the following modules.</p>
<div id="9608a55e" class="cell" data-execution_count="1">
<div id="3628cdbc" class="cell" data-execution_count="1">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> datetime <span class="im">import</span> datetime, timedelta</span>
Expand Down Expand Up @@ -337,7 +337,7 @@ <h2 data-number="1.1" class="anchored" data-anchor-id="data-acquisition"><span c
<section id="searching-in-the-catalog" class="level3" data-number="1.1.1">
<h3 data-number="1.1.1" class="anchored" data-anchor-id="searching-in-the-catalog"><span class="header-section-number">1.1.1</span> Searching in the Catalog</h3>
<p>The module <code>odc-stac</code> provides access to free, open source satelite data. To retrieve the data, we must define several parameters that specify the location and time period for the satellite data. Additionally, we must specify the data collection we wish to access, as multiple collections are available. In this example, we will use multispectral imagery from the Sentinel-2 satellite.</p>
<div id="68524a6c" class="cell" data-execution_count="2">
<div id="878e1dec" class="cell" data-execution_count="2">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>dx <span class="op">=</span> <span class="fl">0.0006</span> <span class="co"># 60m resolution</span></span>
Expand Down Expand Up @@ -390,7 +390,7 @@ <h3 data-number="1.1.1" class="anchored" data-anchor-id="searching-in-the-catalo
<h3 data-number="1.1.2" class="anchored" data-anchor-id="loading-the-data"><span class="header-section-number">1.1.2</span> Loading the Data</h3>
<p>Now we will load the data directly into an <code>xarray</code> dataset, which we can use to perform computations on the data. <code>xarray</code> is a powerful library for working with multi-dimensional arrays, making it well-suited for handling satellite data.</p>
<p>Here’s how we can load the data using odc-stac and xarray:</p>
<div id="06e2982e" class="cell" data-execution_count="3">
<div id="dc75e776" class="cell" data-execution_count="3">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="co"># define a geobox for my region</span></span>
Expand All @@ -417,7 +417,7 @@ <h2 data-number="1.2" class="anchored" data-anchor-id="data-visualization"><span
<h3 data-number="1.2.1" class="anchored" data-anchor-id="rgb-image"><span class="header-section-number">1.2.1</span> RGB Image</h3>
<p>With the image data now in our possession, we can proceed with computations and visualizations.</p>
<p>First, we define a mask to exclude cloud cover and areas with missing data. Subsequently, we create a composite median image, where each pixel value represents the median value across all the scenes we have identified. This approach helps to eliminate clouds and outliers present in some of the images, thereby providing a clearer and more representative visualization of the scene.</p>
<div id="16fba8ce" class="cell" data-execution_count="4">
<div id="126b59e8" class="cell" data-execution_count="4">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="co"># define a mask for valid pixels (non-cloud)</span></span>
Expand Down Expand Up @@ -460,7 +460,7 @@ <h3 data-number="1.2.1" class="anchored" data-anchor-id="rgb-image"><span class=
<section id="false-color-image" class="level3" data-number="1.2.2">
<h3 data-number="1.2.2" class="anchored" data-anchor-id="false-color-image"><span class="header-section-number">1.2.2</span> False Color Image</h3>
<p>In addition to the regular RGB Image, we can swap any of the bands from the visible spectrum with any other bands. In this specific case the red band has been changed to the near infrared band. This allows us to see vegetated areas more clearly, since they now appear in a bright red color. This is due to the fact that plants absorb regular red light while reflecting near infrared light <span class="citation" data-cites="nasa2020">(<a href="references.html#ref-nasa2020" role="doc-biblioref">NASA 2020</a>)</span>.</p>
<div id="108b167a" class="cell" data-execution_count="5">
<div id="22bb45e8" class="cell" data-execution_count="5">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="co"># compute the false color image</span></span>
Expand Down Expand Up @@ -502,7 +502,7 @@ <h3 data-number="1.2.3" class="anchored" data-anchor-id="ndvi-image"><span class
<li>0.33 to 0.66 are moderatly healthy plants</li>
<li>0.66 to 1 are very healthy plants</li>
</ul>
<div id="238cc383" class="cell" data-execution_count="6">
<div id="0236ace2" class="cell" data-execution_count="6">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Normalized Difference Vegetation Index (NDVI)</span></span>
Expand All @@ -529,7 +529,7 @@ <h2 data-number="1.3" class="anchored" data-anchor-id="classification"><span cla
<section id="regions-of-interest" class="level3" data-number="1.3.1">
<h3 data-number="1.3.1" class="anchored" data-anchor-id="regions-of-interest"><span class="header-section-number">1.3.1</span> Regions of Interest</h3>
<p>Since this is a supervised classification, we need to have some training data. Therefore we need to define areas or regions, which we are certain represent the feature which we are classifiying. In this case we are interested in forested areas and regions that are definitly not forested. These regions will be used to train our classifiers.</p>
<div id="a5b69a43" class="cell" data-execution_count="7">
<div id="496ee6e9" class="cell" data-execution_count="7">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Define Polygons</span></span>
Expand Down Expand Up @@ -581,7 +581,7 @@ <h3 data-number="1.3.1" class="anchored" data-anchor-id="regions-of-interest"><s
<section id="data-preparation" class="level3" data-number="1.3.2">
<h3 data-number="1.3.2" class="anchored" data-anchor-id="data-preparation"><span class="header-section-number">1.3.2</span> Data Preparation</h3>
<p>In addition to the Regions of Interest we will extract the specific bands from the loaded dataset that we intend to use for the classification, which are the <code>red, green, blue</code> and <code>near-infrared</code> bands, although other bands can also be utilized. Using these bands, we will create both a training and a testing dataset. The training dataset will be used to train the classifier, while the testing dataset will be employed to evaluate its performance.</p>
<div id="2ee903e3" class="cell" data-execution_count="8">
<div id="08406d34" class="cell" data-execution_count="8">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Classifiying dataset (only necessary bands)</span></span>
Expand Down Expand Up @@ -628,7 +628,7 @@ <h3 data-number="1.3.2" class="anchored" data-anchor-id="data-preparation"><span
</details>
</div>
<p>Now that we have prepared the training and testing data, we will create an image array of the actual scene that we intend to classify. This array will serve as the input for our classification algorithms, allowing us to apply the trained classifiers to the entire scene and identify the forested and non-forested areas accurately.</p>
<div id="1beba729" class="cell" data-execution_count="9">
<div id="027bf00f" class="cell" data-execution_count="9">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>image_data <span class="op">=</span> ds_class[bands].to_array(dim<span class="op">=</span><span class="st">'band'</span>).transpose(<span class="st">'latitude'</span>, <span class="st">'longitude'</span>, <span class="st">'band'</span>)</span>
Expand All @@ -644,7 +644,7 @@ <h3 data-number="1.3.2" class="anchored" data-anchor-id="data-preparation"><span
<h3 data-number="1.3.3" class="anchored" data-anchor-id="classifiying-with-naive-bayes"><span class="header-section-number">1.3.3</span> Classifiying with Naive Bayes</h3>
<p>Now that we have prepared all the needed data, we can begin the actual classification process.</p>
<p>We will start with a <em>Naive Bayes</em> classifier. First, we will train the classifier using our training dataset. Once trained, we will apply the classifier to the actual image to identify the forested and non-forested areas.</p>
<div id="2e7d7060" class="cell" data-execution_count="10">
<div id="6898729d" class="cell" data-execution_count="10">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Naive Bayes initialization and training</span></span>
Expand All @@ -661,7 +661,7 @@ <h3 data-number="1.3.3" class="anchored" data-anchor-id="classifiying-with-naive
</details>
</div>
<p>To evaluate the effectiveness of the classification, we will plot the image predicted by the classifier. Additionally, we will examine the <code>Classification Report</code> and the <code>Confusion Matrix</code> to gain further insights into the classifier’s performance.</p>
<div id="82059783" class="cell" data-execution_count="11">
<div id="fb0d9552" class="cell" data-execution_count="11">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Plot Naive Bayes</span></span>
Expand Down Expand Up @@ -735,7 +735,7 @@ <h3 data-number="1.3.3" class="anchored" data-anchor-id="classifiying-with-naive
<section id="classifiying-with-random-forest" class="level3" data-number="1.3.4">
<h3 data-number="1.3.4" class="anchored" data-anchor-id="classifiying-with-random-forest"><span class="header-section-number">1.3.4</span> Classifiying with Random Forest</h3>
<p>To ensure our results are robust, we will explore an additional classifier. In this section, we will use the Random Forest classifier. The procedure for using this classifier is the same as before: we will train the classifier using our training dataset and then apply it to the actual image to classify the scene.</p>
<div id="12454c17" class="cell" data-execution_count="12">
<div id="75e261e8" class="cell" data-execution_count="12">
<details open="" class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Random Forest initialization and training</span></span>
Expand Down Expand Up @@ -804,8 +804,8 @@ <h3 data-number="1.3.4" class="anchored" data-anchor-id="classifiying-with-rando
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">Actual Positive</td>
<td>284</td>
<td>5203</td>
<td>274</td>
<td>5213</td>
</tr>
</tbody>
</table>
Expand All @@ -818,7 +818,7 @@ <h3 data-number="1.3.4" class="anchored" data-anchor-id="classifiying-with-rando
<section id="comparison-of-the-classificators" class="level3" data-number="1.3.5">
<h3 data-number="1.3.5" class="anchored" data-anchor-id="comparison-of-the-classificators"><span class="header-section-number">1.3.5</span> Comparison of the Classificators</h3>
<p>To gain a more in-depth understanding of the classifiers’ performance, we will compare their results. Specifically, we will identify the areas where both classifiers agree and the areas where they disagree. This comparison will provide valuable insights into the strengths and weaknesses of each classifier, allowing us to better assess their effectiveness in identifying forested and non-forested regions.</p>
<div id="37888f1b" class="cell" data-execution_count="13">
<div id="ebcf234c" class="cell" data-execution_count="13">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a>cmap_trio <span class="op">=</span> colors.ListedColormap([<span class="st">'whitesmoke'</span> ,<span class="st">'indianred'</span>, <span class="st">'goldenrod'</span>, <span class="st">'darkgreen'</span>])</span>
Expand All @@ -845,7 +845,7 @@ <h3 data-number="1.3.5" class="anchored" data-anchor-id="comparison-of-the-class
</div>
</div>
<p>The areas where both classifiers agree include the larger forested regions, such as the <em>Nationalpark Donau-Auen</em> and the <em>Leithagebirge</em>. Additionally, both classifiers accurately identified the urban areas of Vienna and correctly excluded them from being classified as forested.</p>
<div id="fc857846" class="cell" data-execution_count="14">
<div id="5e11e964" class="cell" data-execution_count="14">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Plot only one class, either None (0), Naive Bayes (1), Random Forest (2), or Both (3)</span></span>
Expand All @@ -872,7 +872,7 @@ <h3 data-number="1.3.5" class="anchored" data-anchor-id="comparison-of-the-class
<p>When plotting the classified areas individually, we observe that the Random Forest classifier mistakenly identified the Danube River as a forested area. Conversely, the Naive Bayes classifier erroneously classified a significant amount of cropland as forest.</p>
<p>Finally, by analyzing the proportion of forested areas within the scene, we find that approximately 18% of the area is classified as forest, while around 66% is classified as non-forest. The remaining areas, which include water bodies and cropland, fall into less clearly defined categories.</p>
<p>The accompanying bar chart illustrates the distribution of these classifications, highlighting the percentage of forested areas, non-forested areas, and regions classified by only one of the two classifiers. This visual representation helps to quantify the areas of agreement and disagreement between the classifiers, providing a clearer picture of their performance.</p>
<div id="9b88b147" class="cell" data-execution_count="15">
<div id="bc0468cd" class="cell" data-execution_count="15">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a>counts <span class="op">=</span> {}</span>
Expand Down
Binary file modified chapters/01_classification_files/figure-html/cell-13-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified chapters/01_classification_files/figure-html/cell-14-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified chapters/01_classification_files/figure-html/cell-15-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit e990d28

Please sign in to comment.