-
Notifications
You must be signed in to change notification settings - Fork 1
/
02-skltopics.tex
116 lines (60 loc) · 3.58 KB
/
02-skltopics.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
\[ \mbox{Machine Learning with Python} \]
<p>%=======================================================================%
\textbf{Machine Learning}
* Machine Learning is a discipline involving algorithms designed to find patterns in and make predictions about data.
* It is nearly ubiquitous in our world today, and used in everything from web searches to financial forecasts to studies of the nature of the Universe. * This workshop will cover an introduction to scikit-learn, a python machine learning package, and to the central concepts of Machine Learning.
%=======================================================================%
\textbf{Machine Learning}\\
We will introduce the basic categories of learning problems and how to implement them using scikit-learn.
% From this foundation, we will explore practical examples of machine learning using real-world data, from handwriting analysis to automated classification of astronomical images.
* Regression : Predicting Numeric Values
* Classification : Predicting Categories
* Clustering : assigning instances to groups.
%=======================================================================%
\textbf{Getting ready}
The datasets in scikit-learn are contained within the datasets module. Use the following
command to import these datasets:
<pre>
\begin{verbatim}
>>> from sklearn import datasets
>>> import numpy as np
\end{verbatim}
\end{framed}
%======================================================================== %
\begin{figure}
\centering
\includegraphics[width=1.2\linewidth]{images/SKLsite}
\end{figure}
%=======================================================================%
\textbf{Classification}
* \textbf{Description:} Identifying to which category an object belongs to.
* \textbf{Applications:} Spam detection, Image recognition.
* \textbf{Algorithms:} SVM, nearest neighbors, random forest,
%=======================================================================%
\textbf{Regression}
* \textbf{Description:} Predicting a continuous-valued attribute associated with an object.
* \textbf{Applications:} Drug response, Stock prices.
* \textbf{Algorithms:} SVR, ridge regression, Lasso,
%=======================================================================%
\textbf{Clustering}\\
Automatic grouping of similar objects into sets.
\begin{description}
* [Applications:] Customer segmentation, Grouping experiment outcomes
* [Algorithms:] k-Means, spectral clustering, mean-shift, ...
\end{description}
%=======================================================================%
\textbf{Dimensionality Reduction}\\
* \textbf{Description: } Reducing the number of random variables to consider.
* \textbf{Applications:} Visualization, Increased efficiency
* \textbf{Algorithms:} PCA, feature selection, non-negative matrix factorization.
%=======================================================================%
\textbf{Model selection}\\
* \textbf{Description: } Comparing, validating and choosing parameters and models.
* \textbf{Goal:} Improved accuracy via parameter tuning
* \textbf{Modules:} grid search, cross validation, metrics
%=======================================================================%
\textbf{Preprocessing}\\
* \textbf{Description:} Feature extraction and normalization.
* \textbf{Application:} Transforming input data such as text for use with machine learning algorithms.
* \textbf{Modules:} preprocessing, feature extraction.
\end{document}