-
Notifications
You must be signed in to change notification settings - Fork 5
/
cf.go
308 lines (258 loc) · 10.2 KB
/
cf.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
package mlpack
/*
#cgo CFLAGS: -I./capi -Wall
#cgo LDFLAGS: -L. -lmlpack_go_cf
#include <capi/cf.h>
#include <stdlib.h>
*/
import "C"
import "gonum.org/v1/gonum/mat"
type CfOptionalParam struct {
Algorithm string
AllUserRecommendations bool
InputModel *cfModel
Interpolation string
IterationOnlyTermination bool
MaxIterations int
MinResidue float64
NeighborSearch string
Neighborhood int
Normalization string
Query *mat.Dense
Rank int
Recommendations int
Seed int
Test *mat.Dense
Training *mat.Dense
Verbose bool
}
func CfOptions() *CfOptionalParam {
return &CfOptionalParam{
Algorithm: "NMF",
AllUserRecommendations: false,
InputModel: nil,
Interpolation: "average",
IterationOnlyTermination: false,
MaxIterations: 1000,
MinResidue: 1e-05,
NeighborSearch: "euclidean",
Neighborhood: 5,
Normalization: "none",
Query: nil,
Rank: 0,
Recommendations: 5,
Seed: 0,
Test: nil,
Training: nil,
Verbose: false,
}
}
/*
This program performs collaborative filtering (CF) on the given dataset. Given
a list of user, item and preferences (the "Training" parameter), the program
will perform a matrix decomposition and then can perform a series of actions
related to collaborative filtering. Alternately, the program can load an
existing saved CF model with the "InputModel" parameter and then use that
model to provide recommendations or predict values.
The input matrix should be a 3-dimensional matrix of ratings, where the first
dimension is the user, the second dimension is the item, and the third
dimension is that user's rating of that item. Both the users and items should
be numeric indices, not names. The indices are assumed to start from 0.
A set of query users for which recommendations can be generated may be
specified with the "Query" parameter; alternately, recommendations may be
generated for every user in the dataset by specifying the
"AllUserRecommendations" parameter. In addition, the number of
recommendations per user to generate can be specified with the
"Recommendations" parameter, and the number of similar users (the size of the
neighborhood) to be considered when generating recommendations can be
specified with the "Neighborhood" parameter.
For performing the matrix decomposition, the following optimization algorithms
can be specified via the "Algorithm" parameter:
- 'RegSVD' -- Regularized SVD using a SGD optimizer
- 'NMF' -- Non-negative matrix factorization with alternating least squares
update rules
- 'BatchSVD' -- SVD batch learning
- 'SVDIncompleteIncremental' -- SVD incomplete incremental learning
- 'SVDCompleteIncremental' -- SVD complete incremental learning
- 'BiasSVD' -- Bias SVD using a SGD optimizer
- 'SVDPP' -- SVD++ using a SGD optimizer
- 'RandSVD' -- RandomizedSVD learning
- 'QSVD' -- QuicSVD learning
- 'BKSVD' -- Block Krylov SVD learning
The following neighbor search algorithms can be specified via the
"NeighborSearch" parameter:
- 'cosine' -- Cosine Search Algorithm
- 'euclidean' -- Euclidean Search Algorithm
- 'pearson' -- Pearson Search Algorithm
The following weight interpolation algorithms can be specified via the
"Interpolation" parameter:
- 'average' -- Average Interpolation Algorithm
- 'regression' -- Regression Interpolation Algorithm
- 'similarity' -- Similarity Interpolation Algorithm
The following ranking normalization algorithms can be specified via the
"Normalization" parameter:
- 'none' -- No Normalization
- 'item_mean' -- Item Mean Normalization
- 'overall_mean' -- Overall Mean Normalization
- 'user_mean' -- User Mean Normalization
- 'z_score' -- Z-Score Normalization
A trained model may be saved to with the "OutputModel" output parameter.
To train a CF model on a dataset training_set using NMF for decomposition and
saving the trained model to model, one could call:
// Initialize optional parameters for Cf().
param := mlpack.CfOptions()
param.Training = training_set
param.Algorithm = "NMF"
_, model := mlpack.Cf(param)
Then, to use this model to generate recommendations for the list of users in
the query set users, storing 5 recommendations in recommendations, one could
call
// Initialize optional parameters for Cf().
param := mlpack.CfOptions()
param.InputModel = &model
param.Query = users
param.Recommendations = 5
recommendations, _ := mlpack.Cf(param)
Input parameters:
- Algorithm (string): Algorithm used for matrix factorization. Default
value 'NMF'.
- AllUserRecommendations (bool): Generate recommendations for all
users.
- InputModel (cfModel): Trained CF model to load.
- Interpolation (string): Algorithm used for weight interpolation.
Default value 'average'.
- IterationOnlyTermination (bool): Terminate only when the maximum
number of iterations is reached.
- MaxIterations (int): Maximum number of iterations. If set to zero,
there is no limit on the number of iterations. Default value 1000.
- MinResidue (float64): Residue required to terminate the factorization
(lower values generally mean better fits). Default value 1e-05.
- NeighborSearch (string): Algorithm used for neighbor search. Default
value 'euclidean'.
- Neighborhood (int): Size of the neighborhood of similar users to
consider for each query user. Default value 5.
- Normalization (string): Normalization performed on the ratings.
Default value 'none'.
- Query (mat.Dense): List of query users for which recommendations
should be generated.
- Rank (int): Rank of decomposed matrices (if 0, a heuristic is used to
estimate the rank). Default value 0.
- Recommendations (int): Number of recommendations to generate for each
query user. Default value 5.
- Seed (int): Set the random seed (0 uses std::time(NULL)). Default
value 0.
- Test (mat.Dense): Test set to calculate RMSE on.
- Training (mat.Dense): Input dataset to perform CF on.
- Verbose (bool): Display informational messages and the full list of
parameters and timers at the end of execution.
Output parameters:
- output (mat.Dense): Matrix that will store output recommendations.
- outputModel (cfModel): Output for trained CF model.
*/
func Cf(param *CfOptionalParam) (*mat.Dense, cfModel) {
params := getParams("cf")
timers := getTimers()
disableBacktrace()
disableVerbose()
// Detect if the parameter was passed; set if so.
if param.Algorithm != "NMF" {
setParamString(params, "algorithm", param.Algorithm)
setPassed(params, "algorithm")
}
// Detect if the parameter was passed; set if so.
if param.AllUserRecommendations != false {
setParamBool(params, "all_user_recommendations", param.AllUserRecommendations)
setPassed(params, "all_user_recommendations")
}
// Detect if the parameter was passed; set if so.
if param.InputModel != nil {
setCFModel(params, "input_model", param.InputModel)
setPassed(params, "input_model")
}
// Detect if the parameter was passed; set if so.
if param.Interpolation != "average" {
setParamString(params, "interpolation", param.Interpolation)
setPassed(params, "interpolation")
}
// Detect if the parameter was passed; set if so.
if param.IterationOnlyTermination != false {
setParamBool(params, "iteration_only_termination", param.IterationOnlyTermination)
setPassed(params, "iteration_only_termination")
}
// Detect if the parameter was passed; set if so.
if param.MaxIterations != 1000 {
setParamInt(params, "max_iterations", param.MaxIterations)
setPassed(params, "max_iterations")
}
// Detect if the parameter was passed; set if so.
if param.MinResidue != 1e-05 {
setParamDouble(params, "min_residue", param.MinResidue)
setPassed(params, "min_residue")
}
// Detect if the parameter was passed; set if so.
if param.NeighborSearch != "euclidean" {
setParamString(params, "neighbor_search", param.NeighborSearch)
setPassed(params, "neighbor_search")
}
// Detect if the parameter was passed; set if so.
if param.Neighborhood != 5 {
setParamInt(params, "neighborhood", param.Neighborhood)
setPassed(params, "neighborhood")
}
// Detect if the parameter was passed; set if so.
if param.Normalization != "none" {
setParamString(params, "normalization", param.Normalization)
setPassed(params, "normalization")
}
// Detect if the parameter was passed; set if so.
if param.Query != nil {
gonumToArmaUmat(params, "query", param.Query)
setPassed(params, "query")
}
// Detect if the parameter was passed; set if so.
if param.Rank != 0 {
setParamInt(params, "rank", param.Rank)
setPassed(params, "rank")
}
// Detect if the parameter was passed; set if so.
if param.Recommendations != 5 {
setParamInt(params, "recommendations", param.Recommendations)
setPassed(params, "recommendations")
}
// Detect if the parameter was passed; set if so.
if param.Seed != 0 {
setParamInt(params, "seed", param.Seed)
setPassed(params, "seed")
}
// Detect if the parameter was passed; set if so.
if param.Test != nil {
gonumToArmaMat(params, "test", param.Test, false)
setPassed(params, "test")
}
// Detect if the parameter was passed; set if so.
if param.Training != nil {
gonumToArmaMat(params, "training", param.Training, false)
setPassed(params, "training")
}
// Detect if the parameter was passed; set if so.
if param.Verbose != false {
setParamBool(params, "verbose", param.Verbose)
setPassed(params, "verbose")
enableVerbose()
}
// Mark all output options as passed.
setPassed(params, "output")
setPassed(params, "output_model")
// Call the mlpack program.
C.mlpackCf(params.mem, timers.mem)
// Initialize result variable and get output.
var outputPtr mlpackArma
output := outputPtr.armaToGonumUmat(params, "output")
var outputModel cfModel
outputModel.getCFModel(params, "output_model")
// Clean memory.
cleanParams(params)
cleanTimers(timers)
// Return output(s).
return output, outputModel
}