Skip to content

Commit

Permalink
render submit
Browse files Browse the repository at this point in the history
render submit
  • Loading branch information
math4mad committed Oct 12, 2023
1 parent e6a6539 commit c94d2b5
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 11 deletions.
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"hash": "42d8b35b15fbf684e15ca58720f596b9",
"hash": "baa4c9b5a5e908478b839f2c46a58cbc",
"result": {
"markdown": "---\ntitle: \"2-ecommerce-linear-reg\"\ncode-fold: true\n---\n\n:::{.callout-note,title=\"简介\"}\n > 通过上网浏览时间预测年花费\n\n\n 1. dataset: [`kaggle ecommerce dataset`](https://www.kaggle.com/datasets/kolawale/focusing-on-mobile-app-or-website)\n \n 2. [model](https://www.kaggle.com/code/mohammedibrahim784/e-commerce-dataset-linear-regression-model)\n \n 3. using `MLJLinearModels.jl` [🔗](https://github.com/alan-turing-institute/MLJLinearModels.jl)\n:::\n\n## 1. load package\n\n::: {.cell execution_count=1}\n``` {.julia .cell-code}\nimport MLJ:predict\nusing GLMakie, MLJ,CSV,DataFrames,StatsBase\n```\n:::\n\n\n## 2. process data\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code code-fold=\"show\"}\nstr=\"Ecommerce-Customers\" \ndf=CSV.File(\"./data/Ecommerce-Customers.csv\") |> DataFrame |> dropmissing;\nselect!(df,4:8)\nX=df[:,1:4]|>Matrix|>MLJ.table\ny=Vector(df[:,5])\nfirst(df,5)\n```\n\n::: {.cell-output .cell-output-display execution_count=84}\n```{=html}\n<div><div style = \"float: left;\"><span>5×5 DataFrame</span></div><div style = \"clear: both;\"></div></div><div class = \"data-frame\" style = \"overflow-x: scroll;\"><table class = \"data-frame\" style = \"margin-bottom: 6px;\"><thead><tr class = \"header\"><th class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">Row</th><th style = \"text-align: left;\">Avg. Session Length</th><th style = \"text-align: left;\">Time on App</th><th style = \"text-align: left;\">Time on Website</th><th style = \"text-align: left;\">Length of Membership</th><th style = \"text-align: left;\">Yearly Amount Spent</th></tr><tr class = \"subheader headerLastRow\"><th class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\"></th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th></tr></thead><tbody><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">1</td><td style = \"text-align: right;\">34.4973</td><td style = \"text-align: right;\">12.6557</td><td style = \"text-align: right;\">39.5777</td><td style = \"text-align: right;\">4.08262</td><td style = \"text-align: right;\">587.951</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">2</td><td style = \"text-align: right;\">31.9263</td><td style = \"text-align: right;\">11.1095</td><td style = \"text-align: right;\">37.269</td><td style = \"text-align: right;\">2.66403</td><td style = \"text-align: right;\">392.205</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">3</td><td style = \"text-align: right;\">33.0009</td><td style = \"text-align: right;\">11.3303</td><td style = \"text-align: right;\">37.1106</td><td style = \"text-align: right;\">4.10454</td><td style = \"text-align: right;\">487.548</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">4</td><td style = \"text-align: right;\">34.3056</td><td style = \"text-align: right;\">13.7175</td><td style = \"text-align: right;\">36.7213</td><td style = \"text-align: right;\">3.12018</td><td style = \"text-align: right;\">581.852</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">5</td><td style = \"text-align: right;\">33.3307</td><td style = \"text-align: right;\">12.7952</td><td style = \"text-align: right;\">37.5367</td><td style = \"text-align: right;\">4.44631</td><td style = \"text-align: right;\">599.406</td></tr></tbody></table></div>\n```\n:::\n:::\n\n\n## 3. plot corrleation of variables\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\naxs = []\nlabel=names(df)|>Array\ncolors = [:orange, :lightgreen, :purple,:lightblue,:red,:green]\n\nfig = Figure(resolution=(1400, 1400))\nax=Axis(fig[1,1])\n\nfunction plot_diag(i)\n\n ax = Axis(fig[i, i])\n push!(axs, ax)\n density!(ax, df[:, i]; color=(colors[i], 0.5),\n strokewidth=1.25, strokecolor=colors[i])\nend\n\n\nfunction plot_cor(i, j)\n ax = Axis(fig[i, j])\n scatter!(ax, df[:, i], df[:, j]; color=colors[j])\nend\n\n\nfunction plot_pair()\n [(i == j ? plot_diag(i) : plot_cor(i, j)) for i in 1:5, j in 1:5]\nend\n\nfunction add_xy_label()\n for i in 1:5\n Axis(fig[5, i], xlabel=label[i],)\n Axis(fig[i, 1], ylabel=label[i],)\n end\nend\n\nfunction main()\n\n plot_pair()\n add_xy_label()\n return fig\nend\n\nmain()\n```\n\n::: {.cell-output .cell-output-display execution_count=85}\n![](2-ecommerce-linear-reg_files/figure-html/cell-4-output-1.png){}\n:::\n:::\n\n\n## 4. plot pair variables's cov and cor matrix\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\ndf_cov = df|>Matrix|>cov.|> d -> round(d, digits=3)\ndf_cor = df|>Matrix|>cor.|> d -> round(d, digits=3)\n\nfunction plot_cov_cor()\n fig = Figure(resolution=(2200, 800))\n ax1 = Axis(fig[1, 1]; xticks=(1:5, label), yticks=(1:5, label), title=\"ecommerce cov matrix\",yreversed=true)\n ax3 = Axis(fig[1, 3], xticks=(1:5, label), yticks=(1:5, label), title=\"ecommerce cor matrix\",yreversed=true)\n\n hm = heatmap!(ax1, df_cov)\n Colorbar(fig[1, 2], hm)\n [text!(ax1, x, y; text=string(df_cov[x, y]), color=:white, fontsize=18, align=(:center, :center)) for x in 1:5, y in 1:5]\n\n hm2 = heatmap!(ax3, df_cor)\n Colorbar(fig[1, 4], hm2)\n [text!(ax3, x, y; text=string(df_cor[x, y]), color=:white, fontsize=18, align=(:center, :center)) for x in 1:5, y in 1:5]\n\n fig\nend\n\nplot_cov_cor()\n```\n\n::: {.cell-output .cell-output-display execution_count=86}\n![](2-ecommerce-linear-reg_files/figure-html/cell-5-output-1.png){}\n:::\n:::\n\n\n## 5. MLJ workflow\n### 5.1 load model\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\n LinearRegressor = @load LinearRegressor pkg=MLJLinearModels\n model=LinearRegressor()\n mach = MLJ.fit!(machine(model,X,y))\n fitted_params(mach)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nimport MLJLinearModels ✔\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\n[ Info: For silent loading, specify `verbosity=0`. \n[ Info: Training machine(LinearRegressor(fit_intercept = true, …), …).\n┌ Info: Solver: MLJLinearModels.Analytical\n│ iterative: Bool false\n└ max_inner: Int64 200\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=87}\n```\n(coefs = [:x1 => 25.734271084705085, :x2 => 38.709153810834366, :x3 => 0.43673883559434407, :x4 => 61.57732375487839],\n intercept = -1051.5942553006273,)\n```\n:::\n:::\n\n\n### 5.2 predict \n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n y_hat =predict(mach, X)\n \"rmsd\"=>rmsd(y,y_hat)\n\n```\n\n::: {.cell-output .cell-output-display execution_count=88}\n```\n\"rmsd\" => 9.923256785022247\n```\n:::\n:::\n\n\n### 5.3 plot residuals\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nresid=y_hat.=y\nstem(resid)\n```\n\n::: {.cell-output .cell-output-display execution_count=89}\n![](2-ecommerce-linear-reg_files/figure-html/cell-8-output-1.png){}\n:::\n:::\n\n\n",
"markdown": "---\ntitle: \"2-ecommerce-linear-reg\"\ncode-fold: true\n---\n\n:::{.callout-note title=\"简介\"}\n > 通过上网浏览时间预测年花费\n\n\n 1. dataset: [`kaggle ecommerce dataset`](https://www.kaggle.com/datasets/kolawale/focusing-on-mobile-app-or-website)\n \n 2. [model](https://www.kaggle.com/code/mohammedibrahim784/e-commerce-dataset-linear-regression-model)\n \n 3. using `MLJLinearModels.jl` [🔗](https://github.com/alan-turing-institute/MLJLinearModels.jl)\n:::\n\n## 1. load package\n\n::: {.cell execution_count=1}\n``` {.julia .cell-code}\nimport MLJ:predict\nusing GLMakie, MLJ,CSV,DataFrames,StatsBase\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWARNING: using StatsBase.predict in module Main conflicts with an existing identifier.\n```\n:::\n:::\n\n\n## 2. process data\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code code-fold=\"show\"}\nstr=\"Ecommerce-Customers\" \ndf=CSV.File(\"./data/Ecommerce-Customers.csv\") |> DataFrame |> dropmissing;\nselect!(df,4:8)\nX=df[:,1:4]|>Matrix|>MLJ.table\ny=Vector(df[:,5])\nfirst(df,5)\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```{=html}\n<div><div style = \"float: left;\"><span>5×5 DataFrame</span></div><div style = \"clear: both;\"></div></div><div class = \"data-frame\" style = \"overflow-x: scroll;\"><table class = \"data-frame\" style = \"margin-bottom: 6px;\"><thead><tr class = \"header\"><th class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">Row</th><th style = \"text-align: left;\">Avg. Session Length</th><th style = \"text-align: left;\">Time on App</th><th style = \"text-align: left;\">Time on Website</th><th style = \"text-align: left;\">Length of Membership</th><th style = \"text-align: left;\">Yearly Amount Spent</th></tr><tr class = \"subheader headerLastRow\"><th class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\"></th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th><th title = \"Float64\" style = \"text-align: left;\">Float64</th></tr></thead><tbody><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">1</td><td style = \"text-align: right;\">34.4973</td><td style = \"text-align: right;\">12.6557</td><td style = \"text-align: right;\">39.5777</td><td style = \"text-align: right;\">4.08262</td><td style = \"text-align: right;\">587.951</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">2</td><td style = \"text-align: right;\">31.9263</td><td style = \"text-align: right;\">11.1095</td><td style = \"text-align: right;\">37.269</td><td style = \"text-align: right;\">2.66403</td><td style = \"text-align: right;\">392.205</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">3</td><td style = \"text-align: right;\">33.0009</td><td style = \"text-align: right;\">11.3303</td><td style = \"text-align: right;\">37.1106</td><td style = \"text-align: right;\">4.10454</td><td style = \"text-align: right;\">487.548</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">4</td><td style = \"text-align: right;\">34.3056</td><td style = \"text-align: right;\">13.7175</td><td style = \"text-align: right;\">36.7213</td><td style = \"text-align: right;\">3.12018</td><td style = \"text-align: right;\">581.852</td></tr><tr><td class = \"rowNumber\" style = \"font-weight: bold; text-align: right;\">5</td><td style = \"text-align: right;\">33.3307</td><td style = \"text-align: right;\">12.7952</td><td style = \"text-align: right;\">37.5367</td><td style = \"text-align: right;\">4.44631</td><td style = \"text-align: right;\">599.406</td></tr></tbody></table></div>\n```\n:::\n:::\n\n\n## 3. plot corrleation of variables\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\naxs = []\nlabel=names(df)|>Array\ncolors = [:orange, :lightgreen, :purple,:lightblue,:red,:green]\n\nfig = Figure(resolution=(1400, 1400))\nax=Axis(fig[1,1])\n\nfunction plot_diag(i)\n\n ax = Axis(fig[i, i])\n push!(axs, ax)\n density!(ax, df[:, i]; color=(colors[i], 0.5),\n strokewidth=1.25, strokecolor=colors[i])\nend\n\n\nfunction plot_cor(i, j)\n ax = Axis(fig[i, j])\n scatter!(ax, df[:, i], df[:, j]; color=colors[j])\nend\n\n\nfunction plot_pair()\n [(i == j ? plot_diag(i) : plot_cor(i, j)) for i in 1:5, j in 1:5]\nend\n\nfunction add_xy_label()\n for i in 1:5\n Axis(fig[5, i], xlabel=label[i],)\n Axis(fig[i, 1], ylabel=label[i],)\n end\nend\n\nfunction main()\n\n plot_pair()\n add_xy_label()\n return fig\nend\n\nmain()\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n![](2-ecommerce-linear-reg_files/figure-html/cell-4-output-1.png){}\n:::\n:::\n\n\n## 4. plot pair variables's cov and cor matrix\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\ndf_cov = df|>Matrix|>cov.|> d -> round(d, digits=3)\ndf_cor = df|>Matrix|>cor.|> d -> round(d, digits=3)\n\nfunction plot_cov_cor()\n fig = Figure(resolution=(2200, 800))\n ax1 = Axis(fig[1, 1]; xticks=(1:5, label), yticks=(1:5, label), title=\"ecommerce cov matrix\",yreversed=true)\n ax3 = Axis(fig[1, 3], xticks=(1:5, label), yticks=(1:5, label), title=\"ecommerce cor matrix\",yreversed=true)\n\n hm = heatmap!(ax1, df_cov)\n Colorbar(fig[1, 2], hm)\n [text!(ax1, x, y; text=string(df_cov[x, y]), color=:white, fontsize=18, align=(:center, :center)) for x in 1:5, y in 1:5]\n\n hm2 = heatmap!(ax3, df_cor)\n Colorbar(fig[1, 4], hm2)\n [text!(ax3, x, y; text=string(df_cor[x, y]), color=:white, fontsize=18, align=(:center, :center)) for x in 1:5, y in 1:5]\n\n fig\nend\n\nplot_cov_cor()\n```\n\n::: {.cell-output .cell-output-display execution_count=5}\n![](2-ecommerce-linear-reg_files/figure-html/cell-5-output-1.png){}\n:::\n:::\n\n\n## 5. MLJ workflow\n### 5.1 load model\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\n LinearRegressor = @load LinearRegressor pkg=MLJLinearModels\n model=LinearRegressor()\n mach = MLJ.fit!(machine(model,X,y))\n fitted_params(mach)\n```\n\n::: {.cell-output .cell-output-stderr}\n```\n[ Info: For silent loading, specify `verbosity=0`. \n[ Info: Training machine(LinearRegressor(fit_intercept = true, …), …).\n┌ Info: Solver: MLJLinearModels.Analytical\n│ iterative: Bool false\n└ max_inner: Int64 200\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nimport MLJLinearModels ✔\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n(coefs = [:x1 => 25.734271084705085, :x2 => 38.709153810834366, :x3 => 0.43673883559434407, :x4 => 61.57732375487839],\n intercept = -1051.5942553006273,)\n```\n:::\n:::\n\n\n### 5.2 predict \n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\n y_hat =predict(mach, X)\n \"rmsd\"=>rmsd(y,y_hat)\n\n```\n\n::: {.cell-output .cell-output-display execution_count=7}\n```\n\"rmsd\" => 9.923256785022247\n```\n:::\n:::\n\n\n### 5.3 plot residuals\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nresid=y_hat.=y\nstem(resid)\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n![](2-ecommerce-linear-reg_files/figure-html/cell-8-output-1.png){}\n:::\n:::\n\n\n",
"supporting": [
"2-ecommerce-linear-reg_files"
"2-ecommerce-linear-reg_files/figure-html"
],
"filters": [],
"includes": {
Expand Down
Loading

0 comments on commit c94d2b5

Please sign in to comment.