Skip to content

SDC Workshop 2018 winter - Introduction to Machine Learning

Notifications You must be signed in to change notification settings

daniel-hain/SDC_ML_intro_2018

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SDC Workshop Winter 2018 - Introduction to Applied Machine Learning (ML) and Natural Language Processing (NLP)

12/12 - 2018, Beijing

Dr. Daniel S. Hain, [email protected] Dr. Roman Jurowetzki, [email protected]

Aalborg University, Denmark


In this repository, you will find all notebooks, presentations and materials from the workshop. We will also use it to link to some Kaggle kernels that you can explore for interactive exercises.

Due to the time limitations, the workshop will feature interactive tutorials but no "self-run coding exercises".

Please register on kaggle.com run the exercises.

In this workshop, we will not teach you one particular trending method or approach but rather introduce to Data Science as a field and its approach to working with data.

Sure, we can only do so much in 3 days, and therefore we tried to find a good balance of broad overview and specific applications.

Hopefully, this will give you a good foundation or at least starting point to learn more. Today, it is really easy to find excellent resources and get skilled at sophistic analytical techniques. But, you need to know what to look for and how all the different things out there relate to each other.

kdn

While for several reasons – mostly path dependancy – the innovation studies (and general social science) community are relying on expensive proprietory packages (e.g. SPSS, Stata, SAS or EViews), the people that work with Big Data analytics are working with R and/or Python. We decided not to focus on just one language but will present you both so you can decide which one you find most approachable.

Below you will find links to the different things presented during the workshop. We will update this repository during and after the workshop.


Notebooks

L1: Unsupervised ML

L1.5: Natural Language Processing

L2: Supervised ML

Bonus: A wine classification case study (If time pertmits)


Useful resources

Bibliometrics

Vosviewer Easy software for bibliometrics

Citespace More complex bibliometrix software including geospacial features and mapping.

Courses

Datacamp Online courses. Intro to R, Python, Github, Excel and Sheets are free Recommended courses:

  • R basics: "Introduction to R" (free course)
  • R unsupervised ML: "Unsupervised Learning in R" (chapter 1 free)
  • R Supervised ML: "Unsupervised Learning in R" (chapter 1 free)
  • R Data visualization: "Data Visualization with ggplot2 (Part 1)" (chapter 1 free)

Dataquest Similar to datacamp. Python focused. Also more advanced courses on data engineering

Open Data Science Masters Curriculum Collection of free online resources on all kinds of Data Science topics.

Data and scripts from the ML A-Z course from Udemy R and Python scripts from the course including the course data. The course can be found on Udemy and is usually available for around 12USD.

Software

Help

Others

About

SDC Workshop 2018 winter - Introduction to Machine Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published