Machine learning

Machine learning group continues 2018 Yelp challenge

 

When: Saturday May 26th 2018 from 12:00pm – 5:00pm

Where: CoMotion on King at 115 King Street East, Hamilton, ON

Organizer: Hamilton Machine Learning Group

Register: eventbrite.ca/e/2018-yelp-challenge-continued-tickets-44784796532

Details:

We’ll continue working on the 2018 Yelp Challenge as a team!

We started working on the challenege in March with a group of intermediate/advanced machine learners, but newcomers to the group are still very welcome as we could use more help! This is a larger project we will likely be working on for several weeks after this event, which may include more hacakathon-esque sessions. For those interested in catching up, we have collected some great CNN and RNN intro videos which we’ll send out before the event.

Food and drink will be provided!

This is an advanced workshop for machine learners who already have exprience with basics:

  • Python 3.x installed
  • Familiarity with Pandas and Numpy
  • Familiarity with neural networks

What you can expect to learn:

  • Natural Language Processing techniques
  • Convolutional/Recurrent neural network models
  • Keras/Pytorch/Tensorflow

Machine learning Yelp challenge workshop

 

When: Saturday March 31st 2018 from 12:00pm – 5:00pm

Where: CoMotion on King at 115 King Street East, Hamilton, Ontario

Organizer: Hamilton Machine Learning and Computing Research

Register: eventbrite.ca/e/machine-learning-with-yelp-challenge-dataset-tickets-44158017820

Cost: $5

Details:

We will be working on the 2018 Yelp challange as a team!

https://www.yelp.co.uk/dataset/challenge

This is an advanced workshop for machine learners who already have exprience with basics:

  • Python 3.x installed
  • Familiarity with Pandas and Numpy
  • Familiarity with neural networks

What you can expect to learn:

  • Natural Language Processing techniques
  • Convolutional/Recurrent neural network models
  • Keras/Pytorch/Tensorflow

The goal of this workshop will be to gather intermediated/advanced machine learners together to work on a larger project which may extend for several weeks and depending on interest, may include more hackathon-esque sessions. We’ll be supplying all attendees with further information about what you need to bring as we near the date. For those interested in catching up, we have collected some great CNN and RNN intro videos which we’ll send out to everyone before the event.

Food and drink will be provided.

 

New machine learning group forms

 

As an offshoot of the first ever Hamilton Machine Learning conference last December, a new machine learning and applied computing group has formed in Hamilton!

The group held its first ever meeting this past Saturday at CoMotion on King. About 40 attendees worked through a Kaggle machine learning problem together, using a dataset from the Titanic to predict which passengers would survive the tragedy. They also consumed copious amounts of pizza, thanks to the event sponsor, Hamilton machine learning startup Preteckt!

The group is planning its next meetup event and has started a Facebook group and Slack channel where you can join in the discussion and find out abut the next meetings! The group is being led by Masha Rahimi and Nick Miladinovic.

 

 

Machine Learning Workshop with Kaggle

 

When: Saturday February 17th at 1:00pm

Where: CoMotion on King at 115 King Street East (3rd floor), Hamilton, ON

Organizer: Hamilton Machine Learning and Computing Research

Register: eventbrite.ca/e/machine-learning-workshop-with-kaggle-tickets-41883700275

Details:

 

 

At this workshop, we will work through a Kaggle problem as a group to learn about machine learning and data science!

The workshop leaders will introduce the problem to the group.  The workshop leaders will work with the group to solve the problem on the projector screen.  You can ask questions, participate and help, or just follow along, whatever your comfort level.

But you’re also welcome to solve the problem on your own as part of a group together, this is a casual event to share and learn.

We recommend the following background knowledge for attending the workshop, you may find the links helpful to prepare in advance:

We recommend bringing your laptop!

Mahsa Rahimi and Nick Miladinovic will lead the workshop.

 

Machine learning using scikit-learn

365630
Originally posted on kamillus.github.io

 

Scikit-learn is a fantastic library to solve problems using machine learning and other, more traditional statistical methods in the area of Data Science. In this post I’m outlining why machine learning is important, demonstrating a simple machine learning problem and how to solve it.

Why should you care? Data science is becoming more and more relevant with the growth of big data, and more autononomous systems (ex. recommender systems, pattern recognition). Machine learning, specifically, is applicable to many fields including finance (ex. Detecting credit card fraud), medical (ex. Classifying patient cancer), entertainment (ex. Chess playing bot). The number of careers involving machine learning will steadily increase (there is evidence it’s already happening) since the supporting technologies are becoming more prolific (Hadoop, scikit-learn, Mahout etc.).

One of the problems I was working on not too long ago was classifying which user is at the front of the computer. I have developed a small user classification game utilizing an SVM. The game asks a user for a bunch of words to create a profile of the user – the machine is “learning”. In the next part of the game, the user types a bunch of words and the computer tries to recognize who is typing at the keyboard by utilizing what it learned.

How does the computer learn? The feature generation is accomplished when the user is asked for their name, then presented with a series of words from a dictionary and finally asked to type words as they appear. The features that is recorded is the typing speed, number of errors, and corrections made to typed words.

The next part of the program is to run the data through the classifier (which in our case is SVM). The tricky part is to get the right values for gamma. You could experiment with this by using a test data set; do not use your training set. Once you have this data, the actual classifying is trivial with scikit-learn:

#create the classifiter
classifier = svm.SVC(gamma=1)
#get existing features, and their expected results
(features, targets) = profiles.get_classifier_data()
classifier.fit(features, targets)

#based on new features and targets feed into the program and guess the new predicted targets
predicted = classifier.predict([[data_point.time, data_point.error_count, data_point.distance]])

How could this be improved? I think the first opportunity for improvement is to recognize data clusters automatically using k-means and possibly utilize principal component analysis. That way, every cluster of data will be automatically assigned without first creating user profiles.

I hope this post elucidates the high level machine learning process for anyone that is interested. The technologies and ideas used here are just some tools that can be added to your toolbelt. If you’d like to find out more about machine learning, I recommend Andrew Ng’s set of lectures.

Full Listing