Machine learning (ML) is the unsung hero that powers many applications, systems, sensors, devices, and products. Machine learning is so pervasive that we can often assume its presence in most of the applications and systems without having to specifically call it out.
In simple terms, machine learning is a computer’s ability to learn from data, and it is one of the most useful tools we have to develop intelligent systems and applications. Machine learning is used widely today for all kinds of tasks, from churn prediction in large companies, to web search, to medical diagnostics, to robotics. It’s hard to find a field that cannot benefit from machine learning in one way or another.
Machine learning’s intuitive, versatile and robust approach to finding patterns in the available data makes it a priceless asset for anyone who wants to turn data into insights and predictions. What’s more, today it is more accessible than ever before, thanks to the variety of open source tools and programming languages.
What developers actually need to know about Machine Learning
Something is wrong in the way ML is being taught to developers.
Most ML teachers like to explain how different learning algorithms work and spend tons of time on that. For a beginner who wants to start using ML, being able to choose an algorithm and set parameters looks like the #1 barrier to entry, and knowing how the different techniques work seems to be a key requirement to remove that barrier. Many practitioners argue however that you only need one technique to get started: random forests. Other techniques may sometimes outperform them, but in general, random forests are the most likely to perform best on a variety of problems (see Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?), which makes them more than enough for a developer just getting started with ML.
We would further argue that you don’t need to know all the inner workings of (random forest) learning algorithms (and the simpler decision tree learning algorithms that they use). A high-level understanding of the algorithms, the intuitions behind them, their main parameters, their possibilities and limitations is enough. You’ll know enough to start practicing and experimenting with ML, as there are great open source ML libraries (such as scikit-learn in Python) and cloud platforms that make it super easy to create predictive models from data.
So, if we just give an overview of only one technique, what else can we teach?
Deploying ML models into production
It turns out that, when using ML in real-world applications, most of the work takes place before and after the learning. ML instructors rarely provide an end-to-end view of what it takes to use ML in a predictive application that’s deployed in production. They just explain one part of the problem, then they assume you’ll figure out the rest and you’ll connect the dots on your own. Like for instance, connecting the dots between the ML libraries you were taught to use in Python, R, or Matlab, and your application in production which is developed in Ruby, Swift, C++, etc.
Fortunately, today there are new and accessible solutions to this “last-mile problem”. They revolve around the use of REST (http) APIs. Models need to be exposed as APIs, and if scaling the number of predictions performed by a given model can be an issue, these APIs would be served on multiple endpoints with load balancers in front. Platforms-as-a service can help for that—here is some info about Microsoft Azure ML’s scaling capabilities, Amazon ML’s, and Yhat’s Analytics Load Balancer (which you can also run on your own private infrastructure/cloud). Some of these platforms allow you to use whatever ML library you want, others restrict you to their own proprietary ones. In our upcoming workshop, We’ve chosen to use Azure to deploy models created with scikit-learn into APIs, and also to demonstrate how Amazon and BigML provide an even higher level of abstraction (while still providing accurate models) that can make them easier to work with in many cases.
Deployment is not the only post-learning challenge there is in real-world ML. You should also find appropriate ways to evaluate and monitor your models’ performance/impact, before and after deployment.
The ML workflow diagram above also presents some of the steps to take before learning a model, which are about preparing the right dataset for the algorithms to run on. Before actually running any algorithm you need to…
– Define the right ML problem to tackle for your organization
– Engineer features, i.e. find ways to represent the objects on which you’ll be making predictions with ML
– Figure out when/how often you’ll need to make predictions, and how much time you’ll have for that (is there a way to do predictions in batches or do you absolutely need all your predictions to be real-time?)
– Collect data
– Prepare the actual dataset to run learning algorithms on, i.e. extract features from the “raw” collected data and clean it
– Figure out when/how often you’ll need to learn new/updated models, and how much time you’ll have for that.
Operational Machine Learning Workshop for Developers
We are excited to officially announce today that, in collaboration with PAPIS, we are launching a new learning track called PAPIs Workshops. In partnership with leading education and industry organizations we will offer practical and industry-focused learning programs in various locations around the world (starting with Madrid, London and Boston).
Most Machine Learning courses are given from the perspective of a Data Scientist and focus on the techniques and algorithms that allow to learn from data. This workshop takes the perspective of an application developer and instead provides an end-to-end view of ML integration into your applications. We’ll go all the way from data preparation to the integration of predictive models in your domain and their deployment in production.
Our first workshop is aimed at developers and is an agnostic introduction to operational Machine Learning with open source and cloud platforms. It’s a 2-day hands-on workshop given in a classroom setting. Day 1 covers an intro to ML, the creation, operationalization, and evaluation of predictive models. Day 2 features model selection, ensembles, data preparation, a practical overview of advanced topics such as unsupervised learning and deep learning, and methodology for developing your own ML use case.
We’re using Python with libraries such as Pandas, scikit-learn, SKLL, and cloud platforms such as Microsoft Azure ML, Amazon ML, BigML and Indico. I think these platforms are great for many organizations and real-world use cases, but even if for some reason you’d realize they may not be the perfect fit for you, I’d still recommend using them for learning and practicing ML. ML-as-a-Service makes it much quicker to setup work environments (e.g. Azure ML has most popular libraries preinstalled and can run interactive Jupyter notebooks, which you can access from your browser) but also to experiment with ML with the higher levels of abstraction they provide (e.g. combining one-click clustering, anomaly detection, and classification models with BigML, or quickly featurizing text and images with Indico’s Deep Learning API).
For more information on how to attend, participate or become a sponsor, please visit http://www.papis.io/workshops/operational-machine-learning
For more information on how to attend, participate or become a sponsor, please visit http://www.papis.io/workshops/operational-machine-learning
Special Offer – 30% Discount!
Please take a moment to register now and avail the special 30% discount offered. Visit the event page and register before 22nd May to get 30% off. If you have any questions about the workshop or registration please feel free to contact us at firstname.lastname@example.org.
Happy Machine Learning!
Dr. Louis Dorard and Ali Syed
Guest post by Dr. Mike Ashcroft
If you know all there is to know about Data Science and Machine Learning, you may want to stop reading now.
Oh good, we’re alone.
Despite the hype and confusion surrounding Data Science, the need for people who can interpret data and use it to find patterns and predictions to help organizations make informed business decisions is very real. Data Science is fueling the digital economy, we need to move it to the very center of our business, research and social change endeavours. Data Science is bringing new levels of speed, relevance, and precision to the way we design and manage businesses and operating models. Machine Learning is without a doubt the core aspect of Data Science and predicative analytics in general. In health care, Machine Learning is changing the way doctors identify people at risk of developing certain diseases; in retail, machine learning is used to analyze purchasing data to anticipate trends; CRM and marketing experts use it to tailor campaigns and offers.
Machine Learning is simple: We have the algorithms, we have the experience and, these days, we have the data. The ‘complicated’ mathematics behind the data revolution is the process of the cumulative application of basic techniques that can be understood in terms of mathematics we learnt in high school and early college. Providing this understanding is the most important facet of Data Science education: Only those who understand the tools they use are able to choose the appropriate technique for the tasks they face. I have never met an organization prepared to trust their data analysis to analysts who cannot explain why they use the techniques they do.
Fundamentals of Machine Learning bootcamp is designed to give you this understanding. It will provide you with the ability to apply the most powerful techniques in Machine Learning, to select appropriate techniques for particular problems , and to say exactly what these techniques do and why they work in a way that is understandable to data analysis stakeholders.
Fundamentals of Machine Learning bootcamp will take you through the conceptual and applied foundations of the subject. Topics covered will include Machine Learning theory, types of learning, techniques, models and methods. Labs are developed to practically learn how to use the R programming language and packages for applying the main concepts and techniques of Machine Learning. In this bootcamp, our goal is to give you the basic skills that you need to understand Machine Learning algorithms and models, and interpret their output, which is important for solving a range of data science problems. This is an applied Machine Learning course, and we focus on the intuitions and practical know-how needed to get Machine Learning algorithms to work in practice, rather than the mathematical equations and derivatives.
Using actual data, the bootcamp begins by reviewing important basic statistical methods. You will learn to use the popular statistical programming language R to build these simple models from the ground up. You will then see how these simple techniques can be improved, combined, augmented and adjusted to produce powerful statistical tools for different tasks in data analysis. In this way, you will learn to see advanced Machine Learning techniques not as black boxes, but as principled techniques used to unlock patterns from data.
Over the course of five days, over two dozen techniques will be examined, implemented through supervised exercises and tutorials, and compared. You will learn the relative advantages and disadvantages of different types of techniques in different contexts. You will see how some models are entirely data driven, while others can be used to encode defeasible expert knowledge. You will learn methods for validating selected models and techniques and for choosing among alternative methods.
As we proceed we discuss with examples the sorts of data that suit these different approaches, and you will continue to apply these techniques ‘live’ in R. All topic areas have practical exercises, where you implement the algorithms we are looking at, as well as analyze their outputs and their suitability to particular problems. It is an essential part of the course aims that you get real ‘hands on’ experience working with the techniques we cover, in the comfortable environment of a classroom where you can discuss and work through problems you encounter with the instructor (me). The purpose is to arm you with a set of tools that you know how to apply, how to explain and when to use, as well as their theoretical background.
Fundamentals of Machine Learning bootcamp is for students, researchers and professionals from industry, services, social and public sectors who wish to develop the ability to turn data into meaningful and actionable insights. The greatest care is taken to provide bootcamp participants with high quality instruction that makes the journey of understanding and using advanced data analysis tools as easy and enjoyable as possible. So join us, master the science behind ‘data science’ and equip yourself for a role in the data revolution.
Read and download the bootcamp brochure.
Special Offer – 25% Discount!
Please take a moment to register now and avail the special 25% discount offered. Visit the event page and use the promo code FMLB100 to get 25% off. We are encouraging University students (post graduate and PhD) and researchers to learn machine learning by offering a special 50% discount for them. Also, 40% discount for Limited seats, I encourage you to register as soon possible.
About the Author
The author is Dr. Mike Ashcroft, Lecturer in Machine Learning and Artificial Intelligence at Uppsala University in Sweden, and founder of data analytics company Inatas AB. He has worked in the Machine Learning field for over five years, developing cutting edge software, providing professional and university courses and performing specialist consulting work. He has extensive experience teaching and working both in Europe and Asia.read more
Machine Learning is without a doubt the core aspect of data science and predicative analytics in general. Its intuitive, versatile and robust approach to finding patterns in the available data makes it a priceless asset for anyone who wants to turn data into insights. What’s more, today it is more accessible than ever before, thanks to the variety of libraries in the programming languages used for Machine Learning and predictive modelling. This is particularly true for R, the open-source programming platform that specializes in this kind of tasks.
Up until relatively recently, R has been thought of a tool for statisticians, mainly because it handles statistical models proficiently through its various statistical tools. However, recently it has been upgraded through the introduction of a variety of libraries that contain efficient implementations of several Machine Learning algorithms. In addition, the introduction of parallelization libraries enabled R to run on computer clusters, where big data dwells. What’s more, due to its open-source license, R has attracted several practitioners who developed workshops, tutorials, etc. making learning it easier than anything else in the Data Science field.
Machine Learning is also quite accessible today, due to the variety of books on it. However, most of them have underlining assumptions about what you know, plus they give a lot of emphasis either on the programming aspect of it or on the mathematical dimension of the methods covered. It’s very difficult to find a resource that explains the ideas behind the algorithms and walks you through their implementation and the interpretation of their results, without getting overly technical.
What many people tend to forget is that Machine Learning can be quite enjoyable too. This is because it allows for a great deal of creativity in both the development of new algorithms as well as the implementation (and tweaking) of the existing ones. Plus all that results into getting a computer to do something intelligent that can provide value for you and your organization, bringing about a sense of accomplishment. What’s more, Machine Learning can hone your problem-solving skills and turn difficult problems into intriguing challenges that can be very educational too.
Naturally, learning Machine Learning is also about employability. Today, as more and more organizations become aware of the value of data analytics (esp. in a Data Science setting), the need for Machine Learning practitioners has exceeded the demand for it. This is why there are so many books on the topic as well as a variety of university courses. However, unless you are very methodical and have lots of time, reading books won’t cut it, especially if you are looking for a job in the field sometime soon. Besides, you can’t put books on a resume. As for the university courses, these too take time plus they are often quite pricey. For this and all the other aforementioned reasons, the School of Data Science has put forward a 2-day Machine Learning workshop, an efficient way to learn the essential aspects of the field, using the R platform, at a very reasonable price.
The idea of this and all other similar learning programs developed by School of Data Science is to make Machine Learning accessible to everyone who wants to get into it, without spending months on it (you can go in more depth afterwards, on your own if you want). Familiarizing yourself with the basic concepts and getting some experience on how they are applied will enable you to get into the field faster plus it will spur your enthusiasm about this fascinating field.
This workshop aims to develop basic understanding of Machine Learning based on supervised learning methods, through the use of the R programming platform. It describes the different types of learning and the two main categories of their applications: Classification and Regression. With a focus on the former, it takes a close look at typical Machine Learning techniques and how they apply on datasets akin to those encountered in the real world.
Our goal is to give you the basic skills that you need to understand supervised machine learning algorithms and models, and interpret their output, which is important for solving a range of data science problems. This is an applied Machine Learning course, and we focus on the intuitions and practical know-how needed to get Machine Learning algorithms to work in practice, rather than the mathematical equations and derivatives.
Great opportunity for programmers, business analysts, technology consultants and all mortals interested in Machine Learning to learn several methods for building Machine Learning applications that solve different real-world tasks. Lots of hands-on labs to step through real-world applications of Machine Learning.
Read and download the workshop brochure.
Special Offer – 30% Discount!
Please take a moment to register now and avail the special 30% discount offered. Visit the event page and use the promo code MLBR30 to get 30% off.
So, join us for a packed, holistic, and enjoyable workshop this August and let yourself embark into an educational adventure in the world of Machine Learning. If you have any questions about the workshop or registration please feel free to contact me or email at email@example.com.
Happy Machine Learning!
Dr. Zacharias Voulgarisread more
You’ll probably have heard the term many times before. Nowadays it’s hard to come by an article on trending technologies (particularly information-related ones) without some reference to a Machine Learning in it. It seems that suddenly the world has become aware of the immense practical aspects of this evergreen field of Artificial Intelligence that constitutes the heart of data science. Machine learning is a computer’s way of learning from data and examples. It’s a type of machine intelligence, and will be among one of the technological disruptions of the coming years.
Many people see Machine Learning as a high-tech field that only the selected few can understand and practice, while others see it as merely glorified programming. As in many other cases, the truth lies somewhere in between. Machine Learning is not an esoteric discipline as it once was, in its earlier stages of development. It has grown very popular and therefore accessible, with a variety of open-source libraries in R, Python, and other programming languages. Also, its theory has become more structured and easier to understand, while most of the methods it entails have been tested over many years in a variety of datasets. Still, Machine Learning is not trivial and involves more than just writing code. It requires a lot of work to learn, though doing a specialized degree on it is unnecessary, unless you are really into research. Yet, despite the variety of literature out there, it is very hard to learn it properly on your own.
Machine Learning is used widely today for all kinds of tasks, from churn prediction in large companies, to web search, to medical diagnostics, to robotics (this in particular would have been next to impossible without Machine Learning). It’s hard to find a field that cannot benefit from Machine Learning in one way or another. The reason is simple: data abundance. With all kinds of data floating around, it is natural to gather meaningful combinations of it (creating what is known in Machine Learning as “features”), and use them to make useful predictions about the world, particularly aspects of it that pose some value to us. You can think of it as cooking skills in an environment where there is easy access to a large variety of cooking ingredients and everyone there has quite an appetite. What’s more, Machine Learning is getting better all the time, so it is quite unlikely that it will run out of methods that can turn data into valuable information more efficiently and more effectively.
But don’t take anyone’s word for all this. Look around at Machine Learning practitioners and their lives. Few of them are sitting idle. Most of them, particularly the more adept ones, earn a decent living and often win prizes at Machine Learning competitions. What’s even more important is that they usually have a good time doing what they do, because this is a line of work which is both manageable and challenging at the same time. If you are into programming, it makes it so much more interesting as it allows for the development of better quality applications, some of which can be marketed as intelligent or predictive applications.
As mentioned earlier, Machine Learning takes some effort to learn, but the whole process becomes much easier when it is done in a systematic and engaging way, with an experienced professional as your guide. This is why School of Data Science has created a series of courses, the Machine Learning Smackdown, that provide you all the help you need to learn Machine Learning properly, gaining some hands-on experience in the process. Completing the Machine Learning Smackdown will turn you into a competent Machine Learning practitioner, able to tackle real-world challenges, turning big data into big insights and big opportunities. Are you ready?
Register now for the 2nd Round, a five day bootcamp starting on the 21st of July to learn basic building blocks of practical Machine Learning using Python Scikit-Learn.
Machine Learning is the best way to exploit the opportunity presented by Big Data.read more
As a species we are creatures of habit, everything we do has a pattern and our unique signature is imprinted on it. Phenomena that are beyond our conscious control also exhibit some micro and macro order. The sequence of normal heart beats has unique characteristics that help in distinguishing it from pathology. Examples are not limited to our physiology or biological makeup it extends well into our collective behavior. Our preferences for various products and services in the market have repetitive structure as well. For example there are groups of consumer who always prefer Coke over Pepsi and another group that buys Surf over any other brand of detergent. It would be extremely useful for brands to profile these customers and channelize their marketing towards individuals who fit this profile.
Data generated through our activities captures plethora of information about our identity, likes and dislikes etc. This information has tremendous value in every aspect of human life. Programming computers to unravel this hidden information is what Machine Learning is all about. It is the art and science of scientifically deriving insights, patterns and predictions from data. What’s more, it is the core of data science and often the difference between a superficial understanding of the data and a deep insight into it.
Tom Mitchell in his book Machine Learning provides a short and simple definition of what is Machine Learning;
“The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.”
In general, there are two aspects of Machine Learning: the algorithms (theory) and their implementation (application). Both of these aspects are equally important as they cover different facets of the process of learning from data and examples, unbound by assumptions of the structure of the data involved. The field of Machine Learning provides tools to automatically make decisions from data in order to achieve some goal or solution for a problem. In order to perform Machine Learning, one needs to know both of these aspects well. It is for this purpose that the School of Data Science has put together a structured learning program for business and technology professionals to learn the Machine Learning theory and practical skills required to predict from data. Exclusive program is starting from the 12th of June in London and it’s called Machine Learning Smackdown. Limited number of seats are available so I encourage you to apply as soon as you can.
Some Machine Learning experts may try to convince you that it is a discipline that requires years of studying in order to master. This, however, applies to every discipline but has never stopped anyone learning it in sufficient depth in a shorter time. What most people usually lack, which makes it a daunting task, is proper instruction, something that School of Data Science has considered while meticulously working on the Machine Learning Smackdown courses. Courses are designed for professionals from industry, services and public sectors interested in developing capabilities of turning data into meaningful insights and intelligent predictions. These courses (Smackdown rounds) are:
First round is all about building a solid foundation upon which more can be build. The course will cover the structure of the field, the fundamental skills required to successfully perform it, and the current “hot” topics of Machine Learning. Full of examples and applications, this course will trigger thinking about how Machine Learning can be useful to you. Great course for business and technology professionals to learns basics of Machine Learning and use cases.
A foundation level course which aims to help you learn the basic principles around the models and methods of Machine Learning. Through a series of hands-on examples to step through real-world applications of Machine Learning. Attending this course will enable you to understand the basic concepts, become confident in applying the tools and techniques, and provide a firm foundation from which to explore more advanced methods. All of its lab session are done using IPython and Scikit-learn.
A thorough introduction to Machine Learning, with emphasis on classification methods, through the use of the R programming platform. This course has several examples of datasets similar to those in the real world. Along with the hands-on learning of various methods for classification and regression, this course offers an understanding of validation and a selection strategy for the various techniques presented.
An advanced course offering a detailed view of a wide selection of Machine Learning methods and paradigms. It also covers Machine Learning theory and several applications of the field. The hands-on part of the course is in the R programming language.
The bottom line is that through these courses you will gain a solid understanding of Machine Learning and lay the foundations of the mindset of a practitioner. You will be able to comprehend a new method’s function and assess its performance, making further learning in the field easier and more interesting. Communication with stakeholders of all sorts will be easier for you and you will have a better appreciation of the signals that lurk in the data available. Most importantly, you will be able to do something useful with it and participate in the big data revolution that is taking place these days. Don’t miss the great opportunity to learn practical Machine Learning to solve real business and social problems.read more