Machine learning (ML) is the unsung hero that powers many applications, systems, sensors, devices, and products. Machine learning is so pervasive that we can often assume its presence in most of the applications and systems without having to specifically call it out.
In simple terms, machine learning is a computer’s ability to learn from data, and it is one of the most useful tools we have to develop intelligent systems and applications. Machine learning is used widely today for all kinds of tasks, from churn prediction in large companies, to web search, to medical diagnostics, to robotics. It’s hard to find a field that cannot benefit from machine learning in one way or another.
Machine learning’s intuitive, versatile and robust approach to finding patterns in the available data makes it a priceless asset for anyone who wants to turn data into insights and predictions. What’s more, today it is more accessible than ever before, thanks to the variety of open source tools and programming languages.
What developers actually need to know about Machine Learning
Something is wrong in the way ML is being taught to developers.
Most ML teachers like to explain how different learning algorithms work and spend tons of time on that. For a beginner who wants to start using ML, being able to choose an algorithm and set parameters looks like the #1 barrier to entry, and knowing how the different techniques work seems to be a key requirement to remove that barrier. Many practitioners argue however that you only need one technique to get started: random forests. Other techniques may sometimes outperform them, but in general, random forests are the most likely to perform best on a variety of problems (see Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?), which makes them more than enough for a developer just getting started with ML.
We would further argue that you don’t need to know all the inner workings of (random forest) learning algorithms (and the simpler decision tree learning algorithms that they use). A high-level understanding of the algorithms, the intuitions behind them, their main parameters, their possibilities and limitations is enough. You’ll know enough to start practicing and experimenting with ML, as there are great open source ML libraries (such as scikit-learn in Python) and cloud platforms that make it super easy to create predictive models from data.
So, if we just give an overview of only one technique, what else can we teach?
Deploying ML models into production
It turns out that, when using ML in real-world applications, most of the work takes place before and after the learning. ML instructors rarely provide an end-to-end view of what it takes to use ML in a predictive application that’s deployed in production. They just explain one part of the problem, then they assume you’ll figure out the rest and you’ll connect the dots on your own. Like for instance, connecting the dots between the ML libraries you were taught to use in Python, R, or Matlab, and your application in production which is developed in Ruby, Swift, C++, etc.
Fortunately, today there are new and accessible solutions to this “last-mile problem”. They revolve around the use of REST (http) APIs. Models need to be exposed as APIs, and if scaling the number of predictions performed by a given model can be an issue, these APIs would be served on multiple endpoints with load balancers in front. Platforms-as-a service can help for that—here is some info about Microsoft Azure ML’s scaling capabilities, Amazon ML’s, and Yhat’s Analytics Load Balancer (which you can also run on your own private infrastructure/cloud). Some of these platforms allow you to use whatever ML library you want, others restrict you to their own proprietary ones. In our upcoming workshop, We’ve chosen to use Azure to deploy models created with scikit-learn into APIs, and also to demonstrate how Amazon and BigML provide an even higher level of abstraction (while still providing accurate models) that can make them easier to work with in many cases.
Deployment is not the only post-learning challenge there is in real-world ML. You should also find appropriate ways to evaluate and monitor your models’ performance/impact, before and after deployment.
The ML workflow diagram above also presents some of the steps to take before learning a model, which are about preparing the right dataset for the algorithms to run on. Before actually running any algorithm you need to…
– Define the right ML problem to tackle for your organization
– Engineer features, i.e. find ways to represent the objects on which you’ll be making predictions with ML
– Figure out when/how often you’ll need to make predictions, and how much time you’ll have for that (is there a way to do predictions in batches or do you absolutely need all your predictions to be real-time?)
– Collect data
– Prepare the actual dataset to run learning algorithms on, i.e. extract features from the “raw” collected data and clean it
– Figure out when/how often you’ll need to learn new/updated models, and how much time you’ll have for that.
Operational Machine Learning Workshop for Developers
We are excited to officially announce today that, in collaboration with PAPIS, we are launching a new learning track called PAPIs Workshops. In partnership with leading education and industry organizations we will offer practical and industry-focused learning programs in various locations around the world (starting with Madrid, London and Boston).
Most Machine Learning courses are given from the perspective of a Data Scientist and focus on the techniques and algorithms that allow to learn from data. This workshop takes the perspective of an application developer and instead provides an end-to-end view of ML integration into your applications. We’ll go all the way from data preparation to the integration of predictive models in your domain and their deployment in production.
Our first workshop is aimed at developers and is an agnostic introduction to operational Machine Learning with open source and cloud platforms. It’s a 2-day hands-on workshop given in a classroom setting. Day 1 covers an intro to ML, the creation, operationalization, and evaluation of predictive models. Day 2 features model selection, ensembles, data preparation, a practical overview of advanced topics such as unsupervised learning and deep learning, and methodology for developing your own ML use case.
We’re using Python with libraries such as Pandas, scikit-learn, SKLL, and cloud platforms such as Microsoft Azure ML, Amazon ML, BigML and Indico. I think these platforms are great for many organizations and real-world use cases, but even if for some reason you’d realize they may not be the perfect fit for you, I’d still recommend using them for learning and practicing ML. ML-as-a-Service makes it much quicker to setup work environments (e.g. Azure ML has most popular libraries preinstalled and can run interactive Jupyter notebooks, which you can access from your browser) but also to experiment with ML with the higher levels of abstraction they provide (e.g. combining one-click clustering, anomaly detection, and classification models with BigML, or quickly featurizing text and images with Indico’s Deep Learning API).
For more information on how to attend, participate or become a sponsor, please visit http://www.papis.io/workshops/operational-machine-learning
For more information on how to attend, participate or become a sponsor, please visit http://www.papis.io/workshops/operational-machine-learning
Special Offer – 30% Discount!
Please take a moment to register now and avail the special 30% discount offered. Visit the event page and register before 22nd May to get 30% off. If you have any questions about the workshop or registration please feel free to contact us at email@example.com.
Happy Machine Learning!
Dr. Louis Dorard and Ali Syed
In the last two years the world has produced more data than in all of human history. All this data and digital technologies powered by machine intelligence are changing the way everyone operates. For businesses and government services to compete in a connected economy they need to be able to make sense of all the valuable information that is being generated to unleash new waves of productivity, growth and innovation.
We’ve started Data Science Middle East (DSME) Foundation with the vision to create a regional collaboration on digital skills and data talent development. DSME brings together business and technology professionals, researchers, experts, practitioners, and industry leaders to promote digital data literacy, research and innovation through open projects, capacity building, and community engagements.
This initiative marks a big step forward in uniting the worlds of business leadership, digital transformation, data science and new talent development. With the support from the local governments, industry organizations, communities and academic institutions, DSME will offer variety of data education opportunities and digital workforce skills development projects.
Data science isn’t just for data scientists. In massively connected data driven world, it is imperative that the workforce of today and tomorrow is able to understand what data is available and use scientific methods to analyze and interpret it. DSME is here to help you learn and apply the art and science of turning data into meaningful insights and intelligent predictions.
This launch is an invitation to industry, and academia to let you know that the DSME is open for business. We are looking forward to working with all of you to put data at the core of economic development and innovation with the aim of building a sustainable Middle East. If you are interested in collaborating with DSME or have exciting ideas to develop digital data talent, please contact us on firstname.lastname@example.org
We (DSME and Everati team) are excited to confirm that Middle East’s first professional workshop to learn the most demanded skills of 2016 will be in Dubai (UAE) on 25-27 April, 2016. Amazing! Great opportunity to practically learn data science and machine learning to advance your knowledge and career. Attend this workshop to learn the fundamentals of data science and machine learning and leave armed with practical skills to extract value from data.
Everati, our regional partner for the data science workshops, are a UAE based organization dedicated to the provision of global business information through premier, high-profile, specialized events. For more information on how to attend, participate or become a sponsor, please visit www.datasciencetraining.me
As data becomes more influential in shaping the work and strategies of so many industries, we want to be prepared to raise a generation with the skills and knowledge to work with data. For collaborations, partnership and joint industry programs, get in touch at email@example.com or drop me a line directly. We would love to work with you to enable digital and data-driven Middle East.
Ali Syedread more
Guest post by Louis Dorard, author of Bootstrapping Machine Learning
Prediction APIs are a growing trend and they are changing the way people approach Data Science. Recently, Persontyle partnered with BigML which is a company that provides one such API. Services like BigML abstract away the complexities of learning models from data and making predictions against these models. Thanks to Prediction APIs, anyone is now in a position to do Machine Learning.
However, apart from a few blog posts here and there, there was no long-form resource to introduce you to Machine Learning through Prediction APIs. All the books on the market will teach you how to implement Machine Learning algorithms. But most people who could benefit from it are not willing to invest the time and efforts required to understand how these algorithms work. As Bret Victor wrote: “Until machine learning is as accessible and effortless as typing the word ‘learn,’ it will never become widespread.”
I was really excited when I first learnt about Prediction APIs in 2011. I kept an eye on them and eventually I decided to write the first guide to use them. Although they are indeed making machine learning quite effortless, people still need to be educated to its possibilities, its limitations, how to prepare the data to learn from, and what to do once a machine learning model has been created. As you can imagine, my core audience is not people wanting to become experts in the field but people looking to leverage these technologies for their apps or businesses. They can be hackers, startuppers, CTOs, lead devs, analysts, … They are not going to become Data Scientists, but rather what you could call Data Artisans and they can now do things that only Data Scientists could in the past.
Instead of writing a traditional book, I went for a self-published ebook and I was inspired by successful self-published authors such as Nathan Barry, Sacha Greif, or even Guy Kawasaki. The ebook is complemented by extra material such as videos, screencasts, tutorials, IPython notebooks, code, datasets, a Virtual Machine, and free subscriptions to BigML. The objective is to save time to the person who wants to get started with BigML or even Google Prediction API.
For those who need more hands-on training or who want to be able to ask me questions in person, The School of Data Science and I will soon run a workshop on Prediction APIs: stay tuned! In the meantime, you can check out the book and start using Machine Learning within a day!
My goal is to help you create better apps by using Machine Learning and Prediction APIs. If you like you can read more about me and you can follow me on Twitter (@louisdorard) to see what I’m up to.
Download a free sample of the book with a detailed table of contents.read more
Persontyle launches “The School of Data Science”, offering data science and big data education for professionals
London, United Kingdom, 25 April 2014 — Persontyle, the provider of Data Science education and services are proud to announce the launch of “The School of Data Science”, offering the most comprehensive portfolio of Data Science and Machine Learning courses for professionals.
Declared by Harvard Business Review as the ‘sexiest job of the 21st century’, Data Science skills are becoming a key asset in any organization confronted with the daunting challenge of making sense of data that comes in varieties and volumes never encountered before. The rapid growth of data has been such that few sectors can ignore it. The problem for many organizations is finding the Data Science expertise to understand what this data means as the demand for data scientists outstrips supply.
The most commonly cited barrier to Data Science adoption is a lack of professionals with the right skills or training and there are many research and surveys confirming that in the future, demand for data scientists will outpace the supply of talent. To address this, Persontyle is announcing the most comprehensive portfolio of Data Science and Machine Learning courses covering key subject areas, methods and technologies to help cultivate professionals who can extract value, insights and predictions from data.
“The School of Data Science aims to bring accessible, affordable, engaging, and highly interactive Data Science and Machine Learning education to the world”, says Ali Syed, founder and CEO of Persontyle. “Our training programs and courses are carefully tailored to meet the needs of both established industry professionals and those that are new to the field of Data Science. We are providing great opportunity for all business and technology professionals to develop the skills required to think and work like data scientists.”
The School of Data Science training programs are designed by the renowned academics, authors and professionals in the field of Data Science, Machine Learning, big data, applied statistics and industry domains and will be offered in the format of instructor-led short courses and bootcamps. Learning experiences are designed to cover the theory, tools and applied practices required throughout the entire Data Science lifecycle, from asking the relevant questions to making predictions and visualizing results.
Brief overview of the School of Data Science education offerings are provided below.Data Science and Machine Learning Courses
Full range of short courses are available to help industry professionals to learn the skills they need to become a practicing data scientist. To make learning experience easier, the courses are curated in a sequential order for Beginner level, Foundation level and Advanced level audience. Please visit the Courses page to learn more about the courses available for open enrolment. Spaces are limited. We encourage you to register as soon as you can. http://www.persontyle.com/courses/Big Data Academy
With Big Data becoming ever so relevant, The School of Data Science is offering Hadoop based training courses with the focus on how organizations can take advantage of Hadoop as an enterprise data management platform to deliver value and profitability by analyzing massive scale data sets. Courses are designed for professionals, technologists and data scientists to learn how to store, process, and analyze growing and evolving datasets.Corporate Data Science Training Solutions
Data Science has become an increasingly important aspect in corporate functionality, and for organizations to ensure that data creates value calls for a reskilling effort. Corporate Data Science training is for organizations to offer their workforce the opportunity to develop skills and expertise in statistics, model development, data science lifecycle, Machine Learning, data engineering, data visualization and other decision making skills and concepts. Organizations can engage the Persontyle team of data scientists to conduct a free half day workshop to develop a tailored training solution focused on their requirements, challenges and opportunities.The Data Science Incubator (PhD to Data Scientist)
A 6 week bootcamp fellowship in London that prepares the talented STEM PhDs to work as data scientists. Program is for quantitative oriented research professionals (PhD and postdoc students) who have a strong desire to round off their Data Science skills, in preparation for assuming senior data scientist roles in an industry setting.Open Data Science Fellows Program (ODSFP)
ODSFP is a collaborative platform for Data Science enthusiasts to get together to learn Data Science, to solve meaningful data problems, ask questions, share ideas in a social setting. ODSFP is a not-for-profit program dedicated to the dissemination of Data Science and applying it on Open Data for social good.
The connected world is a world of data. All actions, interactions and behaviours both of humans and machines are stored and transmitted across networks to keep the engines of the digital age running. We now live in an increasingly data driven world, but it is harder than ever to detect the meaningful signals amid the noise of data. Learning Data Science concepts and practices empowers us to deal with this data, equips us with the understanding to ignore noise to heed signals and most importantly helps us to focus on delivering meaningful value.
The School of Data Science presents unprecedented opportunities to develop the necessary skills and the expertise required to use data to unveil patterns, insights and predictionsAbout Persontyle
Persontyle is people focused social enterprise obsessed with helping individuals, organizations and communities learn and apply Data Science to deliver real, meaningful and enduring value. Persontyle is born from the idea of creating a platform for the people, by the people to share passion, knowledge, theory and practices of analyzing data scientifically.
Persontyle offers a comprehensive portfolio of Data Science learning opportunities and services for organizations (profit, non-profit, and government) to deliver breakthrough impact and meaningful value using data.
Persontyle Media Relations
Office: +44 203 239 3141
Mobile: +44 773 785 1449
Email: firstname.lastname@example.org more