In machine learning, classification alludes to a predictive modeling issue where a class mark is anticipated for a given case of input information. Instances of grouping issues include: Given a model, arrange if the event is spam or not. Given a handwritten character, order it as one of the known characters.
3 Types of Machine Learning (shorthand ML) are available:
· 1.Supervised Machine Learning:
Supervised learning is a methodology that involves “mentoring” a computer. The machine receives sample data, sets of inputs that are known and sets of outputs that are known, and let them infer an appropriate function.
· 2.Unsupervised Machine Learning:
Unsupervised learning requires no sample (labeled) data to teach the machines. The program is fed with raw data which, apart from its attributes and volume, is unknown to us, and then let an algorithm process it – understand structures and patterns within the dataset and categorize items based on similarity measures. Some methods rely on statistical similarities or algebraic ones, some on metric measures.
· 3.Semi-supervised Machine Learning:
It aims to combine the best of both worlds. It uses a mixture of labeled and unlabeled data. Labeled data is utilized here to provide some supervision information, some labels, while the more impressive part of training the model is done through unlabeled data. In that way it is a combination of both Supervised and Unsupervised Machine Learning, giving us the option to work on a number of different problems.
The first project most people create when developing their data science experience is the Iris flower dataset project. It is called the “Hello World” project to help people understand what is needed and what entails in this kind of project.
The project itself has a number of steps that should be completed in order to get the best results. Those steps are the following:
1. Matching the problem with the solution
2. Dataset preparation and pre-processing (Data collection, Data visualization, Data selection, Data transformation)
3. Dataset splitting (Training set, Test set, Validation set)
4. Modeling (Model training, Model evaluation, Improving predictions)
5. Model deployment
Each step has numerous sub-steps or variations to them which may be very time-consuming.
In The following images a basic project of flower petal analysis and final prediction:
Figure 1: Title of project (Markdown cell)
Figure 2: The first step is importing needed libraries and dataset
Figure 3: Exploring the data
Figure 4: Plotting the data to visualize known information
Figure 5: Results should be understandable to everyone, even people with no background in Data Visualization
Figure 6: After the initial analysis is complete, the data is divided into training and testing sets
Figure 7: The accuracy of the model is evaluated, to know how accurate and how trustworthy it is.
Figure 8: The final part is to add new data and see how the model evaluates the new specimen.
Machine Learning Application
Machine learning is used for many different things where data can be used and manipulated.
· It is used to predict traffic or online transportation networks (like Uber’s pricing).
· It is used in Social Media Services like personalizing news feeds and ad targeting.
· Another instance where it is used is in Email filtering to differentiate between necessary and spam so spam can be overlooked.
· Another time, when it is used, is for refining results on Search engines or for product recommendations and to show detect fraud online for the protection of online assets.
· Fraud Detection and Video Surveillance
An example of the use of Machine Learning is determining which Tweets are from a certain person (it’s an example that is good to get an understanding of the complexity needed to be able to complete ML projects). The project will be showcased.
A very important detail which is included in such projects are explanations (Markdown cells or Text cells) that explain what is happening or what is about to happen so the person looking at the project can get a full understanding.
Figure 9: Explanation of the project, different font sizes and lists can be seen which is important to make the reader understand easily.
In a similar way, the first step is to import the libraries needed.
Figure 10: Importing libraries, due to the project working with tweets (uploads to Twitter) there has to be some kind of connection established.
Figure 11: Creating a class to be able to work with the data from Twitter faster
Then the data is converted to Pandas Dataframe(s).
Figure 12: The class is instantiated and examples are called
Figure 13: Dataframe creation from the tweets
Figure 14: Creating target (y)
Figure 15: Model choice and estimation
Figure 16: Estimation of the probability that one of the two persons would say the words (source_test).
The project can be continually searching for what kind of tweets the same two people or working with different people (different examples) and getting results depending on their writing patterns.
Machine learning is the study of algorithms that improve or give better results through experience so when it is being trained by the user. It can be used in a number of ways and for various reasons depending on the task at hand. Most notably in financial technology (FinTech) and protection (Fraud detection, spam detection, Video Surveillance).
Important Key Points
- What is meant by machine learning?
- What are types of machine learning?
- What is the importance of machine learning?