Category Archives: Artificial Intelligence

Artificial Intelligence
Machine Learning
Deep Learning

ChatGPT can be used to hack you – Experts warn

chatgptcybercrime

Chatbots have become increasingly popular in recent years, with many businesses and organizations turning to them as a way to improve customer service and streamline communication. One of the most advanced and powerful chatbots available is ChatGPT, developed by OpenAI. However, just like any technology, ChatGPT can be used for malicious purposes. In this article, we will explore how ChatGPT can be used for hacking and the potential risks it poses.

What is ChatGPT

ChatGPT is a large language model trained with ocean of data. It is used to generate human-like text and can be useful for a variety of natural language processing tasks, such as generating text, translating languages, summarizing long documents, and answering questions on almost any topics.

This means that it can have conversations with humans that are almost indistinguishable from those with another person. It is possible to say ChatGPT can be used by hackers to impersonate individuals or organizations, and do all sorts of nasty stuffs to trick people to reveal sensitive information.

Ways You can be Hacked

One way that ChatGPT can be used for hacking is through social engineering. Hackers can use the chatbot to impersonate a trusted individual or organization and trick people into providing personal information or login credentials. They can also use ChatGPT to create phishing scams, sending messages that contain malicious links or attachments.

Hackers can use the chatbot to gather information about a target, such as their interests and habits, which can be used to tailor future attacks. They can also use ChatGPT to map out a target’s network and identify potential vulnerabilities.

It’s also important to note that ChatGPT can be used to automate hacking techniques. For example, a hacker can use ChatGPT to create a script that automates the process of guessing passwords, increasing the chances of success.

Expert warn

OpenAI has implemented certain safety measures to prevent ChatGPT from being used for malicious purposes. For instance, it refuses certain requests that might be harmful or unethical activities such as generating malicious code, hate speech or false information. However, some users have discovered workarounds to bypass these safety measures, enabling them to use ChatGPT for malicious activities.

These workarounds include modifying the input to the model, using it in a different context or with a different objectives. For example, a user might use ChatGPT to generate seemingly harmless code, but then use it to launch a malicious attack.

Just recently a famous cybersecurity company, Check Point Software Technologies, has reported instances of ChatGPT being manipulated to create malicious code capable of stealing computer files, executing malware, phishing for credentials, and even encrypting an entire system in a ransomware attack.

“We’re finding that there are a number of less-skilled hackers or wannabe hackers who are utilizing this tool to develop basic low-level code that is actually accurate enough and capable enough to be used in very basic-level attacks,” Rob Falzon, head of engineering at Check Point, told CBC News.

Another experts in the field said, ChatGPT could significantly speed up and simplify cybercrimes activities for unethical hackers. They just need to identify a clever way to ask the correct questions to the bot.

According to Shmuel Gihon, security researcher, ChatGPT is a great tool for software developers to write better code. However he also, pin pointed the advantages any bad actors might take with this tool.

“As a threat actor, if I can improve my hacking tools, my ransomware, my malware every three to four months, my developing time might be cut by half or more. So the cat-and-mouse game that defense vendors play with threat actors could become way harder for them.”

TechCrunch reported, they tried to create a realistic phishing email using ChatGPT. The chatbot initially refused to create malicious content, but with a slight change in wording, they were able to generate it. They have interviewed number of experts in security industry, and many have believed its potential to generate bad activities for hackers.

Principal research scientist at Sophos, Chester Wisniewski, have said people could do all sorts of social engineering attacks using ChatGPT.

“At a basic level, I have been able to write some great phishing lures with it, and I expect it could be utilized to have more realistic interactive conversations for business email compromise and even attacks over Facebook Messenger, WhatsApp, or other chat apps,” Wisniewski told TechCrunch.

According infosecurity-magazine, a Russian cyber-criminals have been observed on dark web forums trying to bypass OpenAI’s API restrictions to gain access to the ChatGPT chatbot  for nefarious purposes. They have been observed for discussing how to use stolen payment cards to pay for upgraded users on OpenAI and blog posts on how to bypass the geo controls of OpenAI. Some of them still have created tutorials explaining how to use semi-legal online SMS services to register to ChatGPT.

“Right now, we are seeing Russian hackers already discussing and checking how to get past the geofencing to use ChatGPT for their malicious purposes.” said Sergey Shykevich, threat intelligence group manager at Check Point Software Technologies.

According to a recent report from WithSecure, a Helsinki-based cybersecurity company, malicious actors may soon be able to exploit ChatGPT by figuring out how to ask harmful prompts, potentially leading to phishing attempts, harassment, and the dissemination of false information.

“At the beginning, it might have been a lot easier for you to not be an expert or have no knowledge [of coding], to be able to develop a code that can be used for malicious purposes. But now, it’s a lot more difficult,” Karimipour said.

Taking measures by OpenAI

It is important to note that while OpenAI has implemented safety measures to prevent the abuse of ChatGPT, it is still possible for malicious actors to bypass these measures. This highlights the need for individuals and organizations to be aware of the potential risks associated with advanced technologies such as ChatGPT and to take appropriate security measures to protect against hacking attempts.

OpenAI is actively working to improve the safety of their product and respond to potential threats and workarounds identified by cybersecurity experts. According to Hadis Karimipour, an associate professor, OpenAI has refined their safety measures to prevent ChatGPT from being used for malicious purposes over the past few weeks.

“At the beginning, it might have been a lot easier for you to not be an expert or have no knowledge [of coding], to be able to develop a code that can be used for malicious purposes. But now, it’s a lot more difficult,” Karimipour said.

Every new innovation has its pros and cons. The implementation of such applications undergoes ongoing improvements and follows specific strategies to ensure maximum safety evaluations, and that the case of ChatGPT too.

Conclusion

Overall, ChatGPT is a powerful and versatile tool that has the capability to change the way businesses communicate with their customers and clients. However, it’s crucial to be aware of the possible risks and take necessary security precautions to defend against hacking attempts. By being cautious and proactive, organizations and individuals can continue to benefit from ChatGPT while minimizing the potential dangers.

Everyone is talking about ChatGPT: Here is what I learned.

openais-revolutionary-chatbot-chatgpt-see-what-it-is

ChatGPT – What is it?

ChatGPT is a large language model trained by OpenAI for generating human-like text. It can be useful for a variety of natural language processing tasks, such as generating text, translating languages, summarizing long documents, and answering questions. Because it is trained on a massive amount of text data, it has a wide range of knowledge and can generate text that is difficult for other models to produce. However, like all language models, ChatGPT has limitations and may not always produce accurate or appropriate text, so it should be used with caution.

It is not capable of making decisions or taking actions on its own. It is up to users to decide how to use ChatGPT and other AI technologies, and it is ultimately the responsibility of human beings to determine how they will be used and how they will impact society.

Facts about ChatGPT:

  • Created by OpenAI.
  • Organization founded by some of high profile entrepreneurs including Elon Musk, Sam Altman in 2015.
  • Valued at around $20 billion.
  • Other products including, DALL·E 2 and Whisper
  • ChatGPT is powered by GPT-3.5 series
  • Crossed 1 million users in just 5 days

Why is it important and how can we use it?

For chat – Simple chat

As the name suggests you can use ChatGPT simply to chat. Ask almost anything then it will give you accurate answers. ChatGPT is a chatbot that helps in generating content for digital marketing campaigns. It’s not just a text generator, this bot also tracks all the conversations and interactions with the audience on a website. It monitors when visitors are browsing and views the website, clicks links and leaves comments.

In short, just ask something and you will get a response, mostly sensible responses – may occasionally generate incorrect information.

Write, debug and code explaining

If you are a programmer, this is huge news for you. You can now use ChatGPT to write and debug code. The app not only write code but also fixes bugs and generates explanation for the code it writes.

The development process might significantly become faster and cheaper if we are able to use AI powered apps to write code. It seems this is happening and it is just beginning.

ChatGPT explains complex topics and concepts related to programming almost equally human levels and I’m wondering what’s stopping it from becoming an alternative to human coders.

For creative writing

Large language models are really good at generating coherent text with structured approach. ChatGPT does the same, structuring creativity with ChatGPT is easier than ever with little guidance and observation. It is able to handle more complex instructions and producing longer-form content such as Poem, Fiction, Non-Fiction and even long form text based essays.

ChatGPT is able to keep track of what has been said previously and use that information to generate appropriate responses. It generates formal or informal text, short and long form, depending on the context and the tone of the conversation. This tool can be beneficial for creating content for social media or other online platforms.

A user asking the chatbot to explain a regular expression and write a short essay on “effects of westward expansion on the civil war”. In both cases, it was incredibly creative too delivering pretty good results.

Deploy a virtual virtual machine (VM)

Jonas Degrave, a researcher showed how he turned ChatGPT into what appears to be a full fledged Linux terminal interacting with the VM here created right from your web browser. A Virtual Machine running inside ChatGPT feels like magic. See the written article by Jonas here.

Security

We not surprised at all, people are using it for various purposes. Some users are using ChatGPT to reverse engineer shellcode, rewrite it in C and others are playing with it to generate nmap scans.

Limitation

Like any other machine learning model, it is only as good as the data it has been trained on. This means that it may not be able to provide accurate answers to questions or generate responses that are outside of the scope of the data it has been trained on. Additionally, ChatGPT is a text-based model, so it is not capable of providing visual or audio responses. Finally, ChatGPT is not able to browse the internet or access external information, so it can only provide information that it has been trained to generate based on the input it receives. You can see the capabilities and limitations of ChatGPT in the picture below.

chatGPT

Wrap up

We are currently experiencing a huge development in this space, thanks to ChatGPT. ChatGPT is taking the world by storm. It can be used in various areas, including social media content generator, voice assistance, chatbots and virtual assistants, customer care application, meetings, code generators, and for security research areas. This opens the door for a new generation of chatbot innovation, possibly the kind that many anticipated but didn’t see come to pass. At least up till this point.

The Best Machine Learning Books That All Data Scientists Must Read

best ml books

Machine learning is an exciting field that has been growing rapidly in recent years and it’s only expected to continue to grow as we move forward into the future. There are many different options and topics that data scientists can explore while they’re studying machine learning, but there are some core principles and key texts that you should definitely be familiar with if you want to be taken seriously in this industry.

Today, I’ll take a look at the top machine learning textbooks that I’m currently reading and why you should read as well.

1. Artificial Intelligence: A Modern Approach

Artificial Intelligence is a massive and multi-disciplinary field, so it’s no surprise that there are plenty of resources for those looking to jump into this field. The most highly rated textbook for AI students on Amazon is Peter Norvig and Stuart Russell’s Artificial Intelligence: A Modern Approach. This book was introduced in 1995, and has been updated multiple times since then. This is a heavy book with 27 chapters that covers problem solving and search, logic and inference, planning, probabilistic reasoning and decision making, learning, communication, perception and robotics. Basically everything from common algorithms to neural networks and natural language processing.

Topics Covered:

  • Logical Agents
  • Learning, communication, perception and robotics
  • Supervised, Unsupervised learning and Reinforcement Learning, Machine Learning models and Algorithms
  • Probabilistic Reasoning
  • Natural Language Processing

This book is not only for students but also used by many experts in the field. Here are a few reviews from academics and professionals in the subject.

Experts Opinions

I like this book very much. When in doubt I look there, and usually find what I am looking for, or I find references on where to go to study the problem more in depth. I like that it tries to show how various topics are interrelated, and to give general architectures for general problems … It is a jump in quality with respect to the AI books that were previously available. — Prof. Giorgio Ingargiola (Temple).

Really excellent on the whole and it makes teaching AI a lot easier. — Prof. Ram Nevatia (USC).

It is an impressive book, which begins just the way I want to teach, with a discussion of agents, and ties all the topics together in a beautiful way. — Prof. George Bekey (USC).

2. Deep Learning (Adaptive Computation and Machine Learning series)

Ian Goodfellow, Yoshoua Bengio, and Aaron Courville are three researchers who stand at the forefront of Deep Learning. It comes with general context and comprehensive knowledge on mathematical foundation of Deep Learning. This book is highly recommended to read if you want to start your journey with deep learning.

Topics Covered:

First few chapters cover mathematical concepts for deep learning. You will be able to grasp these without difficulty if you have a concise knowledge of linear algebra, probability and statistics. Part 3 covers Deep Learning Research which include different techniques and methods for deep learning which is quite challenging.

  • Numerical Computation
  • Deep Feedforward Networks
  • Optimization for Training Deep Models
  • Deep Learning Research

Experts Opinions

“Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.” —Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceX.

“If you want to know here deep learning came from, what it is good for, and where it is going, read this book.” —Geoffrey Hinton FRS, Professor, University of Toronto, Research Scientist at Google.

3. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 2nd Edition

This book is is a must-read book for everyone who seriously wants to enter this field. This is the perfect book for machine learning practitioners as it covers the most important aspects of machine learning, such as classification, regression, clustering, and dimensionality reduction. It simplifies highly complex concepts through concrete examples and real world example. It also provides detailed introduction of popular frameworks such as Scikit-Learn, Keras and TensorFlow. Author Aurélien Géron has put all the concepts in a beautiful manner so you can gain an intuitive understanding of the concepts and tools for building intelligent systems.

You need programming experience to get started, so learning Python programming language would greatly help to complete this book.

Topics Covered:

  • Introduction to machine learning and history
  • Use Scikit-Learn to track an example machine-learning project end-to-end
  • Explore several training models such as Support Vector Machines, Decision Trees, Random Forests, and Ensemble methods
  • Use the Tensor Flow library to build and train neural nets
  • Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning
  • Techniques for training and scaling deep neural nets.

Experts Opinions

“An exceptional resource to study Machine Learning. You will find clear-minded, intuitive explanations, and a wealth of practical tips.” —François Chollet, Author of Keras, author of Deep Learning with Python.

“This book is a great introduction to the theory and practice of solving problems with neural networks; I recommend it to anyone interested in learning about practical ML.” — Peter Warden, Mobile Lead for TensorFlow.

4. Python Machine Learning – Second Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2nd Edition

You never want to miss this book If you really want to learn Machine learning. This is perfect book as its primary focus is exclusively on the implementation of a various machine learning algorithms. The book places a special emphasis on using Scikit-learn to implement these algorithms, and is a must for anyone looking to develop mastery around algorithm development.

Sebastian Raschka and Vahid Mirjalili has updated it to third edition in 2020, covering TensorFlow 2, ScikitlearnReinforcement learning, and GANs in the recent release.

Topics Covered:

  • Explore and understand the key frameworks for data science, machine learning and deep learning
  • Master deep neural network implementation using the TensorFlow library
  • Embed machine learning model in web application

These books are well worth to read if you want advance your machine learning knowledge and skills. I’ve printed copies of each book I mentioned above. In addition to that I also started to read several other ML books, pdf copies, watch YouTube videos and research papers to improve my ML skills and knowledge.

How Weight and Bias Impact the Output Value of a Neuron

Had opportunity to stay one of the finest water villa resorts in Maldives, Gili Lankanfushi (No News No Shoes) for free, Yes for free!

So why not learn something in here. Thanks to Gili team and management.

Let’s begin….


Artificial neurons are digital construct that seeks to simulate the similar behavior of a biological neuron in human brain. Large number of artificial neurons are digitally connected to each other to make up an artificial neural network. Therefore the core fundamental building block of any neural network is artificial neurons.


Artificial Neuron


Artificial neuron is a mathematical model which mimic biological neuron. Each neuron receives one or more inputs and combines them using an activation function to produce an output.

Weights and Biases


Weights and biases are the learnable parameters of a machine learning models. When the inputs are transmitted between each neuron, the weights are applied to the inputs along with the bias.

Weights control strength of the connections between two neurons. It decides how much influence the input will have on the output.

Biases are constant values. Bias units are not influenced by the previous layer but they do have outgoing connections with their own weights.


How Neural Networks Work


At very high level a simple neural network consists input layers, output layers and many hidden layers in between. These layers are connected via series of nodes so they form a complex giant network.

Within each node there are weight and a bias values. As an input enters the node, it gets processed by the value of a weight and bias and then output the result which then passed to the next layer in the neural network. This way, it forms a signals which transmit form one layer to another until it reaches to last output layer.

This complex underlying structures give powers to computers to “think like humans” and produce more sophisticated cognitive results.


So let’s begin with single input neuron’s output with a weight of 1, bias of 0 and input x.

In the second example we will adjust the weight keeping bias unchanged and see how the slop of the function change.

As you see above, if we increase the value of the weight, the slop will get steeper. However, if we reduce weight of one neuron then the slop will decreases.

Now, what if we negate the weight. Obviously the slope turns to a negative.

As mentioned earlier, these graphs visualize how weight causes the output value of a single neuron. Now let’s change a little bit. This time we will keep weight at 1.0 and give different bias values. Let’s start with a weight of 1.0 and a bias of 2.0.

As we increase bias, the function output shifts upward. If we decrease the bias, then the overall function output will move downward as shown below.

Now we have learnt something about artificial neurons. Artificial neurons mimic how human brain works. These complex neurons require weight and bias value to output some result.

Important Points of Supervised Learning


For the first time ever I had opportunity to go for a multi-day fishing trip with a group of friends by a local fishing boat. This trip was 6 days long, spent roughly 100 hours in the middle of ocean, within the range of 20-50 nautical miles. This was totally a different experience in my life and during the trip I tried to learn something on supervised learning.

So let’s go…



  • Supervised learning models learn from any given labeled data. They are known as training data.
  • Training data contains different patterns.
  • The algorithm will learn underlying patterns during the training process.
  • In testing phase, training data set helps models to predict a desired outcome for unforeseen data.

Supervised Learning Algorithms

  • k-Nearest Neighbors
  • Linear Regression
    • formula for linear regression, Y= ax+b
  • Logistic Regression
    • formula for logistic regression, y = ln(P/(1-P))
  • Support Vector Machines (SVM)
  • Decision Trees and Random Forests
  • Neural Networks

Advantages of Supervised Learning

  • Supervised learning is easy to understand.
  • Number of classes or parameter will be known before model is deployed.

Challenges of Supervised Learning

  • It requires some amount of expertise to structure accurately.
  • Training a proper models can be very time intensive.
  • Human errors in the datasets can cause poor algorithms.
  • It cannot cluster or classify data on its own.

Supervised Learning Models Can Be Used in:

  • Image and object recognition: Supervised learning algorithms can be used to identify objects in a videos or images.
  • Predictive analytics: It provides deep insights into various business data points. Helps companies to take decisions more easily and accurately.
  • Customer sentiment analysis: Easy to extract and classify important pieces of information from large volumes of data such as emotion, intent and context.
  • Spam detection: Classification algorithms is used to recognize patterns or anomalies in a dataset.

A Gentle Introduction to Batch Learning Process

Introduction

Strategies for machine learning system are classified into two main categories. They are Batch Learning and Online learning. In batch learning, models learn offline while in online learning data flow into the learning algorithm in stream of pipelines. In this article, you will learn:

  • Gentle introduction of batch learning.
  • Problems in batch learning.
  • Solving batch learning problems using online learning method.

So let’s begin…


What is Batch Learning?

Data preprocessing is an important step in machine learning projects. It includes various activities such as data cleaning, data reduction, splitting dataset (training and testing dataset) and data normalization process. To train a well accurate model, a large set of data is required. In batch learning process we use all the data we have for the training process. Therefore, the training process takes time and requires huge computational power.


What is happening under the hood?

After model is fully trained in the development process it will be deployed into the production. Once model is deployed, it will use only the data that we have given to train it. We cannot feed new data directly then let it learn on the fly.

If we want to use new data then we need to start from the scratch. We need to bring down the machine learning model and use new dataset with old data and then train it again. When model trained completely on the new dataset, we then deploy it again to the production.

This is not a complex process perhaps in most of the cases, it might work without any major issues.

If we want to run the machine learning model, in every 24hours or in every week, then training the model from the beginning will be very much time consuming and also expensive. Training a machine learning model with new and old dataset not only requires time and computational power, but also requires large disk space and disk management which may again cost money.

This is fine for small projects but it gets tough in real time where the data is coming from various end points such as IoT devices, computers and from servers.


Training #DatasetDiskspace (TB)
11,000,000100
22,000,000200
33,000,000300

Disadvantages of batch learning

The negative effects of large batch sizes are:

  • Model will not learn in the production. We need to train the model every time with new data.
  • Disk management is costly. As dataset grows then it requires more disk spaces.
  • To train a model with large amount of data set costs time and computational resources.

Online learning

To solve issues we face on batch learning, we use a method called online learning. In online learning, it tends to learn from new data while model is in production. Small batches of data will be used to train the model which are known as mini batches. We will look more into online learning in another article.


Conclusion

In this article we have looked into batch learning strategy and how it works. W’ve highlighted the disadvantages of batch learning and how online learning is used to overcome issues we face in batch learning. Hope you understand something on batch from this article.

Important Checklist for Any Machine Learning Project

There are mainly 8 key steps to consider in machine learning projects.

  1. Frame the problem and understand the big picture.
  2. Get relevant data.
  3. Explore the data set and get insights.
  4. Prepare and clean the data set, expose the underlying data patterns to the algorithms.
  5. Explore different models and identify the best ones.
  6. Fine-tune the models and combine them into a great solution.
  7. Present the solution.
  8. Launch, monitor, and maintain the system.

Let’s understand more…

Frame the Problem and Understand the Big Picture

  1. Define business objective.
  2. How the model could be used?
  3. Identify existing solutions for the problem we want to solve?
  4. Which machine learning method we choose (supervised, unsupervised, reinforcement learning, online or offline, etc.)?
  5. How we measure model performance? Does the model able to achieve our objectives?
  6. Identify the minimum performance needed to reach the business objective?
  7. What are similar problems and use cases? Can we reuse experience or tools?
  8. Does human expertise better than a computer algorithm?
  9. List all the possible assumptions and verify them.

Note: automate as much as possible in every steps in the process.

Get Relevant Data

Note: automate as much as possible so we can easily get fresh data.

  1. List the data you need and how much you need.
  2. Identify the data sources. Where can you get data.
  3. Check how much storage requires and create a workspace.
  4. Check for legal obligations before accessing any data storages. Get authorization if necessary.
  5. Convert the data to a friendly format where we can manipulate easily.
  6. Ensure sensitivity of the information.
  7. Check data type (time series, sample, geographical etc) and its size.
  8. Sample a test set, put it aside, and never look at it.

Explore the Dataset and Get Insights

Note: Having industry expert’s opinion and insights would always be beneficial.

  1. Create a copy of the data sheet. Sampling it down to a manageable size would be greatly helpful for data exploration process.
  2. Keep a record of our data exploration. We can use Jupyter or any other notebook for machine learning projects.
  3. Study each attribute and its characteristics.
  4. Identify the target attributes if the model is supervised learning.
  5. Visualize the data.
  6. Study the correlations between each attributes.
  7. Identify the promising transformations which can be useful.
  8. Identify and collect extra data that would be useful.
  9. Document what we have learned.
NameType% of missing valuesNoisiness and type of noisePossibly useful for the task?Type of distribution
categoricalstochasticGaussian
int/floatoutliersuniform
bounded/unboundedrounding errors,logarithmic
text
structured

Prepare and Clean the Dataset

Notes: Keep original dataset intact. Work with copies. That way we can keep original dataset safe.

Write functions for all data transformations. So we can:

  • Easily prepare a dataset for fresh data.
  • Apply these transformations in future projects.
  • Clean and prepare test set.
  • Clean and prepare new data instances when our solution is live in production.
  • Make it easy for hyperparameters process.
  1. Data cleaning: Removing outliers is often important even though it is optional. Fill missing values (e.g., with zero, mean, median…) or ignore such columns and rows.
  2. Feature selection is again optional but highly recommended: Drop the attributes (features) that is not useful for the task.
  3. Feature engineering, where appropriate: Discretize continuous features. Decompose features (e.g., categorical, date/time, etc.). Add promising transformations of features (e.g., log(x), sqrt(x), x^2, etc.). Aggregate features into promising new features.
  4. Feature scaling: standardize or normalize features.

Explore Different Models

Notes: If we have huge data set, it is good idea to sample smaller training sets so we can train many different models in a reasonable time (however this could penalizes complex models such as large neural nets or random forests).

  1. Train many quick models from different categories (e.g., linear, naive Bayes, SVM, Random Forests, neural net, etc.) using standard parameters.
  2. Measure and compare their performance. Using N-fold cross-validation compute standard deviation and mean of the performance measure on the N folds.
  3. Analyze the types of errors that the models make. What data would a human have used to avoid these errors?
  4. Have a quick round of feature selection and engineering.
  5. Identify most promising models.

Fine-Tune the System

Notes: Use as much data as possible as you move toward the end of fine-tuning.

Don’t tweak the model after measuring the generalization error: It will start overfitting the test set.

  1. Fine-tune the hyperparameters using cross-validation.
  2. Try Ensemble methods. Combining your best models will often perform better than running them individually.
  3. Once you are confident about your final model, measure its performance on the test set to estimate the generalization error.

Present the Solution

  1. Document everything we have done.
  2. Create a presentation. Highlighting the big picture is important.
  3. Explain the business objective. Mention model performance and also show other models results
  4. Present key learning points in a beautiful visualizations. Describe what worked and what did not. List assumptions and limitations of the model.

Launch the Model

  1. Do proper testing and launch the model in production with production data inputs.
  2. Monitor system performance at regular intervals and trigger alerts when it drops.
    • As data evolve models performance will be affected. Beware of slow degradation too.
    • Measuring performance may require a human pipeline (e.g via a crowdsourcing service).
    • Also monitor inputs’ quality
  3. Retrain models on a regular basis on fresh data.


Learning resources:

  • Learning resources: Hands‑On Machine Learning with Scikit‑Learn, Keras, and TensorFlow: Book by Aurelien Geron

A Brief Introduction to Supervised Learning

What is Supervised Learning?


Supervised learning is one of the most common paradigm for machine learning problems. The algorithms used in supervised learning are easily understandable, so it is more common.

When we teach kids, often we show flash cards, objects and several examples. They try to learn new things during that process. We then show similar things to kids and ask different kind of questions in order to understand the learning progress.

The same procedure will be applied during supervised learning. We train algorithms using large data sets. Some data are labeled with the correct answers. These labels or targets are known as features. Therefore we know the right answer before we train any model.

Supervised learning mostly consists of classification and regression. However there are other variant as well.

Let’s understand more…

Regression

Predicting the price of a car for a given feature set (milage, color, brand etc). Some regression algorithms can be used for classification as well and vice versa. For example, Logistic Regression can be used for classification. It can output a value that corresponds to the probability of belonging to a given class (eg., 20% chance of being spam).


Classification

Classification is the process of predicting the class of given data points. Some examples of classification include spam detection, churn prediction, sentiment analysis, classifying hand written characters and so on.


Sequence Generation

Given a picture, predict a caption describing it. Sequence generation can sometimes be reformulated as a series of classification problems (such as repeatedly predicting a word or token in a sequence).

Object Detection

Given a picture, draw a bounding box around certain objects. This can also be expressed as a classification problem (given many candidate bounding boxes, classify the contents of each one).

Image Segmentation

Given a picture, draw a pixel-level mask on a specific object.


Most Common supervised learning Algorithms are:

  • Logistic Regression
  • Linear Regressions
  • K-nearest neighbors
  • Decision Tree & Random Forest
  • Neural Networks
  • Support Vector Machines

Training Process

Training dataset consists of both inputs and outputs. The model will be trained until it detects the underlying patterns and relationships between the input data and the output labels. The accuracy will be measured through the loss function, adjusting until the error has been sufficiently minimized. That is the point where it reaches to global minima point.

Over time, models try to learn, thus accuracy will normally be improved. When the training process is completed, these models are used to make new predictions on unseen data.

The predicted labels can be either numbers or categories. For instance, if we are predicting house prices, then the output is a number. So we called it regression model. When we are predicting spam emails using email filtering system, we have two choice whether email is spam or not. Therefore the output is categorical. This type model is known as classification model.


Training Process With a Real Example

Let us understand the training process with an example. For example we have a fruit basket which is filled up with different types of fruits. We want to categorize all fruits based on their category.


Our fruit basket is filled with Apples, Mango and Strawberries. For the models we will label fruits with corresponding unique characteristics of each fruits which make them unique by their type.

NoSizeColorShapeName
1BigRedCircular shape with a depression at the topApple
2BigYellowRounded top shape with a curved convergent shaped to the bottom. Mango
3SmallRed and GreenOval shape with rough surfaceStrawberries

Now, the dataset is ready. It consists of different parameters called features and labels. Algorithm will learn the underlying pattern and output the results. Initially the output will not be so accurate but, as training time increase usually the model gets better and better. Once the model reaches to its best accuracy level, we feed new dataset called test dataset. This way we can make sure its learning progress and the accuracy.


Conclusion

In supervised learning, we train a machine learning algorithm using large set of data points. Some of the data points are labeled with target output. Our aim in supervised learning is to learn a model from labeled training data that allows us to make predictions about unseen or future data.


Learning resources:

Learn Everything About Feature Scaling


Feature Scaling?


Feature scaling is a technique used when we create a machine learning model. It lets you to normalize the range of independent variables or features of the given field of the dataset. It is also known as data normalization. During data preprocessing phase, it is important to do data normalization because, machine learning algorithm will not perform well if the data attributes have different scales.

let’s scratch the surface…

Why Feature Scaling is Important?


The importance of feature scaling is can be illustrated by the following simple example.

Suppose in a dataset we have features and  each feature has different records.


featuresf1f2f3f4f5
Magnitude3004001520550
UnitKgKgcmcmg

Remember every feature has two components


  • Magnitude  (Example: 300)
  • Unit (Example: Kg)

Always keep in mind: Most of the ML algorithms work based on Euclidean distance, Manhattan distance or K Nearest-Neighbors and few others.


featuresf1f2f3f4f5 (f2- f1) (f4- f3)
Magnitude3004001520550400-300 = 10020-15=5
UnitKgKgcmcmgKgcm

So coming back to this example, so when we try to find out the distance between different features, the gap between them actually varies. Some attributes have large gap in between while others are very close to each other. See the table:

You may also have noticed, unit of f5 is in gram(g) while f1 and f2 are in Kilo gram (Kg). So in this case, the model may consider the value of f5 is greater than f1 and f2 but that’s not the case. Because of these reasons, the model may give a wrong predictions.

Therefore we need to make all the attributes (f1, f2, f3…) to have same scale with respect to its units.  In short, we need to convert all the data into same range (usually between 0-1)  such that no particular feature gets dominant over another or no particular feature has less dominant. (By doing so, the convergence will be also much fast and efficient).

There are two common methods used to get all attribute into same scale.


Min-max Scaling


In min-max scaling, values are rescaled to a range between 0 to 1. To find the new value,  we need to subtracting the min value and then divide by the max minus the min. Scikit-Learn provides MinMaxScaler for this calculation.



    \[X_{new} = \frac{ Xi-min(X)}{max(X)-min(X)}\]

Standardization

Standardization is much less affective by outliers (explain outliers – link) . First we need subtract the mean value then divide by standard deviation such that it forms resulting distribution of unit variance. Scikit-Learn provides  a transformer called StandardScaler for this calculation.



    \[ X_{new} = \frac{Xi-X_{mean}}{Standard Deviation} \]


Here I show an example for feature scaling using min-max scaling and standardization. I’m using google colab but you can use any notebook/Ide such as Jupyter notebook or PyCharm.


Go to the link and download Data_for_Feature_Scaling.csv


Upload csv to the google drive

Mount drive to the working notebook

For that you may need authorization code from google Run the code.


# feature scaling sample code
# import recommended libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import preprocessing
# mount drive
from google.colab import drive
drive.mount('/content/drive')
# import dataset 
data_set = pd.read_csv('feature_scaling/Data_for_Feature_Scaling.csv')
# check the data 
data_set.head()

Output
        Country	 Age	Salary	Purchased
0	France	 44	72000	0
1	Spain	 27	48000	1
2	Germany	 30	23000	0
3	Spain	 38	51000	0
4	Germany	 40	1000	1

x = data_set.iloc[:, 1:3].values
print('Origianl data values: \n', x)

Output
Original data values: 
 [[  44   72000]
 [   27   48000]
 [   30   23000]
 [   38   51000]
 [   40    1000]
 [   35   49000]
 [   78   23000]
 [   48   89400]
 [   50   78000]
 [   37   9000]]

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
# Scaled feature 
x_after_min_max_scaler = min_max_scaler.fit_transform(x)
print('\n After min max scaling\n', x_after_min_max_scaler)

Output
After min max scaling
 [[0.33333333  0.80316742]
 [0.           0.53167421]
 [0.05882353   0.24886878]
 [0.21568627   0.56561086]
 [0.25490196   0.        ]
 [0.15686275   0.54298643]
 [1.           0.24886878]
 [0.41176471   1.        ]
 [0.45098039   0.87104072]
 [0.19607843   0.09049774]]

# Now use Standardisation method
Standardisation = preprocessing.StandardScaler()
x_after_Standardisation = Standardisation.fit_transform(x)
print('\n After Standardisation: \n', x_after_Standardisation)

Output
After Standardisation: 
 [[ 0.09536935  0.97512896]
 [-1.15176827   0.12903008]
 [-0.93168516  -0.75232292]
 [-0.34479687   0.23479244]
 [-0.1980748   -1.52791356]
 [-0.56487998   0.1642842 ]
 [ 2.58964459  -0.75232292]
 [ 0.38881349   1.58855065]
 [ 0.53553557   1.18665368]
 [-0.41815791  -1.2458806 ]]

Learning resources: