# Neural Networks and Deep Learning: A Beginner's Guide [2020]

Artificial Neural networks can be compared to a human nervous system.

It has artificial nodes and neurons. It is used for solving problems of artificial intelligence.

The machine learns to perform when they are exposed to various statistics and examples.

It can be compared to the cognitive memory of a person. It is the skill acquired by a machine when it is fed with similar examples.

It is based on threshold logic, which depends on a combination of mathematics and algorithms.

This has led to many automation technologies.

**Neural Networks were designed to recognize patterns. The patterns can be images, texts, numbers, vectors, etc.**

It can cluster and classify the information given. This can be used to solve even simple problems based on basic information.

For example, a desktop shop has x customers daily. It can be used to determine the number of customers y who would buy the product.

It can be determined by two basic details like the age of the person and his salary. If people above 30 years and with a salary 15000 buy the product regularly.

It can be predicted that the person with the same age and salary who comes next can buy the product.

**Paradigms of Machine Learning**

### 1. Unsupervised Learning:

The main aim of unsupervised learning is to find the patterns in data instead of predicting the output.

This is also known as unlabeled data. The types of unsupervised learning are:

**Clustering:**The data is divided into cohesive groups or categories based on their characteristics.

**Association:**This is the determination of the frequent occurrence of the data in the given set of datasets.****

****

### 2. Supervised Learning:

Supervised learning is to learn input and output maps. It is mapping of output based on the given set of input data.

It is like giving the output, for example, true or false, based on a given descriptive question as input.

This is also known as labeled data. The types of supervised learning are:

**Classification:**The output is categorically based on the given input, which is labeled.

**Regression:**This is the prediction of continuous output value based on the input.

### 3.**Reinforcement Learning: **

Reinforcement Learning can neither be supervised or unsupervised learning. It is like learning to control the behavior of a machine.

It is like making decisions sequentially. The output depends on the previous input, and the next input depends on the output generated.

It is like playing a chess game. Each movement of the coin depends on the earlier move.

As it depends on the earlier decisions, mostly the sequenced decisions are labeled.

**Functions of Neural Networks**

A neural network system can perform various functions. It depends on the input to process the output based on earlier reports.

A deep learning system can perform various functions like:

### 1. Classification:

Classification comes under supervised learning. The tasks depend upon labeling the datasets.

The person must transfer their knowledge to the dataset for neural networks to learn the correlation between data and labels.

The examples of classification are:

- Differentiating between facial expressions through the image of the person. If the person is happy or sad.

- Detection of voices and sentiments in audio. Transcription of speech to text in videos or audios.

- To identify the presence of some objects in the image like lane markers, animals on the road, traffic signal, etc.

- Classifications of emails to different sections like spam, fraudulent, etc.

- To recognize the sentiment in customer feedback or any other text message.

- Identifying the gestures in a video.

We can train data with any labels that we can generate or think so that the machine can classify the given set of inputs based on the data and labels.

**2. Clustering:**

Clustering is a grouping based on the Detection of similarities. There is no requirement for labels to detect similarities in the input data.

Unsupervised learning is the method by which we do not use labels. So clustering comes under unsupervised learning.

Most of the data in the world come under unlabeled data. The output will be more accurate when the algorithm is fed with more data.

This is the main law on which machine learning works on. So it can be predicted that the unsupervised learning or unlabeled nodes produce an accurate result at a higher rate.

Clustering is done based on two methods:

**Searching:**First, the algorithm searches for similar items by comparing it with others. For example, it may separate the image and sound-based on similar items. If there is a list of animals, you may separate it as wild and domestic animals based on its characteristics.

**Detection of abnormality:**The converse side of Detection of similarity by the clustering process is the Detection of abnormality or unusual behavior. For example, Detection of an abnormality like fraud detection of a debit card while comparing with the usual usage of the person.

Clustering is used in the following applications:

- Discovering the class of the customer based on customer data.

- Discovering the part of the image based on the pixels

- Identification of synonyms of the word

- Classification of documents based on the type

### 3. Predictive Analytics: Regression

A Neural Network can differentiate between an image and a sentence with the help of classification.

It can understand that an image is made of small unit pixels, and a sentence is made up of a set of alphabets.

This is known as a static prediction. A deep learning algorithm can also find the difference and correlation between the present and future events based on the set of tokens with the datasets.

Regression methods can be used between the present and future data to find the correlation between them.

Regression is also known as predictive analytics, as it can predict the outcome based on similar or earlier data.

It also comes under supervised learning. This can sense or analyze the given set of present data and give a result of what will happen in the future.

It can be used for identification of the next number or character in the series based on the previously given data.

In the case of regression, the output may not be a discrete value. It may be output at a particular time. For example, it can be used to do a trend analysis based on past data.

Few examples in which regression can be used are:

- Hardware breakdown based on manufacturing, transportation, etc.

- Determination of health breakdown like heart attacks or strokes based on the information given by wearable sensors.

- Prediction of the customer churn or prediction the percentage in which a person will leave based on their metadata or web activity.

- The suggestion of friends on social networking sites based on the personal information provided by the person.

- Time and serial prediction like rainfall prediction

- Risk factor analysis

- Data reduction

The more and better we predict, the more we can prevent it. For example, if we predict the presence of the diseases in our body based on the early symptoms, we could prevent our suffering from a disease.

So with predictions, we can reduce the surprises we face on a daily basis.

It is something like the prediction of a storm earlier so that we can stay safe.

This neural network can be used in Machine Learning, Deep Learning, and Artificial Intelligence to develop a smart world.

**Types of Neural Networks**

**1.Multi-layer Perceptron: **This is the first type of neural network. This consists of more layers, more than three. A nonlinear activation function is mostly used in this type.****

**2.Convolution Neural Networks: **It is the second type of neural network and uses a variation of multi-layer perceptron.

**3.Recursive Neural Network: **This is the third type. In this structured predictions are made using weights.

**4.Recurrent Neural Networks: **The fourth type is recurrent neural networks. Directed cycle connections between neurons do it. This architecture is mostly used in long short- term neural networks compared to the activation function.

**5.Shallow Neural Networks: **The shallow neural networks are sequence to sequence modules. It can produce network space from some amount of text. It can be used for basic demonstrations.

There are three attributes on which the neural network depends on. The three vectors are attributed to classes and weights.

The machine will undergo 100 iterations to make the attributes fit into the class. The predictions are then generated and weighted.

The output is given by iterating it through the weights. Neural Networks can also handle back propagation.

**Characteristics of Artificial Neural Network**

A few characteristics of Artificial Neural Networks are:

- It is a mathematical model based on statistics and probability.

- It has a large number of neurons to perform operations.

- The information which can be stored in the form of neurons is a weighted linkage of neurons.

- This is a process of learning from similar data.

- It can also be known as the ability of a machine to recall and learn from a given set of data and assignments with a suitable weight.

- No single neuron can carry specific information

- The computational power of the neurons is determined by its collective behavior.

**Elements of Neural Network**

When neural networks are stacked and composed into many layers, it is known as Deep Learning. Each layer is made up of nodes.

A node is a site for computation to take place. It is like the neurons in the human body, which react or respond to the external stimuli given on it. This can respond to the given input.

A node can combine input from data with coefficients or weights. The weights can amplify or modify the inputs for the required output.

It is used to identify the suitable input which can be used to process the data without error.

There are three main elements of neural networks:

** 1.Input Layer: **This layer can accept input data. It is something that can provide information from the outside environment into the network. In this layer, there are no computational functions performed. This node just passes the information to the hidden layer.

**

*2.Hidden Layer: **The *nodes present in this layer are commonly not exposed to the outer world. It is the abstraction data type of the neural network. It performs all kinds of computation and processes the input data set given by the input layer.**

** 3.Output Layer: **The output layer gives out the processed data to the outside world.

There are two parameters: inputs and weights. These inputs and weights are those which determine the state of the output.

Input and weight products are summed together to determine the state of the output. This is altogether passed into the transfer function of the node.

Then it is transferred into an activation function. This helps to optimize the input based on the output which we expect.

This determines the extent of the signal and its progress through the network. If these signals pass through the neuron, it can be activated.

The transfer function and activation layer are commonly known as hidden layers.

A node can be compared to a neuron-like switch. It can turn on or turn off as input is fed through it.

Each layer's output will act as input for the next layer, like reinforcement learning. This process begins from the starting of the node.

The pairing of suitable weights with suitable inputs determines how the neural network will classify and culture the given data.

This is a basic step in which a neural network works.

It is something like a box which is open on both sides. One side receives the input, and the other side receives output.

The input comes into a box where it combines with the weights and biases, and then it is sent out as output.

It is something like an equation of a straight line. This output then enters into another box as an input.

This process continues until the algorithm is completed, and the desired output is attained.

**Key Concepts of Deep Neural Networks**

Deep learning is a family of machine learning which depends on artificial neural networks.

It can be supervised, unsupervised, or semi-supervised. It can be differentiated from the neural networks by depth.

The number of the nodes through which data must pass is multiple in deep neural networks. It is the process used in pattern recognition.

The neural network's earlier versions, like perceptrons just composed of one input layer, were shallow, and composed only the output layer, and also just one hidden layer in between the input and output layer.

Deep learning includes at least more than three layers. In deep learning, the algorithms are deep and complex.

It can be defined as a neural network with more than one hidden layer.

In DNN, each layer is made of nodes.

Each layer trains on a unique set of features. The feature of the node is based on the output of the previous layer.

If we advance further into the neural network, the features recognized by the nodes are more complex as they are aggregation and combinations of features from the previous layers. This is called a featured hierarchy.

It is a hierarchy in which there is an increase in complexity and abstraction.

This is the reason why deep-learning networks are capable of handling very large data.

The high-dimensional data with billions of different parameters that are part of nonlinear functions can be determined by DNN based on this hierarchy.

Deep Neural Networks are capable of identifying characters which are not labeled and unstructured or raw data like pictures, videos, texts, audios, etc.

This is the majority of data collected in the world. The best thing about deep learning is that it can classify and cluster a large amount of varied data.

For example, it can classify texts based on language.

Deep learning can input billions of images, texts, videos, and can cluster them according to the similarities present between them.

This technique can be used to build a smart album. It is the one in which pictures can be classified.

For example, the images of dogs in one column, cats in another column and the images of different people in different columns, etc.

The same can be done with mails. The mails can be classified as spam mail, updates, promotions, etc.

based on the organizations which send them. All these are done by machines on their own. Unlike normal machine learning algorithms, deep-learning can perform extraction of features on its own without the intervention of humans.

Training on unlabeled data in a deep learning network is done automatically by each node layer.

A deep network can learn features automatically just by training it to reconstruct the input at each step as soon as it draws its samples.

It can behave as re-constructors. Neural networks can understand the relation between features and results.

****

**Example of Neural Networks**

**1. Feed-forward Networks**

In feed-forward neural networks is the connection between the network in such a way that it does not form a cycle.

As we aim to end at a point with the least error, this feed-forward network will help to track a suitable path avoiding a repeated loop.

It involves many steps, and each step includes a guess, and also the measurement of error.

There is also an option of weight and coefficients.

Model is a collection of weights from the start state to its end. This normally starts badly and ends with less bad.

It can change over time as it moves from one neuron to another.

There are five types of connectivity neuron network layers:

**Single-layer Feed-forward Networks:**Here, we have only two layers of input and output layers. The input layer is not counted as there is no active computation in the layer. The output layer is formed as the weights are applied to the input layer.****

**Multi-layer Feed-forward network:**In this, there is a hidden layer between the input and output layer.****

**Single node with own feedback:**The output is fed back into the input layer. These are recurrent networks.

**Single-layer recurrent network**

**Multi-layer recurrent network**

**2.Multiple Linear Regression**

Though the biological name inspires artificial Neural Networks, it consists only of mathematics and computational code like machine learning and artificial intelligence.

It is used to predict the outcome of the response input variable. It uses just one explanatory variable.

If we are aware of linear regression we could easily work on Neural Networks. The input and output are directly proportional.

Y_hat=bX+a

For example, if X is the input, which is the calories taken by the person. Y will the total weight gained by him.

So here X is the input and Y the probability of output. If X increases, Y also increases along with it.

Multiple linear regression takes place at each and every node of a neural network.

The formula for multiple linear regression is:**

*yi*=*β*0+*β*1*xi*1+*β*2*xi*2+...+*βp**xip*+*ϵ*

*Here,*

*yi*=dependent variable

i= number of observations**

*xi*=explanatory variables**

*β*0=y-intercept (constant term)**

*βp*=slope coefficients for each explanatory variable**

*ϵ*=the model's error term (also known as the residuals)

Multiple regression model is based on assumptions:

- The dependent variable and independent variable have a linear relationship.

- The independent variables are not high compared to the other variables.

- The observations you are selected randomly and independently from the given datasets.

- Residues are distributed based on the mean of zero and variances.
****

**3.Gradient Descent**

Gradient descent is the optimization of function that can adjust its weights based on the error caused by it.

The slope is also known as the gradient is typically found on a straight-line graph.

It is the difference of two pints on the y-axis to the points on the x-axis. The x and y axes of the graph represent the error and output.

This is how the error is adjusted depending on the slope. It improves deep learning and neural networks based on the minimization of the cost function.

The various types of Gradient Descend are:

**Batch Gradient Descent:**In this process, the training examples are processed at each step of gradient descent. If there are more examples, the gradient descent is very expensive. It follows a straight path towards the minimum.

**Stochastic Gradient Descent:**In this, only one example is processed per iteration. It is faster than batch gradient descent, but the number of examples required is more. This is much noisier as it can train only one example per iteration.****

**Mini Batch Gradient Descent:**This is a very fast iterative method. The examples are processes per iterations. The number of examples required is high.****

**4.Optimization Algorithms**

Optimization Algorithm is a procedure

which includes iteration until an optimum output is obtained.

Examples of optimization algorithm includes:

- Stochastic gradient descent

- Conjugate Gradient

- Hessian Algorithm

- ADADELTA

- ADAM Optimization

- NONlinear Optimization

- Linear Gradient Descent

- Broyden–Fletcher–Goldfarb–Shanno

- Algorithm

- RMSprop optimization algorithm

- Nesterov's algorithm

- Adaptive Gradient

**5.Activation Functions**

The activation function of the node is the one that determines the output of the node, which will be generated depending on the input.

The purpose of this is to introduce non-linearity into the output.

This uses the back-propagation method. Some examples of activation function include:

- CUBE Activated Function

- Tanh or tangent

- Hyperbolic function.

- Hard tan h

- Rational tan h

- Sigmoid Function is the

- The function of the S-shaped graph.

- Rectified linear unit or Relu function. It is implemented in the hidden layer of the neural network.

- The softmax function is another type of sigmoid function. It is used to handle classification problems.

- Softsign function

- Leaky Relu

- Exponential Linear Unit

- Identity

- Hard Sigmoid

- Soft plus

**6. Logistic Regression**

Deep Neural Network has many layers, and its last layer always has a significant role. Each node is labeled in supervised learning.

The node turns on and turns off based upon the strength of the signal passing through the node.

This depends on the previous layer. The output node has two outputs, either zero or one depending on if it is labeled or not.

Logistic regression is the method of conversion of a continuous signal into binary output.

This is mostly used under classification and not under regression.

The probability of inputs matches the label is calculated by logistic regression.

This is the formula used to calculate the possibilities based on logistic regression.

It is a very good method to be practiced. It is used for sensitivity analysis.

**Neural Networks & Artificial Intelligence**

Neural Networks and Artificial Intelligence are the two different emerging fields in computer science.

Though they are two different fields, they are intertwined with each other.

The Neural Network is the stepping stone of Artificial Intelligence.

Artificial Intelligence is the aim to increase success and not intelligence. Neural Networks aim to increase accuracy and not intelligence.

Artificial intelligence simulates natural intelligence to solve problems, and neural networks are part of it, which helps in the stimulation of data by given input.

**Applications of Artificial Neural Networks:**

A neural network is presently used on a very large scale in the present world. Few applications of Neural Network include:

- In the field of solar energy, artificial neural network applications are being used to manufacture and design a solar steam generating plant.

- ANN finds most of its application in robotics and automation.

- It can be used to estimate the amount of heat energy that must be added to a space to maintain the optimal temperature of the space.

- ANN is used in control systems, medicines, signal processing, and forecasting.

- ANN is used for modeling of the system. It can be used for the identification of the system and to implement complex mapping.

- ANN is used to determine the concentration ratio of something which may be present in the environment.

- ANN is used in pattern recognition for security purposes.

- ANN is used in manufacturing by machines without human intervention.

- ANN is used to predict the number of air flows in a ventilated room.

- ANN is used to predict the amount of energy consumed by solar buildings.

- ANN can be used to solve non- linear problems.

- ANN is used to classify text based on language, font, etc.

- They can handle data that is incomplete and those who are noisy.

- ANN can be used in refrigerators to handle the temperature of the system according to the things present in it.

Artificial neural networks can be used in ventilators, air-conditioning systems, etc. to control the amount of airflow into the house.

**Limitations of Neural Networks:**

- The neural network is used for a supervised or labeled data model.

- This cannot handle unsupervised machine learning or the data which is not labeled.

- It does not cluster and associate on the given input dataset.

- It lacks accuracy.

Artificial Neural Networks with Artificial Intelligence and Machine learning will have a very great scope in the future.

It is believed that this will help to predict severe illnesses too. It can also find its scope in the prediction of calamities.

As this can be used to predict future outcomes based on the raw data, it can help people to plan their life accordingly and prevent themselves from other problems.