Saturday, April 10, 2021

Linear Programming in Artificial Intelligent ((Instructor Name: Nuruzzaman Faruqui)

Linear Programming

Introduction:

Linear Programming is the technique of portraying complicated relationships between elements by using linear functions to find optimum points. Linear Programming begins by taking the real-world data and translating it into a series of mathematical formulas.To find the optimum result, real-life problems are translated into mathematical models to better conceptualize linear inequalities and their constraints. Applications of linear programming are everywhere around you. You use linear programming at personal and professional fronts. You are using linear programming when you are driving from home to work and want to take the shortest route. Or when you have a project delivery you make strategies to make your team work efficiently for on-time delivery.

This is an important part of Artificial Intelligence course and this lab report is done by myself under the supervision of Nuruzzaman Faruqui, lecturer of City University, Bangladesh. From this course we get to explore the real applicable approaches through AI and also acquires better knowledge of the functionality of AI and how AI is making our daily life easier. That's why this is the best Artificial Intelligence course in Bangladesh.

Problem Statement:

Applying the linear programming we can minimize a cost function i.e., c₁x₁ + c₂x₂ + … + cₙxₙ. Here, x is a variable, and c_ is similar to co-efficient. In linear programming a constraint for sum of a variable can be shown as follows:

(a₁x₁ + a₂x₂ + … + aₙxₙ ≤ b) or (a₁x₁ + a₂x₂ + … + aₙxₙ = b). Here, x_ is a variable with a1_co-efficient and b is resources.

From the above example we can illustrate the problem as like-

Two machines, X₁ and X₂.
X₁ costs $50/hour to run, X₂ costs $80/hour to run. The goal is to minimize cost.
This can be formalized as a cost function: 50x₁ + 80x₂.
X₁ requires 5 units of labor per hour.
X₂ requires 2 units of labor per hour.
Total of 20 units of labor to spend.
This can be formalized as a constraint: 5x₁ + 2x₂ ≤ 20.
X₁ produces 10 units of output per hour.
X₂ produces 12 units of output per hour.
The company needs 90 units of output.
This is another constraint.
Literally, it can be rewritten as 10x₁ + 12x₂ ≥ 90.
However, constraints need to be of the form (a₁x₁ + a₂x₂ + … + aₙxₙ ≤ b) or (a₁x₁ + a₂x₂ + … + aₙxₙ = b).

Therefore, we multiply by (-1) to get to an equivalent equation of the desired form: (-10x₁) + (-12x₂) ≤ -90

Code Commentary:

Result:

After implementing the code in python language we get the below result

Conclusion:

As Linear Programming is a valuable way of displaying real-world data in a mathematical way, it is commonly used in manufacturing and the service industry. For example, many large distribution companies will use linear programming in the analysis of their supply chain operations, similar to the toy example above. Additionally, linear programming can be used outside the warehouse in the optimization of delivery routes. Companies like Amazon and FedEx use linear programming to find the shortest and most efficient delivery routes. Linear programming is also used in machine learning applications where a neural network is trained to fit model of a function in order to label input data and predict unknown future values. There is lot more in Machine Learning than linear programming or regression modeling. Linear programming is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming (mathematical optimization).

Friday, April 9, 2021

Hill Climbing Algorithm Learning (Instructor Name: Nuruzzaman Faruqui)

Hill Climbing

Introduction:

Hill climbing is an optimization technique for solving computationally hard problems. It is a local search algorithm which continuously moves in the direction of increasing elevation/value to find the peak of the mountain or best solution to the problem. It terminates when it reaches a peak value where no neighbor has a higher value.

One of the ways of solving the local maxima problem involves repeated an exploration of the problem space is Random-restart. Random-restart hill-climbing conducts a series of hill-climbing searches from randomly generated initial states, running each until it halts or makes no discernible progress. This enables comparison of many optimization trials, for finding a most optimal solution thus becomes a question of using sufficient iterations on the data.

Problem Statement:

Hill climbing is a heuristic search used for mathematical optimization problems in the field of Artificial Intelligence. For given a large set of inputs and a good heuristic function, it tries to find a sufficiently good solution to the problem. This solution may not be the global optimal maximum. It is also called greedy local search as it only looks to its good immediate neighbor state and not beyond that. A node of hill climbing algorithm has two components which are state and value. Hill Climbing is mostly used when a good heuristic is available.

A hill climbing algorithm will look the following way in pseudo code:

Function Hill-Climb (problem):

current = initial state of problem
repeat:

neighbor = best valued neighbor of current
if neighbor not better than current:

return current

current = neighbor

Here we are widely discuss the examples of Hill climbing algorithm about Hospitals-Houses Problem in which we need to minimize the distance of best state of finding the hospitals in lowest path cost.

In here we are using the Hill climbing algorithm found the better neighbor. Here the initial state cost was 70, after using the algorithm we found the better neighbor which cost is 45 and generate the images for all state.

Result:

Initial Cost

After Applying Hill Climbing Algorithm

Used the random restart variant in the hill climbing:

In the above program we will not be able to use the random-restart variant directly. We need to make some changes before running the program.

(1) Firstly we need to remove all the images from the folder.

(2) And then set a value to determine how many times the restart will randomly.

In here we are randomly restart for 15 times. That means we are not using Hill climbing algorithm for one time but repeat it for 15 times. And we got the best state at minimum cost.

Conclusion:

Due to the limitations of Hill Climbing, multiple variants have been thought of to overcome the problem of being stuck in local minima and maxima. Random-restart: conduct hill climbing multiple times. Each time, start from a random state. Compare the maxima from every trial, and choose the highest amongst those.

we have learned about Hill Climbing algorithm. We also learned how by using a hill-climbing algorithm we minimize the cost. We saw a full implementation of a hill-climbing algorithm in python. The problem with Hill Climbing Algorithm is at some point agent falls into local minima and maxima which leads to failure of finding optimized solutions. That’s why multiple variants are here to solve this issue. Although each one still has the potential of ending up in local minima and maxima and no means to continue optimizing. But variants perform better than regular Hill Climbing algorithms.

Markov Chain (Instructor Name: Nuruzzaman Faruqui)

Markov Chain

Introduction

A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain. Markov chains are a fairly common, and relatively simple, way to statistically model random processes. They have been used in many different domains, ranging from text generation to financial modeling, model probabilities using information that can be encoded in the current state. Something transitions from one state to another semi-randomly, or stochastically. Each state has a certain probability of transitioning to each other state, so each time you are in a state and want to transition, a markov chain can predict outcomes based on pre-existing probability data.

This Lab-work report is done by myself under supervision of Nuruzzaman Faruqui, Lecturer of City University, Bangladesh. From this course, we acquires better knowledge of the functionality of AI and also known how AI is making our daily life easier. This is the best Artificial Intelligence course in Bangladesh.

Problem Statement

To construct a Markov chain, we need a transition model that will specify the the probability distributions of the next event based on the possible values of the current event.

Imagine that there were two possible states for weather: sunny or rainy. We can always directly observe the current weather state, and it is guaranteed to always be one of the two aforementioned states.

Here we have a transition model now let’s make a Markov chain by this model .

In this following figure, the probability of tomorrow being sunny based on today being sunny is 0.8. This is reasonable, because it is more likely than not that a sunny day will follow a sunny day. However, if it is rainy today, the probability of rain tomorrow is 0.7, since rainy days are more likely to follow each other. Using this transition model, it is possible to sample a Markov chain and we implement the model by a python code.

Code Commentary:

Result

When executing the code, we got the following output/result.

Conclution:
From the upper discasions we know the widely used of Markov chains, we should now be able to easily implement them in any language of our choice. There are also many more advanced properties of Markov chains and Markov processes to dive into. We can apply this model such as Predicting traffic flows, communication networks, genetic issues, and many more.

Thursday, April 8, 2021

Uncertainty – Bayesian Network (Instructor Name: Nuruzzaman Faruqui)

Introduction

"A Bayesian network is a probabilistic graphical model which represents a set of variables and their conditional dependencies using a directed acyclic graph." We know knowledge representation are using first-order logic and propositional logic with certainty, which means we were sure about the predicates. With this knowledge representation, we might write A→B, which means if A is true then B is true, but consider a situation where we are not sure about whether A is true or not then we cannot express this statement, this situation is called uncertainty. So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or probabilistic reasoning. Uncertainty appears when an agent is not completely sure about the outcome of the decisions it has made. It’s a data structure that represents probabilistic relations and dependencies among random variables.

The properties are in Bayesian Networks:

Networks are directed graphs.

Each node on the graph represents a random variable.
An arrow from X to Y represents that X is a parent of Y. That is, the probability distribution of Y depends on the value of X.
Each node X has probability distribution P (X | Parents(X)).

Problem Statement:

Here we have an scenario to understand the Bayesian Network from Uncertainty.

Let Mr. X has an appointment to attend. To reach the destination Mr. X may travel on a train. Now his chances of not being able to attend the appointment are uncertain. And this could vary on random variables, for instance, Rain, Maintenance (of the train), Train (arrival of the train). So, we will use these variables to build a model and reduce the uncertainty as much as possible.

Figure: The Bayesian network model

Solution

From the above model let’s rain is a root node. That means it has no other probability distributions are depend on it. The numbers are arbitrarily given for the sake of understanding and implementing on python. These are probability distributions, so obviously total sum would be 1.

*none*	*light*	*heavy*
0.7	0.2	0.1

Maintenance from our model depends on root node Rain. {yes, no} defines whether there will be maintenance on a train track.

R	*yes*	no
none	0.4	0.6
light	0.2	0.8
heavy	0.1	0.9

The train is a variable that varies if the train is on time or delayed. The train has dependencies on Maintenance and root node Rain. So, their values affect the probability distribution of Train.

R	M	*on time*	*delayed*
none	yes	0.8	0.2
none	no	0.9	0.1
light	yes	0.6	0.4
light	no	0.7	0.3
heavy	yes	0.4	0.6
heavy	no	0.5	0.5

The final is an Appointment from our model which is solely dependent on its parent node Train. Though Train has dependencies with other variables. However, the Appointment variable only focuses on its parent node.

T	*attend*	*miss*
on time	0.9	0.1
delayed	0.6	0.4

Code Commentary:

#pomegranate is a python package which implement fast and extremely flexoble probablilistic models
#rannging from probability distribution to Bayesian network
from pomegranate import *

# Rain node has no parent
rain = Node(DiscreteDistribution({
    "none": 0.7,
    "light": 0.2,
    "heavy": 0.1
}), name="rain")

# Track maintenance node is conditional on rain
maintenance = Node(ConditionalProbabilityTable([
    ["none", "yes", 0.4],
    ["none", "no", 0.6],
   ["light", "yes", 0.2],
    ["light", "no", 0.8],
    ["heavy", "yes", 0.1],
    ["heavy", "no", 0.9]
], [rain.distribution]), name="maintenance")

# Train node is conditional on rain and maintenance
train = Node(ConditionalProbabilityTable([
    ["none", "yes", "on time", 0.8],
    ["none", "yes", "delayed", 0.2],
    ["none", "no", "on time", 0.9],
    ["none", "no", "delayed", 0.1],
    ["light", "yes", "on time", 0.6],
    ["light", "yes", "delayed", 0.4],
    ["light", "no", "on time", 0.7],
    ["light", "no", "delayed", 0.3],
    ["heavy", "yes", "on time", 0.4],
    ["heavy", "yes", "delayed", 0.6],
    ["heavy", "no", "on time", 0.5],
    ["heavy", "no", "delayed", 0.5],
], [rain.distribution, maintenance.distribution]), name="train")

# Appointment node is conditional on train
appointment = Node(ConditionalProbabilityTable([
    ["on time", "attend", 0.9],
    ["on time", "miss", 0.1],
    ["delayed", "attend", 0.6],
    ["delayed", "miss", 0.4]
], [train.distribution]), name="appointment")

# Create a Bayesian Network and add states
model1 = BayesianNetwork()
model1.add_states(rain, maintenance, train, appointment)

# Add edges connecting nodes
model1.add_edge(rain, maintenance)
model1.add_edge(rain, train)
model1.add_edge(maintenance, train)
model1.add_edge(train, appointment)

# Finalize model
model1.bake()

By using this model, we can predict if Mr. X can attend his appointment on some given conditions. Let’s try that in Python.

# Calculate the probability for a given observation
reds_probability = model1.probability([["heavy", "yes", "delayed", "attend"]])

print(reds_probability)

Result:

After implementing the code we got the result below-

Conclusion

It is also called a Bayes network, belief network, decision network, or Bayesian model. Bayesian networks are probabilistic, because these networks are built from a probability distribution, and also use probability theory for prediction and anomaly detection. Real world applications are probabilistic in nature, and to represent the relationship between multiple events, we need a Bayesian network. It can also be used in various tasks including prediction, anomaly detection, diagnostics, automated insight, reasoning, time series prediction, and decision making under uncertainty.

Artificial Intilegent Lab Problem Solving

Saturday, April 10, 2021