Skip to main content

Intro to Generative Adversarial Networks | GANs 001

   GANs consist of three terms Generative Adversarial Network. Let's understand these three terms first. Generative : A Generative Model takes input training sample from some distribution and learns to represents that distribution. Adversarial : It basically means Conflicting or Opposing. Networks : These are basically neural networks. So,Generative Adversarial Networks are deep neural network architecture comprising of two neural networks compete with each other to make a generative model. A GAN consist of two class models : Discriminative Model :- It is the one that discriminate between two different classes of data.It tries to identify real data from fakes created by the generator Generative Model :- The Generator turns noise into an imitation of the data to try to trick the discriminator Mathematically, A Generative Model 'G' to be trained on training data 'X' sampled from some true distribution 'D' is the one which, given some standard random distrib...

An Image For Computer Vision,Everything You Need To Know

 Image

An Image consists of a set of pixels, which are the buildings blocks for any image. Every Pixels defines the color or the intensity of light.

  

 Suppose an image has a resolution of 1000 x 750,which mean that it is 1000 pixels wide and 750 pixels tall. So the total number of pixels in our image will be 1000 * 750 = 7,50,000 pixels.

An Image can be of two type :-

  • Grayscale
  • Color

A Grayscale image can have a pixel value between 0 and 255, here 0 means the pixel is 'Black' and 255 means the pixel is 'White'. All the values in between represents various shades of gray. The matrix obtained from a Grayscale Image is 2-Dimensional ie it has width and height.

A Color image is represented in RGB color space.The matrix obtained from a color image is a 3D matrix with parameter of Width, Height and Depth. Pixels in the RGB color space are no longer a scalar value like in a grayscale/single channel image – instead, the pixels are represented by a list of three values: one value for the Red component, one for Green, and another for Blue.

 Image processing libraries such as OpenCv and skit-image  represents RGB images as multi-dimensional Numpy arrays with shape (height, width, depth). They also store the RGB channels in reverse order ie BGR. ,the depth is fixed at depth=3.

Before feeding the images to a neural network the image processing is required for scaling the images.The size/aspect ratio of the set of images should be same.The Common choices for width and height image sizes inputted to Convolutional Neural Networks include 32×32, 64×64, 224×224, 227×227, 256×256, and 299×299. 

 Loading an Image using OpenCV Library  

import cv2
image = cv2.imread("example.png")
print(image.shape)
cv2.imshow("Image", image)
cv2.waitKey(0)

*pip install cv2 if module not found

Comments

Popular posts from this blog

Best Platforms to Improve Machine Learning Skills | 2020

 Machine learning is one of the most exciting techniques one has ever encountered.The field of study that gives computers the ability to learn without being explicitly programmed is machine learning.Their are platforms that can help you improve your Machine Learning skills. Today I've come up with the list of some of my favorite platforms.   Platforms to Improve Machine Learning Skills 1. Kaggle The online community of data scientists and machine learning practitioners is Kaggle, a subsidiary of Google LLC. Kaggle is the largest data science community in the world.Kaggle enables users in a web-based data-science environment to find and publish data sets, explore and build models, work with other data scientists and machine learning engineers, and enter competitions to solve challenges in data science.With it's free GPUs, high paying competitions,massive community , thousands on datasets and notebooks, this platform helps a lot. 2. Seedbank It was launched by  'TensorFlow...