Skip to main content

Intro to Generative Adversarial Networks | GANs 001

   GANs consist of three terms Generative Adversarial Network. Let's understand these three terms first. Generative : A Generative Model takes input training sample from some distribution and learns to represents that distribution. Adversarial : It basically means Conflicting or Opposing. Networks : These are basically neural networks. So,Generative Adversarial Networks are deep neural network architecture comprising of two neural networks compete with each other to make a generative model. A GAN consist of two class models : Discriminative Model :- It is the one that discriminate between two different classes of data.It tries to identify real data from fakes created by the generator Generative Model :- The Generator turns noise into an imitation of the data to try to trick the discriminator Mathematically, A Generative Model 'G' to be trained on training data 'X' sampled from some true distribution 'D' is the one which, given some standard random distrib...

An Image For Computer Vision,Everything You Need To Know

 Image

An Image consists of a set of pixels, which are the buildings blocks for any image. Every Pixels defines the color or the intensity of light.

  

 Suppose an image has a resolution of 1000 x 750,which mean that it is 1000 pixels wide and 750 pixels tall. So the total number of pixels in our image will be 1000 * 750 = 7,50,000 pixels.

An Image can be of two type :-

  • Grayscale
  • Color

A Grayscale image can have a pixel value between 0 and 255, here 0 means the pixel is 'Black' and 255 means the pixel is 'White'. All the values in between represents various shades of gray. The matrix obtained from a Grayscale Image is 2-Dimensional ie it has width and height.

A Color image is represented in RGB color space.The matrix obtained from a color image is a 3D matrix with parameter of Width, Height and Depth. Pixels in the RGB color space are no longer a scalar value like in a grayscale/single channel image – instead, the pixels are represented by a list of three values: one value for the Red component, one for Green, and another for Blue.

 Image processing libraries such as OpenCv and skit-image  represents RGB images as multi-dimensional Numpy arrays with shape (height, width, depth). They also store the RGB channels in reverse order ie BGR. ,the depth is fixed at depth=3.

Before feeding the images to a neural network the image processing is required for scaling the images.The size/aspect ratio of the set of images should be same.The Common choices for width and height image sizes inputted to Convolutional Neural Networks include 32×32, 64×64, 224×224, 227×227, 256×256, and 299×299. 

 Loading an Image using OpenCV Library  

import cv2
image = cv2.imread("example.png")
print(image.shape)
cv2.imshow("Image", image)
cv2.waitKey(0)

*pip install cv2 if module not found

Comments