Image recognition is a classic example of the use of neural networks. Let's remember how the network learning process takes place, what difficulties arise and why biology should be used in development. Details under the cut.

Dmitry Soshnikov, a Microsoft technical evangelist, a member of the Russian Association of Artificial Intelligence, a teacher of functional and logical AI programming at the Moscow Aviation Institute, Moscow Institute of Physics and Technology and the Higher School of Economics, as well as our courses, will help us with the story.

Imagine that we have a lot of pictures that need to be sorted into two piles using a neural network. How can this be done? Of course, everything depends on the objects themselves, but we can always highlight some features.

We need to know as much information as possible about the input data and take it into account in manual input, even before training the network. For example, if we have a task to detect multi-colored cats in a picture, not the color, but the shape of the object will be important. When we get rid of color by going to black and white, the network will learn much faster and more successfully: it will have to recognize several times less information.

For the recognition of arbitrary objects, such as cats and frogs, color is obviously important: the frog is green, but the cats are not. If we leave the color channels, for each palette, the network learns to recognize objects in the image again, because this color channel is fed to other neurons.

But what if we want to destroy the famous meme about cats and bread by teaching a neural network to detect an animal in any picture? It would seem that the colors and shape are approximately the same. What to do then?

Filter banks and biological vision

With the help of different filters, you can select different fragments of the image, which are then detected and examined as separate properties. For example, input to traditional machine learning or neural networks. If the neural network has Additional information about the structure of the objects with which it works, the quality of work increases.

In the field of machine vision, filter banks have been developed - sets of filters to highlight the main features of objects.

A similar "architecture" is used in biology. Scientists believe that human vision does not determine the entire image as a whole, but highlights characteristics, unique features by which the brain identifies an object. Accordingly, for quick and correct recognition of an object, it is possible to determine the most unique features. For example, cats can have whiskers - fan-shaped horizontal lines in the image.

Weight Sharing

So that the network does not have to separately learn to recognize cats in different parts pictures, we "share" the weights responsible for recognition between different fragments of the input signals.

This requires a specialized network architecture:

  • convolutional networks for working with images
  • recurrent networks for working with text / sequences
Neural networks that are effectively used in image recognition, which use special convolutional layers (Convolution Layers).

The main idea is this:

  • Using weight sharing to create a "filter window" running over the image
  • The filter applied to the image helps to highlight the fragments that are important for recognition
  • Whereas in traditional machine vision, filters were designed by hand, neural networks allow us to design optimal filters through training.
  • Image filtering can be naturally combined with neural network computation


For image processing, convolution is used, as in signal processing.

Let's describe the convolution function with the following parameters:

  • kernel - convolution kernel, weight matrix
  • pad - how many pixels to add to the image around the edges
  • stride - filter application frequency. For example, for stride=2 we will take every second pixel of the image vertically and horizontally, reducing the resolution by half
In : def convolve(image, kernel, pad = 0, stride = 1): rows, columns = image.shape output_rows = rows // stride output_columns = columns // stride result = np.zeros((output_rows, output_columns)) if pad > 0: image = np.pad(image, pad, "constant") kernel_size = kernel.size kernel_length = kernel.shape half_kernel = kernel_length // 2 kernel_flat = kernel.reshape(kernel_size, 1) offset = builtins.abs( half_kernel-pad) for r in range(offset, rows - offset, stride): for c in range(offset, columns - offset, stride): rr = r - half_kernel + pad cc = c - half_kernel + pad patch = image result = np.dot(patch.reshape(1, kernel_size), kernel_flat) return result
In : def show_convolution(kernel, stride = 1): """Displays the effect of convolving with the given kernel.""" fig = pylab.figure(figsize = (9,9)) gs = gridspec.GridSpec(3, 3, height_ratios=) start=1 for i in range(3): image = images_train conv = convolve(image, kernel, kernel.shape//2, stride) ax = fig.add_subplot(gs[i]) pylab.imshow(image, interpolation="nearest") ax.set_xticks() ax.set_yticks( ) ax = fig.add_subplot(gs) pylab.imshow(kernel, cmap="gray", interpolation="nearest") ax.set_xticks() ax.set_yticks() ax = fig.add_subplot(gs) pylab.imshow(conv , interpolation="nearest") ax.set_xticks() ax.set_yticks() pylab.show()
In : blur_kernel = np.array([, , , , ], dtype="float32") blur_kernel /= 273

Filters

Blur

The blur filter lets you smooth out bumps and emphasize the overall shape of objects.


In : show_convolution(blur_kernel)

Vertical edges

You can come up with a filter that highlights the vertical transitions of brightness in the image. Here, blue indicates the transition from black to white, yellow - vice versa.


In : vertical_edge_kernel = np.array([, , , , ], dtype="float32") vertical_edge_kernel /= 166
In : show_convolution(vertical_edge_kernel)

Horizontal edges

A similar filter can be built to highlight horizontal strokes in an image.


In : horizontal_bar_kernel = np.array([, [-2, -8, -13, -8, -2], , [-2, -8, -13, -8, -2], ], dtype=" float32") horizontal_bar_kernel /= 132
In : show_convolution(horizontal_bar_kernel)

contour filter

You can also build a 9x9 filter that will highlight the contours of the image.


In : blob_kernel = np.array([, , , , , , , , ], dtype="float32") blob_kernel /= np.sum(np.abs(blob_kernel))
In : show_convolution(blob_kernel)
This is how the classic example of digit recognition works: each digit has its own characteristic geometric features (two circles - figure eight, a slash halfway through the image - one, etc.), according to which neural network can determine what kind of object. We create filters that characterize each digit, run each of the filters over the image and reduce the error to a minimum.


If we apply a similar approach to searching for cats in the picture, it will quickly become clear that the quadruped has a lot of signs for training the neural network, and they are all different: tails, ears, mustaches, noses, fur and color. And each cat can have nothing in common with the other. A neural network with a small amount of data about the structure of the object will not be able to understand that one cat is lying and the other is standing on its hind legs.

Basic idea of ​​a convolutional network

  • We create a convolutional layer in the neural network, which ensures that the filter is applied to the image.
  • We train filter weights using the backpropagation algorithm
For example, we have an image i, 2 convolutional filters w with outlets o. The elements of the output image will be calculated as follows:

Weight training

The algorithm is:
  • A filter with the same weights is applied to all pixels in the image.
  • In this case, the filter "runs" over the entire image.
  • We want to train these weights (common for all pixels) using a backpropagation algorithm.
  • To do this, we need to reduce the application of the filter to a single multiplication of matrices.
  • Unlike a fully connected layer, there will be fewer weights for training, and more examples.
  • Cunning - im2col

im2col

Let's start with the image x, where each pixel corresponds to a letter:

The program does not always work correctly. The output is monsters with four eyes, no ears, elongated in the shape of a star and spread over the canvas. You can create a monster yourself by drawing an absurdity in the first window.


Using the program is simple. On the left is the drawing window. Below it are three buttons: cancel, clean up and random drawing. Between the squares is the "process" button. She turns the drawing into a cat.

The program is based on a self-learning "neural network". According to the developer, the machine has processed 20,000 photos of cats. I highlighted elements in them, such as ears, wool, nose, eyes, mouth. I learned to recognize them and distinguish them by their outline.


The eyes are scary.

Works imperfect. It is especially bad when the eyes are processed. The definition of the boundaries of the picture is not always clear. Because of this, additional eyes appear, or they do not appear at all.

It turns out funny. The service is not limited to cats. On the site, you can build a house from blocks, glue shoes and model a bag for the next season.

Fashion bag for summer. Exclusive design!

More recently, developer Christopher Hesse revealed to the world his brainchild - the project. With the help of a neural network, drawn cats turn into “real” ones. At the heart of the idea is a machine learning system from Google called TensorFlow. Edges2cats is divided into two "fields". In the first, the user draws a cat (or something similar to it), and in the second, the neural network tries to make the drawing look like a real animal.

Simple fun appealed to Internet users. They started posting their new pets on Twitter en masse. In some cases, the “image” created by the neural network looked very realistic, as if we had a photograph of a real living being. Some users tried to make cats cute (sometimes they even succeeded), but in many cases real monsters were born.

Note that the eye recognition system does not always work correctly, so that in some pictures the eyes of animals are absent in principle, while in others the pupil may be where the nose should be.



What do kids love the most? Of course, cartoons. It is in this section that we have collected a variety of foreign and domestic cartoons. Among the huge selection, there is sure to be one that will especially fall in love with your child. If you have a lot to do or just want to relax, and the child asks for constant attention, and if he doesn’t, then he starts to “dirty”, then cartoons will come to the rescue. By turning on a cartoon for a child, you can distract him for at least half an hour, or even two or three.


This type of art as animation has been around for a long time. During this time, the quality has improved, which cannot but rejoice. Cartoons are madly in love with children of any generation, everyone, as a child, adored cartoons. Many adults at one time had to wait on TV and had to watch what was shown. Someone at one time was lucky if their parents bought cassettes or discs. And the new generation can already watch what they want and without spending from their parent's wallet, because almost every house has a computer and the Internet, with the help of which a huge card file of cartoons for every taste and color is opened.


For the little ones, the Soviet classic is perfect, which is famous for its simplicity, kindness and a pleasant picture. For example, "Crocodile Gena", "Prostokvashino", "Well, wait a minute!", "The Bremen Town Musicians", "Flying Ship", "Winnie the Pooh", "Kid and Carlson" and many others. You can even sit down with your child and reminisce about childhood. Also for young children there are many modern educational cartoons that differ not only in a brighter picture, but in content.


For children who are already finishing kindergarten or studying in primary school, entertaining cartoons are suitable, where heroes save someone or even the whole world. These are foreign cartoons about superheroes from comics, about sorceresses or fairies, as well as domestic ones about heroes.


Those kids who are already slowly and surely moving towards adolescence may already begin to be interested in cartoons that are especially different in plot. In such cartoons in a relaxed form, the child is forced to think about serious things and experience a lot of emotions. They are suitable for viewing by the whole family, because due to the well-thought-out plot, they will be no less interesting for adults. Such cartoons can be safely put on the same shelf with family films.


Teenagers, despite the fact that they consider themselves adults, still like to watch cartoons. For teenagers, they are already more daring and not as harmless as children. They are dominated by entertainment, adult jokes, teenage problems. These are mainly foreign serial cartoons, such as The Simsons, Family Guy, Futurama, etc.


Do not forget about adults. Yes, they also draw for adults, only they are somewhat similar to teenage ones, but they are more rude, there may be abusive words, intimate overtones and adult problems are affected (family life, work, loans, midlife crisis, etc.).


Cartoons are an art form in which the author's hands are completely free, because you can depict absolutely anything and at the same time add a charming story. We invite you to watch them right now and get great pleasure.