Google used quantum neural networks to model molecules. Why quantum neural networks are interesting

Ministry of Education and Science of the Russian Federation

State educational institution of the Moscow region

International University of Nature, Society and Man "Dubna"

Master's work

TopicQuantum neural networks in learning and control processes

student Afanasyeva Olga Alexandrovna

annotation

this work is devoted to the analysis of quantum neural networks (QNN) and their practical application.

The solution of these problems is closely related to the development of quantum programming methods and is of theoretical and practical interest for the design processes of robust intelligent control under risk and unforeseen control situations, taking into account quantum effects in the formation information process self-organization of knowledge bases.

To achieve the goals, the literature of foreign authors was studied, examples of the use of QNS in management processes were considered.

The result of the work is a comparative analysis between classical and quantum neurons. New quantum operators such as superposition, quantum correlation and interference are proposed. The CNN allows you to see the search and learning space, improve the accuracy and robustness of the process of approximating the learning signal.

The work was carried out under the scientific supervision of Dr. phys.-math. Sciences, Professor S.V. Ulyanov at the Institute of System Analysis and Management of the International University of Nature, Society and Man "Dubna".

Introduction

1. Statement of the problem

1.2 Background

1.3 Research component

2. Scientific component

2.1 Architecture of Quantum Neural Networks

2.2 Why Quantum Neural Networks are interesting

2.3 Quantum neuron

2.4 Building a Quantum Neural Network

2.5 Quantum computing

2.6 KNS models

2.7 Quantum state and its representation

3. KNS training

3.1 Application of Quantum Neural Networks. The meaning of the supervised learning algorithm

3.2 Single-layer and multilayer perceptrons

3.2.1 Single-layer perceptron. Education

3.2.2 Multilayer perceptron. Multilayer perceptron training

3.3 Back Propagation Algorithm

3.4 Genetic algorithm. Classic traveling traveling problem

4. Automatic object management

4.1 Control object

4.2 Robotics as a direction of Artificial Intelligence

4.2.1 General block diagram of the robot

4.2.2 Conceptual model

4.2 Efficient Management quantum spin register. Cryptography and quantum teleportation

5. Practical part. Examples of Quantum Neural Networks

5.1 Inverted pendulum

5.2 Image compression

5.3 Alphabet coding

6 MATLAB NEURAL NETWORK TOOL. Kohonen network

Conclusion

Bibliography

Introduction

Today, as well as a hundred years ago, there is no doubt that the brain works more efficiently and in a fundamentally different way than any human-made computing machine. It is this fact that for so many years has been motivating and guiding the work of scientists around the world to create and study artificial neural networks.

Artificial Neural Networks (ANNs) have some attractive features such as distributed processing parallelism, error tolerance, and the ability to learn and generalize knowledge gained. The generalization property is understood as the ability of an ANN to generate correct outputs for input signals that were not taken into account in the learning process (training). These two properties make ANN an information processing system that solves complex multidimensional problems that are beyond the power of other techniques. However, ANNs also face many challenges, including the lack of rules for deterministic optimal architectures, limited memory capacity that takes a long time to train, and so on. .

Artificial Neural Networks have come into practice wherever it is necessary to solve problems of forecasting, classification or control. This impressive success is due to several reasons:

.Rich opportunities. Neural networks are an exceptionally powerful modeling technique that can reproduce extremely complex dependencies. For many years, linear modeling has been the go-to modeling technique in most fields because of its well-established optimization procedures. In problems where the linear approximation is unsatisfactory (and there are quite a few of them), linear models work poorly. In addition, neural networks cope with the "curse of dimensionality", which does not allow modeling linear dependencies in the case of a large number of variables.

.Easy to use. Neural networks learn from examples. A neural network user selects representative data and then runs a learning algorithm that automatically learns the structure of the data. In this case, of course, the user needs some set of heuristic knowledge about how to select and prepare data, choose the desired network architecture and interpret the results, however, the level of knowledge required for the successful application of neural networks is much more modest than, for example, using traditional methods statistics.

In the field of ANNs, some pioneers introduced quantumcomputation into a similar discussion such as quantum neural calculation, absorbed quantum neural network, quantum associative memory and parallel learning. They constructed a foundation for further study of quantum computing in ANNs. In the course of this, the field of artificial neural networks based on quantum theoretical concepts and methods appears. They are called Quantum Neural Networks.

1. Statement of the problem

.1 Purpose

Research and analysis of Quantum Neural Networks, their practical application.

Directions of research work

· Reveal the advantages of quantum neural networks over classical networks.

· Consider examples of the use of quantum neural networks in intelligent control processes.

· Conduct a simulation of the operation of a quantum neuron on a classical computer.

· Modeling a Data Clustering Network in MATLAB.

· Consider a specific example from robotics (robot manipulator).

1.2 Background

· Nine volumes on quantum computing and quantum programming, edition of the University of Milan, author Ulyanov S.V.

· Nilsson monograph, 2006.

· Website www.qcoptimizer.com.

1.3 Research component

The study refers to innovative technologies development of quantum neural networks in the field of intelligent control systems with learning. The solution of these problems is closely related to the development of quantum programming methods and is of theoretical and practical interest for the design processes of robust intelligent control under risk and unforeseen control situations, taking into account quantum effects in the formation of the information process of self-organization of knowledge bases.

2. Scientific component

.1 Architecture of Quantum Neural Networks

Some system can be called neural, if it can identify at least one neuron. The neural system is quantum neuralsystem if it is capable of quantum computing.

There are several different approaches to what can be called quantum neural networks. Different researchers use their own analogies to establish a connection between quantum mechanics and artificial neural networks. Some basic concepts of these two areas are given in the following table 1:

Table 1. Basic concepts of quantum mechanics and the theory of neural networks

Classical neural networksQuantum neural networksNeuron state qubits Connection entanglement learning rule Superposition of states of entanglement Search for a winner Interference as a unitary transformation Output Result Decoherence (measurement)

Pairs of concepts that are in the same line of the table should not be considered as analogies - in fact, the establishment of such an analogy is one of the main tasks of the theory of quantum neural networks. So far, quantum representations have been mainly used to implement classical computing. The concept of quantum computing was introduced in 1982 by Richard Feynman, who was investigating the role of quantum effects in future processors whose elements could be atomic in size. In 1985, David Deutsch formulated the concept of quantum computing. It is important to note that the efficiency of using neural networks is associated with massive parallel distributed processing of information and the nonlinearity of the transformation of input vectors by neurons. On the other hand, quantum systems have a much more powerful quantum parallelism, expressed by the principle of superposition.

When developing the concept of quantum classical and neural computing, an important role is played by the chosen interpretation of quantum mechanics, among which

Copenhagen Interpretation;

Feynman formalism of path integrals;

Everett's interpretation of multiple worlds, etc.

The choice of interpretation is important in establishing analogies between quantum mechanics and neurocomputing. In particular, it is important for solving the problem of correlating such a linear theory, which is quantum mechanics, with the essentially non-linear data processing that determines the power of neurotechnology.

2.2 Why Quantum Neural Networks are interesting

There are two main reasons for the interest in quantum neural networks. One has to do with arguments that quantum processes can play an important role in how the brain works. For example, Roger Penrose has made various arguments that only new physics, which should unify quantum mechanics with general theory relativity, could describe such phenomena as understanding and consciousness. However, his approach is not addressed to neural networks themselves, but to intracellular structures such as microtubules. Another reason is related to the rapid growth of quantum computing, the main ideas of which could well be transferred to neurocomputing, which would open up new opportunities for them.

Quantum neural systems can bypass some of the difficult issues that are essential for quantum computing due to their analogy, the ability to learn from a limited number of examples.

What can be expected from quantum neural networks? Currently, quantum neural networks have the following advantages:

exponential memory capacity;

better performance with fewer hidden neurons;

fast learning;

elimination of catastrophic forgetting due to the absence of image interference;

solving linearly inseparable problems with a single-layer network;

lack of connections;

high data processing speed (10 10 bits/s);

miniaturization (10 11neurons/mm 3);

higher stability and reliability;

These potential advantages of quantum neural networks are the main motivation for their development.

.3 Quantum neuron

Synapses communicate between neurons and multiply the input signal by a number characterizing the strength of the connection - the weight of the synapse. The adder performs the addition of signals coming through synaptic connections from other neurons and external input signals. The converter implements the function of one argument, the output of the adder, into some output value of the neuron. This function is called the activation function of the neuron.

Thus, the neuron is completely described by its weights and activation function F. Having received a set of numbers (vector) as inputs, the neuron produces some number at the output.

The activation function can be of various types. The most widely used options are listed in the table (Table 2).

Table 2: List of neuronal activation functions

NameFormulaValue rangeThreshold (0, 1)Signed (-1,1) Sigmoid (0.1) Semi-linear (0, ∞)Linear (-∞, ∞)Radial basic (0, 1) Semi-linear with saturation (0, 1) Linear with saturation (-1,1) Hyperbolic tangent (-1.1) Triangular (0, 1)

The definition of a Quantum Neuron is given as follows:

It receives input signals (input data or output signals from other SNS neurons) through several input channels. Each input signal passes through a junction having a certain intensity (or weight); this weight corresponds to the synaptic activity of the neuron. Each neuron has a specific threshold value associated with it. The weighted sum of the inputs is calculated, the threshold value is subtracted from it, and the result is the neuron's activation value (also called the neuron's post-synaptic potential - PSP).

The activation signal is transformed using an activation function (or transfer function) and as a result, the output signal of the neuron is obtained (Fig. 1).

Fig 1: Mathematical model of a neuron

A mathematical model of a quantum neuron, where are matrices acting on the basis, is an operator that can implement a network of quantum cells.

For example: The learning process of a quantum neuron. = - identity operator: .

The quantum learning rule is provided in analogy to the classical case, as follows: , where is the desired output. This learning rule brings the quantum neuron into the desired state used for learning. Taking the squared modulo difference between the actual and the desired output, we see that:

An entire network can be assembled from primitive elements using the standard rules of ANN architectures.

2.4 Building a Quantum Neural Network

This issue is solved in two stages: the choice of the type (architecture) of the QNS, the selection of weights (training) of the QNS.

The first step is to choose the following: which neurons we want to use (number of inputs, transfer functions); how should they be connected to each other; what to take as inputs and outputs of the KNS.

At first glance, this task seems boundless, but, fortunately, we do not have to invent QNS from scratch - there are several dozen different neural network architectures, and the effectiveness of many of them has been mathematically proven. The most popular and studied architectures are multilayer perceptron, general regression neural network, Kohonen neural networks and others.

At the second stage, we should "train" the selected network, that is, select such values ​​of its weights so that it works as needed. An untrained CNS is like a child - it can be taught anything. In neural networks used in practice, the number of weights can be several tens of thousands, so training is a really complex process. For many architectures, special learning algorithms have been developed that allow you to adjust the weights of the CNN in a certain way. The most popular of these algorithms is the Error Back Propagation method, used, for example, to train a perceptron.

2.5 Quantum computing

Quantum computing makes it possible to solve problems that cannot be solved on classical computers. For example, Shor's algorithm gives on a quantum computer a polynomial solution to the factorization of an integer by two prime factors, which is considered unsolvable on a classical computer. In addition, Grover's algorithm gives a significant speedup when searching for data in an unordered database.

So far, we have not seen a qualitative difference between the use of ordinary bits and qubits, but something strange happens when you train an atom with light, which is only enough for the electron to travel halfway between the levels of excitation. Since electrons cannot actually exist in the space between these levels, they exist ON BOTH levels at the same time. This is known as superposition .

This superposition allows one theoretically to calculate several possibilities at the same time, since a group of qubits can represent several numbers at the same time. For calculations using the superposition property, you can create a set of qubits, put them into superposition states, and then perform an action on them.

When the algorithm is completed, the superposition can be collapsed and a certain result will be obtained - i.e. all qubits will go to states 0 or 1. We can assume that the algorithm acts in parallel on all possible combinations of certain states of qubits (i.e. 0 or 1) - a trick known as quantum parallelism (Table 3).

Table 3: Main concepts of quantum computing

N The main concept of quantum computing: 1 2 3 4 5 6 Description wave function from quantum evolution (development) Superposition of classical states (sequence) Entanglement Interference Measurement Unitary transformations

The construction of models of quantum neural systems (as well as the creation of models of quantum computing) is faced with the need to find out which calculations can be characterized as truly quantum and what are the sources of efficiency of these calculations.

An important place is also occupied by the elucidation of the most effective areas of application of quantum computing systems.

The fundamental resource and basic formation of quantum information is a quantum bit (qubit). From a physical point of view, a qubit represents an ideal dual state of a quantum system. Examples of such systems include photons (vertical and horizontal polarization), electrons, and systems defined by two energy levels of atoms or ions. From the very beginning, the dual state of a system has played a central role in the study of quantum mechanics. This is the simplest quantum system, and in principle all other quantum systems can be modeled in the state space of collections of qubits.

The state of a quantum bit is given by a vector in a two-dimensional complex vector space. Here the vector has two components, and its projections to the bases of the vector space are complex numbers. A quantum bit is represented (in the Dirac notation as a ket-vector) as or in a vector notation (bra-vector). If, then. For the purposes of quantum computing, the basis states |0> and |1> encode the classical bit values ​​0 and 1, respectively. However, unlike classical bits, qubits can be in the superposition of |0> and |1>, such as where and are complex numbers, for which the following condition is satisfied: . If or take zero values, then defines the classical, pure state. Otherwise, it is said to be in a state of superposition of two classical basic states. Geometrically, a quantum bit is in a continuous state between and until measurements of its state are made. In the case when the system consists of two quantum bits, it is described as a tensor product. For example, in Dirac's notation, a two-quantum bit system is given as The number of possible states of the combined system increases exponentially with the addition of a quantum bit.

This leads to the problem of estimating the quantum correlation that is present between quantum bits in a composite system.

Exponential growth in the number of states, together with the ability to subject the entire space to transformations (either a unitary dynamic evolution of the system, or a dimension project in subspace eigenvector) provides the foundation for quantum computing. Since unitary transformations are reversible, quantum computations (except measurements) will all be reversible, limiting them to unitary quantum transformations. This means that each quantum cell (per one or many qubits) performs a reversible computation. So, given an output cell, it is necessary to uniquely determine what the input was. Fortunately there is classical theory reversible computation, which tells us that every classical algorithm can be made reversible with an acceptable upper, so this limitation on quantum computation does not pose a serious problem. This is something that should be kept in mind when proposing a specification for quantum gates, however.

2.6 KNS models

There are several research institutes in the world working on the concept of a quantum neural network, for example Technical University in Georgia and Oxford University. Most, however, refrain from publishing their work. This is probably because the potential implementation of a quantum neural network is much simpler than a conventional quantum computer, and every institution wants to win the quantum race. Theoretically, it is easier to build a quantum neural network than a quantum computer, for one reason. That reason is coherence. The superposition of many qubits reduces the resistance to noise in a quantum computer, and noise can potentially cause the superposition to collapse or decohere before a useful computation can be done. However, since quantum neural networks will not require very long periods or very many superpositions per neuron, they will be less affected by noise, continuing to perform calculations similar to those performed by a conventional neural network, but many times faster (exponentially in fact).

Quantum neural networks could realize their exponential speed advantage by using a superposition of the input and output values ​​of a neuron. But another advantage that could be gained is that since neurons can handle the superposition of signals, the neural network could actually have fewer neurons in the hidden layer when learning to approximate a given function. This would make it possible to build simpler networks with fewer neurons and, consequently, improve the stability and reliability of their operation (i.e., the number of opportunities for the network to lose coherence would be reduced). If all this is taken into account, then could a quantum neural network be computationally more powerful than a conventional network? At present, the answer seems to be no, since all quantum models use a finite number of qubits to carry out their calculations, and this is a limitation.

2.7 Quantum state and its representation

quantum neuron perceptron robotics

The quantum state and quantum computation operator are both important for understanding parallelism and plasticity in information processing systems.

In a quantum logical circuit, the fundamental quantum states are the one-bit state rotation U θ , shown in Fig.2, and a two-bit controlled NOT state, shown in Fig.2. 3. The first states rotate the input quantum state to this θ. Recent States perform the XOR operation.

Rice. 2. One-bit rotation state

Rice. 3. Two-bit control of NOT status

We single out the following complex-estimator of the representation Eq. (3) restrictively corresponding to the cubic state in Eq. (one).


Equation (3) makes it possible to express the following operations: a rotation state and a two-bit controlled NOT state.

a) Rotation state operation

The rotation state is a state-transfer phase that transforms the phase into a cubic state. Since the cubic state is represented by Eq. (3), the state is understood as the following relationship:

) Operation two-bit controlled NOT

This operation is defined by the input parameter γ in the following way:

where γ=1 corresponds to the reversal rotation, and γ=0 - non-rotation. When γ=0, the probability phase of the amplitude of the quantum state |1> is reversed.

However, the observed probability is invariant so that we regard this case as a non-rotation.

3. KNS training

Quantum neural networks are effective at performing complex functions in a number of many areas. These include pattern recognition, classification, vision, control systems, and prediction.

The ability to learn - adapt to conditions and opportunities in a changing environment - is so important feature neural networks, which is now attached as a separate item to the so-called "Turing test", which is the operational definition of the concept of intelligence.

Empirical test, the idea of ​​which was proposed by Alan Turing<#"justify">· The neural network is stimulated by the environment.

· The neural network undergoes changes in its free parameters as a result of excitation.

· The network responds in a new way to the environment due to the changes that have taken place in its internal structure.

There are numerous algorithms available and one would expect that there is some unique algorithm for designing a QNA model. The difference between the algorithms lies in the formulation that can change the weights of the neurons, and in the relation of the neurons to their environment.

All teaching methods can be classified into two main categories: supervised and unsupervised.

Table 4 presents various learning algorithms and their associated network architectures (the list is not exhaustive). The last column lists the tasks for which each algorithm can be applied. Each learning algorithm is focused on a network of a certain architecture and is intended for a limited class of tasks. In addition to those considered, some other algorithms should be mentioned: Adaline and Madaline, linear discriminant analysis, Sammon projections, principal component analysis.

Table 4: Known learning algorithms:

ПарадигмаОбучающее правилоАрхитектураАлгоритм обученияЗадачаС учителемКоррекция ошибкиОднослойный и многослойный перцептронАлгоритмы обучения перцептрона Обратное распространение Adaline и Madalineобучения перцептрона Обратное распространение Adaline и MadalineКлассификация образов Аппроксимация функций Предскащание, управлениеБольцманРекуррентнаяАлгоритм обучения БольцманаКлассификация образовХеббМногослойная прямого распространенияЛинейный дискриминантный анализАнализ данных Классификация образовСоревнованиеСоревнованиеВекторное квантованиеКатегоризация внутри класса Сжатие данныхСеть ARTARTMapКлассификация образовБез учителяКоррекция ошибкиМногослойная прямого распространенияПроекция СаммонаКатегоризация внутри Class Data AnalysisHebbDirect Propagation or CompetitionPrincipal Component AnalysisData CompressionData CompressionHopfield NetworkAssociative Memory LearningAssociative MemoryCompetitionCompetitionVector QuantizationCategorization Data CompressionKohonen SOMKohonen SOMK Categorization Data analysis ARTART1, ART2 networks Categorization Mixed Error correction and competition RBF network RBF learning algorithm Pattern classification Function approximation Prediction, control

To train the network means to tell it what we want from it. This process is very similar to teaching a child the alphabet. Showing the child a picture of the letter "A", we ask him: "What letter is this?" If the answer is wrong, we tell the child the answer that we would like to receive from him: "This is the letter A." The child remembers this example along with the correct answer, that is, some changes occur in his memory in the right direction. We will repeat the letter presentation process again and again until all 33 letters are firmly remembered. This process is called "supervised learning" (Fig. 4.).

Rice. 4. The process of "learning with a teacher".

When training a network, we act in exactly the same way. We have some database containing examples (a set of handwritten images of letters). Presenting the image of the letter "A" to the input of the CNS, we get from it some answer, not necessarily correct. We also know the correct (desired) answer - in this case, we would like the signal level to be maximum at the output of the CSS with the "A" label. Usually, the set (1, 0, 0, ...) is taken as the desired output in the classification problem, where 1 is at the output labeled "A", and 0 is at all other outputs. Calculating the difference between the desired response and the actual response of the network, we get 33 numbers - the error vector. The error backpropagation algorithm is a set of formulas that allows you to calculate the required corrections for the neural network weights from the error vector. We can present the same letter (as well as different images of the same letter) to the neural network many times. In this sense, training is more like repeating exercises in sports - training.

After multiple presentations of examples, the CNN weights stabilize, and the CNN gives correct answers to all (or almost all) examples from the database. In this case, we say that "the network has learned all the examples", "the neural network has been trained", or "the network has been trained". In software implementations, it can be seen that during the learning process, the error value (the sum of squared errors over all outputs) gradually decreases. When the error value reaches zero or an acceptable low level, the training is stopped, and the resulting network is considered trained and ready for use on new data.

Network trainingis broken down into the following steps:

  1. Network initialization: network weights and biases are assigned small random values ​​from ranges

    And respectively.

  2. Definition of a training sample element: (<текущий вход>, <желаемый выход>). The current inputs (x0, x1... xN-1) must be different for all elements of the training sample. When using a multilayer perceptron as a classifier, the desired output signal (d0, d1 ... dN-1) consists of zeros except for one element corresponding to the class to which the current input signal belongs.
  3. Calculation of the current output signal: the current output signal is determined in accordance with the traditional scheme of the functioning of a multilayer neural network.
  4. Synaptic Weight Tuning: To tune the weights, a recursive algorithm is used that is first applied to the output neurons of the network and then traverses the network backwards to the first layer. Synaptic weights are adjusted according to the formula:

,

where w ij - weight from neuron i or from input signal element i to neuron j at time t, x i " - output of neuron i or i-th element of the input signal, r - learning step, g j - error value for neuron j. If the neuron with number j belongs to the last layer, then

,

where dj is the desired output of neuron j, yj is the current output of neuron j. If the neuron with number j belongs to one of the layers from the first to the penultimate one, then

,

where k runs through all the neurons of the layer with a number one greater than that of the one to which neuron j belongs. External displacements of neurons b are configured in a similar way.

The considered model can be used for pattern recognition, classification, and forecasting. There have been attempts to build expert systems based on multilayer perceptrons with training using the backpropagation method. It is important to note that all the information that QNA has about the task is contained in the set of examples. Therefore, the quality of QNN training directly depends on the number of examples in the training sample, as well as on how fully these examples describe the given task. Once again, training neural networks is a complex and knowledge-intensive process. QNN learning algorithms have various parameters and settings, which require an understanding of their influence to control.

3.1 Application of Quantum Neural Networks. The meaning of the supervised learning algorithm

The class of problems that can be solved using QNN is determined by how the network works and how it learns. During operation, the QNS takes the values ​​of input variables and issues the values ​​of output variables. Thus, the network can be used in a situation where you have certain known information, and you want to get some information from it. known information(Patterson, 1996; Fausett, 1994). Here are some examples of such tasks:

· Pattern recognition and classification

Objects of different nature can act as images: text symbols, images, sound patterns, etc. When training the network, various patterns of images are offered with an indication of which class they belong to. A sample is usually represented as a vector of feature values. In this case, the totality of all features must uniquely determine the class to which the sample belongs. If there are not enough features, the network can correlate the same sample with several classes, which is not true. At the end of the network training, it can be presented with previously unknown images and receive an answer about belonging to a certain class.

· Decision making and management

This problem is close to the problem of classification. Situations are subject to classification, the characteristics of which are fed to the input of the CNS. At the output of the network, a sign of the decision that it made should appear. In this case, various criteria for describing the state of the controlled system are used as input signals.

· Clustering

Clustering refers to the division of a set of input signals into classes, despite the fact that neither the number nor the characteristics of the classes are known in advance. After training, such a network is able to determine which class the input signal belongs to.

· Forecasting

After training, the network is able to predict the future value of a certain sequence based on several previous values ​​and / or some currently existing factors. It should be noted that forecasting is possible only when the previous changes really predetermine the future ones to some extent.

· Approximation

A generalized approximation theorem is proved: using linear operations and a cascade connection, it is possible to obtain a device from an arbitrary nonlinear element that calculates any continuous function with some predetermined accuracy.

· Data Compression and Associative Memory

The ability of neural networks to identify relationships between various parameters makes it possible to express high-dimensional data more compactly if the data is closely interconnected with each other. The reverse process - restoring the original data set from a piece of information - is called auto-associative memory. Associative memory also allows you to restore the original signal/image from noisy/damaged input data. Solving the problem of heteroassociative memory makes it possible to implement content-addressable memory.

Stages of problem solving:

data collection for training;

data preparation and normalization;

choice of network topology;

experimental selection of network characteristics;

actual training;

checking the adequacy of training;

parameter adjustment, final training;

network verbalization for further use.

So, let's move on to the second important condition for applying Quantum Neural Networks: we must know that there is a relationship between known inputs and unknown outputs. This connection can be distorted by noise.

As a rule, QNS is used when the exact type of connections between inputs and outputs is unknown - if it were known, then the connection could be modeled directly. Another essential feature of the QNN is that the dependence between input and output is in the process of learning the network. Two types of algorithms are used for QNN training (different types of networks use different types of learning): supervised (“supervised learning”) and unsupervised (“unsupervised”). Most often, supervised learning is used.

For supervised network training, the user must prepare a set of training data. These data are examples of inputs and their corresponding outputs. The network learns to establish a connection between the first and second. Typically, training data is taken from historical data. It can also be the values ​​of stock prices and the FTSE index, information about past borrowers - their personal data and whether they successfully fulfilled their obligations, examples of the positions of the robot and its correct reaction.

The QNN is then trained using some kind of supervised learning algorithm (the best known of which is the backpropagation method proposed by Rumelhart et al., 1986), in which the available data is used to adjust the weights and thresholds of the network in such a way as to minimize the error prediction on the training set. If the network is trained well, it acquires the ability to model an (unknown) function relating the values ​​of input and output variables, and subsequently such a network can be used to predict in a situation where the output values ​​are unknown.

3.2 Single-layer and multilayer perceptrons

.2.1 Single layer perceptron. Education

Historically, the first artificial neural network capable of perception (perception) and the formation of a response to a perceived stimulus was PerceptronRosenblatt (F. Rosenblatt, 1957). The term " Perceptron" comes from the Latin perceptionwhich means perception, cognition. The Russian analogue of this term is "Perceptron". Its author considered the perceptron not as a specific technical computing device, but as a model of the brain. Modern works on artificial neural networks rarely pursue such a goal.

The simplest classical perceptron contains elements of three types (Fig. 5.).

Rice. 5. Rosenblatt's elementary perceptron

A single-layer perceptron is characterized by a matrix of synaptic connections ||W|| from S- to A-elements. The matrix element corresponds to the link leading from the i-th S-element (row) to the j-th A-element (columns). This matrix is ​​very similar to the matrices of absolute frequencies and informativeness formed in the semantic information model based on the system information theory.

From the point of view of modern neuroinformatics, a single-layer perceptron is mainly of purely historical interest; however, it can be used to study the basic concepts and simple algorithms for training neural networks.

Training a classical neural network consists in adjusting the weight coefficients of each neuron.

Step 1: The initial weights of all neurons are assumed to be random.

Step 2: The network is presented with an input image x a , resulting in an output image.

Step 3: The vector of the error made by the network at the output is calculated. The weight vectors are adjusted so that the amount of adjustment is proportional to the output error and is zero if the error is zero:

ü only the components of the weight matrix that correspond to non-zero values ​​of the inputs are modified;

ü the sign of the weight increment corresponds to the sign of the error, i.e. a positive error (output value less than required) leads to coupling gain;

ü learning of each neuron occurs independently of the learning of other neurons, which corresponds to the principle of locality of learning, which is important from a biological point of view.

Step 4: Steps 1-3 are repeated for all training vectors. One cycle of sequential presentation of the entire sample is called an epoch. Training ends after a few epochs if at least one of the following conditions is met:

ü when the iterations converge, i.e. the weight vector stops changing;

ü when the total absolute error summed over all vectors becomes less than some small value .

3.2.2 Multilayer perceptron. Multilayer perceptron training

This network architecture is probably the most commonly used today. It was proposed by Rumelhart and McClelland (1986) and is discussed in detail in almost all textbooks on neural networks (see, for example, Bishop, 1995). Each element of the network constructs a weighted sum of its inputs, corrected as a term, and then passes this activation value through a transfer function, and thus the output value of this element is obtained. The elements are organized in a layered topology with direct signal transmission. Such a network can be easily interpreted as an input-output model, in which weights and thresholds (bias) are free parameters of the model. Such a network can model a function of almost any degree of complexity, and the number of layers and the number of elements in each layer determine the complexity of the function. Determination of the number of intermediate layers and the number of elements in them is an important issue in the design of a multilayer perceptron (Haykin, 1994; Bishop, 1995).

The number of input and output elements is determined by the conditions of the problem. Doubts may arise as to which input values ​​to use and which not. We will assume that the input variables are chosen intuitively and that they are all significant. The question of how much to use intermediate layers and elements in them is still completely unclear. One intermediate layer can be taken as an initial approximation, and the number of elements in it can be set equal to half the sum of the number of input and output elements. Again, we will discuss this issue in more detail later.

A multilayer perceptron is a trainable recognition system that implements a linear decision rule corrected during the learning process in the space of secondary features, which are usually fixed randomly selected linear threshold functions from primary features.

During training, signals from the training sample are alternately fed to the input of the perceptron, as well as indications of the class to which this signal should be assigned. The training of the perceptron consists in correcting the weights for each recognition error, i.e., for each case of a mismatch between the solution produced by the perceptron and the true class. If the perceptron erroneously attributed the signal to a certain class, then the weights of the function, the true class, increase, and the erroneous ones decrease. In the case of a correct decision, all weights remain unchanged (Fig. 6.).

Rice. 6. Double layer perceptron

Example: Consider a perceptron, i.e. system with n input channels and output channel y. The output of a classical perceptron is where is the activation function of the perceptron and is the tuning weights during the training process. The perceptron learning algorithm works as follows.

The weights are initialized with small amounts.

The model vector represents the perceptron and the output y obtained according to the rule

The weights are updated according to the rule where t is discrete time and d is the desired output produced for training and is the stride.

Comment. It will hardly be possible to build an exact analogue of the non-linear activation function F, as well as the sigmoid and other general use functions in neural networks, perhaps for the quantum case.

3.3 Back Propagation Algorithm

In the mid-1980s, several researchers independently proposed an efficient learning algorithm for multilayer perceptrons based on calculating the gradient of the error function. The algorithm has been called "error backpropagation".

The backpropagation algorithm is an iterative gradient learning algorithm that is used to minimize the standard deviation of the current output and the desired output of multilayer neural networks.

In the "back propagation" neuroparadigm, sigmoidal transfer functions are most often used, for example

Sigmoid functions are monotonically increasing and have non-zero derivatives over the entire domain of definition. These characteristics ensure the proper functioning and learning of the network.

The functioning of a multilayer network is performed in accordance with the formulas:

where s is the adder output, w is the connection weight, y is the neuron output, b is the bias, i is the neuron number, N is the number of neurons in the layer, m is the layer number, L is the number of layers, f is the activation function.

Back propagation method- a way to quickly calculate the gradient of the error function.

The calculation is performed from the output layer to the input one using recurrent formulas and does not require recalculation of the output values ​​of neurons.

The backpropagation of the error allows many times to reduce the computational costs for calculating the gradient compared to the calculation for determining the gradient. Knowing the gradient, you can apply many methods of optimization theory that use the first derivative.

The backpropagation algorithm calculates the error surface gradient vector. This vector indicates the direction of the shortest descent on the surface from a given point, so if we move "a little" along it, the error will decrease. A sequence of such steps (slowing down as you approach the bottom) will eventually lead to a minimum of one type or another. A certain difficulty here is the question of what step length should be taken.

Of course, with such training of the neural network, there is no certainty that it has learned the best way, since there is always the possibility of the algorithm hitting a local minimum (Fig. 7.). To do this, special techniques are used to “knock out” the found solution from local extremum. If, after several such actions, the neural network converges to the same solution, then we can conclude that the solution found is most likely optimal.

Rice. 7. Gradient Descent Method for Minimizing Network Error

3.4 Genetic algorithm. Classic traveling traveling problem

The genetic algorithm (GA) is able to perform optimal tuning of the CNN with the dimension of the search space sufficient to solve most practical problems. At the same time, the range of applications under consideration far exceeds the capabilities of the error backpropagation algorithm.

Information processing by a genetic algorithm uses two main mechanisms for selecting useful features, borrowed from modern ideas about natural selection: mutations in a single chain and crossing over (crossing over) between two chains. Let us consider these mechanisms in more detail (Table 5).

Table 5: Mutations and crosses

00111101011100001001 0001101000011101001A) Initial genetic chains 00111101 ....... 01100001001 ....... 00001101001 0001101B) Random formation of the region for subsequent crossing00111101 ....... 01100001 0001101V) Fragmentation of fragments code 001110100001101001 000110101100001001d) Chains after crossing

The figure shows the successive stages of information exchange between two chains when crossing. The resulting new chains (or one of them) can be further included in the population if the set of features they specify gives the best value of the objective function. Otherwise, they will be weeded out, and their ancestors will remain in the population. Mutation in the genetic chain has a point character: at some random point in the chain, one of the codes is replaced by another (zero - one, and one - zero).

From the point of view of artificial information processing systems, genetic search is a specific method for finding a solution to an optimization problem. At the same time, such an iterative search is adaptive to the features of the objective function: the chains born in the process of crossing test more and more broad areas feature spaces and are predominantly located in the optimum region. Relatively rare mutations prevent the degeneration of the gene pool, which is tantamount to a rare but never-ending search for the optimum in all other areas of the attribute space.

In the last ten years, many methods have been developed for supervised learning of CNS using GA. The results obtained prove the great possibilities of such a symbiosis. The joint use of QNS and GA algorithms also has an ideological advantage, because they belong to the methods of evolutionary modeling and are developed within the framework of one paradigm of borrowing natural methods and mechanisms by technology as the most optimal.

To model the evolutionary process, let's first generate a random population - several individuals with a random set of chromosomes (numerical vectors). The genetic algorithm imitates the evolution of this population as a cyclic process of crossing individuals and changing generations (Fig. 8.).

Rice. 8. Calculation algorithm

Consider the advantages and disadvantages of standard and genetic methods using the classic traveling salesman problem (TSP) as an example. The essence of the problem is to find the shortest closed path around several cities, given by their coordinates. It turns out that already for 30 cities, finding the optimal path is a difficult task that has prompted the development of various new methods (including neural networks and genetic algorithms).

Each solution variant (for 30 cities) is a numeric line, where the j-th place is the number of the j-th city bypass in order. Thus, there are 30 parameters in this problem, and not all combinations of values ​​are allowed. Naturally, the first idea is a complete enumeration of all bypass options.

The enumeration method is the simplest in nature and trivial in programming. To find the optimal solution (maximum point of the objective function), it is required to sequentially calculate the values ​​of the objective function at all possible points, remembering the maximum of them. The disadvantage of this method is the high computational cost. In particular, in the traveling salesman problem, it will be necessary to calculate the lengths of more than 1030 paths, which is completely unrealistic. However, if it is possible to enumerate all options in a reasonable time, then you can be absolutely sure that the solution found is indeed optimal (Fig. 9.).

Rice. 9. Search for the optimal solution

The genetic algorithm is just such a combined method. The crossover and mutation mechanisms in a sense implement the enumeration part of the method, and the selection of the best solutions is gradient descent. The figure shows (fig.) that such a combination makes it possible to provide consistently good genetic search efficiency for any types of tasks (fig. 10.).

Rice. 10. Gradient Descent Method

So, if a complex function of several variables is given on a certain set, then a genetic algorithm is a program that, in a reasonable time, finds a point where the value of the function is close enough to the maximum possible. Choosing an acceptable calculation time, we will get one of the best solutions that is generally possible to obtain in this time.

Positive qualities of genetic algorithms

1. Finding the global minimum: non-susceptibility to "getting stuck" in the local minima of the objective function.

Massive parallelism in processing: individuals in the population function independently: the calculation of the objective function values, death, mutations are carried out independently for each individual. In the presence of several processor elements, the performance can be very high.

Biosimilar: Genetic algorithms are built on the same principles that led to the emergence of humans and the entire diversity of species, and therefore can be very productive and useful.

4. Automatic object management

Management of an object is the process of influencing it in order to ensure the required flow of processes in the object or the required change in state. The basis of management is the receipt and processing of information about the state of the object and the external conditions of its operation to determine the impacts that must be applied to the object in order to achieve the goal of management.

Control, carried out without human intervention, is called automatic control. The device with which control is carried out is called the control device. The combination of the control object and the control device forms an automatic control system (ACS) (Fig. 11.).

Rice. 11. Block diagram of the automatic control system

The state of the object is characterized by the output value X. From the control device, the control input U is applied to the object. X, obstructing control. At the input of the control device, a master action (task) G is supplied, containing information about the required value of X, i.e. about the goal of management. In the most general case, the input of the object also receives information about current state of the object in the form of the output value X and about the perturbation F acting on the object. The variables U, G, F and X are generally vectors.

As with any dynamic system, processes in ACS are divided into steady and transitional.

When considering ACS, the following concepts are important: system stability, quality of the control process and control accuracy.

The quality of the management process is characterized by how close the management process is to the desired one. Quantitatively, they are expressed by quality criteria:

Transient time - the time interval from the beginning of the transient until the moment when the deviation of the output value from its new steady state value becomes less than a certain value - usually 5%.

Maximum deviation in the transition period (overshoot) - the deviation is determined from the new steady state value and is expressed as a percentage.

Fluctuation of the transition process - is determined by the number of oscillations equal to the number of minima of the curve of the transition process during the transition process. Often, the oscillation is expressed as a percentage as the ratio of adjacent maxima of the transient curve.

The control accuracy is characterized by the system error in steady-state conditions (difference between the desired signal and the actual one) - droop.

4.1 Control object

The dynamics of the control object is described by the following system of differential equations: Control object

Initial data: the control object represents the car. The parameter to be controlled is its speed. Vehicle speed can be controlled by another parameter. It can be: the force of pressing the accelerator pedal, etc. Many external factors can affect the speed of a car: the slope under which the car is moving, the quality of road grip, and the wind. Vehicle speed information comes from the speed sensor.

The dynamics of the control object is described by the following system of differential equations:

Parameters T1, T2, K1, K2 are determined experimentally and have the following values ​​respectively: K1=5, K2=7.156, T1=1.735, T2=16.85.

It is required to build such a controller in the class of neural network structures that would provide control of the object subject to the following requirements for the synthesized automatic control system:

· Physical feasibility of the controller.

· Work stability.

· Minimum difficulty.

· Building a neurocontroller

An ANN can be trained on a certain set by selecting tuning parameters (Fig. 12.). The error backpropagation method based on the gradient method is used with the convergence rate constant h. To ensure convergence, we change h from 1 to 0.00001 with the number of iterations of 1000000. The size of the training set is 400 training pairs. The number of neurons in the hidden layer is 50.

Rice. 12. Block diagram of an ACS with a neurocontroller

For comparison, the transient response of the ACS with the PID controller was taken, where k1=0.2, k2=0.007, k3=0.2.

The ACS with a neuroregulator and a PID controller is supplied with a driving influence G=10, 20, 30 ... 110. At the interval of 50-100s. the system is interfered with. The results of the ACS operation in fig. 13.

Rice. 13. Comparison of neuroregulator and PID controller

The comparison shows: the PID controller loses to the neurocontroller both in terms of speed at the start, when entering the interference zone and when exiting.

4.2 Robotics as a direction of Artificial Intelligence

All human intellectual activity is ultimately aimed at active interaction with the outside world through movements. In the same way, the elements of the robot's intellect serve, first of all, to organize its purposeful movements. At the same time, the main purpose of purely computer systems of Artificial Intelligence (AI) is to solve intellectual problems that are abstract or auxiliary in nature and usually not associated with perception. environment with the help of artificial sense organs, or with the organization of the movements of the actuators.

Within the framework of the first approach, first of all, the structure and mechanisms of the human brain are studied, and the ultimate goal is to reveal the secrets of thinking. The necessary stages of research in this direction are building models based on psychophysiological data, conducting experiments with them, putting forward new hypotheses about the mechanisms of intellectual activity, improving models, etc.

The second approach considers AI as an object of study. Here we are talking about modeling intellectual activity using computers. And the purpose of the work is to create an algorithmic and software computers that allow solving intellectual problems no worse than a person.

Finally, the third approach is focused on the creation of mixed human-machine (interactive) intelligent systems, on the symbiosis of the capabilities of natural and artificial intelligence. The most important problems in these studies are the optimal distribution of functions between natural and artificial intelligence and organization of dialogue between man and machine.

Attempts by scientists around the world to create robots met with at least two serious problems that did not allow any noticeable progress in this direction: pattern recognition and common sense. Robots see much better than us, but they do not understand what they see. Robots hear much better than us, but they do not understand what they hear.

4.2.1 General block diagram of the robot

Scheme indicating the most important nodes of the future robot (control system, sensors, signaling devices), and their connections with each other. According to this scheme, it is easy to navigate in what else is necessary for the robot, what needs to be done or obtained (Fig. 14.).

Rice. 14. General block diagram of the robot

4.2.2 Conceptual model

The scheme of the robot, made at two levels of detail. Displays the main functional components of the system and the relationships between them. Very useful for ordering thoughts (Fig. 15-22.).

Rice. 15. Conceptual model. Self-propelled cart

Rice. 16. Scheme of behavior

Rice. 17. Robot environment

Rice. 18. Control panel

Rice. 19. Sensor system

Rice. 20. Control system

Rice. 21. Propulsion system

Rice. 22. Warning system

4.2 Efficient control of the quantum spin register. Cryptography and quantum teleportation

The idea of ​​quantum computing was first expressed by the Soviet mathematician Yu.I. Manin in 1980 and became actively discussed after the publication of an article by Robert Feynman in 1982. Indeed, states 0 and 1, which are represented in modern computers as voltage levels of certain electrical circuits(triggers) can also be interpreted as states elementary particles, if, for example, we use such a characteristic as spin. According to the Pauli principle, each particle can have a spin of +1/2 or -1/2 - why not logical "one" and "zero"? And the quantum nature of such trigger particles, called "quantum bits" or "qubits" (Qbit), gives the capabilities of computers built on this basis truly unique properties.

The development of quantum information processing devices is a new and rapidly developing area of ​​nanotechnology. On the way of its development, there are problems that can be conditionally divided into physical-technological, mathematical and information-computational. The latter include the question: how can quantum computing be managed with maximum efficiency.

The operation of a quantum computing device requires the so-called entangled states, which are also important for quantum teleportation and cryptography, so the study of entanglement is one of the main goals of quantum informatics. In the general case, the safe processing of information in a quantum register should be based on the control of qubits (quantum bits) that make up its filling.

Let's assume that there is one qubit. In this case, after the measurement, in the so-called classical form, the result will be 0 or 1. In reality, a qubit-quantum object and therefore, due to the uncertainty principle, the result of the measurement can be both 0 and 1 with a certain probability. If a qubit is equal to 0 (or 1) with 100% probability, its state is denoted by the symbol (or) - in Dirac's notation and - these are the basic states. In the general case, the quantum state of a qubit is "between" the base ones and is written in the form where |a|І and |b|І are the probabilities to measure 0 or 1, respectively; ; |a|І + |b|І = 1. Moreover, immediately after the measurement, the qubit goes into the basic quantum state, similar to the classical result.

There is such a thing as quantum teleportation. The essence of quantum teleportation is to transfer the state of an object over a distance, while the object itself does not move. It turns out that the teleportation, about which science fiction writers have written so much, remains nothing more than a fantasy. Quantum teleportation was described by Einstein. True, the scientist himself did not believe in it, although it did not contradict any laws of physics. According to the great scientist, the quantum effect, the experimental confirmation of which our contemporaries have achieved, should have led to complete absurdity. However, it will lead, as we are now told, to the creation of an entirely new generation of computers.

The teleportation algorithm implements the exact transfer of the state of one qubit (or system) to another. AT the simplest circuit 4 qubits are used: a source, a receiver and two auxiliary ones. Note that as a result of the algorithm, the initial state of the source will be destroyed - this is an example of the action general principle impossibility of cloning - it is impossible to create an exact copy of a quantum state without destroying the original. In fact, it is quite easy to create identical states on qubits. For example, having measured 3 qubits, we will transfer each of them to the basic states (0 or 1) and at least two of them will coincide. It will not be possible to copy an arbitrary state, and teleportation is a replacement for this operation.

Teleportation allows you to transfer the quantum state of the system using conventional classical communication channels. Thus, one can, in particular, obtain bound state system consisting of subsystems remote at a great distance.

The promising concept of a quantum register for quantum information processing is based on an ensemble of spins in an entangled state, which will be considered as qubits. The use of statistical mixtures of pure states, such as spin ensembles, has led to the development of quantum ensemble calculations, which have been experimentally performed in a system of up to 12 qubits. Even larger quantum registers have been experimentally investigated to assess their stability with respect to decoherence, which is a significant problem. Therefore, an efficient estimate of the evolution of the ensemble is required.

The conditions under which entanglement can occur can be achieved with reliable thermal insulation by a method called adiabatic demagnetization in a rotating frame of reference (ADRF). It seems that the ability to easily control the amount of entanglement of qubits in a quantum register using external influence will help to better determine the mode of its effective operation. The control can be modeled in the most simple and, at the same time, realistic way, assuming that 1) the system is near equilibrium, 2) external influence leads to a change in the temperature of qubits, 3) magnetic field according to the ADRF type, it has a fairly simple form for modeling, 4) this effect can be reused in various combinations as the main structural element of the control system.

Quantum parallelism lies in the fact that the data in the process of calculations are quantum information, which at the end of the process is converted into classical information by measuring the final state of the quantum register. The gain in quantum algorithms is achieved due to the fact that when applying one quantum operation, a large number of superposition coefficients of quantum states, which in virtual form contain classical information, is transformed simultaneously.

The application of the ideas of quantum mechanics has already opened a new era in the field of cryptography, since the methods of quantum cryptography open up new possibilities in the field of message passing.

Quantum cryptography says the following: the interception of a sent message immediately becomes known. This means that the fact of espionage cannot be overlooked. An intercepted message encrypted by a quantum computer loses its structure and becomes incomprehensible to the addressee. Since quantum cryptography exploits the nature of reality, and not human sophistication, it becomes impossible to hide the fact of espionage. The appearance of encryption of this kind will put an end to the struggle of cryptographers for the most reliable ways to encrypt messages.

Note that the longer entanglement lasts, the better for a quantum computer, since "long-lasting" qubits can solve more complex problems.

In this case, the processor used Grover and Deutsch-Jose quantum algorithms to perform two different tasks. The processor gave the correct answer in 80% of cases (using the first algorithm) and in 90% of cases (with the second algorithm).

The reading of the result also occurs with the help of microwaves: if the oscillation frequency corresponds to that present in the cavity, then the signal passes through it.


5. Practical part. Examples of Quantum Neural Networks

5.1 Inverted pendulum

The task is to establish a stable state of the pendulum by moving the carriage to the X=0 position using quantum neural networks (Fig. 23.).

Rice. 23. Steady state of the pendulum

The dispatcher sets the pendulum in motion, balances and brings the carriage to the X=0 position, provided that the carriage avoids striking at the points of the end of the track.

Rice. 24. Dispatcher system

The controller system consists of a neural network (FFNN), a target generator, a comparator, and an inverted pendulum as a controlled object (Fig. 24.).

Rice. 25. Comparison of the principle of operation of Quantum and Artificial neural networks

A quantum neural network has more superior capabilities in contrast to Artificial neural networks (Fig. 25.).

5.2 Image compression

Here a cubic neuron model is proposed as new scheme with a mismatched computing standard that bridges quantum computing and neural computing.

When an image is input to a network with a feed-forward connection to a narrow hidden layer, it is possible to take the compressed image data from the output of the hidden layer. The network is being trained. When the network performs this identity mapping, it is possible to pick up data from the original image from the output of the narrow hidden layer (Fig. 26,27 ..).

Rice. 26. Image Compression on Layered Neural Networks

Rice. 27. Input circuit, section of the original image

When modeling, we accept the values ​​of the parameters given in the table (Table 6).

Table 6: Parameter values ​​in simulation

ParametersValuesBI;BH8 bitBWO16 bitPatch size8x8 pixelsInitial value of the parameter ( )-π~π Parameter initial value -1~1Target quantization range (weight, threshold) -5~5

The number of hidden layer neurons, which strongly affects the value of R, depends on the experimental situations. The quantization for BH and BWO is done by such a quantization function as shown in Fig. 28.

Rice. 28. An example of function quantization

5.3 Alphabet coding

Rice. 29. Examples graphic symbols alphabet

6. MATLAB NEURAL NETWORK TOOL. Kohonen network

The success of neural networks is explained by the fact that the necessary element base for the implementation of neural networks has been created, and powerful tools have been developed for modeling them in the form of application packages. These packages include the Neural Networks Toolbox (NNT) package of the MATLAB 6 mathematical modeling system from Math Works. MATLAB is used by more than 1,000,000 engineers and scientists and runs on most modern operating systems, including Linux, Mac OS, Solaris, and Microsoft Windows.

The NNT application package contains tools for building neural networks based on the behavior of the mathematical analogue of a neuron. The package provides efficient support for the design, training, analysis and simulation of many known types of networks - from basic perceptron models to the most advanced associative and self-organizing networks.

The Kohonen model (Fig. 30.) solves the problem of finding clusters in the space of input images.

This network is trained without a teacher on the basis of self-organization. As learning progresses, the weight vectors of neurons tend to the centers of clusters - groups of vectors of the training sample. At the stage of solving information problems, the network refers the new presented image to one of the formed clusters, thereby indicating the category to which it belongs.

Rice. 30. Kohonen Network

Let's consider Kohonen's NS architecture and learning rules in more detail. The Kohonen network, like the Lippmann-Hamming network, consists of a single layer of neurons. The number of inputs of each neuron is equal to the dimension of the input image. The number of neurons is determined by the degree of detail with which it is required to perform clustering of a set of library images. With a sufficient number of neurons and successful learning parameters, NS Kohonen can not only identify the main groups of images, but also establish the "fine structure" of the resulting clusters. In this case, close input images will correspond to similar maps of neural activity (Fig. 31.).

Rice. 31. An example of a Kohonen map. The size of each square corresponds to the degree of excitation of the corresponding neuron.

Training begins with setting random values ​​to the matrix of connections. In the future, the process of self-organization occurs, which consists in modifying the weights when the vectors of the training sample are presented to the input. For each neuron, you can determine its distance to the input vector:

Next, the neuron m=m is selected *for which this distance is minimal. At the current training step t, only the weights of neurons from the neighborhood of neuron m will be modified *:

Rice. 34. Training the Kohonen network

Initially, all the neurons of the network are located in the vicinity of any of the neurons; subsequently, this neighborhood narrows. At the end of the training phase, only the weights of the closest neuron are adjusted. The pace of learning h (t)<1 с течением времени также уменьшается. Образы обучающей выборки предъявляются последовательно, и каждый раз происходит подстройка весов.

It is convenient to present the resulting map as a two-dimensional image, in which different degrees of excitation of all neurons are displayed as squares of different areas. An example of a map built using 100 Kohonen neurons is shown in Fig. 7.2.

Each neuron carries information about a cluster - a clot in the space of input images, forming a collective image for this group. Thus, Kohonen's NS is capable of generalization. Several neurons with close values ​​of the weight vectors can correspond to a specific cluster, so the failure of one neuron is not so critical for the functioning of Kohonen's NN.

Conclusion

Some comparative studies have been optimistic, others pessimistic. For many problems, such as pattern recognition, there are no dominant approaches yet. The choice of the best technology should be dictated by the nature of the problem. It is necessary to try to understand the possibilities, prerequisites and scope of various approaches and make the most of their additional benefits for the further development of intelligent systems. Such efforts can lead to a synergistic approach that combines QNS with other technologies to make a significant breakthrough in solving pressing problems. It is clear that the interaction and joint work of researchers in the field of QNS and other disciplines will not only avoid repetition, but also (more importantly) stimulate and give new qualities to the development of individual areas.

Currently, there is active research into alternative methods of computing, such as computing using quantum computers and neurocomputers. Both directions give us great opportunities in parallelism, but they consider this issue from different angles. Quantum computers allow you to perform an operation on an unlimited number of qubits at the same time, which can greatly increase the speed of calculations. The neurocomputer, on the other hand, allows you to simultaneously perform many different simple tasks on a large number of primitive processors, and eventually get the result of their work. Considering that the main task of neurocomputers is image processing. With a parallel architecture, this task is performed much faster than in a classical sequential one. At the same time, neural computers allow us to obtain universal and at the same time "survivable" systems, due to their homogeneous structure.

In this paper, I tried to provide a systematic introduction to the theory of Quantum Neural Networks, and also helped to get closer to answering an important question: are Quantum Neural Networks a long-awaited mainstream in which the development of artificial intelligence methods will continue, or will they turn out to be a kind of fashion trend, like this has previously been the case with expert systems and some other research tools (such as Feynman diagrams), which were initially expected to be revolutionary breakthroughs. Gradually, these methods found their limitations and took their respective place in the general structure of science.

There are two main reasons for the interest in quantum neural networks. One has to do with arguments that quantum processes can play an important role in how the brain works. For example, Roger Penrose has made various arguments that only a new physics that should unify quantum mechanics with general relativity could describe phenomena such as understanding and consciousness. However, his approach is not addressed to neural networks themselves, but to intracellular structures such as microtubules. Another reason is related to the rapid growth of quantum computing, the main ideas of which could well be transferred to neurocomputing, which would open up new opportunities for them.

Quantum neural systems can bypass some of the difficult issues that are essential for quantum computing due to their analogy, the ability to learn from a limited number of examples.

What can be expected from quantum neural networks? Currently, quantum neural networks have the following advantages:

Exponential memory capacity;

Better performance with fewer hidden neurons;

Fast learning;

Elimination of catastrophic forgetting due to the absence of image interference;

Solving linearly inseparable problems with a single-layer network;

No connections;

High data processing speed (10 10 bits/s);

Miniature (10 11 neurons/mm 3);

Higher stability and reliability;

These potential advantages of quantum neural networks are the main motivation for their development.

quantum neuron

Synapses communicate between neurons and multiply the input signal by a number characterizing the strength of the connection - the weight of the synapse. The adder performs the addition of signals coming through synaptic connections from other neurons and external input signals. The converter implements the function of one argument, the output of the adder, into some output value of the neuron. This function is called the activation function of the neuron.

Thus, the neuron is completely described by its weights and activation function F. Having received a set of numbers (vector) as inputs, the neuron produces some number at the output.

The activation function can be of various types. The most widely used options are listed in the table (Table 2).

Table 2: List of neuronal activation functions

Name

Scope of value

Threshold

Iconic

Sigmoid

semi-linear

Linear

Radial basic

Semi-linear with saturation

Linear with saturation

Hyperbolic tangent

triangular

The definition of a Quantum Neuron is given as follows:

It receives input signals (input data or output signals from other SNS neurons) through several input channels. Each input signal passes through a junction having a certain intensity (or weight); this weight corresponds to the synaptic activity of the neuron. Each neuron has a specific threshold value associated with it. The weighted sum of the inputs is calculated, the threshold value is subtracted from it, and the result is the neuron's activation value (also called the neuron's post-synaptic potential - PSP).

The activation signal is transformed using an activation function (or transfer function) and as a result, the output signal of the neuron is obtained (Fig. 1).


Fig 1

A mathematical model of a quantum neuron, where are matrices acting on the basis, is an operator that can implement a network of quantum cells.

For example: The learning process of a quantum neuron. = - identity operator: .

The quantum learning rule is provided in analogy to the classical case, as follows: , where is the desired output. This learning rule brings the quantum neuron into the desired state used for learning. Taking the squared modulo difference between the actual and the desired output, we see that:


An entire network can be assembled from primitive elements using the standard rules of ANN architectures.

2 Fundamentals of Quantum Computing Qubits Qubits The unit of quantum information is the qubit The unit of quantum information is the qubit A qubit can be thought of as a 2-state system, eg. spin 1/2 or two-level system. A qubit can be thought of as a 2-state system, e.g. spin 1/2 or two-level system. The state of a qubit is described by a vector of 2 components: The state of a qubit is described by a vector of 2 components:


3 Fundamentals of Quantum Computing Quantum Gates Quantum Gates Quantum gates are analogues of boolean operations AND, OR, NOT, etc. Quantum gates are analogues of Boolean operations AND, OR, NOT, etc. A quantum gate acting on n qubits is a unitary operator A quantum gate acting on n qubits is a unitary operator Example: gate NOT: Example: gate NOT:


4 Quantum Algorithms Simon's Algorithm for Finding the Period of a Function Simon's Algorithm for Finding the Period of a Function Shor's Algorithm for Prime Factorization Shor's Algorithm for Prime Factorization Grover's Search Algorithm Grover's Search Algorithm Deutsch Jos's Algorithm Deutsch Jos's Algorithm






7 Shor's algorithm: basic steps 1. Choose a random remainder a modulo N 2. Check gcd(a, N)=1 3. Find the order r of the remainder a modulo N 4. If r is even then calculate gcd (a r/2 - 1 , N) Definition: the minimum r such that a r 1 (mod N) is called the order of a modulo N The order is the period of the function f(x)=a x (mod N)




















17 Quantum Associative Memory Perush Quantum Associative Network (2000) Perush Quantum Associative Network (2000) Based on the Hopfield Model Based on the Hopfield Model Continuous generalization of the Hopfield Hamiltonian Continuous generalization of the Hopfield Hamiltonian Wave function collapse as convergence to an attractor Wave function collapse as convergence to an attractor


18 Quantum neural network Quantum neural network (Berman et al., 2002) Quantum neural network (Berman et al., 2002) Designed to calculate the degree of quantum entanglement Designed to calculate the degree of quantum entanglement Runs in time Runs in time Is a feedforward network Is a feedforward network Consists of two-level quantum objects and linear oscillators Consists of two-level quantum objects and linear oscillators




20 Quantum associative memory Ventura quantum AP (1998, 2000, 2003) Ventura quantum AP (1998, 2000, 2003) Based on Grover's algorithm Based on Grover's algorithm Storing m n-dimensional binary vectors Storing m n-dimensional binary vectors Specialized quantum learning algorithm gives operator P Specialized quantum learning algorithm gives operator P Has an exponential capacity of ~2 n Has an exponential capacity of ~2 n