# Harrow Hassidim Lloyd (HHL) Algorithm

Written by Bikash's Quantum for the Strangeworks Platform

Multiple linear regression, one of the most fundamental supervised learning algorithms, assumes an imperative role in the field of machine learning. In 2009, Harrow et al. [Phys. Rev. Lett. 103, 150502 (2009)] showed that their algorithm could be used to sample the solution of a linear system $Ax=b$ exponentially faster than any existing classical algorithm. Remarkably, any multiple linear regression problem can be reduced to a linear system of equations problem. However, finding a practical and efficient quantum circuit for the quantum algorithm in terms of elementary gate operations is still an open topic. Here we put forward a 7-qubit quantum circuit design, based on an earlier work by Cao et al. [Mol. Phys. 110, 1675 (2012)], to solve a 3-variable regression problem, utilizing only basic quantum gates. Furthermore, we demonstrate the circuit implementation of the given algorithm after designing the circuits in Qiskit and Cirq platform.

## Introduction

Quantum algorithms running on quantum computers aim at quickly and efficiently solving several important computational problems faster than classical algorithms running on classical computers. One key way in which quantum algorithms differ from classical algorithms is that they utilize quantum mechanical phenomena such as superposition and entanglement, which allow us to work in exponentially large Hilbert spaces with only polynomial overheads. This in turn, in some cases, allows for exponential speed-ups in terms of algorithmic complexity.

In today's world, machine learning is primarily concerned with the development of low-error models in order to make accurate predictions possible by learning and inferring from training data. It borrows heavily from the field of statistics in which linear regression is one of the flagship tools. The theory of multiple linear regression or more generally multivariate linear regression was largely developed in the field of statistics in the pre-computer era. It is one of the most well understood, versatile and straightforward techniques in any statistician's toolbox. It is also an important and practical supervised learning algorithm. Supervised learning is where one has some labelled input data samples ${x_{i},y_{i}}_{i=1}$ (where $x_{i}$'s are the feature vectors and $y_{i}$'s are the corresponding labels) and then based on some criteria (which might depend on the context) chooses a mapping from input set $X$ to the output set $Y$. And that mapping can help to predict the probable output corresponding to an input lying outside of the training data sets. Multiple linear regression is similar in the sense that given some training samples one identifies a closely-fitting hyperplane depending on the specific choice of a loss function (the most common one being a quadratic loss function based on the least squares method). Interestingly, any multiple regression problem can be converted into an equivalent system of linear equations problem or more specifically, a Quantum Linear Systems Problem (QLSP) problem.

Suppose, that we are given a system of $N$ linear equations with $N$ unknowns, which can be expressed as $Ax=b$. Now, what we are interested in, is: given a matrix $A∈C_{N×N}$ with a vector $b∈C_{N}$, find the solution $x∈C_{N}$ satisfying $Ax=b$ (which is $A_{−1}b$ if $A$ is invertible), or else return a flag if no solution exists. This is known as the Linear Systems Problem (LSP). However, we will consider only a special case of this general problem, in form of the Quantum Linear Systems Problem (QLSP).

The quantum version of the LSP problem, that is, the QLSP can be expressed as:

Let $A$ be a $N×N$ Hermitian matrix with a spectral norm bounded by unity and a known condition number $κ$. The quantum state on $⌈gN⌉$ qubits $∣b⟩$ can be given by

$∣b⟩:=∣∣∑_{i}b_{i}∣i⟩∣∣∑_{i}b_{i}∣i⟩ $

and $∣x⟩$ by

$∣x⟩:=∣∣∑_{i}x_{i}∣i⟩∣∣∑_{i}x_{i}∣i⟩ $

where $b_{i},x_{i}$ are respectively the $i_{th}$ component of vectors $b$ and $x$. Given the matrix $A$ (whose elements are accessed by an oracle) and the state $∣b⟩$, an output state $∣x~⟩$ is such that $∣∣∣x~⟩−∣x⟩∣∣_{2}≤ϵ$, with some probability $Ω(1)$ (practically at least $21 $) along with a binary flag indicating success or failure.

In 2009, Harrow, Hassidim and Lloyd put forward a quantum algorithm (popularly known as the HHL algorithm) to obtain information about the solution $x$ of certain classes of linear systems $Ax=b$. As we know, algorithms for finding the solutions to linear systems of equations play an important role in engineering, physics, chemistry, computer science, and economics apart from other areas. However, experimentally implementing the HHL algorithm for solving an arbitrary system of linear equations to a satisfactory degree of accuracy, remains an infeasible task even today, due to several physical and theoretical restrictions imposed by the algorithm and the currently accessible hardware. It is therefore imperative to extract as much practical benefit as we can from the algorithm. We present an application (in the context of multiple regression) of a modified version of the earlier circuit which was meant for implementing the HHL algorithm for a $4×4$ linear system on real quantum computers. This circuit requires only 7 qubits and it should be simple enough to experimentally verify it, one gets access to quantum computers with quantum logic gates with sufficiently low error rates.

Although the HHL solves the QLSP for all such matrices $A$ or

$[0A A_{†}0 ]$, it can be efficiently implemented only when they are sparse and well-conditioned (the sparsity condition may be slightly relaxed). In this context, efficient means at most poly-logarithmic in system size. A $N×N$ matrix is called $s$-sparse if it has at most $s$ non-zero entries in any row or column. We call it simply sparse if it has at most $poly(gN)$ entries per row. We generally call a matrix well-conditioned when its singular values lie between the reciprocal of its condition number ($κ1 $) and 1. Condition number $κ$ of a matrix is the ratio of largest to smallest singular value and is undefined when the smallest singular value of $A$is $0$. For Hermitian matrices the magnitude of the eigenvalues are equal to the magnitudes of the respective singular values.

At this point it is important to reiterate that unlike the output $A_{−1}b$ of a classical linear system solver, the output copy of $∣x~⟩$ does not provide access to the coordinates of $A_{−1}b$. Nevertheless, it allows for sampling from the solution vectors like $⟨x~∣M∣x~⟩$, where $M$ is a quantum- mechanical operator. This is one main difference between solutions of the LSP and solutions of the QLSP. We should also keep in mind that reading out the elements of $∣x~⟩$ in itself takes $O(N)$ time. Thus, a solution to QLSP might be useful only in applications where just samples from the vector $∣x~⟩$ are needed.

## The Harrow Hassidim Lloyd Algorithm

The HHL algorithm consists of three major steps which we will briefly discuss one by one. Initially, we begin with a Hermitian matrix $A$ and an input state $∣b⟩$ corresponding to our specific system of linear equations [Fig. 1]. The assumption that $A$ is Hermitian may be dropped without loss of generality since we can instead solve the linear system of equations given by $[0A A_{†}0 ]y=[b0 ]$ which has the unique solution $y=[0x ]$ when $A$ is invertible. This transformation does not alter the condition number (ratio of the magnitudes of the largest and smallest eigenvalues) of $\mathbf{A}$. However, in the case our original matrix $A$ is not Hermitian, the transformed system with the new matrix $[0A A_{†}0 ]$ needs oracle access to the non-zero entries of the rows and columns of $A$.

Here $iAt$ and $−iAt$ commute and hence $e_{iAt}e_{−iAt}=e_{iAt−iAt}=e_{0}=I$. Moreover, $e_{iAt}$ shares all its eigenvectors with $A$, while it's eigenvalues are $e_{iλ_{j}t}$ if the eigenvalues of $A$ are taken to be $λ_{j}$. Suppose that $∣u_{j}⟩$ are the eigenvectors of $A$ and $λ_{j}$ are the corresponding eigenvalues. We recall that we assumed all the eigenvalues to be of magnitude less than 1 (spectral norm is bounded by unity). As the eigenvalues $λ_{j}$ are of the form $0.a_{1}a_{1}a_{3}⋯$ in binary, we use $∣λ_{j}⟩$ to refer to $∣a_{1}a_{2}a_{3}⋯⟩$. We know from the spectral theorem that every Hermitian matrix has an orthonormal basis of eigenvectors. So, in this context, $A$ can be re-written as $∑_{j}λ_{j}∣u_{j}⟩⟨u_{j}∣$ (via eigendecomposition of $A$) and $∣b⟩$ as $∑_{j}β_{j}∣u⟩_{j}$.

## Phase Estimation

The quantum phase estimation algorithm performs the mapping $(∣0⟩_{⊗n})_{C}∣u⟩_{I}∣0⟩_{S}↦∣φ~ ⟩_{C}∣u⟩_{I}∣0⟩_{S}$where $∣u⟩$ is an eigenvector of a unitary operator $U$ with an unknown eigenvalue $e_{i2πφ}$. $φ~ $ is a $t$-bit approximation of $φ$, where $t$ is the number of qubits in the clock register. The superscripts on the kets indicate the names of the registers which store the corresponding states. In the HHL algorithm the input register begins with a superposition of eigenvectors instead i.e., $∣b⟩=∑_{j}β_{j}∣u_{j}⟩$ instead of a specific eigenvector $∣u⟩$, and for us the unitary operator is $e_{iAt}$. So the phase estimation circuit performs the mapping

$(∣0⟩_{⊗n})_{C}∣b⟩_{I}↦(∑_{j=1}β_{j}∣u_{j}⟩_{I}∣⟩λ~_{j}t_{0} 2π_{C})$

where $λ~_{j}$'s are the binary representations of the eigenvalues of $A$ to a tolerated precision. To be more explicit, here $λ~_{j}$ is represented as $b_{1}b_{2}b_{3}⋯b_{t}$ ($t$ is the number of qubits in the clock register) if the actual binary equivalent of $λ_{j}$ is of the form $λ=0.b_{1}b_{2}b_{3}⋯$. To avoid the factor of $2π$ in the denominator, the evolution time $t_{0}$ is generally chosen to be $2π$. However, $t_{0}$ may also be used to normalize $A$ (by re-scaling $t_{0}$) in case the spectral norm of $A$ exceeds 1. Additionally, an important factor in the performance of the algorithm is the condition number $κ$. As $κ$ grows, $A$ tends more and more towards a non-invertible matrix, and the solutions become less and less stable. Matrices with large condition numbers are said to be ill-conditioned. The HHL algorithm generally assumes that the singular values of $A$ lie between $1/κ$ and 1, which ensures that the matrices we have to deal with are well-conditioned. Nonetheless, there are methods to tackle ill-conditioned matrices and those have been thoroughly discussed in the paper by Lloyd et al. It is worth mentioning that in this step the clock register-controlled Hamiltonian simulation gate $U$ can be expressed as $∑_{k=0}∣τ⟩⟨τ∣_{C}⊗e_{iAτt_{0}/T}$, where $T=2_{t}$ ($t$ is the number of qubits in the clock register) and evolution time $t_{0}=O(κ/ϵ)$. Interestingly, choosing $t_{0}=O(κ/ϵ)$ can at maximum error cause an error of magnitude $ϵ$ in the final state.

## Controlled Rotation ($R(λ~_{−1})$)

A clock register controlled $σ_{y}$-rotation of the ancilla qubit produces a normalized state of the form

$∑_{j=1}β_{j}∣u_{j}⟩_{I}∣λ~_{j}⟩_{C}(1−λ~_{j}C_{2} ∣0⟩+λ~_{j}C ∣1⟩)_{S}$

These rotations, conditioned on respective $λ~_{j}$, can be achieved by the application of the

$exp(−iθσ_{y})=[cos(θ)sin(θ) −sin(θ)cos(θ) ]$

operators where $θ=cos_{−1}(λ~_{j}C )$. $C$ is a scaling factor to prevent the controlled rotation from being unphysical. That is, practically $C<\lambda \min C$ (minimum eigenvalue of $A$) is a safe choice, which may be more formally stated as $C=O(1/κ)$.

## Uncomputation

In the final step, the inverse quantum phase estimation algorithm sets back the clock register to $(∣0⟩_{⊗n})_{C}$ and leaves the remaining state as

$∑_{j=1}β_{j}∣u_{j}⟩_{I}(1−λ~_{j}C_{2} ∣0⟩+λ~_{j}C ∣1⟩)_{S}$

Postselecting on the ancilla $∣1⟩_{S}$ gives the final state $C∑_{j=1}(λ_{j}β_{j} )∣u_{j}⟩_{I}$. The inverse of the Hermitian matrix $A$ can be written as $∑_{j}λ_{j}1 ∣u_{j}⟩⟨u_{j}∣$, and hence $A_{−1}∣b⟩$ matches $∑_{j=1}λ_{j}~ β_{j} ∣u_{j}⟩_{I}$. This outcome state, in the standard basis, is component-wise proportional to the exact solution $x$ of the system $Ax=b$.

## Linear Regression Utilizing the HHL

Linear regression models a linear relationship between a scalar response variable and one or more feature variables. Given a $n$-unit data set ${y_{i},x_{i1},⋯,x_{ip}}_{i=1}$, a linear regression model assumes that the relationship between the dependent variable $y$ and a set of $p$ attributes i.e., $x={x_{1},⋯,x_{p}}$ is linear. Essentially, the model takes the form

$y_{i}=β_{0}+β_{1}x_{1}+⋯+β_{p}x_{ip}+ϵ_{i}=x_{i}β β+ϵ_{i}$

where $ϵ_{i}$ is the noise or error term. Here $i$ ranges from 1 to $n$. $x_{i}$ denotes the transpose of the column matrix $x_{i}$. And $x_{i}β β$ is the inner product between vectors $x_{i}$ and $β β$. These $n$ equations may be more compactly represented in the matrix notation, as $y=Xβ β+ϵϵ$. Now, we will consider a simple example with 3 feature variables and a bias $β_{0}$. Say our data sets are ${−81 +82 1 ,−2 ,2 1 ,−21 }$, ${83 −82 3 ,−2 ,−2 1 ,21 }$, ${−81 −82 1 ,2 ,−2 1 ,−21 }$ and ${83 +82 3 ,2 ,2 1 ,21 }$.

Plugging in these data sets we get the linear system:

$β_{0}−2 β_{1}+2 1 β_{2}−21 β_{3}=−81 +82 1 $

$β_{0}−2 β_{1}−2 1 β_{2}+21 β_{3}=83 −82 3 $

$β_{0}+2 β_{1}−2 1 β_{2}−21 β_{3}=−81 −82 1 $

$β_{0}+2 β_{1}+2 1 β_{2}+21 β_{3}=83 +82 3 $

To estimate $β β$ we will use the popular least squares method, which minimizes the residual sum of squares $∑_{i=1}(y_{i}−x_{i}β β_{i})_{2}$. If $X$ is positive definite (and in turn has full rank) we can obtain a unique solution for the best fit $β^ β^ $, which is $(X_{T}X)_{−1}X_{T}y$. It is possible that all the columns of $X$ are not linearly independent and by extension $X$ is not full rank. This kind of a situation might occur if two or more of the feature variables are perfectly correlated. Then $X_{T}X$ would be singular and $β^ $ wouldn't be uniquely defined. Nevertheless, there exist techniques like filtering to resolve the non-unique representations by reducing the redundant features. Rank deficiencies might also occur if the number of features $p$ exceed the number of data sets $N$. If we estimate such models using regularization, then redundant columns should not be left out. The regulation takes care of the singularities. More importantly, the final prediction might depend on which columns are left out.

Equations (1)-(4) may be expressed in the matrix notation as:

$⎣⎢⎢⎢⎢⎡ −2 −2 −2 2 11−11 2 1 −2 1 2 1 2 1 −21 21 21 21 ⎦⎥⎥⎥⎥⎤ ⎣⎢⎢⎢⎡ β_{1}β_{0}β_{2}β_{3} ⎦⎥⎥⎥⎤ =⎣⎢⎢⎢⎢⎡ −81 +82 1 83 −82 3 81 +82 1 83 +82 3 ⎦⎥⎥⎥⎥⎤ $

Note that unlike common convention, our representation of $X$ does not contain a column full of 1's corresponding the bias term. This representation is used simply because of the convenient form which we obtain for $X_{T}X$. The final result remains unaffected as long as $y=Xβ β$ represents the same linear system.

Now

$X_{T}X=41 ⎣⎢⎢⎢⎡ 1595−3 9153−5 5315−9 −3−5−915 ⎦⎥⎥⎥⎤ $

and

$X_{T}y=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡ 21 21 21 21 ⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤ $.

Thus we need to solve for $β^ β^ $ from $X_{T}Xβ^ β^ =X_{T}y$ .

Quantum circuit for solving a $4×4$ system of linear equations $Ax=b$. The top qubit ($∣s⟩$) is the ancilla qubit. The four qubits in the middle ($∣j_{0}⟩,∣j_{1}⟩,∣j_{2}⟩$ and $∣j_{3}⟩$) stand for the Clock register $C$. The two qubits at the bottom ($∣q_{0}⟩$ and $∣q_{1}⟩$) form the Input register $I$ and two Hadamard gates are applied on them to initialize the state $∣b⟩$.

## Quantum Circuit

Having discussed the general idea behind the HHL algorithm in the above Section and its possible application in drastically speeding up multiple regression, we now move on to the quantum circuit design meant to solve the $4×4$ linear system which we have already encountered i.e., $X_{T}Xβ^ =X_{T}y$. For sake of convenience we will now denote $X_{T}X$ with $A$, $β^ $ with $x$ and $X_{T}y$ with $b$. The circuit requires only 7 qubits, with 4 qubits in the clock register, 2 qubits in the input register and remaining 1 as an ancilla qubit. At this point it is imperative to mention that we specifically chose the form of the regression data points in the previous Section such that $A$ turns out to be Hermitian, has four distinct eigenvalues of the form $λ_{i}=2_{i−1}$ and $b$ has a convenient form which can be efficiently prepared by simply using two Hadamard gates.

$A=41 ⎣⎢⎢⎢⎡ 1595−3 9153−5 5315−9 −3−5−915 ⎦⎥⎥⎥⎤ $

$A$ is a Hermitian matrix with eigenvalues $λ_{1}=1,λ_{2}=2,λ_{3}=4$ and $λ_{4}=8$. The corresponding eigenvectors encoded in quantum states $∣u_{j}⟩$ may be expressed as

$∣u_{1}⟩=−∣00⟩−∣01⟩−∣10⟩+∣11⟩$

$∣u_{2}⟩=+∣00⟩+∣01⟩−∣10⟩+∣11⟩$

$∣u_{3}⟩=+∣00⟩−∣01⟩+∣10⟩+∣11⟩$

$∣u_{4}⟩=−∣00⟩+∣01⟩+∣10⟩+∣11⟩$

Also, $b=[21 21 21 21 ]_{T}$ can be written as $∑_{j=1}β_{j}∣u_{j}⟩$ where $β_{j}=21 $.

We will now trace through the quantum circuit in Fig. (2). $∣q_{0}⟩$ and $∣q_{1}⟩$ are the input register qubits which are initialized to a combined quantum state

$∣b⟩=21 ∣00⟩+21 ∣01⟩+21 ∣10⟩+21 ∣11⟩$

which is basically the state-encoded format of $b$. This is followed by the quantum phase estimation step which involves a Walsh-Hadamard transform on the clock register qubits $∣j_{0}⟩,∣j_{1}⟩,∣j_{2}⟩,∣j_{3}⟩$, a clock register controlled unitary gates $U_{2_{0}},U_{2_{1}},U_{2_{2}}$ and $U_{2_{3}}$ where $U=exp(iAt/16)$ and an inverse quantum Fourier transform on the clock register. As discussed in the above Section, this step would produce the state $21 ∣0001⟩_{C}∣u_{1}⟩_{I}+21 ∣0010⟩_{C}∣u_{2}⟩_{I}+21 ∣0100⟩_{C}∣u_{3}⟩_{I}+21 ∣1000⟩_{C}∣u_{4}⟩_{I}$, which is essentially the same as $∑_{j=1}β_{j}∣u_{j}⟩_{I}∣2πλ~_{j}t_{0} ⟩_{C}$, assuming $t_{0}=2π$. Also, in this specific example $∣λ~_{j}⟩=∣λ_{j}⟩$, since the 4 qubits in the clock register are sufficient to accurately and precisely represent the 4 eigenvalues in binary. As far as the endianness of the combined quantum states is concerned we must keep in mind that in our circuit $∣q_{0}⟩$ is the most significant qubit (MSQ) and $∣q_{3}⟩$ is the least significant qubit (LSQ).

Next is the $R(λ~_{−1})$ rotation step. We make use of an ancilla qubit $∣s⟩$ (initialized in the state $∣0⟩$), which gets phase shifted depending upon the clock register's state. Let us take an example clock register state $∣0100⟩_{C}=∣0⟩_{q_{0}}⊗∣1⟩_{q_{1}}⊗∣0⟩_{q_{2}}⊗∣0⟩_{q_{3}}$ (binary representation of the eigenvalue corresponding to $∣u_{3}⟩$, that is 4). In this combined state, $∣q_{1}⟩$ is in the state $∣1⟩$ while $∣q_{0}⟩,∣q_{2}⟩$ and $∣q_{3}⟩$ are all in the state $∣0⟩$. This state will only trigger the $R_{y}(4.2_{r}8π )$ rotation gate, and none of the other phase shift gates. Thus, we may say that the smallest eigenvalue states in C cause the largest ancilla rotations. Using linearity arguments, it is clear that if the clock register state had instead been $∣b⟩$, as in our original example, the final state generated by this rotation step would be $∑_{j=1}β_{j}∣u_{j}⟩_{I}∣λ~_{j}⟩_{C}((1−C_{2}/λ~_{j})_{1/2}∣0⟩+λ~_{j}C ∣1⟩)_{S}$ where $C=8π/2_{r}$. For this step, an apriori knowledge of the eigenvalues of $A$ was necessary to design the gates.

Then, as elaborated above, the inverse phase estimation step essentially reverses the quantum phase estimation step. The state produced by this step, conditioned on obtaining $|1\rangle$ in ancilla is $2_{r}8π ∑_{j=1}2_{j−1}21 ∣u_{j}⟩$. Upon writing in the standard basis and normalizing, it becomes $340 1 (−∣00⟩+7∣01⟩+11∣10⟩+13∣11⟩)$. This is proportional to the exact solution of the system $x=321 [−1 7 11 13 ]_{T}$.

The Group Leaders Optimization Algorithm is employed to approximately decompose the $exp(2_{k}iAt )$ gates in the Hamiltonian simulation step into elementary quantum gates.

## Quantum Circuit Simulation

from qiskit import QuantumCircuit, execute from strangeworks.qiskit import get_backend from math import pi qc = QuantumCircuit(7, 2) #Initially applying Hadamard gate qc.h(1) qc.h(2) qc.h(3) qc.h(4) qc.h(5) qc.h(6) #application of exp(iAt/16) operator qc.h(6) qc.ccx(1,5,6) qc.h(6) qc.cu3(0.196,-pi/2,pi/2,1,6) qc.cu3(pi/2,pi/2,-pi/2,1,6) qc.u1(0.379,1) qc.cu3(0.981,-pi/2,pi/2,1,5) qc.u1(0.589,1) qc.ccx(1,5,6) qc.cu3(0.196,-pi/2,pi/2,1,5) qc.ccx(1,5,6) qc.h(6) qc.ccx(1,5,6) qc.h(6) #application of exp(iAt/8) operator qc.h(6) qc.ccx(2,5,6) qc.h(6) qc.cu3(1.963,-pi/2,pi/2,2,6) qc.cu3(pi/2,pi/2,-pi/2,2,6) qc.u1(1.115,2) qc.cu3(1.963,-pi/2,pi/2,2,5) qc.u1(2.615,2) qc.ccx(2,5,6) qc.cu3(0.178,-pi/2,pi/2,2,5) qc.ccx(2,5,6) qc.h(6) qc.ccx(2,5,6) qc.h(6) #application of exp(iAt/4) operator qc.h(6) qc.ccx(3,5,6) qc.h(6) qc.cu3(-0.785,-pi/2,pi/2,3,6) qc.cu3(pi/2,pi/2,-pi/2,3,6) qc.u1(1.017,3) qc.cu3(3.927,-pi/2,pi/2,3,5) qc.u1(2.517,3) qc.ccx(3,5,6) qc.cu3(2.356,-pi/2,pi/2,3,5) qc.ccx(3,5,6) qc.h(6) qc.ccx(3,5,6) qc.h(6) #application of exp(iAt/2) operator qc.h(6) qc.ccx(4,5,6) qc.h(6) qc.cu3(-9.014*10**(-9),-pi/2,pi/2,4,6) qc.cu3(pi/2,pi/2,-pi/2,4,6) qc.u1(-0.750,4) qc.cu3(1.571,-pi/2,pi/2,4,5) qc.u1(0.750,4) qc.ccx(4,5,6) qc.cu3(-1.571,-pi/2,pi/2,4,5) qc.ccx(4,5,6) qc.h(6) qc.ccx(4,5,6) qc.h(6) #Applying Inverse Fourier Transform qc.h(1) qc.cu1(pi/2,2,1) qc.cu1(pi/4,3,1) qc.cu1(pi/8,4,1) qc.h(2) qc.cu1(pi/2,3,2) qc.cu1(pi/4,4,2) qc.h(3) qc.cu1(pi/2,4,3) qc.h(4) #applying control-Ry operations qc.cu3(8*pi/2,0,0,1,0) qc.cu3(4*pi/2,0,0,2,0) qc.cu3(2*pi/2,0,0,3,0) qc.cu3(pi/2,0,0,4,0) #Applying Quantum Fourier Transform qc.h(4) qc.cu1(-pi/2,4,3) qc.h(3) qc.cu1(-pi/4,4,2) qc.cu1(-pi/2,3,2) qc.h(2) qc.cu1(-pi/8,4,1) qc.cu1(-pi/4,3,1) qc.cu1(-pi/2,2,1) qc.h(1) #application of exp(-iAt/2) operator qc.h(6) qc.ccx(4,5,6) qc.h(6) qc.ccx(4,5,6) qc.cu3(-1.571,-pi/2,pi/2,4,5) qc.ccx(4,5,6) qc.u1(0.750,4) qc.cu3(1.571,-pi/2,pi/2,4,5) qc.u1(-0.750,4) qc.cu3(pi/2,pi/2,-pi/2,4,6) qc.cu3(-9.014*10**(-9),-pi/2,pi/2,4,6) qc.h(6) qc.ccx(4,5,6) qc.h(6) #application of exp(-iAt/4) operator qc.h(6) qc.ccx(3,5,6) qc.h(6) qc.ccx(3,5,6) qc.cu3(2.356,pi/2,-pi/2,3,5) qc.ccx(3,5,6) qc.u1(-2.517,3) qc.cu3(3.927,pi/2,-pi/2,3,5) qc.u1(-1.017,3) qc.cu3(pi/2,3*pi/2,-3*pi/2,3,6) qc.cu3(-0.785,pi/2,-pi/2,3,6) qc.h(6) qc.ccx(3,5,6) qc.h(6) #application of exp(-iAt/8) operator qc.h(6) qc.ccx(2,5,6) qc.h(6) qc.ccx(2,5,6) qc.cu3(0.178,pi/2,-pi/2,2,5) qc.ccx(2,5,6) qc.u1(-2.615,2) qc.cu3(1.963,pi/2,-pi/2,2,5) qc.u1(-1.115,2) qc.cu3(pi/2,3*pi/2,-3*pi/2,2,6) qc.cu3(1.963,pi/2,-pi/2,2,6) qc.h(6) qc.ccx(2,5,6) qc.h(6) #Applying exp(-iAt/16) operator qc.h(6) qc.ccx(1,5,6) qc.h(6) qc.ccx(1,5,6) qc.cu3(0.196,pi/2,-pi/2,1,5) qc.ccx(1,5,6) qc.u1(0.589,1) qc.cu3(0.981,pi/2,-pi/2,1,5) qc.u1(0.379,1) qc.cu3(pi/2,3*pi/2,-3*pi/2,1,6) qc.cu3(0.196,pi/2,-pi/2,1,6) qc.h(6) qc.ccx(1,5,6) qc.h(6) qc.measure([5,6], [0,1]) # get a Strangeworks backend backend = get_backend("qasm_simulator") # execute circuit execute(qc, backend, shots=8192)

## Execution results

After executing the circuits on the Strangeworks backend, the outcomes 00, 01, 10, and 11 are obtained with 22%, 25%, 27%, and 26% which clearly shows the required solution state.

## Conclusion

To conclude, we have noticed that any multiple regression problem can be reduced to an equivalent linear system of equations problem. This allows us to employ the general quantum speed-up techniques like the HHL algorithm. However, for the HHL we need low error low cost circuits for the Hamiltonian simulation and controlled rotation steps to get accurate results. Although methods like Trotter-Suzuki decomposition and the Solvay-Kitaev algorithm can help to decompose Hamiltonian simulation, but except in certain cases they provide neither minimum cost nor efficient gate sequences. In most practical scenarios we would prefer a low cost gate sequence which approximates the unitary operator well, rather than a high cost exact decomposition. This is where stochastic genetic algorithms like the GLOA comes into play. However, we noted earlier, the GLOA turns out to be useful only in those cases where the matrix exponentials can be efficiently calculated. Otherwise the speedup provided by the HHL would be nullified. One might also argue that the large time taken to find the correct set of gates for the HHL, given a particular $A$, renders the quantum speed-up achieved useless. That is indeed true for smaller datasets but one should recall that the discussed technique reveals its true speed-up only for large datasets. This is because we restrict the number of gates to be used for the GLOA to a maximum of 20. So while the number and size of the data sets can be arbitrarily large, only a limited number of gates will ever be used for approximating the Hamiltonian simulation step of the HHL.

## References

Sanchayan Dutta, Adrien Suau, Sagnik Dutta, Suvadeep Roy, Bikash K. Behera and Prasanta K. Panigrahi, Demonstration of a Quantum Circuit Design Methodology for Multiple Regression, [arXiv:1811.01726 ]