Quantum Evolution Kernel (QEK) application
How it works
The Quantum Evolution Kernel (QEK) application's components are summarized in the figure below. You will find a more comprehensive description of them in the appendix and throughout the notebook tutorial. The color scheme is the following:
- orange components represent input data steps
- yellow components are needed to configure the quantum computations
- green components allow to perform the computations themselves.
Finally, the arrows represent how the components are connected among themselves and the order in which they need to be executed.
IMPORTANT
Components are thus steps with order constraints. For example, "dataset" must run before "data_subset".
The constraint order is automatically enforced. For instance:
- if you skip "dataset" and start directly with "data_subset", "dataset" will first run automatically using its default configuration
- if you run again "dataset" with a different configuration, and then run "seq_register", "data_subset" will first rerun relying on the new configuration of "dataset"
To run a step, you simply need to call it as a method on the application, and provide the useful parameters described in the appendix. For example, if
qek
is the application, you can run thedata_subset
step as follows:
qek.data_subset(size=10)
Details of components
This appendix details for each component all the configuration parameters that may be set when running.
dataset
Load a dataset that will be used as an input to the data_subset.
-
dataset
(type=str, default=alkane, optional): Possible values:-
'alkane':
This dataset contains a set of molecules belonging to the alkane family. An alkane molecule is composed by carbon and hydrogen atoms where all the carbon-carbon bonds are single bonds. For simplicity, in this tutorial the hydrogen bonds have been stripped out and only carbon atoms are kept. Alkane molecules are extremely important for several industrial processes such as the production of industrial-grade oil and grease. The boiling point is a relevant physical property which can strongly affect the result of this process.
The goal of the machine learning model developed in this notebook is to predict the boiling point of each alkane molecule given a training dataset where the boiling points are known. This is a therefore a REGRESSION problem.
-
'PTC_FM':
This dataset contains a set of molecules which can be either toxic or not toxic for humans.
The goal of the machine learning model developed in this notebook is to predict the toxicity of each molecule given a training dataset where the toxicity is known. This is a therefore a CLASSIFICATION problem.
-
-
folder
(type=str, default=/data/gml, optional). The folder containing the data. Not needed if the dataset name is specified.
data_subset
Choose a subset on which all subsequent operations will be performed. Typically, one should begin with a small subset and perform fast simulations with different configuration parameters. When satisfied, one can select a larger subset or the full dataset to check how the given configuration generalizes.
After running the component, you may additionally call the plot_graphs()
method or plot_graphs(index)
to plot all or one graph in the subset.
size
(type=int, default=None, optional): If None, use the full dataset, otherwise, limit its size to the specified number of graphs.plot
(type=bool, default=False, optional): Whether or not to plot the graphs withplot_graphs()
.
device
The quantum device with realistic constraints which will be used in the subsequent simulations.
device
(type=str, default=QPUGen1, optional): The quantum device to use in the simulations. Its constraints must be met by both registers and pulse sequences. Currently, only 'QPUGen1' can be selected.noise
(type=bool, default=False, optional): Whether or not to enable the simulation of noise. Only available on the EMU_FREE backend (see below).
seq_register
Choose a graph in the dataset and showcase how it is converted into a register of neutral atoms in the PASQAL quantum machine.
After running the component, you may additionally call plot_sample()
to plot the graph and the register to compare them.
index
(type=int, default=0, optional): Sample index in the data subset.plot
(type=bool, default=False, optional): Whether or not to plot the embedded register created from the graph withplot_sample()
.
backend
The backend on which the simulations will be executed. This can either be an emulator of the PASQAL device or a real quantum processing unit.
device_type
(type=str, default=local, optional): The backend where the pulse sequences are executed. Possible values:- 'local': Pulser simulation on the local machine.
- 'EMU_FREE': Basic statevector emulator for small number of qubits. It is also capable of adding realistic noise to the simulations.
- 'EMU_TN': Proprietary, high performance PASQAL emulator using an approximation based on tensor networks. This emulator can reach a larger number of qubits in the simulation but does not allow to add any realistic noise.
seq_build
Component which builds a sequence of pulses on the atomic register previously constructed. This quantum state resulting from the execution of the pulse sequence is used to build the kernel matrix (see below).
After running the component, you may additionally call plot_sequence()
to plot the pulse sequence.
n_layers
(type=int, default=2, optional): Number of layers in the pulse sequence. Notice that one layer includes the graph embedding pulse to encode the graph on the atomic and a mixing pulse for tuning the quantum kernel accuracy.layer_duration
(type=int, default=100, optional): The duration in nanoseconds of the mixing and graph embedding pulses.n_samples
(type=int, default=10000, optional): Number of samples to use for evaluating the probability distribution, a larger number of sample will give a more accurate measurement of the quantum state but it will result in a longer simulation both for emulators and real quantum processing units.plot
(type=bool, default=True, optional): Whether or not to draw the sequence usingplot_sequence()
.
seq_sample
Component to extract the resulting quantum state in form of bitstrings from the previously created pulse sequence and plot the bitstring distribution. The main goal of this component is to showcase what is the output of the quantum processing unit (or emulator).
After running the component, you may additionally call plot_distribution()
to plot the distribution that results from running the feature function.
plot
(type=bool, default=True, optional): Whether or not to plot the distribution withplot_distribution()
.
kernel_matrix
Compute the kernel matrix using the chosen register and pulse sequence, plot it and return it.
After running the component, you may additionally call plot_matrix()
to visualize it. Notice that the kernel matrix is a positive definite matrix where each element (i, j)
represents the degree of similarity of graph i
with graph j
in the given dataset.
plot
(type=bool, default=True, optional): Whether or not to plot the kernel matrix withplot_matrix()
.
predictor
Perform machine learning predictions using the chosen chosen pulse sequence as a basis for building the quantum kernel, and evaluate their quality against a test set.
After running the component, you may additionally call plot_results()
or summary()
to plot the results or a text-based summary.
test_fraction
(type=float, default=0.25, optional): Fraction of the given dataset to take as a test set. Must be a floating-point value between 0 and 1.shuffle
(type=bool, default=True, optional): Whether or not to shuffle the data before splitting.random_state
(type=int, default=42, optional): Integer value that acts as a seed to make training/test sets split reproducible. Only used if shuffle=True.print_summary
(type=bool, default=True, optional): Whether or not to directly print insights about the prediction results with thesummary()
method.plot
(type=bool, default=False, optional): Whether or not to call theplot_results()
method.