# Running a calculation

## Contents |

## Introduction to O(N) BigDFT

The purpose of this lesson is to introduce linear-scaling
BigDFT and explain how to do to basic calculations. Make sure you have read the relevant papers in articles describing BigDFT before beginning. As discussed on the "Input variables" page, we require a coordinate file `posinp.xyz` and an `input.yaml` file.

### Methanol

We begin with the example of methanol, using the coordinates in `methanol.xyz` :

6 angstroem free C -0.047131 0.664389 0.000000 O -0.047131 -0.758551 0.000000 H -1.092995 0.969785 0.000000 H 0.878534 -1.048458 0.000000 H 0.437145 1.080376 0.891772 H 0.437145 1.080376 -0.891772

We will first run a cubic-scaling calculation as a point of reference, using the file `methanol_cubic.yaml`:

posinp: methanol dft: hgrids: 0.4 rmult: [5.0, 7.0] ixc: LDA (ABINIT) disablesym: Yes

The variable `posinp` is used to distinguish between the name of the `yaml` input and the position file, so that several different calculations can be run with different `yaml` inputs but the same atomic coordinates. Run BigDFT using

./bigdft methanol_cubic

This will generate the logfile `log-methanol_cubic.yaml`, which we will use as a reference for the energy, forces and KS eigenvalues.

We now proceed to running the first O(N) calculations. We begin by running a calculation using the diagonalization approach. There are many parameters which can be varied, however it is possible to achieve robust and reliable results using a pre-defined set of values for the majority of these variables. We can exploit this by importing a `profile`, which already defines a set of common values. This is achieved by simply adding a line like the following to the input file:

import: linear

This imports all the information from the profile, and we can then add or overwrite those values if we choose to. Currently available profiles can be found in the file `bigdft/src/input_variables_definition.yaml` where you can also find information about the input parameters.

In this case, we use a profile, and also specify some parameters as defined in the file `methanol_linear.yaml`:

import: linear posinp: methanol dft: hgrids: 0.4 rmult: [5.0, 7.0] ixc: LDA (ABINIT) disablesym: Yes lin_general: subspace_diag: No lin_kernel: nstep: 2 alphamix: 0.3 linear_method: DIAG lin_basis_params: rloc_kernel: 9.0 rloc_kernel_foe: 10.0 C: nbasis: 4 rloc: 5.5 O: nbasis: 4 rloc: 5.5 H: nbasis: 1 rloc: 5.0

We can then re-run BigDFT using this input file. A reasonable value for the number of MPI tasks cores is about one fifth of the total number of support functions. This can be increased (with a negative impact on the efficiency) up to a maximum of one support function/MPI.

We can easily rerun with FOE by modifying the following variable:

linear_method: FOE

or with direct minimization by modifying:

linear_method: DIRMIN alphamix: 1.0

For large systems the diagonalization test will be much more expensive, but for a system of this size all three methods take a similar amount of time. Run with all three and compare the outputs. You should notice that all three converge in roughly 13 iterations of the outer loop to a final energy of -24.021 Ha (grep for `delta out` if you want to see an overview of the outer loop convergence - see below for more information on the output file), compared to -24.028 Ha for the cubic-scaling code. There are some small differences in the energy given by the different methods for calculating the density kernel, but these are within the expected error for the given basis and convergence parameters.

You may also have noticed that for FOE there are no KS eigenvalues printed, while for direct minimization they are only written at the beginning where the input guess orbitals are printed, compared to diagonalization where they are printed at each kernel iteration. Why is this? Well for FOE we aren't calculating the KS states, while for direct minimization we are calculating an arbitrary linear combination of the eigenstates, so in each case we don't have access to the eigenvalues. If you do want the eigenvalues, we can still use FOE or direct minimization, we simply have to perform a diagonalization at the end of the calculation. Change the value of `subspace_diag` to `Yes` and rerun. You will see that there is an additional kernel update at the end of the calculation and the eigenvalues are now printed (note that the occupancies given here are incorrect). Comparing to the diagonalization calculation there are some small differences, these are expected to be bigger than the difference in the total energy, but in both cases this difference will decrease if the convergence thresholds are modified. Note that only the occupied (and in some cases low energy unoccupied states) are expected to be accurate, as the support functions are optimized to minimize the occupied states. If the unoccupied states are required, a small number can be explicitly included in the minimization procedure using the direct minimization approach.

**Exercise**: Run the above inputs with increasing localization radii. What's the closest you can get to the cubic-scaling energy? Compare also the forces and KS eigenvalues.

### The output file

Much of the BigDFT output is identical to the cubic-scaling version (e.g. the forces, the noise, the stress tensor etc.). The following are some notable exceptions:

`gnrm`: The gradient of the support functions, which should reach the convergence threshold specified by `gnrm_cv`. If this value is chosen too small, the code has some tools to detect it and might already exit sooner (if `min_gnrm_for_dynamic` is reached).

`delta`: The mean change of the density between two iterations of the kernel optimization. This value should converge to the specified threshold `rpnrm_cv`.

`delta out`: The mean change of the density between two outer loop iterations (support function and kernel optimization). This value should converge to the specified threshold `rpnrm_cv` and is the final stop criterion.

`max dev from unity`: Gives the maximal deviation of the overlap matrix from the identity. This value should never become larger than about 0.1-0.2. If it does, this indicates that the chosen cutoff radii are too small.

`max/mean error`: Gives the error of the various matrix powers (inversion, square root etc.). This value should not increase too much and stay of the order of 1.0e−8 . If the error becomes too large, depending on the selected method for calculating the matrices, the code will try to increase the precision of the matrix inversions. If this still does not help, you will see the warning “Taylor order reached maximum”. In such a case you should try to increase the cutoff radii or change the method used.

`check boundary values`: Gives the weight of the support functions which is located on the boundary of the localization regions. This value should not become much larger than 1.0e-2 and is also an indicator that the localization radii might need to be increased.

### Spin polarized calculations

In this section we briefly want to have a look at spin polarized calculations. In this case, the number of up and down spins is different, and we thus need two sets of support functions and kernels (one for the up and one for the down part). Obviously this makes the calculations heavier.

As example we use O_2. Use the following `posinp.xyz`

2 angstroem free O 0.0 0.0 0.0 2 O 0.0 0.0 1.208 0

and the following `input.yaml`:

import: linear dft: nspin: 2 mpol: 2 lin_general: charge_multipoles: 0 lin_basis_params: O: nbasis: 9 ig_occupation: O: 2s: [1.0, 1.0] 2p: - 1.0 - 1.0 - 1.0 - 1/3 - 1/3 - 1/3 3d: 0.0

We start the run with a polarization of 2. Can you see in the output that this is as well the polarization that we get in the end?

Now let's see what happens if we run a closed shell (i.e. not spin polarized) calculation. Can you modify the input.yaml in such a way as to run a closed shell calculation? If you have troubles you can find the solution here: `O2.yaml`

Run this closed shell calculation and compare the final result for the energy with the one from the spin polarized case. Which one is lower?

### Metallic systems

This tutorial needs some improvement, proceed with caution!

The linear version performs best for system with a finite HOMO-LUMO gap. Nevertheless we can also treat metallic systems.

As simple example we take one single copper atom. Create a xyz file with a Cu atom at the position (0.0,0.0,0.0). Then copy the following to your input.yaml and launch the calculation:

import: linear lin_general: subspace_diag: yes perf: foe_gap: yes lin_kernel: nit: 6 alphamix: 0.1 lin_basis_params: ao_confinement: -1.0 confinement: -1.0 rloc_kernel: 10.0 rloc_kernel_foe: 12.0 Cu: {nbasis: 9, rloc: 7.0} ig_occupation: Cu: 3d: 10.0 4s: 1.0 4p: 0.0 chess: foe: fscale: 5.e-3

You can already see in the first few iterations that the code has convergence problems. For metallic systems this can be an indication that the mixing parameters is chosen too large ("charge sloshing"). Can you try to change the mixing parameter to a smaller value in order to avoid this problem? Hint 1: The corresponding input variable is already contained in the input.yaml file. Hint 2: If you reduce the mixing factor it is also recommendable to increase the number of iterations in the mixing loop.

If you don't find parameters which work, you can find here a solution:`Cu.yaml`

### Troubleshooting

The default values used by BigDFT are usually quite stable. Nevertheless, this run shall illustrate some typical problems that could arise.

For this test we will use a small silicon cluster. Copy the following to a file posinp.xyz

4 angstroem free Si 0.000 1.280 0.000 Si 0.000 0.000 -2.047 Si 0.000 0.000 2.047 Si 0.000 -1.280 0.000

In the file input.yaml we simply take the default values:

import: linear

Run this calculation. You will note that it takes more iteration to converge than for the other systems ran so far.

It turns out that for Silicon you often need to include support functions with a d character. That is, instead of the standard setup with 4 (1 s-like, 3 p-like) you now need 9 support functions. Can you modify the input.yaml file in such a way as to use 9 basis support functions per atom, where the additional 5 have d character?

Hint: you need to modify the fields `lin_basis_params` and `ig_occupation`.

You can find the solution here:
`Si4.yaml`

## Post-processing

Some tips for post-processing. Note you must have specified the printing of the support function matrices in the sparse format!

For the post-processing go into the data directory that has been created.

**multipole analysis** :

The following performs an atomic multipole analysis, based on the matrices calculated during the run.

bigdft-tool \ -a multipole-analysis \ --mpirun='' \ --overlap_file=overlap_sparse.txt \ --kernel_file=density_kernel_sparse.txt \ --kernel_matmul_file=density_kernel_sparse_matmul.txt \ --hamiltonian_file=hamiltonian_sparse.txt \ --metadata_file=sparsematrix_metadata.dat \ --multipole_matrix_0_0=mpmat_0_0.txt \ --multipole_matrix_1_0=mpmat_1_-1.txt \ --multipole_matrix_1_1=mpmat_1_0.txt \ --multipole_matrix_1_2=mpmat_1_1.txt \ --multipole_matrix_2_0=mpmat_2_-2.txt \ --multipole_matrix_2_1=mpmat_2_-1.txt \ --multipole_matrix_2_2=mpmat_2_0.txt \ --multipole_matrix_2_3=mpmat_2_1.txt \ --multipole_matrix_2_4=mpmat_2_2.txt

**pdos** :

This generates a gnuplot file PDoS.gp ready to be plotted (you only have to adjust the colors and label according to your needs). To do so you first have to diagonalize the Hamiltonian:

bigdft-tool \ -a solve-eigensystem \ --mpirun='' \ --matrix_format='serial_text' \ --metadata_file=sparsematrix_metadata.dat \ --overlap_file=overlap_sparse.txt \ --hamiltonian_file=hamiltonian_sparse.txt \ --coeff_file=coeffs.txt

This will write the eigenvectors in the file coeffs.txt.

Now you can calculate the partial density of states:

bigdft-tool \ -a pdos \ --mpirun='' \ --matrix_format='serial_text' \ --metadata_file=sparsematrix_metadata.dat \ --overlap_file=overlap_sparse.txt \ --hamiltonian_file=hamiltonian_sparse.txt \ --coeff_file=coeffs.txt \ --pdos_file=pdos.dat

The file pdos.dat indicates which pdos shall be calculated. For the example of methanol you can use the following one:

4 # number of pdos # C C 1 # for the first pdos, take support function number 1 of the carbon atoms C 2 # ... plus also number 2 C 3 # ... plus also number 3 C 4 # ... plus also number 4 # O O 1 # for the second pdos, take support function number 1 of the oxygen atoms O 2 # ... plus also number 2 O 3 # ... plus also number 3 O 4 # ... plus also number 4 # H H 1 # for the third pdos, take support function number 1 of the hydrogen atoms # total C 1 # For the total dos, take all support functions C 2 # C 3 # C 4 # O 1 # O 2 # O 3 # O 4 # H 1 #