# Fragment approach

## Contents |

## Fragment calculation for a water dimer

The aim of this lesson is to present how to do basic fragment calculations with BigDFT and explain how to do to basic calculations. It is recommended that you have read the relevant papers in articles describing BigDFT and go through the tutorials on linear-scaling BigDFT before beginning.

### Template calculation

We consider the example of a water dimer with varying separation *R* between the molecules. Since we have two identical water molecules, they can both be represented by the same template fragment. The first step is therefore to generate the support functions for an isolated water molecule. We will use the following `h2o.xyz` and `h2o.yaml` input files:

3 angstroemd0 free O -0.931477999671 0.004547999723 -0.117331000086 H -1.355753693973 -0.032108553467 0.725181911626 H -0.003078395366 0.011003111969 0.053703839371

import: linear dft: hgrids: 0.25 rmult: [5.0, 7.0] inputpsiid: linear ixc: LDA (ABINIT) lin_general: output_wf: 1 output_mat: 11 lin_kernel: alphamix: 0.3 linear_method: FOE lin_basis_params: O: nbasis: 4 ao_confinement: 1.0E-002 confinement: [5.0E-003, 0.0] rloc: 9.0 rloc_kernel: 12.0 rloc_kernel_foe: 14.0 H: nbasis: 1 ao_confinement: 1.0E-002 confinement: [5.0E-003, 0.0] rloc: 9.0 rloc_kernel: 12.0 rloc_kernel_foe: 14.0

Note the non-zero values for `output_wf` and `output_mat`, which specify that the support functions and associated matrices be written to file. You should also observe that we have chosen a rather small value for `hgrid`, which will ensure a more accurate interpolation when the support functions are roto-translated.

Fragment calculations require the template information to be in a sub-directory, so you should proceed to run BigDFT in the following manner:

mkdir data-dimer cd data-dimer ./bigdft h2o | tee h2o.out

At the end of the calculation, you will notice that another directory `data-h2o` been created by BigDFT, which contains a number of files for the support functions and matrices. With these outputs, we can proceed to do a fragment calculation.

### Fragment calculations

We will consider three different scenarios for the dimer: with the molecules at a short separation, close to the equilibrium and far apart. Note that in each case, we keep the molecules rigid, with neither the isolated or dimer system in a fully relaxed state, as this is only intended to serve as a model system. We therefore have three position inputs: `dimer_1.5.xyz`, `dimer_2.0.xyz` and `dimer_5.0.xyz`:

6 angstroemd0 free O -0.931477999671 0.004547999723 -0.117331000086 H -1.355753693973 -0.032108553467 0.725181911626 H -0.003078395366 0.011003111969 0.053703839371 O 1.500000000000 -0.003706000000 0.113495000000 H 1.796831000000 0.762122000000 -0.351887000000 H 1.773536000000 -0.747744000000 -0.399151000000

6 angstroemd0 free O -0.931477999671 0.004547999723 -0.117331000086 H -1.355753693973 -0.032108553467 0.725181911626 H -0.003078395366 0.011003111969 0.053703839371 O 2.000000000000 -0.003706000000 0.113495000000 H 2.296831000000 0.762122000000 -0.351887000000 H 2.273536000000 -0.747744000000 -0.399151000000

6 angstroemd0 free O -0.931477999671 0.004547999723 -0.117331000086 H -1.355753693973 -0.032108553467 0.725181911626 H -0.003078395366 0.011003111969 0.053703839371 O 5.000000000000 -0.003706000000 0.113495000000 H 5.296831000000 0.762122000000 -0.351887000000 H 5.273536000000 -0.747744000000 -0.399151000000

We of course also need a `dimer.yaml` input file:

dft: hgrids: 0.35 rmult: [5.0, 7.0] inputpsiid: linear_restart ixc: LDA (ABINIT) lin_general: nit: 1 kernel_restart_mode: kernel lin_basis: nit: 1 lin_kernel: nit: [20,100] alphamix: 0.3 rpnrm_cv: 1.0E-010 linear_method: FOE lin_basis_params: O: nbasis: 4 ao_confinement: 1.0E-002 confinement: [5.0E-003, 0.0] rloc: 9.0 rloc_kernel: 12.0 rloc_kernel_foe: 14.0 H: nbasis: 1 ao_confinement: 1.0E-002 confinement: [5.0E-003, 0.0] rloc: 9.0 rloc_kernel: 12.0 rloc_kernel_foe: 14.0 frag: h2o: [1, 2]

Compared to a standard linear-scaling calculation, there are not many big changes to the input. You will of course notice that we have specified that this is a restart calculation via the `inputpsiid` variable, and we have allowed for a large number of kernel iterations while eliminating basis optimization by setting `nit` to 1 in the `lin_basis` block. Most importantly, we have also added a `frag` block, which is where we we specify the number of fragments and their types. In this example it is straightforward, as we simply have two fragments of the same name. More complicated systems are of course also possible, as will be seen in later examples.

**Exercise**: Using the above inputs, run fragment calculations for the three different coordinates. Make sure that you rename the files appropriately for each calculation, so that the `yaml`, `xyz` and `data` have the same label, either by copying files or using symbolic links. **Copy for each calculation in the directory data-label the file h2o.xyz and the data data-h2o.**
Also run all three systems without fragments; you can reuse the file for generating the template support functions if you change

`hgrids`to 0.35 and switch off the output of the support functions and other associated quantities.

Compare the linear and fragment energies for the three different distances. You will notice that they are very similar for the large separation where the molecules are weakly interacting, with a moderate error for the equilibrium calculation and a very large error for the short distance. This is unsurprising, as the fragment approach is intended for well separated molecules and is expected to break down at small distances. In other terms, the **basis set superposition error** (BSSE) is large. When using the fragment approach it is therefore important to consider whether the approximation that it represents is valid in your case.

### Increasing the basis

Nonetheless, let us assume that in this case we insist on using the fragment approach. What can we do to improve the accuracy? One option is to increase the number of degrees of freedom by increasing the number of support functions per atom. Let's try adding another s-type support function to each atom. Returning to the template calculation, change `nbasis` to 2 for hydrogen and 5 for oxygen. Whenever we vary the number of support functions from the default values, we have to specify the electronic configuration. To do this, add the following block to the end of the input file:

ig_occupation: H: 1s: 1.0 2s: 0.0 O: 2s: 2.0 2p: [4/3, 4/3, 4/3] 3s: 0.0

Rerun the template calculation in the `data-dimer` directory, copying the input files to `h2o_25.yaml` and `h2o_25.xyz` to avoid overwriting the previous template support functions. This will produce a directory called `data-h2o_25` containing the relevant outputs. You will notice that the energy is virtually the same as for a smaller basis, as we already had enough degrees of freedom to represent the isolated molecule.

We can now rerun the fragment calculations by modifying the fragment block:

frag: h2o_25: [1, 2]

Rerun the calculation for any of the three distances. You'll notice that BigDFT stops quickly, giving an error message, both in the main output file and in an additional output file called `bigdft-err-0-0.yaml`. Why? Well we changed the number of support functions in the template calculation, so we also have to change `nbasis` in the fragment input file. Once we update these values, BigDFT should run successfully. Note that for `rloc`, we do not have to make sure the radii is the same as the template calculation, BigDFT will simply override the input value and detect the correct template radius.

**Exercise**: Rerun the fragment calculations with 2 and 5 support functions per hydrogen and oxygen. Try increasing the number of support functions further. What happens to the energies? If you want some further practice, try more distances and see if you can reproduce a full binding curve - you should expect to see significant improvements in the energy of the fragment approach as you increase the number of support functions.