A fast and precise DFT wavelet code

Miscellaneous input parameters

A fast and precise DFT wavelet code
Jump to: navigation, search

This file is used to specify values in order to optimize the performance of BigDFT. If this file does not exist, default values are used. On the contrary to other BigDFT input files, this file has optional key / value entries. The keywords are not case sensitive. This file is named input.perf.

Example

debug F                       Debug option
fftcache 8192                 Cache size for the FFT
accel NO                      Acceleration (NO, CUDAGPU, OCLGPU, OCLCPU, OCLACC)
OCL_platform                  Chosen OCL platform
OCL_devices                   Chosen OCL devices
blas F                        CUBLAS acceleration
projrad  1.50E+01             Radius of the projector as a function of the maxrad
exctxpar OP2P                 Exact exchange parallelisation scheme
ig_diag T                     Input guess: (T:Direct, F:Iterative) diag. of Ham.
ig_norbp 5                    Input guess: Orbitals per process for iterative diag.
ig_blocks 300 800             Input guess: Block sizes for orthonormalisation
ig_tol  1.00E-04              Input guess: Tolerance criterion
methortho 0                   Orthogonalisation (0=Cholesky,1=GS/Chol,2=Loewdin)
rho_commun DEF                Density communication scheme (DBL, RSC, MIX)
psolver_groupsize 0           Size of Poisson Solver taskgroups (0=nproc)
psolver_accel 0               Acceleration of the Poisson Solver (0=none, 1=CUDA)
unblock_comms OFF             Overlap Communications of fields (OFF,DEN,POT)
linear OFF                    Linear Input Guess approach (OFF, LIG, FUL, TMO)
tolsym  1.00E-08              Tolerance for symmetry detection
signaling F                   Expose calculation results on Network
signalTimeout 0               Time out on startup for signal connection
domain                        Domain to add to the hostname to find the IP
inguess_geopt 0               0= wavlet input guess, 1= real space input guess
store_index T                 linear scaling: store indices or recalculate them
verbosity 2                   verbosity of the output 0=low, 3=high
outdir .                      Writing directory
psp_onfly T                   Calculate pseudopotential projectors on the fly
pdsyev_blocksize -8           SCALAPACK linear scaling blocksize
pdgemm_blocksize -8           SCALAPACK linear scaling blocksize
maxproc_pdsyev 4              SCALAPACK linear scaling max num procs
maxproc_pdgemm 4              SCALAPACK linear scaling max num procs
ef_interpol_det  1.00E-20     FOE: max determinant of cubic interpolation matrix
ef_interpol_chargediff 1.0E1  FOE: max charge difference for interpolation
mixing_after_inputguess T     mixing step after linear input guess (T/F)
iterative_orthogonalization F iterative_orthogonalization for input guess orbitals

Entry descriptions

This is the only input file with no mandatory lines. Only provided values are taken into account.

  • ’debug’ T/F: The debug mode is enable mainly for memory profiling.
  • ’fftcache’ ncache_size: Specify the cache size for FFT in kBytes.
  • ’accel’ NO/CUDAGPU/OCLGPU: Specify the use of CUDA (resp. OCL) versions of various subroutines. For CUDA, a ’GPU.config’ file is needed.
  • ’blas’ T/F: Use or not the CUBlas acceleration.
  • ’projrad’ real: Radius of the projector region as a function of the maxrad. Reducing this value allows a faster treatment of the pseudopotentials separable part. Care must be taken that the nonlocal energy component remains sufficiently precise. (Search the output for Enl.)
  • ’exctxpar’ key: Exact exchange parallelisation scheme.
  • ’ig_diag’ T/F: Input guess: (T:Direct, F:Iterative) diagonalisation of Hamiltonian.
  • ’ig_norbp’ int: Input guess: Orbitals per process for iterative diagonalisation.
  • ’ig_blocks’ int int: Input guess: Block sizes for orthonormalisation.
  • ’ig_tol’ real: Input guess: Tolerance criterion.
  • ’methortho’ key: Orthogonalisation (0=Cholesky,1=GS/Chol,2=Loewdin).
  • ’rho_commun’ key: Density communication scheme. The option MIX can lead to considerable speedups, but it is supported only for free boundary conditions and implies small deviations. (Search the output for Electronic charge changed by rho compression.)
  • ’inguess_geopt’ int: When this is set to 1, a real space grid is transformed along with the atomic coordinates that allows a more efficient input guess for the wavefunctions during a geometry optimization, but only for free boundary conditions. The default (0) uses the wavelet coefficients of the previous geometry step.
  • ’verbosity’ int: Determines the amount of output from little (0) to detailed (3), 2 by default.
  • ’psp_onfly’ T/F: Switch on the once-and-for-all strategy for calculating the PSP projectors, which is faster but more memory demanding (considered by memguess). The default is on-the-fly strategy (T).
Personal tools