Miscellaneous input parameters
A fast and precise DFT wavelet code
(Redirected from Input.perf)
This file is used to specify values in order to optimize the performance of BigDFT. If this file does not exist, default values are used. On the contrary to other BigDFT input files, this file has optional key / value entries. The keywords are not case sensitive. This file is named input.perf.
Example
debug F Debug option fftcache 8192 Cache size for the FFT accel NO Acceleration (NO, CUDAGPU, OCLGPU, OCLCPU, OCLACC) OCL_platform Chosen OCL platform OCL_devices Chosen OCL devices blas F CUBLAS acceleration projrad 1.50E+01 Radius of the projector as a function of the maxrad exctxpar OP2P Exact exchange parallelisation scheme ig_diag T Input guess: (T:Direct, F:Iterative) diag. of Ham. ig_norbp 5 Input guess: Orbitals per process for iterative diag. ig_blocks 300 800 Input guess: Block sizes for orthonormalisation ig_tol 1.00E-04 Input guess: Tolerance criterion methortho 0 Orthogonalisation (0=Cholesky,1=GS/Chol,2=Loewdin) rho_commun DEF Density communication scheme (DBL, RSC, MIX) psolver_groupsize 0 Size of Poisson Solver taskgroups (0=nproc) psolver_accel 0 Acceleration of the Poisson Solver (0=none, 1=CUDA) unblock_comms OFF Overlap Communications of fields (OFF,DEN,POT) linear OFF Linear Input Guess approach (OFF, LIG, FUL, TMO) tolsym 1.00E-08 Tolerance for symmetry detection signaling F Expose calculation results on Network signalTimeout 0 Time out on startup for signal connection domain Domain to add to the hostname to find the IP inguess_geopt 0 0= wavlet input guess, 1= real space input guess store_index T linear scaling: store indices or recalculate them verbosity 2 verbosity of the output 0=low, 3=high outdir . Writing directory psp_onfly T Calculate pseudopotential projectors on the fly pdsyev_blocksize -8 SCALAPACK linear scaling blocksize pdgemm_blocksize -8 SCALAPACK linear scaling blocksize maxproc_pdsyev 4 SCALAPACK linear scaling max num procs maxproc_pdgemm 4 SCALAPACK linear scaling max num procs ef_interpol_det 1.00E-20 FOE: max determinant of cubic interpolation matrix ef_interpol_chargediff 1.0E1 FOE: max charge difference for interpolation mixing_after_inputguess T mixing step after linear input guess (T/F) iterative_orthogonalization F iterative_orthogonalization for input guess orbitals
Entry descriptions
This is the only input file with no mandatory lines. Only provided values are taken into account.
- ’debug’ T/F: The debug mode is enable mainly for memory profiling.
- ’fftcache’ ncache_size: Specify the cache size for FFT in kBytes.
- ’accel’ NO/CUDAGPU/OCLGPU: Specify the use of CUDA (resp. OCL) versions of various subroutines. For CUDA, a ’GPU.config’ file is needed.
- ’blas’ T/F: Use or not the CUBlas acceleration.
- ’projrad’ real: Radius of the projector region as a function of the maxrad. Reducing this value allows a faster treatment of the pseudopotentials separable part. Care must be taken that the nonlocal energy component remains sufficiently precise. (Search the output for Enl.)
- ’exctxpar’ key: Exact exchange parallelisation scheme.
- ’ig_diag’ T/F: Input guess: (T:Direct, F:Iterative) diagonalisation of Hamiltonian.
- ’ig_norbp’ int: Input guess: Orbitals per process for iterative diagonalisation.
- ’ig_blocks’ int int: Input guess: Block sizes for orthonormalisation.
- ’ig_tol’ real: Input guess: Tolerance criterion.
- ’methortho’ key: Orthogonalisation (0=Cholesky,1=GS/Chol,2=Loewdin).
- ’rho_commun’ key: Density communication scheme. The option MIX can lead to considerable speedups, but it is supported only for free boundary conditions and implies small deviations. (Search the output for Electronic charge changed by rho compression.)
- ’inguess_geopt’ int: When this is set to 1, a real space grid is transformed along with the atomic coordinates that allows a more efficient input guess for the wavefunctions during a geometry optimization, but only for free boundary conditions. The default (0) uses the wavelet coefficients of the previous geometry step.
- ’verbosity’ int: Determines the amount of output from little (0) to detailed (3), 2 by default.
- ’psp_onfly’ T/F: Switch on the once-and-for-all strategy for calculating the PSP projectors, which is faster but more memory demanding (considered by memguess). The default is on-the-fly strategy (T).