Workflow Modes
DeepPseudopot's utility can be expanded along three directions:
(1) selecting the pseudopotential model family,
(2) choosing an optimization workflow, and
(3) deciding how to initialize that model.
This page summarizes each knob so you can assemble the combination that matches your study. Please see our paper for theoretical details.
1. Pseudopotential Model Selection
Neural-network local potentials
DeepPseudopot's default use case is to learn a screened local pseudopotential modeled by a deep neural network, combined with a non-local (NL) angular momentum-dependent correction term and a spin orbit coupling (SOC) term supplied in the bundle.
$$ v_{\mathrm{loc}}^{\alpha}\left(G\right) = \Bigl[\left(h^{H} \circ \cdots \circ h^{2} \circ h^{1} \right)\left(G\right)\Bigr]_{\alpha}, $$ where \(h^{i}(x) = \sigma\bigl(W^{i} x + b^{i}\bigr)\) is the output of the \(i\)-th hidden layer, \(\sigma\) is an activation function, and \(W^{i}\) and \(b^{i}\) are the weights and bias tensors.
The following NN_config.par keys govern the NN flavor, particularly PPmodel. The PPmodel catalog spans sigmoid, tanh, ReLU, and CELU activations with Xavier or He initialization, optional batch normalization, logistic decays, and Gaussian gating (*_decayGaussian). Gaussian-gated variants guarantee smooth decay at large \(G\) in the reciprocal space; logistic decays (*_decay) offer a softer cutoff.
| Key | Type | Applies to | Required? | Notes |
|---|---|---|---|---|
PPmodel |
string | All NN runs | Yes | Class name from utils/nn_models.py (e.g., Net_relu_xavier, Net_celu_HeInit_decayGaussian). Determines activation, normalization, dropout, and gating. |
hiddenLayers |
list[int] | All NN runs | Yes | Number of neurons in each hidden layer, not including the input and output layers which are set automatically by the code. E.g., 64 64 for a fully connected MLP with [1, 64, 64, 3] for three elements. |
PPmodel_decay_rate |
float | Net_*_decay variants |
Only for decay models | Logistic decay rate in the reciprocal space used to attenuate the NN output at large \(G\). |
PPmodel_decay_center |
float | Net_*_decay variants |
Only for decay models | Logistic inflection point (in units of Bohr\(^{-1}\)) specifying where attenuation begins. |
PPmodel_gaussian_std |
float | Net_*_decayGaussian variants |
Only for decayGaussian models | Standard deviation \(\sigma\) in the Gaussian envelope \(\exp[-q^2/(2\sigma^2)]\). |
PPmodel_scale |
list[float] | Net_celu_HeInit_scale_decayGaussian |
Only for scale variant | Per-species scaling factors applied after the Gaussian envelope; length must equal the number of atom types. |
Analytic local pseudopotential form ('Zunger form')
DeepPseudopot also retains the functionality to fit semi-empirical local potential forms first proposed by Wang and Zunger for III-V and II-VI semiconductors (Phys. Rev. B 51, 17398):
Parameters \(a_0 \dots a_3\) map to ppParams[0:4] within each init_<atom>Params.par input file. You can:
- Use the analytic form as an initialization target for the NN local pseudopotential, which trains the NN to reproduce \(V_{\text{Zunger}}(G)\) before band-structure fitting, or
- Optimize the analytic potential directly by setting
PPmodel = ZeroFunction(or a similar pass-through) so that only the Zunger coefficients are varied via Monte Carlo/gradient updates (more in Workflow Selection).
Relevant keys:
| Key | Type | Applies to | Required? | Notes |
|---|---|---|---|---|
init_<atom>Params.par |
file | Each species | Required when using analytic form | Provides the nine-parameter vectors (local, long-range, SO, NL, strain) consumed by init_ZungerPP and pot_func. |
init_Zunger_num_epochs |
int | Initialization | Optional | When >0, trains the NN to match the analytic potential before optimization. |
ZeroFunction |
PPmodel choice |
Analytic-only fits | Optional | Choose PPmodel = ZeroFunction (or another noop architecture) if you want to optimize just the analytic coefficients. |
Long-range (LR), spin-orbit coupling (SOC), nonlocal (NL), and strain terms
Regardless of whether the local piece is NN-based or analytic-form-based, the remaining channels share the same parameter slots described in the README ("Order of Pseudopotential Parameters"). These channels can be toggled independently, letting you build hybrids in any combination with the local pseudopotential.
| Key / Parameter | Type | Applies to | Required? | Notes |
|---|---|---|---|---|
ppParams[4] (long-range) |
float per species (N-1 independent) | Polar materials | Optional | Enables the Gaussian-screened Fröhlich tail with attenuation LRgamma. Leave it as zero for non-polar systems. |
ppParams[5] (spin–orbit) |
float per species | When SObool = 1 |
Optional | Set SObool = 1 to include SO; you may still zero ppParams[5] to keep only nonlocal terms active. |
ppParams[6-7] (nonlocal) |
float per species and non-local channel | When SObool = 1 |
Optional | Set the strength of the projectors used for the nonlocal part. |
ppParams[8] (strain) |
float per species | Strain workflows | Optional | Controls deformation-potential responses when fitting strain-dependent observables. |
SObool |
int (0/1) | Any run | Optional | Turns SO projectors on/off. With SObool = 1 and ppParams[5] = 0, you effectively reuse the NL blocks while keeping SO inactive. |
cacheSO |
int (0/1) | Runs with heavy SO/NL reuse | Optional | Caches SO/NL matrices in shared memory. This is highly recommended when training pseudopotentials with SOC terms but it also increases RAM usage. |
2. Workflow Selection
DeepPseudopot supports four core execution modes:
Gradient-based training
Highly recommended when using a neural network local potential.
- Set
max_num_epochs > 0andmc_bool = 0to use this. - Provide an
optimizer(adam,sgd,adamw, etc.),optimizer_lr,scheduler_gamma, andschedulerStep. - Optional
perturbEveryinjects random kicks to escape shallow local minima. - Expect periodic
epoch_<N>_*checkpoints plusplotBS/plotPPplots. - The total training objective can combine the band-structure loss with optional reciprocal-space penalties (
penalize_lambda,penalize_mag_lambda) and, when enabled, deformation-potential or coupling terms. - In gradient runs, deformation potentials are treated as global transition observables. Same-k and different-k rows from
expDefPot_X.parare therefore handled consistently in serial,separateKptGrad, and multiprocessing modes.
| Key | Type | Required? | Notes |
|---|---|---|---|
optimizer |
string | Optional | Supported: adam, sgd, lbfgs, rmsprop, adamw; see function init_optimizer in utils/init_NN_train.py for the complete list. Default is adam. |
optimizer_lr |
float | Yes | Base learning rate for the chosen optimizer. |
scheduler_gamma |
float | Optional | Multiplicative decay factor (e.g., 0.95). |
schedulerStep |
int | Optional | Number of epochs between scheduler updates. |
perturbEvery |
int | Optional | Apply random parameter perturbation every N epochs; -1 disables. |
Monte Carlo fitting (parallel tempering)
Preferred for traditional Zunger, as well as for NL/SO parameters where the loss landscape is highly rugged and non-convex.
- Set
mc_bool = 1andmax_num_epochs = 0. - Tune
mc_iter,mc_percentage,mc_beta, and helper files (mcOpts*.par,<atom>ParamSteps.par,mc_beta_schedule) to balance exploration and acceptance. - Checkpoints are written in
mc_checkpoint.pth/best_pot.*that can be used in future runs. - The Monte Carlo code still uses its legacy defPot path. If you need row-based same-k / different-k transition targets from
expDefPot_X.par, use the gradient workflow.
| Key | Type | Required? | Notes |
|---|---|---|---|
mc_iter |
int | Yes | Number of trial moves per MC block. |
mc_percentage |
float | Yes | Fraction of parameters perturbed each move (0–1). |
mc_beta |
float | Yes | Inverse temperature controlling acceptance; higher values favor downhill moves. |
mcOpts*.par |
files | Optional | Provide the schedule, step sizes, and temperature settings for parallel tempering. |
<atom>ParamSteps.par |
files | Optional | Provide per-atom per-parameter step size modifications. |
mc_beta_schedule |
file | Optional | Enables simulated annealing/tempering across betas. |
Initialization-only / validation sweeps
- Leave
max_num_epochs = 0andmc_bool = 0. - The driver loads
init_PPmodel.pth(or analytic parameters ininit_<atom>Params.par), evaluates band structures/deformation potentials, and exits. - Use this mode to sanity-check bundles or export analytic potentials without training.
Band-structure evaluation
- Run
python eval_fullBand.py /path/to/inputs/ /path/to/results/with any initialized pseudopotential to score new systems or dense \(\mathbf{k}\)-paths. - The script
eval_fullBand.pyis optimized for low memory use and parallel efficiency.
Hybrid workflows
- Users can connect a chain of Zunger initialization, Monte Carlo fitting, and gradient-based training of NN local potentials by reusing the
final_pot*,epoch_<N>_*, ormc_checkpoint.pthfiles as the initialization files for the next stage.
After picking a workflow, consult the Output Data Description for expected outputs and use Troubleshooting Guide for restart procedures.
3. Initialization Strategies
Choosing the initial parameters to the neural network pseudopotential is very important and can save a lot of effort in training them. By default, the DeepPseudopot code goes throught the following options in order to provide a good initialization. The users can also pick whichever option that matches your need on hand
- Checkpoint reuse — If
init_PPmodel.pth(and optionallyinit_AdamState.pth) exists, the weights of the model (and optionally optimizer state) will be loaded directly. This is the fastest restart path and preserves optimizer momentum when desired. -
Fit to Zunger potentials — When
init_Zunger_num_epochs > 0, the NN initializes by training againstinit_<atom>Params.partargets before training against band structures and deformation potentials. This initialization training is controlled byinit_Zunger_optimizer,init_Zunger_optimizer_lr,init_Zunger_scheduler_gamma, andinit_Zunger_plotEvery.Key Type Required? Notes init_Zunger_optimizerstring Optional Supported: adam(default) orsgd.init_Zunger_optimizer_lrfloat Required when init training enabled Learning rate for the initialization phase. init_Zunger_scheduler_gammafloat Optional Multiplicative LR decay during initialization. init_Zunger_plotEveryint Optional Interval (epochs) for saving initZunger_plotPP.*diagnostics. -
Fit to tabulated \(v(G)\) — Supplying
init_qSpace_pot.parbypasses the analytic form; the NN will learn the provided reciprocal-space table during initialization. This option is well suited for initializing the semi-empirical pseudopotential from ab initio pseudopotential references or a known function. - He/Kaiming or Xavier initialization — If
PPmodelis one of theNet_*HeInit*orNet_*xavier*classes andinit_Zunger_num_epochs = 0, the network relies solely on the underlying torch initializer. This is useful for cases when no analytic prior exists. - Randomized CELU variants — Setting
PPmodel = Net_celu_RandInit*samples weights from a custom normal distribution, allowing truly random weights for uncertainty estimation.
The selected initialization strategy is printed in the log files for traceability.
Overall Execution Flow of main.py
DeepPseudopot's main driver (main.py) advances through the following stages. Please use them as checkpoints when customizing runs or debugging new observables:
- Configuration Ingest – Parse
NN_config.par, validate mutually exclusive toggles (Monte Carlo vs. gradient, SO flags, etc.), and record global settings. Load the structural and spectral inputs for each system (system_X.par,input_X.par,kpoints_X.par,expBandStruct_X.par, optional weights) and constructBulkSystemobjects while preserving atom ordering. Refer to Input Data Description for the companion files expected alongside the config. - Model Instantiation and Initialization – Build the neural network architecture specified by
PPmodel/hiddenLayersand prepare it for either analytic or NN-based optimization. Initialize the model frominit_PPmodel.pth,init_qSpace_pot.par, or by fitting analytic Zunger parameters (init_<atom>Params.par). Output artefacts such asinitZunger_plotPP.*are generated here. - Baseline Evaluation – Compute the band structure of the initialized potential and write
oldFunc_plotBS.pdf/initZunger_plotBS.pdffor reference. - Optimization Loop – Depending on
NN_config.par, run gradient-based training mode (bandStruct_train_GPU,max_num_epochs > 0,mc_bool = 0) or Monte Carlo mode (runMC_NN,mc_bool = 1), producingepoch_<N>_*,training_cost.dat, ormc_checkpoint.pthoutputs. - Fourier Transform & Export – Outputs real- and reciprocal-space potentials via
FT_converge_and_write_ppandwrite_PP_qSpace(*_pot.dat,*_qSpace_pot.dat). - Post-processing & cleanup – Generate optional animations with
genMovie, remove temporary MC plots, and release shared-memory caches before exit.
Extended Toolkit
| Script | Purpose |
|---|---|
charge_density_from_wfns.py |
Builds real-space charge densities from plane-wave eigenvectors calculated from DeepPseudopot. |
convert_bgwBS.py |
Translates BerkeleyGW or Quantum ESPRESSO outputs into the DeepPseudopot input bundle format. |
convert_convCell_to_primCell.py |
Converts conventional cells into primitive cells during input preparation. |
utils/cluster_pp.py |
Performs PCA/K-means analyses to cluster neural network pseudopotentials and assess coverage. |
plot_BS_from_file.py |
Plotting scripts for band structures. |
plot_SOC_NL_T_Vloc.py |
Plotting scripts for decomposed Hamiltonian components. |