In the table below we have a segmented basis set where the primitive gaussian type orbitals (PGTO) are combined into contracted gaussian type orbitals (CGTO). In this case, we have a basis set with three degrees of freedom (three CGTOs) down from five degrees of freedom (five PGTOs). Dark fields in the table shows coefficients whose value are larger than zero. Bright fields are zero.
CGTO1 | CGTO2 | CGTO3 | |
---|---|---|---|
PGTO1 | |||
PGTO2 | |||
PGTO3 | |||
PGTO4 | |||
PGTO5 |
So the PGTOs are obtained from the .mol file (I'm using Dalton for this) using the read_mol.py script found in the gist since it is used anyways to generate the new basis set. The CGTOs are obtained from an RHF calculation (see the hf.dal file in the gist) using the read_log.py script.
The problem with the design of a new basis set is to choose how many of the PGTOs to use in a CGTO, for instance, why choose three PGTOs in CGTO1? Since the number of PGTOs is not too overwhelming, I've gone ahead and made a general solution so I have the possibility to test all cases. The main problem then boiled down to: given an N x N matrix with zeros (notice that there are many zeros in this matrix), how do I overwrite subarrays (3 x 1 in this case) in Numpy so I can insert a specific number of coefficients to make CGTO1 from PGTO1 to PGTO3? and how to I make CGTO2 and CGTO3 couple with PGTO4 and PGTO5 using identity matrix inserted at the correct spot?
The trick is to make use of Numpys excellent (and very difficult) features known as boolean masking where a matrix filled with True or False will extract the corresponding values and one is able to assign to them. In the end, we need to remove excessive columns and I've found my head can do that easily by doing a transpose of the matrix, remove the unneeded rows (using a simpler Numpy/python logic which I know by hand) and transpose it back.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from numpy import * | |
from numpy.random import random | |
def gen_mos_mask(n_dim,n,m): | |
# this assumes we are in the upper left corner of the matrix | |
i = array([True]*n + [False]*(n_dim-n)) | |
j = array([True]*m + [False]*(n_dim-m)) | |
return outer(i,j) | |
def gen_iden_mask(n,m,ns,ndim): | |
# generate smaller lower right corner of the matrix | |
i = array([False]*n + [True]*ns) | |
j = array([False]*m + [True]*ns + [False]*(ndim-m-ns)) | |
return outer(i,j) | |
def generate_mos(n_dim,mos): | |
C = zeros((n_dim,n_dim)) | |
(n,m) = shape(mos) # rows, cols | |
i = n_dim - n # get minor dimension | |
mos_mask = gen_mos_mask(n_dim,n,m) # generate mask for molecular orbitals | |
iden_mask = gen_iden_mask(n,m,i,n_dim) # generate mask for identity matrix | |
C[mos_mask] = mos.ravel() | |
C[iden_mask] = identity(i).ravel() | |
# make the transpose | |
Ct = C.transpose() | |
# delete what we do not need | |
Dt = Ct[:i+m] | |
# transpose back | |
C = Dt.transpose() | |
# get dimensions | |
return C | |
if __name__ == '__main__': | |
mos = random((3,2)) | |
print generate_mos(6,mos) |