In the table below we have a segmented basis set where the primitive gaussian type orbitals (PGTO) are combined into contracted gaussian type orbitals (CGTO). In this case, we have a basis set with three degrees of freedom (three CGTOs) down from five degrees of freedom (five PGTOs). Dark fields in the table shows coefficients whose value are larger than zero. Bright fields are zero.
CGTO1 | CGTO2 | CGTO3 | |
---|---|---|---|
PGTO1 | |||
PGTO2 | |||
PGTO3 | |||
PGTO4 | |||
PGTO5 |
So the PGTOs are obtained from the .mol file (I'm using Dalton for this) using the read_mol.py script found in the gist since it is used anyways to generate the new basis set. The CGTOs are obtained from an RHF calculation (see the hf.dal file in the gist) using the read_log.py script.
The problem with the design of a new basis set is to choose how many of the PGTOs to use in a CGTO, for instance, why choose three PGTOs in CGTO1? Since the number of PGTOs is not too overwhelming, I've gone ahead and made a general solution so I have the possibility to test all cases. The main problem then boiled down to: given an N x N matrix with zeros (notice that there are many zeros in this matrix), how do I overwrite subarrays (3 x 1 in this case) in Numpy so I can insert a specific number of coefficients to make CGTO1 from PGTO1 to PGTO3? and how to I make CGTO2 and CGTO3 couple with PGTO4 and PGTO5 using identity matrix inserted at the correct spot?
The trick is to make use of Numpys excellent (and very difficult) features known as boolean masking where a matrix filled with True or False will extract the corresponding values and one is able to assign to them. In the end, we need to remove excessive columns and I've found my head can do that easily by doing a transpose of the matrix, remove the unneeded rows (using a simpler Numpy/python logic which I know by hand) and transpose it back.
This was one of the more difficult issues, but at least I can now generate a multitude of basis set calculations and examining my property of interest in a matter of seconds. Numpy rocks, but I think I could grow old trying to learn all the awesome features it has.