[DFTB-Plus-User] On parallel version of DFTB+
ZHAOHUI HUANG
zuh101 at psu.edu
Mon Sep 19 23:59:57 CEST 2016
OK. thanks for reply. Actually I just use DFTB+ to relax large structures with smallest basis. For better description of conduction band, I wrote a TB code on static structure, using sp3d5s* basis. I can calculate band structure for up to 12,000 atoms with CHEEV. thanks.
ZhaoHui Huang,
----- Original Message -----
From: "Jacek Jakowski" <jjakowski at gmail.com>
To: "User list for DFTB+ related questions" <dftb-plus-user at mailman.zfn.uni-bremen.de>
Sent: Monday, September 19, 2016 4:13:18 PM
Subject: Re: [DFTB-Plus-User] On parallel version of DFTB+
Yes, scalapack is a direct attempt to get eigenvalues for dense
matrices. CHEEV or CHEEVD are complex diagonalizer for dense
matrices. As such it scales cubically. Different algorithms for
diagonalization are implemented and one them is called
divide-and-conquer. The DC-DFTB-K treats molecular systems as
collection of fragments which are solved independently (hence
divide-and-conquer) and then assembles the approximate solution.
This may be confusing with CHEEV/CHEEVD since both belong to
"divide-and-conquer" approaches. But besides the names the have
nothing in common.
I don't think that DFTB+ could handle million of atoms without
switching from dense to sparse matrices/solvers --- to estimate
how much resources you need just take the largest calculations you
were able to do so far, see how much larger is the system you have
in mind and take the cube of that number. For example, increase your
size 10x and you cost grows 1000 times.
And yes, there are often iterative solvers such as ARPACK/LANCZOS,
they are good if you only need small fraction of eigenvectors (like
in DFT with large basis sets). You cannot really use them with DFTB
because you need all eigenvectors. You basis set is already reduced
to minimum and you cannot reduce it any further.
On Mon, Sep 19, 2016 at 3:45 PM, ZHAOHUI HUANG <zuh101 at psu.edu> wrote:
> Hi,
>
> Thanks for quick reply. yeah, I know you mean the first step of BerkeleyGW. When you solve eigenvalues of exciton, in the space of quasiparticle wave functions, you don't diagonalize exciton Hamiltonian containing two-body effect? Last year, I had a 114 atoms structure, the exciton Hamiltonian had dimension around 170,000. They used iterative method to diagonalize H. Can I say SCALAPACK solution is direct attempts to get eigenvalues? for example, CHEEV or CHEEVD (divide and qanquer algorithm). so is it possible to use iterative method to handle large size of TB Hamiltonian? I expect DFTB+ could extend to handle structure with one million or so. comments?
>
> ZhaoHui Huang,
>
>
> ----- Original Message -----
> From: "Jacek Jakowski" <jjakowski at gmail.com>
> To: "User list for DFTB+ related questions" <dftb-plus-user at mailman.zfn.uni-bremen.de>
> Sent: Monday, September 19, 2016 3:28:14 PM
> Subject: Re: [DFTB-Plus-User] On parallel version of DFTB+
>
> My estimates are based on diagonalization for dense matrices, which
> is what scalapack does. The specific numbers are based on
> diagonalization on Cray XC30 (intel Xeons, 16 cores per node).
> Diagonalization as well as other matrix-matrix operations scale
> cubically with the system size which means that if you decreases
> your system 2 times then the computational cost reduces 8
> times(=2^3). Actually I need to correct my previous message: 10
> hours on 4000 cores for a single diagonalization is not for 100k for
> for 400k basis functions. Yes, tight-binding is much faster than
> conventional DFT, but dense linear algebra still scales cubically
> and dominates computations. The speedup with respect to DFT comes
> from two factors: (1) for the given number of atoms the matrices
> are about 5-10 smaller than in conventional DFT with localized basis
> set, (2) the formation of DFTB matrices is very small comparing
> to the same size DFT matrices.
>
> According to the official information BerkeleyGW does not do
> diagonalization but take the results of diagonalization from other
> codes as input (and computes higher order corrections). Also, it is
> intended for up to a few hundreds of atoms.
>
> Besides DFTB+, you can try divide and conquer implementation called
> DC-DFTB-K (Japan) or cp2k implementation (they use ELPA if I
> remember correctly).
>
>
> Jacek
>
>
> On Mon, Sep 19, 2016 at 2:01 PM, ZHAOHUI HUANG <zuh101 at psu.edu> wrote:
>> Can you describe some algorithm details used in DFTB+？ especially on Hamiltonian diagonalization? Tight-binding calculations are supposed to run very fast, but your reply impressed me with totally different picture. It takes me time to think over your words. I have not realized that DFTB+ might require a few thousand of processors. simply ask, could you tell me on what most CPU time are spent with DFTB+ calculation? diagonalization?
>> thanks a lot.
>>
>> If you use iterative method to solve Hamiltonian eigenvalues as implemented in BerkeleyGW, what do you think the calculation speed?
>>
>> ZhaoHui Huang,
>>
>>
>> ----- Original Message -----
>> From: "Jacek Jakowski" <jjakowski at gmail.com>
>> To: "User list for DFTB+ related questions" <dftb-plus-user at mailman.zfn.uni-bremen.de>
>> Sent: Saturday, September 17, 2016 8:26:15 PM
>> Subject: Re: [DFTB-Plus-User] On parallel version of DFTB+
>>
>> Most likely you don't have enough memory to fit the 26,000 atoms on
>> your computer, even if DFTB+ can handle it. Assuming that your
>> 26k atoms are carbons (or similar) you need 80GB to fit a single
>> matrix (100kbasis) in memory and much more (like 10 times) for a real
>> calculations.
>> But then if this fits into your memory, then 100k matrices on 4000
>> cores takes about 10 hours for a single diagonalization (real case).
>> It would probably took something like a month to do SCF, and
>> about half a year for a few MD steps.
>>
>> I suggest that you decrease the size of cell so that your matrices
>> are below 32,000.
>>
>> Jacek
>>
>> On Fri, Sep 9, 2016 at 1:36 PM, ZHAOHUI HUANG <zuh101 at psu.edu> wrote:
>>> Hello,
>>>
>>> Sorry to bother you if not interested.
>>>
>>> I have an issue from running parallel DFTB+. My unit cell contains 26,000 atoms and I just want to relax the structures a few steps. By running the code, I first get output overflow error message, then I increase MAXRECL parameter defined in HSDParser package. It runs indeed. but It failed by SCALAPACK error,
>>>
>>> MAXNEIGHBORS: 8847
>>> iSCC Total electronic Diff electronic SCC error
>>> Operation failed!
>>> ppotrf in scalafx_ppotrf_dreal
>>> Info: 23233
>>>
>>>
>>> Is there any code developer who is familiar with this part of code? thanks.
>>>
>>>
>>> ZhaoHui Huang,
>>> _______________________________________________
>>> DFTB-Plus-User mailing list
>>> DFTB-Plus-User at mailman.zfn.uni-bremen.de
>>> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>> _______________________________________________
>> DFTB-Plus-User mailing list
>> DFTB-Plus-User at mailman.zfn.uni-bremen.de
>> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
>> _______________________________________________
>> DFTB-Plus-User mailing list
>> DFTB-Plus-User at mailman.zfn.uni-bremen.de
>> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
> _______________________________________________
> DFTB-Plus-User mailing list
> DFTB-Plus-User at mailman.zfn.uni-bremen.de
> https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
_______________________________________________
DFTB-Plus-User mailing list
DFTB-Plus-User at mailman.zfn.uni-bremen.de
https://mailman.zfn.uni-bremen.de/cgi-bin/mailman/listinfo/dftb-plus-user
More information about the DFTB-Plus-User
mailing list