I managed to get this to work. 

I defined a larger matrix with the dense blocks appended to the end of the 
matrix on the last processor. Currently, I am only running with one extra 
unknown, so this should not be a significant penalty for load balancing. 

Since the larger matrix has the same I-j locations for the FE non-zeros, I use 
it directly in the FE assembly. 

I have tested with parallel MUMPS solves and it working smoothly. Also, the 
monolithic system removes the issue with the singularity of J_fe at/near the 
bifurcation point. 

Next, I would like to figure out if there are ways to bring in iterative 
solvers to solve this more efficiently. My J_fe comes from a nonlinear shell 
deformation problem with snap through response. 

I am not sure if it would make sense to use an AMG solver on this monolithic 
matrix, as opposed to using it as a preconditioner for J_fe in the 
Schur-factorization approach. The LOCA solver in Trillions was able to find 
some success with the latter approach: 
https://www.worldscientific.com/doi/abs/10.1142/S0218127405012508 
<https://www.worldscientific.com/doi/abs/10.1142/S0218127405012508>

I would appreciate any general thoughts concerning this. 

Regards,
Manav


> On May 29, 2019, at 9:11 PM, Manav Bhatia <bhatiama...@gmail.com> wrote:
> 
> Barry, 
> 
>   Thanks for the detailed message. 
> 
>    I checked libMesh’s continuation sovler and it appears to be using the 
> same system solver without creating a larger matrix: 
> https://github.com/libMesh/libmesh/blob/master/src/systems/continuation_system.C
>  
> <https://github.com/libMesh/libmesh/blob/master/src/systems/continuation_system.C>
>  
> 
>    I need to implement this in my code, MAST, for various reasons (mainly, it 
> fits inside a bigger workflow). The current implementation implementation 
> follows the Schur factorization approach: 
> https://mastmultiphysics.github.io/class_m_a_s_t_1_1_continuation_solver_base.html#details
>  
> <https://mastmultiphysics.github.io/class_m_a_s_t_1_1_continuation_solver_base.html#details>
>  
> 
>    I will look into some solutions pertaining to the use of 
> MatGetLocalSubMatrix or leverage some existing functionality in libMesh. 
> 
> Thanks,
> Manav
> 
> 
>> On May 29, 2019, at 7:04 PM, Smith, Barry F. <bsm...@mcs.anl.gov 
>> <mailto:bsm...@mcs.anl.gov>> wrote:
>> 
>> 
>>  Understood. Where are you putting the "few extra unknowns" in the vector 
>> and matrix? On the first process, on the last process, some places in the 
>> middle of the matrix?
>> 
>>   We don't have any trivial code for copying a big matrix into a even larger 
>> matrix directly because we frown on doing that. It is very wasteful in time 
>> and memory. 
>> 
>>   The simplest way to do it is call MatGetRow() twice for each row, once to 
>> get the nonzero locations for each row to determine the numbers needed for 
>> preallocation and then the second time after the big matrix has been 
>> preallocated to get the nonzero locations and numerical values for the row 
>> to call MatSetValues() with to set that row into the bigger matrix. Note of 
>> course when you call MatSetValues() you will need to shift the rows and 
>> column locations to take into account the new rows and columns in the bigger 
>> matrix. If you put the "extra unknowns" at the every end of the rows/columns 
>> on the last process you won't have to shift.
>> 
>>   Note that B being dense really messes up chances for load balancing since 
>> its rows are dense and take a great deal of space so whatever process gets 
>> those rows needs to have much less of the mesh. 
>> 
>>  The correct long term approach is to have libmesh provide the needed 
>> functionality (for continuation) for the slightly larger matrix directly so 
>> huge matrices do not need to be copied.
>> 
>>  I noticed that libmesh has some functionality related to continuation. I do 
>> not know if they handle it by creating the larger matrix and vector and 
>> filling that up directly for finite elements. If they do then you should 
>> definitely take a look at that and see if it can be extended for your case 
>> (ignore the continuation algorithm they may be using, that is not relevant, 
>> the question is if they generate the larger matrices and if you can leverage 
>> this).
>> 
>> 
>>  The ultimate hack would be to (for example) assign the extra variables to 
>> the end of the last process and hack lib mesh a little bit so the matrix it 
>> creates (before it puts in the numerical values) has the extra rows and 
>> columns, that libmesh will not put the values into but you will. Thus you 
>> get libmesh to fill up the true final matrix for its finite element problem 
>> (not realizing the matrix is a little bigger then it needs) directly, no 
>> copies of the data needed. But this is bit tricky, you'll need to combine 
>> libmesh's preallocation information with yours for the final columns and 
>> rows before you have lib mesh put the numerical values in. Double check if 
>> they have any support for this first. 
>> 
>>   Barry
>> 
>> 
>>> On May 29, 2019, at 6:29 PM, Manav Bhatia <bhatiama...@gmail.com 
>>> <mailto:bhatiama...@gmail.com>> wrote:
>>> 
>>> Thanks, Barry. 
>>> 
>>> I am working on a FE application (involving bifurcation behavior) with 
>>> libMesh where I need to solve the system of equations along with a few 
>>> extra unknowns that are not directly related to the FE mesh. I am able to 
>>> assemble the n x 1 residual (R_fe) and  n x n  Jacobian (J_fe ) from my 
>>> code and libMesh provides me with the sparsity pattern for this. 
>>> 
>>> Next, the system of equations that I need to solve is: 
>>> 
>>> [   J_fe   A ]  { dX } =  { R_fe  }
>>> [   B       C ]  { dV } =  {R_ext } 
>>> 
>>> Where, C is a dense matrix of size m x m ( m << n ), A is n x m, B is m x 
>>> n, R_ext is m x 1.   A, B and C are dense matrixes. This comes from the 
>>> bordered system for my path continuation solver. 
>>> 
>>> I have implemented a solver using Schur factorization ( this is outside of 
>>> PETSc and does not use the FieldSplit construct ). This works well for most 
>>> cases, except when J_fe is close to singular. 
>>> 
>>> I am now attempting to create a monolithic matrix that solves the complete 
>>> system. 
>>> 
>>> Currently, the approach I am considering is to compute J_fe using my 
>>> libMesh application, so that I don’t have to change that. I am defining a 
>>> new matrix with the extra non-zero locations for A, B, C. 
>>> 
>>> With J_fe computed, I am looking to copy its non-zero entries to this new 
>>> matrix. This is where I am stumbling since I don’t know how best to get the 
>>> non-zero locations in J_fe. Maybe there is a better approach to copy from 
>>> J_fe to the new matrix? 
>>> 
>>> I have looked through the nested matrix construct, but have not given this 
>>> a serious consideration. Maybe I should? Note that I don’t want to solve 
>>> J_fe and C separately (not as separate systems), so the field-split 
>>> approach will not be suitable here. 
>>> 
>>> Also, I am currently using MUMPS for all my parallel solves. 
>>> 
>>>  I would appreciate any advice. 
>>> 
>>> Regards,
>>> Manav
>>> 
>>> 
>>>> On May 29, 2019, at 6:07 PM, Smith, Barry F. <bsm...@mcs.anl.gov 
>>>> <mailto:bsm...@mcs.anl.gov>> wrote:
>>>> 
>>>> 
>>>> Manav,
>>>> 
>>>> For parallel sparse matrices using the standard PETSc formats the matrix 
>>>> is stored in two parts on each process (see the details in MatCreateAIJ()) 
>>>> thus there is no inexpensive way to access directly the IJ locations as a 
>>>> single local matrix. What are you hoping to use the information for? 
>>>> Perhaps we have other suggestions on how to achieve the goal.
>>>> 
>>>> Barry
>>>> 
>>>> 
>>>>> On May 29, 2019, at 2:27 PM, Manav Bhatia via petsc-users 
>>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>> Once a MPI-AIJ matrix has been assembled, is there a method to get the 
>>>>> nonzero I-J locations? I see one for sequential matrices here: 
>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetRowIJ.html
>>>>>  
>>>>> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetRowIJ.html>
>>>>>  , but not for parallel matrices. 
>>>>> 
>>>>> Regards,
>>>>> Manav
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to