Re: [petsc-users] issue of MatCreateDense in the CUDA codes

Jose E. Roman via petsc-users Thu, 03 Oct 2024 00:58:46 -0700

You have to wait until the merge request has the label "Merged" instead of 
"Open".



> El 3 oct 2024, a las 9:55, 刘浪天 via petsc-users <petsc-users@mcs.anl.gov> 
> escribió:
> 
> Hello Junchao,
> 
> Okay. Thank you for helping find this bug. I pull the newest version of petsc 
> today. It seems this error has not been fixed in the present release version. 
> Maybe I should wait for some days.
> 
> Best wishes,
> Langtian
> 
> 
>> On Oct 3, 2024, at 12:12 AM, Junchao Zhang <junchao.zh...@gmail.com> wrote:
>> 
>> 
>> Hi, Langtian,
>>    Thanks for the configure.log and I now see what's wrong. Since you 
>> compiled your code with nvcc,  we mistakenly thought petsc was configured 
>> with cuda.
>>    It is fixed in 
>> https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7909__;!!G_uCfscf7eWS!aeorPzG-t4hjjYuvNrXAN30S9m17SudfU_aVI1QhsSjB6SJ1ezv8tqvb5tC5u3kC-MAPhaI_HsiKr5Pt_XnJjIrC$
>>  ,  which will be in petsc/release and main.
>>  
>>    Thanks.
>> --Junchao Zhang
>> 
>> 
>> On Wed, Oct 2, 2024 at 3:11 PM 刘浪天 <langtian....@icloud.com> wrote:
>> Hi Junchao,
>> 
>> I check it, I haven't use cuda when install pure cpu version of petsc.
>> The configure.log has been attached. Thank you for your reply.
>> 
>> Best wishes,
>> -------------------- Langtian Liu Institute for Theorectical Physics, 
>> Justus-Liebig-University Giessen Heinrich-Buff-Ring 16, 35392 Giessen 
>> Germany email: langtian....@icloud.com Tel: (+49)641 99 33342
>> 
>>> On Oct 2, 2024, at 5:05 PM, Junchao Zhang <junchao.zh...@gmail.com> wrote:
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Oct 2, 2024 at 3:57 AM 刘浪天 via petsc-users 
>>> <petsc-users@mcs.anl.gov> wrote:
>>> Hi all,
>>> 
>>> I am using the PETSc and SLEPc to solve the Faddeev equation of baryons. I 
>>> encounter a problem of function MatCreateDense when changing from CPU to 
>>> CPU-GPU computations.
>>> At first, I write the codes in purely CPU computation in the following way 
>>> and it works.
>>> ```
>>> Eigen::MatrixXcd H_KER;
>>> Eigen::MatrixXcd G0;
>>> printf("\nCompute the propagator matrix.\n");
>>> prop_matrix_nucleon_sc_av(Mn, pp_nodes, cos1_nodes);
>>> printf("\nCompute the propagator matrix done.\n");
>>> printf("\nCompute the kernel matrix.\n");
>>> bse_kernel_nucleon_sc_av(Mn, pp_nodes, pp_weights, cos1_nodes, 
>>> cos1_weights);
>>> printf("\nCompute the kernel matrix done.\n");
>>> printf("\nCompute the full kernel matrix by multiplying kernel and 
>>> propagator matrix.\n");
>>> MatrixXcd kernel_temp = H_KER * G0;
>>> printf("\nCompute the full kernel matrix done.\n");
>>> 
>>> // Solve the eigen system with SLEPc
>>> printf("\nSolve the eigen system in the rest frame.\n");
>>> // Get the size of the Eigen matrix
>>> int nRows = (int) kernel_temp.rows();
>>> int nCols = (int) kernel_temp.cols();
>>> // Create PETSc matrix and share the data of kernel_temp
>>> Mat kernel;
>>> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, 
>>> nRows, nCols, kernel_temp.data(), &kernel));
>>> PetscCall(MatAssemblyBegin(kernel, MAT_FINAL_ASSEMBLY));
>>> PetscCall(MatAssemblyEnd(kernel, MAT_FINAL_ASSEMBLY));
>>> ```
>>> Now I change to compute the propagator and kernel matrices in GPU and then 
>>> compute the largest eigenvalues in CPU using SLEPc in the ways below.
>>> ```
>>> cuDoubleComplex *h_propmat;
>>> cuDoubleComplex *h_kernelmat;
>>> int dim = EIGHT * NP * NZ;
>>> printf("\nCompute the propagator matrix.\n");
>>> prop_matrix_nucleon_sc_av_cuda(Mn, pp_nodes.data(), cos1_nodes.data());
>>> printf("\nCompute the propagator matrix done.\n");
>>> printf("\nCompute the kernel matrix.\n");
>>> kernel_matrix_nucleon_sc_av_cuda(Mn, pp_nodes.data(), pp_weights.data(), 
>>> cos1_nodes.data(), cos1_weights.data());
>>> printf("\nCompute the kernel matrix done.\n");
>>> printf("\nCompute the full kernel matrix by multiplying kernel and 
>>> propagator matrix.\n");
>>> // Map the raw arrays to Eigen matrices (column-major order)
>>> auto *h_kernel_temp = new cuDoubleComplex [dim*dim];
>>> matmul_cublas_cuDoubleComplex(h_kernelmat,h_propmat,h_kernel_temp,dim,dim,dim);
>>> printf("\nCompute the full kernel matrix done.\n");
>>> 
>>> // Solve the eigen system with SLEPc
>>> printf("\nSolve the eigen system in the rest frame.\n");
>>> int nRows = dim;
>>> int nCols = dim;
>>> // Create PETSc matrix and share the data of kernel_temp
>>> Mat kernel;
>>> auto* h_kernel = (std::complex<double>*)(h_kernel_temp);
>>> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, 
>>> nRows, nCols, h_kernel_temp, &kernel));
>>> PetscCall(MatAssemblyBegin(kernel, MAT_FINAL_ASSEMBLY));
>>> PetscCall(MatAssemblyEnd(kernel, MAT_FINAL_ASSEMBLY));
>>> But in this case, the compiler told me that the MatCreateDense function 
>>> uses the data pointer as type of "thrust::complex<double>" instead of 
>>> "std::complex<double>".
>>> I am sure I only configured and install PETSc in purely CPU without GPU and 
>>> this codes are written in the host function.
>>> Please double check that your PETSc was purely CPU configured.  You can 
>>> find it at the end of your configure.log to see if petsc is configured with 
>>> CUDA.
>>> Since thrust::complex<double> is a result of a petsc/cuda configuration,  I 
>>> have this doubt. 
>>> 
>>>  
>>> 
>>> Why the function changes its behavior? Did you also meet this problem when 
>>> writing the cuda codes and how to solve this problem.
>>> I tried to copy the data to a new thrust::complex<double> matrix but this 
>>> is very time consuming since my matrix is very big. Is there a way to 
>>> create the Mat from the original data without changing the data type to 
>>> thrust::complex<double> in the cuda applications? Any response will be 
>>> appreciated. Thank you!
>>> 
>>> Best wishes,
>>> Langtian Liu
>>> 
>>> ------
>>> Institute for Theorectical Physics, Justus-Liebig-University Giessen
>>> Heinrich-Buff-Ring 16, 35392 Giessen Germany
>>> 
>> 
>

Re: [petsc-users] issue of MatCreateDense in the CUDA codes

Reply via email to