Re: [petsc-users] issue of MatCreateDense in the CUDA codes

Matthew Knepley Wed, 02 Oct 2024 04:50:06 -0700

On Wed, Oct 2, 2024 at 6:11 AM 刘浪天 via petsc-users <petsc-users@mcs.anl.gov>
wrote:


> I cannot declare everything as PetscScalar, my strategy is computing the
> elements of matrix on GPU blocks by blocks and copying them back to the
> CPU. Finally computing the eigenvalues using SLEPc on CPU.
>

Then you have to either

a) have a temporary array where you copy the GPU results to the CPU type, or

b) if you know the types are the same size, you can cast the pointer.

  Thanks,

     Matt


> -------------------- Langtian Liu Institute for Theorectical Physics,
> Justus-Liebig-University Giessen Heinrich-Buff-Ring 16, 35392 Giessen
> Germany email: langtian....@icloud.com Tel: (+49)641 99 33342
>
> On Oct 2, 2024, at 11:31 AM, Jose E. Roman <jro...@dsic.upv.es> wrote:
>
>
> Does it work if you declare everything as PetscScalar instead of
> cuDoubleComplex?
>
> El 2 oct 2024, a las 11:23, 刘浪天 <langtian....@icloud.com> escribió:
>
> Hi Jose,
>
> Since my matrix is two large, I cannot create the Mat on GPU. So I still
> want to create and compute the eigenvalues of this matrix on CPU using
> SLEPc.
>
> Best,
> -------------------- Langtian Liu Institute for Theorectical Physics,
> Justus-Liebig-University Giessen Heinrich-Buff-Ring 16, 35392 Giessen
> Germany email: langtian....@icloud.com Tel: (+49)641 99 33342
>
> On Oct 2, 2024, at 11:18 AM, Jose E. Roman <jro...@dsic.upv.es> wrote:
>
>
> For the CUDA case you should use MatCreateDenseCUDA() instead of
> MatCreateDense(). With this you pass a pointer with the data on the GPU
> memory. But I guess "new cuDoubleComplex[dim*dim]" is allocating on the
> CPU, you should use cudaMalloc() instead.
>
> Jose
>
>
> El 2 oct 2024, a las 10:56, 刘浪天 via petsc-users <petsc-users@mcs.anl.gov>
> escribió:
>
> Hi all,
>
> I am using the PETSc and SLEPc to solve the Faddeev equation of baryons. I
> encounter a problem of function MatCreateDense when changing from CPU to
> CPU-GPU computations.
> At first, I write the codes in purely CPU computation in the following way
> and it works.
> ```
> Eigen::MatrixXcd H_KER;
> Eigen::MatrixXcd G0;
> printf("\nCompute the propagator matrix.\n");
> prop_matrix_nucleon_sc_av(Mn, pp_nodes, cos1_nodes);
> printf("\nCompute the propagator matrix done.\n");
> printf("\nCompute the kernel matrix.\n");
> bse_kernel_nucleon_sc_av(Mn, pp_nodes, pp_weights, cos1_nodes,
> cos1_weights);
> printf("\nCompute the kernel matrix done.\n");
> printf("\nCompute the full kernel matrix by multiplying kernel and
> propagator matrix.\n");
> MatrixXcd kernel_temp = H_KER * G0;
> printf("\nCompute the full kernel matrix done.\n");
>
> // Solve the eigen system with SLEPc
> printf("\nSolve the eigen system in the rest frame.\n");
> // Get the size of the Eigen matrix
> int nRows = (int) kernel_temp.rows();
> int nCols = (int) kernel_temp.cols();
> // Create PETSc matrix and share the data of kernel_temp
> Mat kernel;
> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
> nRows, nCols, kernel_temp.data(), &kernel));
> PetscCall(MatAssemblyBegin(kernel, MAT_FINAL_ASSEMBLY));
> PetscCall(MatAssemblyEnd(kernel, MAT_FINAL_ASSEMBLY));
> ```
> Now I change to compute the propagator and kernel matrices in GPU and then
> compute the largest eigenvalues in CPU using SLEPc in the ways below.
> ```
> cuDoubleComplex *h_propmat;
> cuDoubleComplex *h_kernelmat;
> int dim = EIGHT * NP * NZ;
> printf("\nCompute the propagator matrix.\n");
> prop_matrix_nucleon_sc_av_cuda(Mn, pp_nodes.data(), cos1_nodes.data());
> printf("\nCompute the propagator matrix done.\n");
> printf("\nCompute the kernel matrix.\n");
> kernel_matrix_nucleon_sc_av_cuda(Mn, pp_nodes.data(), pp_weights.data(),
> cos1_nodes.data(), cos1_weights.data());
> printf("\nCompute the kernel matrix done.\n");
> printf("\nCompute the full kernel matrix by multiplying kernel and
> propagator matrix.\n");
> // Map the raw arrays to Eigen matrices (column-major order)
> auto *h_kernel_temp = new cuDoubleComplex [dim*dim];
>
> matmul_cublas_cuDoubleComplex(h_kernelmat,h_propmat,h_kernel_temp,dim,dim,dim);
> printf("\nCompute the full kernel matrix done.\n");
>
> // Solve the eigen system with SLEPc
> printf("\nSolve the eigen system in the rest frame.\n");
> int nRows = dim;
> int nCols = dim;
> // Create PETSc matrix and share the data of kernel_temp
> Mat kernel;
> auto* h_kernel = (std::complex<double>*)(h_kernel_temp);
> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
> nRows, nCols, h_kernel_temp, &kernel));
> PetscCall(MatAssemblyBegin(kernel, MAT_FINAL_ASSEMBLY));
> PetscCall(MatAssemblyEnd(kernel, MAT_FINAL_ASSEMBLY));
> But in this case, the compiler told me that the MatCreateDense function
> uses the data pointer as type of "thrust::complex<double>" instead of
> "std::complex<double>".
> I am sure I only configured and install PETSc in purely CPU without GPU
> and this codes are written in the host function.
> Why the function changes its behavior? Did you also meet this problem when
> writing the cuda codes and how to solve this problem.
> I tried to copy the data to a new thrust::complex<double> matrix but this
> is very time consuming since my matrix is very big. Is there a way to
> create the Mat from the original data without changing the data type to
> thrust::complex<double> in the cuda applications? Any response will be
> appreciated. Thank you!
>
> Best wishes,
> Langtian Liu
>
> ------
> Institute for Theorectical Physics, Justus-Liebig-University Giessen
> Heinrich-Buff-Ring 16, 35392 Giessen Germany
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fYVABMc9HqFTe-LOQjYCjmdl2bvWNOESlGNrBaRjwG8Kk2w8kDck2d-5ka6MqEO-gH_ppuZo46QFEgIgkLH6$
  
<https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fYVABMc9HqFTe-LOQjYCjmdl2bvWNOESlGNrBaRjwG8Kk2w8kDck2d-5ka6MqEO-gH_ppuZo46QFEoZYvF7A$
 >

Re: [petsc-users] issue of MatCreateDense in the CUDA codes

Reply via email to