Hi, Langtian, Thanks for the configure.log and I now see what's wrong. Since you compiled your code with nvcc, we mistakenly thought petsc was configured with cuda. It is fixed in https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7909__;!!G_uCfscf7eWS!c2Xrs8q_dNGecGM-HpdhjKUScE6v8qSOSZa0ApqRhZJb2JUEhBq23ydXnFivVh45LPetYtrx-9xUkF_Z7VKbetWpWzW-$ , which will be in petsc/release and main.
Thanks. --Junchao Zhang On Wed, Oct 2, 2024 at 3:11 PM 刘浪天 <langtian....@icloud.com> wrote: > Hi Junchao, > > I check it, I haven't use cuda when install pure cpu version of petsc. > The configure.log has been attached. Thank you for your reply. > > Best wishes, > -------------------- Langtian Liu Institute for Theorectical Physics, > Justus-Liebig-University Giessen Heinrich-Buff-Ring 16, 35392 Giessen > Germany email: langtian....@icloud.com Tel: (+49)641 99 33342 > > On Oct 2, 2024, at 5:05 PM, Junchao Zhang <junchao.zh...@gmail.com> wrote: > > > > > On Wed, Oct 2, 2024 at 3:57 AM 刘浪天 via petsc-users < > petsc-users@mcs.anl.gov> wrote: > >> Hi all, >> >> I am using the PETSc and SLEPc to solve the Faddeev equation of baryons. >> I encounter a problem of function MatCreateDense when changing from CPU to >> CPU-GPU computations. >> At first, I write the codes in purely CPU computation in the following >> way and it works. >> ``` >> >> Eigen::MatrixXcd H_KER >> *;*Eigen::MatrixXcd G0*;* >> >> printf("\nCompute the propagator matrix.\n") >> *;*prop_matrix_nucleon_sc_av(Mn*, *pp_nodes*, *cos1_nodes) >> *;*printf("\nCompute the propagator matrix done.\n") >> *;*printf("\nCompute the kernel matrix.\n") >> *;*bse_kernel_nucleon_sc_av(Mn*, *pp_nodes*, *pp_weights*, *cos1_nodes*, >> *cos1_weights) >> *;*printf("\nCompute the kernel matrix done.\n") >> *;*printf("\nCompute the full kernel matrix by multiplying kernel and >> propagator matrix.\n") >> *;*MatrixXcd kernel_temp = H_KER * G0 >> *;*printf("\nCompute the full kernel matrix done.\n") >> >> *;*// Solve the eigen system with SLEPc >> printf("\nSolve the eigen system in the rest frame.\n") >> *;*// Get the size of the Eigen matrix >> int nRows = (int) kernel_temp.rows() >> *;*int nCols = (int) kernel_temp.cols()*;* >> >> // Create PETSc matrix and share the data of kernel_temp >> Mat kernel >> *;*PetscCall(MatCreateDense(PETSC_COMM_WORLD*, *PETSC_DECIDE*, >> *PETSC_DECIDE*, *nRows*, *nCols*, *kernel_temp.data()*, *&kernel)) >> *;*PetscCall(MatAssemblyBegin(kernel*, *MAT_FINAL_ASSEMBLY)) >> *;*PetscCall(MatAssemblyEnd(kernel*, *MAT_FINAL_ASSEMBLY))*;* >> >> ``` >> Now I change to compute the propagator and kernel matrices in GPU and >> then compute the largest eigenvalues in CPU using SLEPc in the ways below. >> ``` >> >> cuDoubleComplex *h_propmat >> *;*cuDoubleComplex *h_kernelmat*;* >> >> int dim = EIGHT * NP * NZ >> *;*printf("\nCompute the propagator matrix.\n") >> *;*prop_matrix_nucleon_sc_av_cuda(Mn*, *pp_nodes.data()*, *cos1_nodes.data()) >> *;*printf("\nCompute the propagator matrix done.\n") >> *;*printf("\nCompute the kernel matrix.\n") >> *;*kernel_matrix_nucleon_sc_av_cuda(Mn*, *pp_nodes.data()*, >> *pp_weights.data()*, *cos1_nodes.data()*, *cos1_weights.data()) >> *;*printf("\nCompute the kernel matrix done.\n") >> *;*printf("\nCompute the full kernel matrix by multiplying kernel and >> propagator matrix.\n") >> *;*// Map the raw arrays to Eigen matrices (column-major order) >> auto *h_kernel_temp = new cuDoubleComplex [dim*dim] >> *;*matmul_cublas_cuDoubleComplex(h_kernelmat*,*h_propmat*,*h_kernel_temp*,*dim*,*dim*,*dim) >> *;*printf("\nCompute the full kernel matrix done.\n") >> >> *;*// Solve the eigen system with SLEPc >> printf("\nSolve the eigen system in the rest frame.\n") >> *;*int nRows = dim >> *;*int nCols = dim*;* >> >> // Create PETSc matrix and share the data of kernel_temp >> Mat kernel >> *;*auto* h_kernel = (std::complex<double>*)(h_kernel_temp) >> *;*PetscCall(MatCreateDense(PETSC_COMM_WORLD*, *PETSC_DECIDE*, >> *PETSC_DECIDE*, *nRows*, *nCols*, *h_kernel_temp*, *&kernel)) >> *;*PetscCall(MatAssemblyBegin(kernel*, **MAT_FINAL_ASSEMBLY*)) >> *;*PetscCall(MatAssemblyEnd(kernel*, **MAT_FINAL_ASSEMBLY*))*;* >> >> But in this case, the compiler told me that the MatCreateDense function >> uses the data pointer as type of "thrust::complex<double>" instead of >> "std::complex<double>". >> I am sure I only configured and install PETSc in purely CPU without GPU >> and this codes are written in the host function. >> > Please double check that your PETSc was purely CPU configured. You can > find it at the end of your configure.log to see if petsc is configured with > CUDA. > Since *thrust::complex<double> *is a result of a petsc/cuda > configuration, I have this doubt. > > > >> >> Why the function changes its behavior? Did you also meet this problem >> when writing the cuda codes and how to solve this problem. >> I tried to copy the data to a new thrust::complex<double> matrix but this >> is very time consuming since my matrix is very big. Is there a way to >> create the Mat from the original data without changing the data type to >> thrust::complex<double> in the cuda applications? Any response will be >> appreciated. Thank you! >> >> Best wishes, >> Langtian Liu >> >> ------ >> Institute for Theorectical Physics, Justus-Liebig-University Giessen >> Heinrich-Buff-Ring 16, 35392 Giessen Germany >> >> >