Karl Lin <karl.lin...@gmail.com> writes:

> Thanks for the reply. However, if the matrix is huge, like 13.5TB in our
> case, it will take significant amount of time to loop over insertion twice.
> Any other time and resource saving options? Thank you very much.

Where do the matrix entries come from?

Counting nonzeros should run at near STREAM bandwidth, which is a
200-300 GB/s for modern 2-socket compute nodes.  How many nodes do you
need to have the memory capacity?  On 100 nodes, that preallocation
counting pass should take less than a second.

Reply via email to