Re: [boinc_dev] Server chooses wrong plan class for work allocation

Raistmer Thu, 08 Dec 2011 23:21:43 -0800

Some of CUDA supporting DLLs (CUFFT for example) are quite big ~10MB in size.
To copy such sizes few times (cause running few tasks per GPU is common 
practice fir fast GPUs) per time interval  ~mins (and VHAR SETI task computes 
in 2 mins even on relatively old CUDA GPU) will give big stress on HDD 
subsystem and definitely negatively affect host performance.
To copy such sizes per WU basis should not be considered as solution IMHO.

----- Original Message ----- 
From: Rom Walton 
To: David Anderson (BOINC) ; [email protected] ; Eric Korpela 
Sent: Friday, December 09, 2011 5:02 AM
Subject: Re: [boinc_dev] Server chooses wrong plan class for work allocation

I thought both the executable and supporting DLLs needed to have the
<copy_file/> attribute on Windows.

In that case the process would be launched from the slot directory and
the dll loaded from the slot directory.

----- Rom

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of David Anderson
Sent: Thursday, December 08, 2011 4:00 PM
To: [email protected]; Eric Korpela
Subject: Re: [boinc_dev] Server chooses wrong plan class for work
allocation

Interesting!
According to MS docs
(http://msdn.microsoft.com/en-us/library/7d83bc18%28v=vs.80%29.aspx)
the search order for DLLs is:
1) The directory where the executable module for the current process is
located.
2) The current directory.
3) ... some other directories

In our case, 1) is the project directory and 2) is the slot directory.
So the cuda23 app is being linked with the pre-2.3 versions of both
cudart.dll and cufft.dll.
There's a collision between the physical names of the 'cuda' DLLs and
the logical names of the 'cuda32' DLLs.

The solution is change physical names of the 'cuda' version DLLs.
SETI@home needs to deploy a new app version for plan class 'cuda'
where the FFT lib has:
physical name: cufft_10.dll
logical name: cufft.dll
copy_file: true

... and similarly for cudart.dll

I'll update the docs to reflect this.

-----------

Linux: here we have control over the search order.
For compatibility with Windows, we use the same order (project dir, slot
dir, other).
So the Linux 'cuda' versions need to be changed as above also.

-- David

On 08-Dec-2011 12:07 PM, Richard Haselgrove wrote:
> Further investigation on this. I may have been getting work allocated 
> with plan_class 'cuda', instead of 'cuda23', because the processing 
> speeds were not as differentiated as they should have been.
>
> This is because DLLs from the wrong plan_class were being used.
>
> This may be specific to the SETI CUDA implementation, as originally 
> supplied by NVIDIA to kick-start CUDA on BOINC, but the cautionary 
> lesson may be helpful to other projects too.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] Server chooses wrong plan class for work allocation

Reply via email to