Further investigation on this. I may have been getting work allocated with
plan_class 'cuda', instead of 'cuda23', because the processing speeds were not
as differentiated as they should have been.
This is because DLLs from the wrong plan_class were being used.
This may be specific to the SETI CUDA implementation, as originally supplied by
NVIDIA to kick-start CUDA on BOINC, but the cautionary lesson may be helpful to
other projects too.
Both SETI application version 608 and 609 rely heavily on NVIDIA's CUDA FFT
libraries. In those early days, the library wasn't versioned, so both
applications depend on a file called cufft.dll - in fact, the .dll files are
interchangable between the two applications, and the increased speed of the 609
application is almost entirely due to the cuda23 version of cufft.dll
distributed with it.
Using BOINC v6.12.34 on Windows XP, here is how the application and the support
files are referenced in client_state.xml
<app_version>
<version_num>608</version_num>
<file_ref>
<file_name>setiathome_6.08_windows_intelx86__cuda.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart.dll</file_name>
<open_name>cudart.dll</open_name>
</file_ref>
<file_ref>
<file_name>cufft.dll</file_name>
<open_name>cufft.dll</open_name>
</file_ref>
</app_version>
<app_version>
<version_num>609</version_num>
<file_ref>
<file_name>setiathome_6.09_windows_intelx86__cuda23.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>cudart_23_win32.dll</file_name>
<open_name>cudart.dll</open_name>
<copy_file/>
</file_ref>
<file_ref>
<file_name>cufft_23_win32.dll</file_name>
<open_name>cufft.dll</open_name>
<copy_file/>
</file_ref>
</app_version>
So, there is a file called cufft.dll in the project directory, and a second
(different and larger) file called cufft.dll in the slot directory while
version 609 is running.
However, Process Explorer reveals that the version in the project directory is
being used when application 609 is running - the wrong version: see screenshot
http://img850.imageshack.us/img850/9923/processexplorerwrongdll.png
I don't know how a BOINC application main program is supposed to traverse the
directory search path in search of support DLLs, but in this case it appears to
be looking in the project folder first, and because of the ambiguous file
names, stopping there.
The idea was that when multiple compatible plan_classes existed, the server
would randomly allocate test work to the presumed 'lower' plan class to ensure
future work was actually assigned to the application which was measured to be
fastest on the host in question. In this case, the test has the opposite effect
- that of slowing down the faster application to the speed of the slower.
----- Original Message -----
> For the last few weeks, it looks as if the SETI@home server has been making
> strange plan_class choices.
>
> Here's a screen-shot taken this afternoon.
>
> http://img46.imageshack.us/img46/1426/cudaorcuda23.png
>
> There seem to be almost as many tasks assigned to plan class 'cuda' as there
> are to plan class 'cuda23': on this project, both are compatible with the
> card in use, but cuda23 is usually significantly faster, so should be chosen
> in preference when available.
>
> The two tasks I've picked out in red have different runtime estimates, but
> exactly the same deadline, so they must have been allocated in the same
> scheduler RPC.
>
> Looking at the datestamps on the executable files in the project directory,
> it appears as if this started happening around 9th. November (I use European
> date ordering)
>
> Directory of D:\BOINCdata\projects\setiathome.berkeley.edu
>
> 09/11/2011 21:46 1,445,888 setiathome_6.08_windows_intelx86__cuda.exe
> 07/10/2011 06:13 2,859,008
> setiathome_6.09_windows_intelx86__cuda23.exe
>
> I normally run optimised applications on this host, but de-optimised in late
> September or early October. So I'm pretty certain that all the 519 completed
> tasks for 'SETI@home Enhanced 6.08 windows_intelx86 (cuda)' have been run in
> the last four weeks.
>
> http://setiathome.berkeley.edu/host_app_versions.php?hostid=3751792
>
> That's too many to expect for the planned random test that the relative
> speeds are as expected.
> _______________________________________________
> boinc_dev mailing list
> [email protected]
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
>
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.