Re: [boinc_dev] [boinc_projects] [Bulk] Fwd: abnormally long result deadlines

Richard Haselgrove Tue, 17 Dec 2013 14:50:12 -0800

Now that SETI is back up, today's example of this behaviour is in 
http://setiathome.berkeley.edu/forum_thread.php?id=73556


Although one of the two complainants mentions cloning a hard disk, and hence 
might possibly have got himself into the situation this code is designed to 
detect (two machines with same HostID / CPID, to cheat credit scores), it 
doesn't sound as if he could have allowed the RPC sequence numbers to have 
become scrambled.

I agree with Claggy, that for the observed case, some variant of 'resend lost 
results' would be ideal - though in these cases, the results are not lost, and 
a different form of resynchronisation between client and server records would 
be needed.

But the aspect of the situation which annoys volunteers the most is that they 
spend time and energy (i.e. money) computing tasks which are still cached on 
their local computer, only to find that the results are not accepted 
scientifically, and no credits are awarded. Even synchronisation via "aborted 
by server" (preventing the waste of time/resources) would be better than that.



>________________________________
> From: Stephen Maclagan <[email protected]>
>To: Richard Haselgrove <[email protected]>; Eric Driver 
><[email protected]>; 'Boinc Projects' <[email protected]>; 
>"[email protected]" <[email protected]> 
>Sent: Tuesday, 17 December 2013, 21:15
>Subject: Re: [boinc_dev] [boinc_projects] [Bulk] Fwd: abnormally long result 
>deadlines
> 
>
>My proposal is allow the server to use results that have erroneously been 
>marked as detached,
>the hosts at Seti where this has happened have had the said tasks and still 
>crunch them and report them,
>(unless the owner notices and resets the project) not considering them for 
>validation is a waste,
>
>Claggy
>
>> Date: Tue, 17 Dec 2013 17:07:59 +0000
>> From: [email protected]
>> To: [email protected]; [email protected]; 
>> [email protected]
>> Subject: Re: [boinc_dev] [boinc_projects] [Bulk] Fwd: abnormally long result 
>>    deadlines
>> 
>> I think we ought to look into this question of 'abandoned' results more 
>> carefully.
>> 
>> I can only find 'mark_results_over(host)' in two places in 
>> handle_request.cpp, and they're both to do with hosts where there's 
>> "evidence that the host has detached". But we get a steady dribble of 
>> reports - mostly at SETI, because that's the busiest message board - saying 
>> that results were marked as abandoned when the user had no intention of 
>> detaching and didn't consciously do so. There were two more today, not now 
>> accessible because of maintenance (look in Number Crunching when the 
>> database is back up).
>> 
>> I did attempt to look into it some months ago, but couldn't come up with a 
>> definitive smoking gun, even though I collected several reports from 
>> normally reliable witnesses, known on the boards. The nearest I came was 
>> some suspicion of correlation with timing issues - more than one user 
>> reported an unexpected 'last request too recent' in a scheduler reply, when 
>> the BOINC client itself was choosing when to send the scheduler request.
>> 
>> In the context of which, I've been noticing that the NumberFields @ Home 
>> server doesn't appear to be locked to an NTP server - it drifts slowly 
>> forward against UTC, and when I checked a couple of days ago, it was about 5 
>> minutes 30 seconds fast (task deadlines were 7:00:05:30 ahead of time of 
>> receipt). Sorry, I kept intending to mention that on the boards, but it 
>> didn't seem critical.
>> 
>> IIRC, client backoff is reported in the client event log as "Project 
>> requested delay of 91 seconds" (that's NF) - but in at least some cases, the 
>> client calculates an absolute time from the local clock and uses that to 
>> time the next RPC - I've seen confusion over timing when using a manager on 
>> one machine to observe/control a client on a different machine, if the 
>> clocks aren't properly synchronised. So maybe if somebody has time (sorry) 
>> to look through the timing of interactions between client and server, these 
>> mysterious 'abandonments' could be solved too.
>> 
>> 
>> 
>> >________________________________
>> > From: Eric Driver <[email protected]>
>> >To: 'Boinc Projects' <[email protected]>; 
>> >[email protected] 
>> >Sent: Tuesday, 17 December 2013, 15:56
>> >Subject: Re: [boinc_projects] [Bulk] Fwd: [boinc_dev] abnormally long 
>> >result deadlines
>> > 
>> >
>> >Not sure if this is just a coincidence, but in every case where this
>> >happened, it was with a reissued result after the first result had an
>> >outcome of "abandoned".
>> >
>> >Eric
>> >
>> >
>> >-----Original Message-----
>> >From: boinc_projects [mailto:[email protected]] On
>> >Behalf Of Ian Hay
>> >Sent: Tuesday, December 17, 2013 7:06 AM
>> >To: David Anderson; Boinc Projects
>> >Subject: Re: [boinc_projects] [Bulk] Fwd: [boinc_dev] abnormally long result
>> >deadlines
>> >
>> >I spotted massively extended deadlines in a couple of VolPEx@UH 
>> >workunits almost 3 months ago 
>> >(http://volpex.cs.uh.edu/VCP/forum_thread.php?id=128&postid=713).
>> >
>> >The project's tasks are from jobs made up of a pair of main+worker 
>> >workunits, sometimes with hundreds of tasks in each of them.  All tasks 
>> >should have a 5 minute deadline, but a small number from a couple of the 
>> >main workunits had their deadline set to a few days over 19 years (the 
>> >rest of the main tasks and all of the worker ones had correct 
>> >deadlines).  The affected tasks were issued at the end of September and 
>> >had a deadline in early October 2032.
>> >
>> >I haven't seen this with any more of their jobs, but that doesn't mean 
>> >it hasn't happened (the workunits for a job are deleted as soon as it's 
>> >been completed).
>> >
>> >Ian
>> >
>> >David Anderson wrote on 16/12/2013 19:24:
>> >> Anyone else see this behavior?
>> >> I can't think of why it would happen.
>> >> -- David
>> >>
>> >>
>> >> -------- Original Message --------
>> >> Subject: [boinc_dev] abnormally long result deadlines
>> >> Date: Sun, 15 Dec 2013 23:37:39 -0700
>> >> From: Eric Driver <[email protected]>
>> >> To: <[email protected]>
>> >>
>> >> Hello,
>> >>
>> >> Does anyone know what would cause a small number of results to be created
>> >> with a very long deadline?  The delay_bound is set in the wu template to 7
>> >> days, but a handful of results were given a deadline next September.
>> >>
>> >> Eric Driver
>> >>
>> >> NumberFields@home
>> >_______________________________________________
>> >boinc_projects mailing list
>> >[email protected]
>> >http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_projects
>> >To unsubscribe, visit the above URL and
>> >(near bottom of page) enter your email address.
>> >
>> >
>> >_______________________________________________
>> >boinc_projects mailing list
>> >[email protected]
>> >http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_projects
>> >To unsubscribe, visit the above URL and
>> >(near bottom of page) enter your email address.
>> >
>> >
>> >
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>                          
>_______________________________________________
>boinc_dev mailing list
>[email protected]
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>
>
>
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] [boinc_projects] [Bulk] Fwd: abnormally long result deadlines

Reply via email to