BOINC currently has no mechanism for deleting files from versions of deprecated 
apps.
I added an item for this on Github.
It's not hard, but it involves both client and scheduler changes.

-- David

On 31-Aug-2015 10:34 AM, Richard Haselgrove wrote:
I've been following along with the discussion, and checking out my own CPDN 
directory. We need to understand what is going on, and consider what is BOINC's 
responsibility, what is CPDN's responsibility, and what is the user's 
responsibility.
The rig I'm looking at has crunched for CPDN, off and on, since 1 January 2007. 
It's my dual Xeon workstation, which started life under Windows XP, and now 
runs Windows 7. It's had two GPU transplants: it started life with probably 
some variant of BOINC v5, and now runs v7.6.9
At the moment, I'm not running any CPDN tasks, so the CPDN project folder is a 
fair representation of the minimal state. It contains 278 files, totalling just 
under a gigabyte (those are real files on disk, not BOINC's enumeration or 
graphical display of them).
171 of those files are listed in client_state.xml, in connection with the 10 
different app_versions listed there. There are 18 EXE files, 119 GZ files, and 
34 ZIP files. These files, I would suggest, are BOINC's responsibility. When an 
application version number is bumped, in my experience, BOINC deletes the files 
for the previous version and replaces them. That seems fine. Where we might 
have a basis for discussion is when a whole application is deprecated without 
replacement. I don't think BOINC has a mechanism for removing the app_version 
section, and hence its associated files, from the user machine state (David 
will correct me if I'm wrong). I haven't got as far as a metric for the disk 
space occupied by files specified by any deprecated CPDN application, but it 
may be non-trivial.
The next group of files I noticed were program-related files extracted from the 
archives specified in client_state - 16 DLLs and 36 EXE files. CPDN extracts 
those, not BOINC: should they be deleted after each task, and re-extracted for 
the next one? Hardly seems worth it.
And I seem to have some 27 archive files (ZIP and GZ) containing data for 
failed models - maybe 100 MB in total, but not big by modern standards.
It should be noted that CPDN - as a project, nothing to do with BOINC - creates 
a separate sub-directory within the BOINC Data project directory for each model 
run. These task directories are - under Windows at least, there's some doubt 
about Linux - deleted by CPDN on successful conclusion of the run: it is these 
project-managed sub-directories which are the subject of the warnings about 
disk space consumption in the project FAQs, because they can get orphaned when 
tasks exit abnormally. Because the orphaned files are in separate, 
self-contained, subdirectories, they are easy to identify and delete manually, 
and I've been doing it as I've noticed them over the years.
I would urge that "failed task subdirectories" remain the responsibility of the 
project and their documentation/public relations team: I would ask that BOINC should 
*NOT* take upon itself to delete directories (or indeed files) not listed in 
client_state.xml. For example, the SETI optimised application installer which I maintain 
places ReadMe files in a documentation sub-directory: should BOINC declare that folder as 
against the rules too, and remove it?
Finally, we had a discussion about <sticky/> files some few years ago. The 
conclusion, after discussion with Einstein developers and administrators (they are 
heavy users of the sticky mechanism with their locality scheduler), was that they 
should be removed on project reset. That is certainly the case with FiND's sticky 
files, which accumulate without limit, and can be cleansed with a project reset 
between batches. That seems like the right decision to me, and I don't feel we need 
to revisit it again.


      On Monday, 31 August 2015, 17:42, Jord van der Elst <[email protected]> 
wrote:
  On Mon, Aug 31, 2015 at 6:15 PM, Jacob Klein <[email protected]> wrote:

Jord, I appreciate your reply, and I'd like to clarify some things:

- I'm not requesting to allow the user to adjust the deadline. I'm
requesting that the user be able to cherry pick a task and say "get this
one done now." It'd be a nice feature to have, for the situation I was in,
but I "had to" resort to other means, namely hijacking the task's local
deadline.

In my opinion, there is no difference between them. Add such an option and
users will be using it to tell BOINC to run tasks in the order they want,
instead of what the scheduler thinks they should run in.

CPDN has one additional piece of advice and that is to run their models as
much as possible alone on BOINC, precisely because of those long deadlines.
But not only are the CPDN deadlines long, the time that it takes to run a
model can be extremely long -several months even when run 24/7- as well.
Still depends on the application you chose to run, of course.


- The problem was not "freeing up space for other non-BOINC applications."
The problem was "how to shrink the data directory, so it doesn't take me so
damn long to back it up, on these 2 old laptops with old HDDs."

As I said, if you do not want to use that much disk space, you can adjust
it in the preferences, allow less work to be cached etc.

That said,
Yes, BOINC should clean up after itself but in my opinion only on things it
knows about.
It'll know that an application exits cleanly, after which BOINC uploads
everything and reports that.
The question is, does it know about crashing applications and that they can
leave stuff behind?

Do know that this stuff is left behind in the projects directory, not in a
slot directory. As far as I know, BOINC doesn't keep track of everything
that goes on in these project directories. It may know about the latest
applications, the not yet started tasks, the in-progress tasks, the simple
view slideshow images.
Should it keep track of everything happening in those directories?
Should it delete everything it hasn't got a record of?
When should it do this deleting, every so many hours while running, or upon
the start of the client?
What then of things you add manually, be it for anonymous platform?

With the possibility of it going wrong, isn't it then easier to have the
user manually delete the affected project directory with everything in it?
Should we automate everything?



-- Jord van der Elst.



------------------------------
From: [email protected]
Date: Mon, 31 Aug 2015 18:02:04 +0200
Subject: Re: [boinc_alpha] Large files and Reset project - A couple quirks
To: [email protected]
CC: [email protected]; [email protected]


A couple of observations.

- That's when I noticed that a project (climateprediction.net) was
using 18.5 GB. That's a LOT, for a single project.
That's what you get with a weather emulation modeling project that has
several tens of applications. According to their own FAQ (
http://www.climateprediction.net/support/technical-faq/#How_much_disk_space_do_the_models_take_up)
their models take up 2GB of disk space per model.
Also, from
http://www.climateprediction.net/support/technical-faq/#Why_is_there_data_left_on_my_disk
"5.1 Why is there data left on my disk? When models crash and the model
restarts, this can leave data on your computer. In this case, you might
need to manually delete data."
- Problem #1: Should BOINC allow a completed task to keep storing
potentially-large files? Is something being missed?
Keep in mind that for some projects 'completed tasks' leave behind data
because the data files are used at a later stage for more tasks. Einstein
comes to mind that does this.

- Problem #2: I had to edit client_state.xml, to set <report_deadline>
equal to <received_time>, to force earliest-deadline-first (high-priority
mode) on the tasks. Is there no better way to do this? Might we consider
providing this option, on some to-do list?

We may well do away with the deadline then, because as soon as you give
users an easy option to edit the deadline to something they like instead of
what the project wants, they'll abuse it. Also, what's with the 'I had to'?
You did all this out of your own free will, no one forced you. So instead
of saying that there appeared no other option than for you to manually edit
a file, you can also say that 'This resulted in me editing file X, doing
things, and isn't there an easier method to do so the next time'?

But still, is there something wrong/incomplete, with the Reset mechanism
- does it leave big files behind?
Reset will not clean up sticky files.
Reset will not clean up app_info.xml and the executables and libraries
dictated in this file.
I *think* it also leaves app_config.xml alone, but could be mistaken
there.

All of these are minor, especially #2 (pro-user here), but still...
Together they add up to an experience that could confuse a normal user who
is looking to regain some disk space or shrink their data directory.

On the other hand, disk space is cheap these days, even on an SSD. The
first 6 TB SSD has just hit the market past July.
So unless you have an ancient system with a small hard drive, you'll
probably have a system with one or more HDDs with several terabytes of
space.
Is BOINC taking up 20GB then so much?

Of course, if you just do not want BOINC or a project to take up that much
disk space, you set preferences to say so with '*Use no more than N GB'*.


- Jord van der Elst




_______________________________________________
boinc_alpha mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


_______________________________________________
boinc_alpha mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to