Returning back to this, since block based storage, which can act as a
shared storage/transport layer, is ready with 5'th release of the DST.
My couple of notes on proposed data distribution algorithm in FS.
On Sun, Sep 16, 2007 at 03:07:11AM -0400, Kyle Moffett ([EMAIL PROTECTED])
wrote:
> >I ac
Hi Pavel.
On Mon, Sep 17, 2007 at 06:22:30PM +, Pavel Machek ([EMAIL PROTECTED])
wrote:
> > I'm pleased to announce third release of the distributed storage
> > subsystem, which allows to form a storage on top of remote and local
> > nodes, which in turn can be exported to another storage as
Hi!
> I'm pleased to announce third release of the distributed storage
> subsystem, which allows to form a storage on top of remote and local
> nodes, which in turn can be exported to another storage as a node to
> form tree-like storages.
How is this different from raid0/1 over nbd? Or raid0/1 o
On Sat, Sep 15, 2007 at 11:24:46AM -0600, Andreas Dilger ([EMAIL PROTECTED])
wrote:
> > When Chris Mason announced btrfs, I found that quite a few new ideas
> > are already implemented there, so I postponed project (although
> > direction of the developement of the btrfs seems to move to the zfs
On Sep 15, 2007, at 13:24:46, Andreas Dilger wrote:
On Sep 15, 2007 16:29 +0400, Evgeniy Polyakov wrote:
Yes, block device itself is not able to scale well, but it is the
place for redundancy, since filesystem will just fail if
underlying device does not work correctly and FS actually does n
On Sep 15, 2007 12:20 -0400, Robin Humble wrote:
> On Sat, Sep 15, 2007 at 10:35:16AM -0400, Jeff Garzik wrote:
> >Lustre is tilted far too much towards high-priced storage,
>
> many (most?) Lustre deployments are with SATA and md raid5 and GigE -
> can't get much cheaper than that.
I have to ag
On Sep 15, 2007 16:29 +0400, Evgeniy Polyakov wrote:
> Yes, block device itself is not able to scale well, but it is the place
> for redundancy, since filesystem will just fail if underlying device
> does not work correctly and FS actually does not know about where it
> should place redundancy bit
On Sat, Sep 15, 2007 at 10:35:16AM -0400, Jeff Garzik wrote:
>Robin Humble wrote:
>>On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
>>>I've been waiting for years for a smart person to come along and write a
>>>POSIX-only distributed filesystem.
>>it's called Lustre.
>>works well, sca
Robin Humble wrote:
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
It is my hope that you will put your skills towards a distributed
filesystem :) Of the current solutions, GFS (currently in kernel)
scales poorly, and NFS v4.1 is amazingly bloated and overly complex.
I've been
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
>It is my hope that you will put your skills towards a distributed
>filesystem :) Of the current solutions, GFS (currently in kernel)
>scales poorly, and NFS v4.1 is amazingly bloated and overly complex.
>
>I've been waiting for years
Hi Mike.
On Fri, Sep 14, 2007 at 10:54:56PM -0400, Mike Snitzer ([EMAIL PROTECTED])
wrote:
> This distributed storage is very much needed; even if it were to act
> as a more capable/performant replacement for NBD (or MD+NBD) in the
> near term. Many high availability applications don't _need_ al
Hi Jeff.
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik ([EMAIL PROTECTED]) wrote:
> >Further TODO list includes:
> >* implement optional saving of mirroring/linear information on the remote
> > nodes (simple)
> >* new redundancy algorithm (complex)
> >* some thoughts about distributed
On Sat, Sep 15, 2007 at 12:08:42AM -0400, Jeff Garzik wrote:
> J. Bruce Fields wrote:
>> No, servers are required to support ordinary nfs operations to the
>> metadata server.
>> At least, that's the way it was last I heard, which was a while ago. I
>> agree that it'd stink (for any number of reas
J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 06:32:11PM -0400, Jeff Garzik wrote:
J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 05:14:53PM -0400, Jeff Garzik wrote:
NFSv4.1 adds to the fun, by throwing interoperability completely out the
window.
What parts are you worried about in partic
On 9/14/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:
> Evgeniy Polyakov wrote:
> > Hi.
> >
> > I'm pleased to announce fourth release of the distributed storage
> > subsystem, which allows to form a storage on top of remote and local
> > nodes, which in turn can be exported to another storage as a no
On Fri, Sep 14, 2007 at 06:32:11PM -0400, Jeff Garzik wrote:
> J. Bruce Fields wrote:
>> On Fri, Sep 14, 2007 at 05:14:53PM -0400, Jeff Garzik wrote:
>>> J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
> I've been waiting for years for a smart person to
J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 05:14:53PM -0400, Jeff Garzik wrote:
J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
I've been waiting for years for a smart person to come along and write a
POSIX-only distributed filesystem.
What exactly do y
On Fri, Sep 14, 2007 at 05:14:53PM -0400, Jeff Garzik wrote:
> J. Bruce Fields wrote:
>> On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
>>> I've been waiting for years for a smart person to come along and write a
>>> POSIX-only distributed filesystem.
>
>> What exactly do you mean by
J. Bruce Fields wrote:
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
I've been waiting for years for a smart person to come along and write a
POSIX-only distributed filesystem.
What exactly do you mean by "POSIX-only"?
Don't bother supporting attributes, file modes, and other
On Fri, Sep 14, 2007 at 03:07:46PM -0400, Jeff Garzik wrote:
> My thoughts. But first a disclaimer: Perhaps you will recall me as one
> of the people who really reads all your patches, and examines your code and
> proposals closely. So, with that in mind...
>
> I question the value of distrib
Jeff Garzik wrote:
> Evgeniy Polyakov wrote:
> > Hi.
> >
> > I'm pleased to announce fourth release of the distributed storage
> > subsystem, which allows to form a storage on top of remote and local
> > nodes, which in turn can be exported to another storage as a node to
> > form tree-like storage
Evgeniy Polyakov wrote:
Hi.
I'm pleased to announce fourth release of the distributed storage
subsystem, which allows to form a storage on top of remote and local
nodes, which in turn can be exported to another storage as a node to
form tree-like storages.
This release includes new configuratio
On Thu, Sep 13, 2007 at 04:22:59PM +0400, Evgeniy Polyakov wrote:
> Hi Paul.
>
> On Mon, Sep 10, 2007 at 03:14:45PM -0700, Paul E. McKenney ([EMAIL
> PROTECTED]) wrote:
> > > Further TODO list includes:
> > > * implement optional saving of mirroring/linear information on the remote
> > > nodes
Hi Paul.
On Mon, Sep 10, 2007 at 03:14:45PM -0700, Paul E. McKenney ([EMAIL PROTECTED])
wrote:
> > Further TODO list includes:
> > * implement optional saving of mirroring/linear information on the remote
> > nodes (simple)
> > * implement netlink based setup (simple)
> > * new redundancy alg
On Friday 31 August 2007 14:41, Alasdair G Kergon wrote:
> On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips wrote:
> > Resubmitting a bio or submitting a dependent bio from
> > inside a block driver does not need to be throttled because all
> > resources required to guarantee completion mu
On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips wrote:
> Resubmitting a bio or submitting a dependent bio from
> inside a block driver does not need to be throttled because all
> resources required to guarantee completion must have been obtained
> _before_ the bio was allowed to procee
Hi Daniel.
On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote:
> > Then, if of course you will want, which I doubt, you can reread
> > previous mails and find that it was pointed to that race and
> > pos
On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote:
> Then, if of course you will want, which I doubt, you can reread
> previous mails and find that it was pointed to that race and
> possibilities to solve it way too long ago.
What still bothers me about your response is that, while you kno
On Tue, Aug 28, 2007 at 02:08:04PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote:
> > On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL
> > PROTECTED]) wrote:
> > > > We do not care about one cpu being able to increase
On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote:
> On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL PROTECTED])
> wrote:
> > > We do not care about one cpu being able to increase its counter
> > > higher than the limit, such inaccuracy (maximum bios in flight
> > > thus ca
On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > We do not care about one cpu being able to increase its counter
> > higher than the limit, such inaccuracy (maximum bios in flight thus
> > can be more than limit, difference is equal to the number of CPUs -
>
On Tuesday 28 August 2007 02:35, Evgeniy Polyakov wrote:
> On Mon, Aug 27, 2007 at 02:57:37PM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > Say Evgeniy, something I was curious about but forgot to ask you
> > earlier...
> >
> > On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> >
On Fri, Aug 03, 2007 at 09:04:51AM +0400, Manu Abraham ([EMAIL PROTECTED])
wrote:
> On 7/31/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
>
> > TODO list currently includes following main items:
> > * redundancy algorithm (drop me a request of your own, but it is highly
> > unlikley
On Mon, Aug 27, 2007 at 02:57:37PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> Say Evgeniy, something I was curious about but forgot to ask you
> earlier...
>
> On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> > ...All oerations are not atomic, since we do not care about prec
Say Evgeniy, something I was curious about but forgot to ask you
earlier...
On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> ...All oerations are not atomic, since we do not care about precise
> number of bios, but a fact, that we are close or close enough to the
> limit.
> ... in bi
On Tue, Aug 14, 2007 at 07:20:49PM +0200, Jan Engelhardt ([EMAIL PROTECTED])
wrote:
> >I'm pleased to announce second release of the distributed storage
> >subsystem, which allows to form a storage on top of remote and local
> >nodes, which in turn can be exported to another storage as a node to
>
On Aug 14 2007 20:29, Evgeniy Polyakov wrote:
>
>I'm pleased to announce second release of the distributed storage
>subsystem, which allows to form a storage on top of remote and local
>nodes, which in turn can be exported to another storage as a node to
>form tree-like storages.
I'll be quick: w
On Tuesday 14 August 2007 05:46, Evgeniy Polyakov wrote:
> > The throttling of the virtual device must begin in
> > generic_make_request and last to ->endio. You release the throttle
> > of the virtual device at the point you remap the bio to an
> > underlying device, which you have convinced your
On Tue, Aug 14, 2007 at 05:32:29AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Tuesday 14 August 2007 04:50, Evgeniy Polyakov wrote:
> > On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips
> ([EMAIL PROTECTED]) wrote:
> > > On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
On Tuesday 14 August 2007 04:50, Evgeniy Polyakov wrote:
> On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
> > > > And it will not solve the deadlock problem in general. (Maybe
> > > > it works for y
On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
> > > And it will not solve the deadlock problem in general. (Maybe it
> > > works for your virtual device, but I wonder...) If the virtual
> > > device
On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
> > And it will not solve the deadlock problem in general. (Maybe it
> > works for your virtual device, but I wonder...) If the virtual
> > device allocates memory during generic_make_request then the memory
> > needs to be throttled.
>
> D
On Tue, Aug 14, 2007 at 04:13:10AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Tuesday 14 August 2007 01:46, Evgeniy Polyakov wrote:
> > On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips
> ([EMAIL PROTECTED]) wrote:
> > > Perhaps you never worried about the resources that the d
On Tuesday 14 August 2007 01:46, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > Perhaps you never worried about the resources that the device
> > mapper mapping function allocates to handle each bio and so did not
> > consider thi
On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> Perhaps you never worried about the resources that the device mapper
> mapping function allocates to handle each bio and so did not consider
> this hole significant. These resources can be significant, as is
On Monday 13 August 2007 02:12, Jens Axboe wrote:
> > It is a system wide problem. Every block device needs throttling,
> > otherwise queues expand without limit. Currently, block devices
> > that use the standard request library get a slipshod form of
> > throttling for free in the form of limit
On Monday 13 August 2007 05:18, Evgeniy Polyakov wrote:
> > Say you have a device mapper device with some physical device
> > sitting underneath, the classic use case for this throttle code.
> > Say 8,000 threads each submit an IO in parallel. The device mapper
> > mapping function will be called
On Mon, Aug 13, 2007 at 05:18:14AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > If limit is for
> > 1gb of pending block io, and system has for example 2gbs of ram (or
> > any other resonable parameters), then there is no way we can deadlock
> > in allocation, since it will not force pag
On Mon, Aug 13, 2007 at 04:18:03AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > No. Since all requests for virtual device end up in physical devices,
> > which have limits, this mechanism works. Virtual device will
> > essentially call either generic_make_request() for new physical
> > de
On Monday 13 August 2007 05:04, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 04:04:26AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote:
> > > > Oops, and there is also:
> > > >
> > > > 3) The bio throttle, which is supposed to prev
On Mon, Aug 13, 2007 at 04:04:26AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote:
> > > Oops, and there is also:
> > >
> > > 3) The bio throttle, which is supposed to prevent deadlock, can
> > > itself deadlock. Let me see if I can reme
On Monday 13 August 2007 04:03, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 03:12:33AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > > This is not a very good solution, since it requires all users of
> > > the bios to know how to free it.
> >
> > No, only the specific ->endio needs t
On Monday 13 August 2007 01:23, Evgeniy Polyakov wrote:
> On Sun, Aug 12, 2007 at 10:36:23PM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > (previous incomplete message sent accidentally)
> >
> > On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> > > On Tue, Aug 07, 2007 at 10:55:
On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote:
> > Oops, and there is also:
> >
> > 3) The bio throttle, which is supposed to prevent deadlock, can
> > itself deadlock. Let me see if I can remember how it goes.
> >
> > * generic_make_request puts a bio in flight
> > * the bio gets pas
On Mon, Aug 13, 2007 at 03:12:33AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > This is not a very good solution, since it requires all users of the
> > bios to know how to free it.
>
> No, only the specific ->endio needs to know that, which is set by the
> bio owner, so this knowledge
On Monday 13 August 2007 03:22, Jens Axboe wrote:
> I never compared the bio to struct page, I'd obviously agree that
> shrinking struct page was a worthy goal and that it'd be ok to uglify
> some code to do that. The same isn't true for struct bio.
I thought I just said that.
Regards,
Daniel
-
On Mon, Aug 13 2007, Daniel Phillips wrote:
> On Monday 13 August 2007 03:06, Jens Axboe wrote:
> > On Mon, Aug 13 2007, Daniel Phillips wrote:
> > > Of course not. Nothing I said stops endio from being called in the
> > > usual way as well. For this to work, endio just needs to know that
> > > o
On Monday 13 August 2007 03:06, Jens Axboe wrote:
> On Mon, Aug 13 2007, Daniel Phillips wrote:
> > Of course not. Nothing I said stops endio from being called in the
> > usual way as well. For this to work, endio just needs to know that
> > one call means "end" and the other means "destroy", thi
On Monday 13 August 2007 02:18, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 02:08:57AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > > But that idea fails as well, since reference counts and IO
> > > completion are two completely seperate entities. So unless end IO
> > > just happens
On Mon, Aug 13 2007, Daniel Phillips wrote:
> On Monday 13 August 2007 02:13, Jens Axboe wrote:
> > On Mon, Aug 13 2007, Daniel Phillips wrote:
> > > On Monday 13 August 2007 00:45, Jens Axboe wrote:
> > > > On Mon, Aug 13 2007, Jens Axboe wrote:
> > > > > > You did not comment on the one about put
On Monday 13 August 2007 02:13, Jens Axboe wrote:
> On Mon, Aug 13 2007, Daniel Phillips wrote:
> > On Monday 13 August 2007 00:45, Jens Axboe wrote:
> > > On Mon, Aug 13 2007, Jens Axboe wrote:
> > > > > You did not comment on the one about putting the bio
> > > > > destructor in the ->endio handl
On Mon, Aug 13, 2007 at 02:08:57AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > But that idea fails as well, since reference counts and IO completion
> > are two completely seperate entities. So unless end IO just happens
> > to be the last user holding a reference to the bio, you cannot
On Mon, Aug 13 2007, Daniel Phillips wrote:
> On Monday 13 August 2007 00:45, Jens Axboe wrote:
> > On Mon, Aug 13 2007, Jens Axboe wrote:
> > > > You did not comment on the one about putting the bio destructor
> > > > in the ->endio handler, which looks dead simple. The majority of
> > > > cases
On Mon, Aug 13 2007, Daniel Phillips wrote:
> On Monday 13 August 2007 00:28, Jens Axboe wrote:
> > On Sun, Aug 12 2007, Daniel Phillips wrote:
> > > Right, that is done by bi_vcnt. I meant bi_max_vecs, which you can
> > > derive efficiently from BIO_POOL_IDX() provided the bio was
> > > allocated
On Monday 13 August 2007 00:45, Jens Axboe wrote:
> On Mon, Aug 13 2007, Jens Axboe wrote:
> > > You did not comment on the one about putting the bio destructor
> > > in the ->endio handler, which looks dead simple. The majority of
> > > cases just use the default endio handler and the default
> >
On Monday 13 August 2007 00:28, Jens Axboe wrote:
> On Sun, Aug 12 2007, Daniel Phillips wrote:
> > Right, that is done by bi_vcnt. I meant bi_max_vecs, which you can
> > derive efficiently from BIO_POOL_IDX() provided the bio was
> > allocated in the standard way.
>
> That would only be feasible,
On Sun, Aug 12, 2007 at 10:36:23PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> (previous incomplete message sent accidentally)
>
> On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> > On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe wrote:
> >
> > So, what did we decide? To
Hi Daniel.
On Sun, Aug 12, 2007 at 04:16:10PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> Your patch is close to the truth, but it needs to throttle at the top
> (virtual) end of each block device stack instead of the bottom
> (physical) end. It does head in the direction of eliminatin
On Sun, Aug 12, 2007 at 11:44:00PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Sunday 12 August 2007 22:36, I wrote:
> > Note! There are two more issues I forgot to mention earlier.
>
> Oops, and there is also:
>
> 3) The bio throttle, which is supposed to prevent deadlock, can itsel
On Mon, Aug 13 2007, Jens Axboe wrote:
> > You did not comment on the one about putting the bio destructor in
> > the ->endio handler, which looks dead simple. The majority of cases
> > just use the default endio handler and the default destructor. Of the
> > remaining cases, where a specializ
On Sun, Aug 12 2007, Daniel Phillips wrote:
> On Tuesday 07 August 2007 13:55, Jens Axboe wrote:
> > I don't like structure bloat, but I do like nice design. Overloading
> > is a necessary evil sometimes, though. Even today, there isn't enough
> > room to hold bi_rw and bi_flags in the same variabl
On Sunday 12 August 2007 22:36, I wrote:
> Note! There are two more issues I forgot to mention earlier.
Oops, and there is also:
3) The bio throttle, which is supposed to prevent deadlock, can itself
deadlock. Let me see if I can remember how it goes.
* generic_make_request puts a bio in fl
(previous incomplete message sent accidentally)
On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe wrote:
>
> So, what did we decide? To bloat bio a bit (add a queue pointer) or
> to use physical device limits? The latter requires to r
On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe
([EMAIL PROTECTED]) wrote:
>
> So, what did we decide? To bloat bio a bit (add a queue pointer) or
> to use physical device limits? The latter requires to replace all
> occurence of bi
On Tuesday 07 August 2007 13:55, Jens Axboe wrote:
> I don't like structure bloat, but I do like nice design. Overloading
> is a necessary evil sometimes, though. Even today, there isn't enough
> room to hold bi_rw and bi_flags in the same variable on 32-bit archs,
> so that concern can be scratche
Hi Evgeniy,
Sorry for not getting back to you right away, I was on the road with
limited email access. Incidentally, the reason my mails to you keep
bouncing is, your MTA is picky about my mailer's IP reversing to a real
hostname. I will take care of that pretty soon, but for now my direct
m
On Wed, Aug 08, 2007 at 02:17:09PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED])
wrote:
> This throttling mechanism allows to limit maximum amount of queued bios
> per physical device. By default it is turned off and old block layer
> behaviour with unlimited number of bios is used. When turned on
This throttling mechanism allows to limit maximum amount of queued bios
per physical device. By default it is turned off and old block layer
behaviour with unlimited number of bios is used. When turned on (queue
limit is set to something different than -1U via blk_set_queue_limit()),
generic_make
On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote:
> I don't like structure bloat, but I do like nice design. Overloading is
So, what did we decide? To bloat bio a bit (add a queue pointer) or to
use physical device limits? The latter requires to replace all occurence
On Tue, Aug 07 2007, Daniel Phillips wrote:
> On Tuesday 07 August 2007 05:05, Jens Axboe wrote:
> > On Sun, Aug 05 2007, Daniel Phillips wrote:
> > > A simple way to solve the stable accounting field issue is to add a
> > > new pointer to struct bio that is owned by the top level submitter
> > > (
On Tuesday 07 August 2007 05:05, Jens Axboe wrote:
> On Sun, Aug 05 2007, Daniel Phillips wrote:
> > A simple way to solve the stable accounting field issue is to add a
> > new pointer to struct bio that is owned by the top level submitter
> > (normally generic_make_request but not always) and is n
On Sun, Aug 05 2007, Daniel Phillips wrote:
> A simple way to solve the stable accounting field issue is to add a new
> pointer to struct bio that is owned by the top level submitter
> (normally generic_make_request but not always) and is not affected by
> any recursive resubmission. Then getti
On Sun, Aug 05, 2007 at 02:35:04PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote:
> > On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote:
> > > > DST original code worked as device mapper plugin too, but its two
> > > > addi
On Sun, Aug 05, 2007 at 02:23:45PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote:
> > If we are sleeping in memory pool, then we already do not have memory
> > to complete previous requests, so we are in trouble.
>
> Not at all. Any re
On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote:
> On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote:
> > > DST original code worked as device mapper plugin too, but its two
> > > additional allocations (io and clone) per block request ended up
> > > for me as a show stopper.
>
On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote:
> If we are sleeping in memory pool, then we already do not have memory
> to complete previous requests, so we are in trouble.
Not at all. Any requests in flight are guaranteed to get the resources
they need to complete. This is guaranteed
Hi Daniel.
On Sun, Aug 05, 2007 at 01:04:19AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > we can wait in it for memory in mempool. Although that means we
> > already in trouble.
>
> Not at all. This whole block writeout path needs to be written to run
> efficiently even when normal
On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> > DST original code worked as device mapper plugin too, but its two
> > additional allocations (io and clone) per block request ended up for
> > me as a show stopper.
>
> Ah, sorry, I misread. A show stopper i
On Saturday 04 August 2007 09:44, Evgeniy Polyakov wrote:
> > On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> > > * storage can be formed on top of remote nodes and be
> > > exported simultaneously (iSCSI is peer-to-peer only, NBD requires
> > > device mapper and is synchronous)
> >
>
On Saturday 04 August 2007 09:37, Evgeniy Polyakov wrote:
> On Fri, Aug 03, 2007 at 06:19:16PM -0700, I wrote:
> > To be sure, I am not very proud of this throttling mechanism for
> > various reasons, but the thing is, _any_ throttling mechanism no
> > matter how sucky solves the deadlock problem.
On Fri, Aug 03, 2007 at 09:04:51AM +0400, Manu Abraham ([EMAIL PROTECTED])
wrote:
> On 7/31/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
>
> > TODO list currently includes following main items:
> > * redundancy algorithm (drop me a request of your own, but it is highly
> > unlikley
Hi Daniel.
> On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> > * storage can be formed on top of remote nodes and be exported
> > simultaneously (iSCSI is peer-to-peer only, NBD requires device
> > mapper and is synchronous)
>
> In fact, NBD has nothing to do with device mappe
On Fri, Aug 03, 2007 at 06:19:16PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
> It depends on the characteristics of the physical and virtual block
> devices involved. Slow block devices can produce surprising effects.
> Ddsnap still qualifies as "slow" under certain circumstances (big
On 8/4/07, Dave Dillow <[EMAIL PROTECTED]> wrote:
> On Fri, 2007-08-03 at 09:04 +0400, Manu Abraham wrote:
> > On 7/31/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
> >
> > > TODO list currently includes following main items:
> > > * redundancy algorithm (drop me a request of your own, but it
On Fri, 2007-08-03 at 09:04 +0400, Manu Abraham wrote:
> On 7/31/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
>
> > TODO list currently includes following main items:
> > * redundancy algorithm (drop me a request of your own, but it is highly
> > unlikley that Reed-Solomon based wil
On Friday 03 August 2007 03:26, Evgeniy Polyakov wrote:
> On Thu, Aug 02, 2007 at 02:08:24PM -0700, I wrote:
> > I see bits that worry me, e.g.:
> >
> > + req = mempool_alloc(st->w->req_pool, GFP_NOIO);
> >
> > which seems to be callable in response to a local request, just the
> > case w
Hi Mike,
On Thursday 02 August 2007 21:09, Mike Snitzer wrote:
> But NBD's synchronous nature is actually an asset when coupled with
> MD raid1 as it provides guarantees that the data has _really_ been
> mirrored remotely.
And bio completion doesn't?
Regards,
Daniel
-
To unsubscribe from this l
Hi Evgeniy,
Nit alert:
On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> * storage can be formed on top of remote nodes and be exported
> simultaneously (iSCSI is peer-to-peer only, NBD requires device
> mapper and is synchronous)
In fact, NBD has nothing to do with device
On Friday 03 August 2007 07:53, Peter Zijlstra wrote:
> On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote:
> > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra wrote:
> > ...my main position is to
> > allocate per socket reserve from socket's queue, and copy data
> > there from main
On Friday 03 August 2007 06:49, Evgeniy Polyakov wrote:
> ...rx has global reserve (always allocated on
> startup or sometime way before reclaim/oom)where data is originally
> received (including skb, shared info and whatever is needed, page is
> just an exmaple), then it is copied into per-socket
1 - 100 of 110 matches
Mail list logo