Marco Gerards <[EMAIL PROTECTED]> writes: > Hi, > > Fatfs still has serious locking problems write writing and I had a > close look at this today. > > First I need to separate the problem into sub-problems: > > -- The diskfs_node_refcnt_lock problem -- > > diskfs_node_refcnt_lock can be locked while write_node tries to lock > it. > > The problem that is (perhaps) easy to fix is the problem caused by > diskfs_drop_node. This function locks diskfs_node_refcnt_lock until > the function returns. In the function (so while > diskfs_node_refcnt_lock is locked) diskfs_node_update is called. > > Diskfs_node_update calls the function write_node in fatfs > (indirectly) if the node is the disk node structure is dirty. > > Because fatfs has to lookup the directory that holds the node that > should be updated it (or one of the functions used for that, such > details don't matter right now) it has to lock > diskfs_node_refcnt_lock. Ofcourse it is _not_ an option not to lock > diskfs_node_refcnt_lock!! > > This problem is hopefully not too hard to fix, here are some possible > solutions: > > - Make sure that for every opened file the directory is know so we > don't have to look it up. Just add a "struct node *dirnode" to the > "struct dirnode" of fatfs. (fatfs only solution, evading the problem)
I wrote a patch to do this today. See patch #1798 on savannah for the patch and a description. > - Make sure diskfs_drop_node doesn't call diskfs_node_update while > diskfs_node_refcnt_lock is locked. This requires some careful libdiskfs > hacking and could be really hard. (generic solution, fixing the problem) I've also tried to fix this problem in libdiskfs. I've added the diskfs_drop_disknode, that should be implemented by all libdiskfs servers. It is used to drop the disknode, this was done by diskfs_node_norefs before. Now diskfs_node_norefs is changed too so it only removes the referenced to the node, nothing more. By making these changes it became possible to release the lock earlier. When the lock is released diskfs_update_node is called and it doesn't lock up like I described. :) Here is the changed version of libdiskfs/node-drop.c, please understand this is just a concept. If someone wants me to finish this instead of using the other solution, just ask me. /* Node NP now has no more references; clean all state. The diskfs_node_refcnt_lock must be held, and will be released upon return. NP must be locked. */ void diskfs_drop_node (struct node *np) { mode_t savemode; if (np->dn_stat.st_nlink == 0) { diskfs_check_readonly (); assert (!diskfs_readonly); if (np->dn_stat.st_mode & S_IPTRANS) diskfs_set_translator (np, 0, 0, 0); if (np->allocsize != 0 || (diskfs_create_symlink_hook && S_ISLNK (np->dn_stat.st_mode) && np->dn_stat.st_size)) { /* If the node needs to be truncated, then a complication arises, because truncation might require gaining new references to the node. So, we give ourselves a reference back, unlock the refcnt lock. Then we are in the state of a normal user, and do the truncate and an nput. The next time through, this routine will notice that the size is zero, and not have to do anything. */ np->references++; spin_unlock (&diskfs_node_refcnt_lock); diskfs_truncate (np, 0); /* Force allocsize to zero; if truncate consistently fails this will at least prevent an infinite loop in this routine. */ np->allocsize = 0; diskfs_nput (np); return; } assert (np->dn_stat.st_size == 0); savemode = np->dn_stat.st_mode; np->dn_stat.st_mode = 0; np->dn_stat.st_rdev = 0; np->dn_set_ctime = np->dn_set_atime = 1; /* Drop all references to the node. After this it is safe to unlock diskfs_node_refcnt_lock. */ diskfs_node_norefs (np); spin_unlock (&diskfs_node_refcnt_lock); diskfs_node_update (np, 1); diskfs_free_node (np, savemode); } else { /* Drop all references to the node. After this it is safe to unlock diskfs_node_refcnt_lock. */ diskfs_node_norefs (np); spin_unlock (&diskfs_node_refcnt_lock); diskfs_node_update (np, diskfs_synchronous); } fshelp_drop_transbox (&np->transbox); if (np->dirmod_reqs) free_modreqs (np->dirmod_reqs); if (np->filemod_reqs) free_modreqs (np->filemod_reqs); assert (!np->sockaddr); diskfs_drop_disknode (np); } _______________________________________________ Bug-hurd mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-hurd