On Mon, Sep 23, 2013 at 04:21:49PM +0800, Weijie Yang wrote:
> Consider the following scenario:
> thread 0: reclaim entry x (get refcount, but not call 
> zswap_get_swap_cache_page)
> thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
>       finished, entry x and its zbud is not freed as its refcount != 0
>       now, the swap_map[x] = 0
> thread 0: now call zswap_get_swap_cache_page
>       swapcache_prepare return -ENOENT because entry x is not used any more
>       zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
>       zswap_writeback_entry do nothing except put refcount
> Now, the memory of zswap_entry x and its zpage leak.
> 
> Modify:
>  - check the refcount in fail path, free memory if it is not referenced.

Hmm, I don't like this because zswap refcount routine is already mess for me.
I'm not sure why it was designed from the beginning. I hope we should fix it 
first.

1. zswap_rb_serach could include zswap_entry_get semantic if it founds a entry 
from
   the tree. Of course, we should ranme it as find_get_zswap_entry like 
find_get_page.
2. zswap_entry_put could hide resource free function like zswap_free_entry so 
that
   all of caller can use it easily following pattern.
   
  find_get_zswap_entry
  ...
  ...
  zswap_entry_put

Of course, zswap_entry_put have to check the entry is in the tree or not
so if someone already removes it from the tree, it should avoid double remove.

One of the concern I can think is that approach extends critical section
but I think it would be no problem because more bottleneck would be [de]compress
functions. If it were really problem, we can mitigate a problem with moving
unnecessary functions out of zswap_free_entry because it seem to be rather
over-enginnering.

>  - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
> can be not only caused by nomem but also by invalidate.
> 
> Signed-off-by: Weijie Yang <[email protected]>
> Reviewed-by: Bob Liu <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: [email protected]
> Acked-by: Seth Jennings <[email protected]>
> ---
>  mm/zswap.c |   21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/zswap.c b/mm/zswap.c
> index cbd9578..1be7b90 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -387,7 +387,7 @@ static void zswap_free_entry(struct zswap_tree *tree, 
> struct zswap_entry *entry)
>  enum zswap_get_swap_ret {
>       ZSWAP_SWAPCACHE_NEW,
>       ZSWAP_SWAPCACHE_EXIST,
> -     ZSWAP_SWAPCACHE_NOMEM
> +     ZSWAP_SWAPCACHE_FAIL,
>  };
>  
>  /*
> @@ -401,9 +401,9 @@ enum zswap_get_swap_ret {
>   * added to the swap cache, and returned in retpage.
>   *
>   * If success, the swap cache page is returned in retpage
> - * Returns 0 if page was already in the swap cache, page is not locked
> - * Returns 1 if the new page needs to be populated, page is locked
> - * Returns <0 on error
> + * Returns ZSWAP_SWAPCACHE_EXIST if page was already in the swap cache
> + * Returns ZSWAP_SWAPCACHE_NEW if the new page needs to be populated, page 
> is locked
> + * Returns ZSWAP_SWAPCACHE_FAIL on error
>   */
>  static int zswap_get_swap_cache_page(swp_entry_t entry,
>                               struct page **retpage)
> @@ -475,7 +475,7 @@ static int zswap_get_swap_cache_page(swp_entry_t entry,
>       if (new_page)
>               page_cache_release(new_page);
>       if (!found_page)
> -             return ZSWAP_SWAPCACHE_NOMEM;
> +             return ZSWAP_SWAPCACHE_FAIL;
>       *retpage = found_page;
>       return ZSWAP_SWAPCACHE_EXIST;
>  }
> @@ -529,11 +529,11 @@ static int zswap_writeback_entry(struct zbud_pool 
> *pool, unsigned long handle)
>  
>       /* try to allocate swap cache page */
>       switch (zswap_get_swap_cache_page(swpentry, &page)) {
> -     case ZSWAP_SWAPCACHE_NOMEM: /* no memory */
> +     case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */
>               ret = -ENOMEM;
>               goto fail;
>  
> -     case ZSWAP_SWAPCACHE_EXIST: /* page is unlocked */
> +     case ZSWAP_SWAPCACHE_EXIST:
>               /* page is already in the swap cache, ignore for now */
>               page_cache_release(page);
>               ret = -EEXIST;
> @@ -591,7 +591,12 @@ static int zswap_writeback_entry(struct zbud_pool *pool, 
> unsigned long handle)
>  
>  fail:
>       spin_lock(&tree->lock);
> -     zswap_entry_put(entry);
> +     refcount = zswap_entry_put(entry);
> +     if (refcount <= 0) {
> +             /* invalidate happened, consider writeback as success */
> +             zswap_free_entry(tree, entry);
> +             ret = 0;
> +     }
>       spin_unlock(&tree->lock);
>       return ret;
>  }
> -- 
> 1.7.10.4
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected].  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]";> [email protected] </a>

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to