The bloom filter does its thing pretty well if you make sure it is updated 
properly. This is what it is all about.
The old DCBase forum is offline atm (wtf) and archive.org does not archive it 
properly, it seems (wtf2) so all I can see there's a topic there on bloom 
having 'latency', presumably on the hashed files availability. But this is 
exactly what is addressed here.

If you understand why it's not updated properly then the fix is rather
trivial. There are multiple ways to resolve this in client side and I am
already testing one. It works afics, I'd just like to see at least one
more confirmation during a proper testing session with someone else
before committing.

The already committed update for the hub side is rather just a partial
remedy against clients that don't want to fix this issue, but a proper
client side fix works 100% even on hubs that do not include the hubside
fix at all.

With a client side fix applied, there's only up to 1 minute latency
(worst case) to hashed files become searchable by TTH, and for now this
applies after any type of share changes.

Generally, the solution is that you should send an SF value to the hub right 
after any share refresh and before any new or changed files have started 
hashing. With the SF value in that point the hub's bloom requester gets a 
reference and it can use this reference later to compare subsequent SF's to - I 
mean SF's that are to be sent later, as hashing progresses, with the minutely 
INF sending.
This approach guarantees that any type of updates in the share will be signaled 
to the bloom requester.

** Changed in: adchpp
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/2110291

Title:
  One time small updates in the share may not trigger a Bloom filter
  update request which makes such updated files unsearchable by TTH for
  other hub users

Status in ADCH++:
  Fix Released
Status in AirDC++:
  New
Status in DC++:
  Confirmed

Bug description:
  There is a possible scenario where other users logged into the same ADCH++ 
hub with Bloom filter support
  may not receive search results (by TTH) for one or more updated files after 
manually refreshing the share in DC++, until the user updates the share once 
more or reconnects to the hub.

  The problem is consistently reproducible after one or a few files getting 
updated and the sharre refreshed,
  if the overall size of the changed files is relatively small.

  To reproduce this, you need to update already shared file(s) with different 
content,
  or perform a similar number of file removals and additions to the share, then 
manually refresh the share.

  The cause of the issue is that sending INFs — just like any other commands — 
is not instantaneous.
  The function that compiles the INF command is placed into the async task 
queue of all connected hubs' sockets, to be run when feasible.
  If, for example, you update one small file and refresh the share, normally 
that would result in sending SF = lastSF - 1 with the infoupdate() right after 
the refresh.
  Then, the hashing thread's TTHDone event handler updates the total number of 
files after the file with the updated content has been hashed.
  This change is then sent with the next scheduled infoupdate() (typically 
minutely).

  But... if the small updated file is already hashed by the time the hub's 
respective infoupdate() is called,
  then SF becomes lastSF + 1 again. Bingo — the value is correct, but the Bloom 
plugin won't be signaled to request a filter update.
  OTOH if the hasher's queue is empty before the share refresh, it will indeed 
start working almost instantaneously, so if the total size of the updated 
file(s) is small enough, it often wins the race, it seems.

  The largest total updated file size to reproduce this depends on your 
hardware.
  It is higher with faster CPUs and storage, and also depends on how busy the 
hub/socket is at the time.

  On a system with a 100Mb/s HDD read speed and an i5-6600 CPU, the threshold 
is about 15 MiB.
  Obviously, this could easily be 10 times larger on modern hardware.

To manage notifications about this bug go to:
https://bugs.launchpad.net/adchpp/+bug/2110291/+subscriptions


_______________________________________________
Mailing list: https://launchpad.net/~linuxdcpp-team
Post to     : linuxdcpp-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~linuxdcpp-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to