** Description changed:

- <eMTee> So not getting a result for a changed file (same path/different 
content) in the share after re-hashing is because the hub requesting a new 
bloom filter only if the number of shared files are changed in the INF coming 
from the client. In common examples like when you share an updated binary or 
change a text file and reindex this would not happen at all.
- <eMTee> Bloom request is only triggered by an SF and not SS in the INF. See 
https://sourceforge.net/p/adchpp/code/ci/default/tree/plugins/Bloom/src/BloomManager.cpp#L98
 
- <eMTee> And with adding SS to the check there we're still not completly out 
of water since if the share change is a same path, same size, different content 
change then it still sucks. Minor editing of a text file or change of a 
fix-sized metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
- <eMTee> You can change even all of your share in this special way and if you 
don't change the sizes and number of files then you won't provide hits at all 
until you do some other kind of share change or reconnect the hub.
+ Update: rephrase and clarify the initial report.
  
- [2023-02-28 09:03] <eMTee> So Blom request is based on an inadequate signal 
that's not enough for all cases.
- [2023-02-28 09:07] <eMTee> SS also should be hooked on at the very least but 
a perfect solution would be something that is signalling the share change in 
general or the number of re-hashes in the current client session. Or the last 
rehash timestamp. These signals would be adequate for requesting a new Bloom 
filter in all cases when it is needed to.
- [2023-02-28 09:11] <eMTee> Of course the client could force to send an INF SF 
after all rehashes in case it supports Blom, but it's pretty ugly to implement 
in DC++ and, more importantly, it is against the protocol since you send INFs 
only if some values change and in these special cases we investigate this would 
mean sending multiple INF SF's with the same value.
- [2023-02-28 09:13] <eMTee> "Each time this is received, it means that the 
fields specified have been added or updated." in 
https://adc.sourceforge.io/ADC.html#_inf
- [2023-02-28 09:17] <eMTee> If an extension is allowed to specify new INF 
fields then a last rehash timestamp field would probably be the cleanest 
solution for this both protocol and implementation wise...
+ --------------
+ 
+ There is a problem of not getting a search result for any number of
+ changed files (same path/different content) in the share after re-
+ hashing in an ADC client connected to an ADC hub with Bloom filter
+ support of TTH searches.
+ 
+ The issue is because the hub requesting a new bloom filter only if the number 
of shared files are changed in the INF SF coming from the client. In common 
examples like when you share an updated binary or change a text file and 
reindex this would obviously not happen. For example changing of a fix-sized 
metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
+ So the filter request is based on an inadequate signal that's not enough for 
all common use cases.
+ 
+ A solution would be something that is signalling the share change in
+ general or also provided the number of re-hashes in the current client
+ session or maybe the last rehash timestamp. These signals would be
+ adequate for requesting a new Bloom filter in all cases when files
+ changed in a client's share.
+ 
+ Of course a BLOM supporting client could force to send an INF SF after
+ all re-hashes when there is a content change in the share but it is
+ against the protocol since INFs allowed to send only if any of the flag
+ values changed and in these special case this would mean sending
+ multiple INF SF's with the same SF value (see "Each time this is
+ received, it means that the fields specified have been added or
+ updated." in https://adc.sourceforge.io/ADC.html#_inf ).
+ 
+ If an extension is allowed to specify new INF fields then a new flag
+ ("SC"?)  optionally with parameters containing more data for the hub
+ about the actual share change, like a last rehash timestamp and number
+ of changed files. This would probably be the cleanest solution but it
+ needs a protocol update for the BLOM ADC extension.
  
  Within the currently defined standards another possibility is to do some
- client side trickery, an ugly hack to slightly fake SF or SS (eg. by
- incrementing one of them by 1) in each of this special share change case
- so then that triggers a Bloom update.
+ client side trickery, an ugly hack to slightly fake SF (eg. by
+ incrementing it by 1) in each of this special share change casees so
+ then that'd trigger a BLOM request for an updated filter.

** Summary changed:

- Certain type of changes in the share do not trigger a Bloom filter update 
which makes such changed files temporarily unsearchable
+ Certain type of changes in the share do not trigger a Bloom filter update 
which makes such changed files temporarily unsearchable by TTH

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/2009492

Title:
  Certain type of changes in the share do not trigger a Bloom filter
  update which makes such changed files temporarily unsearchable by TTH

Status in ADCH++:
  New
Status in DC++:
  Confirmed

Bug description:
  Update: rephrase and clarify the initial report.

  --------------

  There is a problem of not getting a search result for any number of
  changed files (same path/different content) in the share after re-
  hashing in an ADC client connected to an ADC hub with Bloom filter
  support of TTH searches.

  The issue is because the hub requesting a new bloom filter only if the number 
of shared files are changed in the INF SF coming from the client. In common 
examples like when you share an updated binary or change a text file and 
reindex this would obviously not happen. For example changing of a fix-sized 
metadata e.g. an MP3 IDv1 tag resulting exactly this scenario.
  So the filter request is based on an inadequate signal that's not enough for 
all common use cases.

  A solution would be something that is signalling the share change in
  general or also provided the number of re-hashes in the current client
  session or maybe the last rehash timestamp. These signals would be
  adequate for requesting a new Bloom filter in all cases when files
  changed in a client's share.

  Of course a BLOM supporting client could force to send an INF SF after
  all re-hashes when there is a content change in the share but it is
  against the protocol since INFs allowed to send only if any of the
  flag values changed and in these special case this would mean sending
  multiple INF SF's with the same SF value (see "Each time this is
  received, it means that the fields specified have been added or
  updated." in https://adc.sourceforge.io/ADC.html#_inf ).

  If an extension is allowed to specify new INF fields then a new flag
  ("SC"?)  optionally with parameters containing more data for the hub
  about the actual share change, like a last rehash timestamp and number
  of changed files. This would probably be the cleanest solution but it
  needs a protocol update for the BLOM ADC extension.

  Within the currently defined standards another possibility is to do
  some client side trickery, an ugly hack to slightly fake SF (eg. by
  incrementing it by 1) in each of this special share change casees so
  then that'd trigger a BLOM request for an updated filter.

To manage notifications about this bug go to:
https://bugs.launchpad.net/adchpp/+bug/2009492/+subscriptions


_______________________________________________
Mailing list: https://launchpad.net/~linuxdcpp-team
Post to     : linuxdcpp-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~linuxdcpp-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to