<eMTee> I've checked klondike's geoip lib and as I precieve it does the lookups 
staight from the data file. That doesn't sound too efficient. Also I'm not sure 
the whole thing is thread-safe at all. So maybe it's Maxmind's lib is the only 
option...
<cologic> I agree with you (or, well, your conclusion -- whatever your 
reasoning was) that klondike's design, while it is as he says, a "This is a 
purposefully slow implementation for memory constrained environments", achieves 
that fseek()ing and similar around the file in a thread-unsafe way, to avoid 
mmap() and other access methods. That the mmdb_t data structure itself can only 
meaningfully be used by one thread at a time, so the choice ends up being 
multiply loading in the files -- which, actually, sort of works with his 
design, since it is so low-memory-usage -- or keep fewer mmdb_t objects around 
than threads, and mux access to them somehow (say, mutexes: "TODO: mutex 
handling").
<cologic> It's a reasonable tradeoff for a different environment than DC++ 
lives in these days.
<cologic> He jumps through endless hoops just not to keep anything he doesn't 
need in memory, IMO to the code/design's detriment in this context.
<cologic> The whole dbip-country-lite-2020-05.mmdb is about 5MB, which seems 
completely acceptable to mmap() or just load completely into memory and not 
deal with all those seeks, mutex questions, etc.
<cologic> Just to take the most obvious option (the same overall approach, 
just, all in memory). Other options exist too.
<cologic> The completely unoptimized representation (text-based, no fancy 
reused DAG-like substructures, etc) from their CSV file is still only 17MB. So 
not saying this is a great option, but just a dumb CSV parser on that could 
work too.
<cologic> line by line of 2.17.115.0,2.17.115.255,GB -- trivial.
<cologic> The other concern I have with the style of klondike's code is that 
it's full of basically untrusted-data-driven pointer-chasing (via seeks at the 
moment, but still) which as klondike acknowledges can result in exponential 
blowup.
<cologic> It's not his fault -- he did what one could with the format -- but 
from working on other code dealing with a conceptually similar file format, I'm 
not a fan of it.
<cologic> There's something to be said for the relative 
simplicit/verifiability/fewer-weird-failure-modes of a constrained CSV parser 
which builds an in-memory representation with some reasonable std::foo data 
structure.
<cologic> But I suspect that if one fuzzed most of these geoip format parsers, 
the results would be dire

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/1774502

Title:
  Free GeoIP Database Format Change

Status in DC++:
  Confirmed

Bug description:
  "Updated versions of the GeoLite Legacy databases are now only available to 
redistribution license customers, although anyone can continue to download the 
March 2018 GeoLite Legacy builds. Starting January 2, 2019, the last build will 
be removed from our website. GeoLite Legacy database users will need to switch 
to the GeoLite2 or commercial GeoIP databases and update their integrations by 
January 2, 2019."
  https://dev.maxmind.com/geoip/legacy/geolite/

To manage notifications about this bug go to:
https://bugs.launchpad.net/dcplusplus/+bug/1774502/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~linuxdcpp-team
Post to     : linuxdcpp-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~linuxdcpp-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to