Re: [Bibdesk-users] v1.8.6: 'Searching External Databases' is limited to an arbitrary short number of results?

Christiaan Hofman Sun, 10 Oct 2021 08:58:36 -0700


> On 6 Oct 2021, at 13:56, mn <[email protected]> wrote:
> 
> On 06.10.21 11:38, Christiaan Hofman wrote:
>>>>>>>> For efficiency, we don’t fetch all the results at once. If you show
>>>>>>>> the status bar, you may see the total number of available results.
>>>>>>>> 
>>>>>>>> If you search repeatedly, further results will be fetched.
>>>>>>> 
>>>>>>> OK.
>>>>>>> Is there some way to fetch all results at once?
>>>>>>> A setting or option to customize the numbers of results fetched?
>>>>>>> If not, adding those would be most welcome.
>>>>>> 
>>>>>> No, there isn’t.
>>>>> 
>>>>> BTW, this is not just our choice. It is also the policy for the server
>>>>> for the web interface. And they threaten to block your IP address when
>>>>> you don’t comply with their policy, so I don’t think it is a good idea
>>>>> to ignore that.
>>>> 
>>>> On their website
>>>> https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen
>>>> <https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen>
>>>> they give the following:
>>>> 
>>>> > In order not to overload the E-utility servers, NCBI recommends that
>>>> users post no more than three URL requests per second and limit large
>>>> jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time
>>>> during weekdays. Failure to comply with this policy may result in an IP
>>>> address being blocked from accessing NCBI.
>>>> 
>>>> The mechanics behind the scene here elude me: BibDesk fetches 50
>>>> _results_ in one go, apparently fine with the '≤3 URL requests /s'?
>>>> 
>>>> On the face of it, I'd conclude that smaller portions (like 20 results
>>>> per 'search') would be fine in any case.
>>>> But then 150 results at a time as well?
>>>> 
>>>> The week day angle seems quite vague, but open to interpretation that
>>>> larger requests on weekends will be possible/tolerated?
>>>> 
>>>> But I suspect that 'URL requests' as limited on server and the 'BibDesk
>>>> results fetcheing' don not even correspond in that matter?
>>>> 
>>> 
>>> A comment in our code also mentions a limit on the number of results.
>>> Perhaps they have changed that policy over time. But getting a larger
>>> number of results can also slow down the search (a lot). Also, every
>>> fetch action is a URL action. The first one is two, because we first
>>> need to get the number of results.
>>> 
>> 
>> BTW, if you really get a very large number of results you probably did
>> not target your search very well, and fetching a large number of results
>> is really not that useful, just wasteful. Are you going to look for the
>> result you need in 15000 items?
>> 
> 
> The waste angle is in the eye of the beholder. I'll certainly not read
> line by line thru 79000 results. But I also find the result fetching in
> predefined chunks of 50 problematic.
> 
> There are of course different usage scenarios for BibDesk.
> 
> The goal here is to approach something in diemnsions like a systematic
> literature review with as much work done within BibDesk as is possible
> to get a minimized workflow.
> 
> This is inspired by JabRef's functionality for this (but that is just my
> understanding of how it's advertised in the menus of that app, as it
> then always crashes on my system due to some Java bugs; plus I
> previously much preferred BibDesk for the work I needed done, which were
> admittedly much smaller projects that just grew larger over time).
> 
> So I noticed that in BibDesk with results expected to be (somewhat)
> largish, it gets difficult for me to keep track over what's done and
> what remains tbd. Re-downloading the same results and analyzing them
> again is certainly also wasteful. How to solve that with more advanced
> queries, better targetting is unresolved until now.
> 
> My thinking was that fetch results online, then work offline to search
> and sort within those downloaded…
> 
> It may be a suboptimal approach after all, but with the advanced search
> on PubMed, I have a much easier time narrowing down the results via
> (better?) queries. My understanding is that within BibDesk, that is
> limited to spelling out the exact query directly (perhaps doable, but
> with GUI options it certainly gets that much easier to eg limit
> publication dates).
> 
> From that search results page I also just created a query that then
> allowed me to download 2712 results –with abstracts included– into one
> text file. Is that not reproducible within BibDesk's accessing Entrez?
> 
> 
> 
> — Mike
>


I’ve increased the batch size to 100, and let it fetch 10 batches in a row 
automatically. So that gives you 1000 with each search.

Christiaan

_______________________________________________
Bibdesk-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bibdesk-users

Re: [Bibdesk-users] v1.8.6: 'Searching External Databases' is limited to an arbitrary short number of results?

Reply via email to