I'm kind of seconding Rod here. It might make more sense, depending on your use case and local computer resources, to just get a download of Plantae *AND* Brazil from GBIF periodically, then process that to exclude existing Brazilian datasets. You could then use something like Apache hadoop / spark to efficiently split the file by dataset or by institution code.
This would greatly simplify your interactions with GBIF (down to just periodically generating a download programmatically) and you would have an easy place to insert any additional data transformations you want. This is the path i take for my work at least - the incremental cost of a couple million more records is worth the reduction in complexity overall. - Alex On 09/09/2015 12:16 PM, Eduardo Dalcin wrote: > Hi Rod, > > The real purpose is to have a list of UUID and the "source web page" > for the data set. Thus, one way to do it is to select those resources > that counts <> 0 for PLANTAE *AND* Brazil. > > I don't want to do any stats analysis, but feed up one local > harverster / agregator. > > The problem is, considering the reply from Jan Legind at Sep 3, we > have to check one by one (https://goo.gl/3wysaA) to check if it is a > Herbarium / Preserved Specimen (Plantae) or not, from the request > http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN. > > Does it make sense? > > Thanks for your curiosity! :) > > Cheers, > > Eduardo > > > -------------------------------- > *Eduardo Dalcin > <https://mailtrack.io/trace/link/5516ed5e4f903c6ee9bd9fb3876fb65ffffc687c?url=http%3A%2F%2Feduardo.dalc.in&signature=cda9e9bf584a828c>* > **Instituto de Pesquisas Jardim Bot?nico do Rio de Janeiro - JBRJ > e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br> > Trabalho / Work: +55 21 3204 2116 > -------------------------------- > *e-mail alternativo / **alternate email:**edalcin at jbrj.org > <mailto:edalcin at jbrj.org>* > -------------------------------- > Agendar reuni?o / Schedule a meeting: http://agendar.dalc.in > <https://mailtrack.io/trace/link/3a5eaa1df56016285886497766577e5357ddc6c1?url=http%3A%2F%2Fagendar.dalc.in&signature=c4e8d8113c34937f> > > On Mon, Sep 7, 2015 at 12:33 PM, Roderic Page > <Roderic.Page at glasgow.ac.uk <mailto:Roderic.Page at glasgow.ac.uk>> wrote: > > Hi Eduardo, > > I?m curious, is the purpose to get counts by dataset by country, > or to get all the plant occurrences for Brazil? The later can be > obtained by downloading all plant occurrences in Brazil > http://www.gbif.org/occurrence/search?TAXON_KEY=6&COUNTRY=BR (you > could then compute the per-dataset stats locally). I realise that > this isn?t as convenient as having GBIF slice the data for you in > the API. > > Regards > > Rod > > --------------------------------------------------------- > Roderic Page > Professor of Taxonomy > Institute of Biodiversity, Animal Health and Comparative Medicine > College of Medical, Veterinary and Life Sciences > Graham Kerr Building > University of Glasgow > Glasgow G12 8QQ, UK > > Email: Roderic.Page at glasgow.ac.uk <mailto:Roderic.Page at > glasgow.ac.uk> > Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778> > Skype: rdmpage > Facebook: http://www.facebook.com/rdmpage > LinkedIn: http://uk.linkedin.com/in/rdmpage > Twitter: http://twitter.com/rdmpage > Blog: http://iphylo.blogspot.com > ORCID: http://orcid.org/0000-0002-7101-9767 > Citations: > http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ > ResearchGatehttps://www.researchgate.net/profile/Roderic_Page > > >> On 4 Sep 2015, at 10:39, Eduardo Dalcin <edalcin at jbrj.org >> <mailto:edalcin at jbrj.org>> wrote: >> >> Hi Markus, >> >> Yes, that's a shame I can't have country and "nub" together. >> There is any hope about it? >> >> Eduardo >> >> >> -------------------------------- >> *Eduardo Dalcin >> >> <https://mailtrack.io/trace/link/bac23864202354f3789938ce352a878faa0cd8b8?url=http%3A%2F%2Feduardo.dalc.in&signature=aea58ef6f439535b>* >> **Instituto de Pesquisas Jardim Bot?nico do Rio de Janeiro - JBRJ >> e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br> >> Trabalho / Work: +55 21 3204 2116 <tel:%2B55%2021%203204%202116> >> -------------------------------- >> *e-mail alternativo / **alternate email:**edalcin at jbrj.org >> <mailto:edalcin at jbrj.org>* >> -------------------------------- >> Agendar reuni?o / Schedule a meeting: http://agendar.dalc.in >> >> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5> >> >> On Thu, Sep 3, 2015 at 4:29 PM, Markus D?ring <mdoering at gbif.org >> <mailto:mdoering at gbif.org>> wrote: >> >> Eduardo, >> >> as you might have seen from my issue comment the webservice >> uses a different parameter name for taxonKey which is a bug >> we need to fix at some point. >> Please use nubKey for now to use the service like that: >> >> http://api.gbif.org/v1/occurrence/counts/datasets?nubKey=6 >> >> The real problem for you will be that we do not support the >> combination of the country and the taxon filter, just one of >> the two. So you cannot search for plants in Brazil I am >> afraid, just for datasets about Brazil and datasets with >> plant records. >> >> Markus >> >> >> >> > On 03 Sep 2015, at 14:12, Eduardo Dalcin <edalcin at jbrj.org >> <mailto:edalcin at jbrj.org>> wrote: >> > >> > Thanks Jan. I'll keep exploring and I'll be in touch, if I >> need. >> > >> > Best, >> > >> > Eduardo >> > >> > >> > >> > -------------------------------- >> > Eduardo Dalcin >> > Instituto de Pesquisas Jardim Bot?nico do Rio de Janeiro - JBRJ >> > e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br> >> > Trabalho / Work: +55 21 3204 2116 >> <tel:%2B55%2021%203204%202116> >> > -------------------------------- >> > e-mail alternativo / alternate email: edalcin at jbrj.org >> <mailto:edalcin at jbrj.org> >> > -------------------------------- >> > Agendar reuni?o / Schedule a meeting: >> http://agendar.dalc.in >> >> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5> >> > >> > On Thu, Sep 3, 2015 at 4:51 AM, Jan Legind [GBIF] >> <jlegind at gbif.org <mailto:jlegind at gbif.org>> wrote: >> > Dear Eduardo, >> > >> > >> > >> > Thanks for getting in touch with us about these issues. >> > >> > >> > >> > The first request >> >> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN >> returns the number of records located in Brazil for the >> facets in the request. >> > >> > The second query >> >> http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN >> uses the Occurrence Inventories web service >> http://www.gbif.org/developer/occurrence#inventories which >> does not support the basis-of-record facet in the /datasets >> request. I understand that it would be better if the API >> response yielded an error message in this instance. >> > >> > >> > >> > Concerning the other issues ? you are indeed right that the >> counts do not make sense in the context of taxon key 6 which >> is Plantae. Actually the API does not handle the taxonKey >> search at all, contrary to what the documentation states: >> > >> > >> > >> > /occurrence/counts/datasets >> > >> > GET >> > >> > Counts >> > >> > Lists occurrence counts for datasets that cover a given >> taxon or country. >> > >> > country, taxonKey >> > >> > >> > >> > As you can see here, >> http://api.gbif.org/v1/occurrence/counts/datasets?taxonKey=6 >> , this request doesn?t return anything. >> > >> > >> > >> > The GBIF developers will handle this issue in due time. >> > >> > You can follow the issue in our bug tracking service here: >> http://dev.gbif.org/issues/browse/POR-2828 >> > >> > >> > >> > >> > >> > With best regards, >> > >> > >> > >> > Jan K. Legind >> > >> > Data manager, GBIF Secretariat >> > >> > >> > >> > >> > >> > From: API-users [mailto:api-users-bounces at lists.gbif.org >> <mailto:api-users-bounces at lists.gbif.org>] On Behalf Of >> Eduardo Dalcin >> > Sent: 2. september 2015 20:06 >> > To: api-users at lists.gbif.org >> <mailto:api-users at lists.gbif.org>; dev at gbif.org >> <mailto:dev at gbif.org> >> > Cc: Jo?o Monnerat Lanna; Nat?lia Queiroz; Diogo Silva; >> Laura; Ricardo Avancini >> > Subject: [API-users] Some questions from a begginer >> > >> > >> > >> > Hi folks, >> > >> > >> > >> > This is my first message to the list. So, please, be nice :) >> > >> > >> > >> > I'm working here at Rio de Janeiro Botanical Garden, >> together with the guys at the National Center for Flora >> Conservation. We are doing the risk assessment of the >> Brazilian flora to the government. We assess, so far, the >> risk of ca. 6.000 species, but we still have to assess ca. >> 35.000. Access occurrence records for Brazil is crucial, and >> every occurrence is important. >> > >> > >> > >> > That means that we have to put together occurrence data >> from different sources and, after the first batch of the risk >> assessment, we realize that we need to build up our >> aggregator. We are planning to do this with the >> Lontra-harvester, with the help of the guys at Brazilian GBIF >> Node. >> > >> > >> > >> > So, the one of the firsts steps was to list the available >> resources to understand the dimension of the task and, that >> brings me to my questions. >> > >> > >> > >> > First: >> > >> > >> > >> > The request: >> > >> > >> > >> > >> >> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN >> > >> > >> > >> > returns 4.982.689 records >> > >> > >> > >> > And the request: >> > >> > >> > >> > >> >> http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN >> > >> > >> > >> > returns (here) 7.406.310 records >> > >> > >> > >> > Comments? >> > >> > >> > >> > Second: >> > >> > >> > >> > The request: >> > >> > >> > >> > >> >> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN >> > >> > >> > >> > return things like this: >> > >> > >> > >> > "197908d0-5565-11d8-b290-b8a03c50a862":27629 >> > >> > >> > But the consult of the same dataset: >> > >> > >> > >> > >> >> http://www.gbif.org/occurrence/search?TAXON_KEY=6&DATASET_KEY=197908d0-5565-11d8-b290-b8a03c50a862 >> > >> > >> > >> > Returns "null" (of course, is a FishBase!) >> > >> > >> > >> > I have plenty of examples like this, on yellow here (not >> finished!): >> > >> > >> > >> > >> >> https://docs.google.com/spreadsheets/d/1msUjwMLoKwnXxJFzF20SeN_C65RIkGLbwaYyj459VTc/edit?usp=sharing >> > >> > >> > >> > Comments? >> > >> > >> > >> > I think those two questions is a good start. Please, let me >> know if I'm doing something wrong. >> > >> > >> > >> > Cheers, >> > >> > >> > >> > Eduardo >> > >> > -------------------------------- >> > >> > Eduardo Dalcin >> > >> > Instituto de Pesquisas Jardim Bot?nico do Rio de Janeiro - JBRJ >> > >> > e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br> >> > >> > Trabalho / Work: +55 21 3204 2116 >> <tel:%2B55%2021%203204%202116> >> > >> > -------------------------------- >> > >> > e-mail alternativo / alternate email: edalcin at jbrj.org >> <mailto:edalcin at jbrj.org> >> > >> > -------------------------------- >> > >> > Agendar reuni?o / Schedule a meeting: >> http://agendar.dalc.in >> >> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5> >> > >> > >> > >> > >> >> >> _______________________________________________ >> API-users mailing list >> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org> >> http://lists.gbif.org/mailman/listinfo/api-users > > > > > _______________________________________________ > API-users mailing list > API-users at lists.gbif.org > http://lists.gbif.org/mailman/listinfo/api-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gbif.org/pipermail/api-users/attachments/20150909/6c8dba77/attachment-0001.html>
