Re: How to deal with many files using solr external file field
Can you provide a stack trace for the OOM eexception ? On Tue, Jun 7, 2011 at 4:25 PM, Bohnsack, Sven wrote: > Hi all, > > we're using solr 1.4 and external file field ([1]) for sorting our > searchresults. We have about 40.000 Terms, for which we use this sorting > option. > Currently we're running into massive OutOfMemory-Problems and were not > pretty sure, what's the matter. It seems that the garbage collector stops > working or some processes are going wild. However, solr starts to allocate > more and more RAM until we experience this OutOfMemory-Exception. > > > We noticed the following: > > For some terms one could see in the solr log that there appear some > java.io.FileNotFoundExceptions, when solr tries to load an external file for > a term for which there is not such a file, e.g. solr tries to load the > external score file for "trousers" but there ist none in the > /solr/data-Folder. > > Question: is it possible, that those exceptions are responsible for the > OutOfMemory-Problem or could it be due to the large(?) number of 40k terms > for which we want to sort the result via external file field? > > I'm looking forward for your answers, suggestions and ideas :) > > > Regards > Sven > > > [1]: > http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html >
Re: Find newly added documents
"newly added" is a bit vague. Do you mean "since last Sunday" ? "between the last and the one before that" ? Also, do you need to distinguish between updated and newly added documents ? Perhaps you could be more specific about the use case. -Simon On Fri, Jan 22, 2010 at 4:25 AM, Erik Hatcher wrote: > You can do a search, sort by the special _docid_ "field" (underscores > mandatory) descending and the top documents listed will be the latest added. > > Like this, un-url-encoded: q=*:*&sort=_docid_ desc > >Erik > > > > On Jan 22, 2010, at 3:39 AM, Sandeep Tagore wrote: > > >> Thanks a lot Erik. Is there any other alternate way? >> Thanks a lot for your response. >> >> Regards, >> Sandeep >> >> >> You'll be able to find them only after a commit. >> >> One way to do this is index a timestamp with every document, and find >> the latest ones using that field. There's an example of an automatic >> timestamp field in the example schema. >> >>Erik >> >> -- >> View this message in context: >> http://old.nabble.com/Find-newly-added-documents-tp27254813p27270104.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >
Re: unloading a solr core doesn't free any memory
What Garbage Collection parameters is the JVM using ? the memory will not always be freed immediately after an event like unloading a core or starting a new searcher. 2010/2/8 Tim Terlegård > To me it doesn't look like unloading a Solr Core frees the memory that > the core has used. Is this how it should be? > > I have a big index with 50 million documents. After loading a core it > takes 300 MB RAM. After a query with a couple of sort fields Solr > takes about 8 GB RAM. Then I unload (CoreAdminRequest.unloadCore) the > core. The core is not shown in /solr/ anymore. Solr still takes 8 GB > RAM. Creating new cores is super slow because I have hardly any memory > left. Do I need to free the memory explicitly somehow? > > /Tim >
Re: HttpDataSource consume REST API with Authentication required
http://issues.apache.org/jira/browse/SOLR-1490 has a patch which will do what you want -Simon On Thu, Mar 4, 2010 at 2:21 PM, javaxmlsoapdev wrote: > > I have to use > > http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource > HttpDataSource to ask Solr consume my REST service and index data returned > from that service. My application/service has authentication/authorization. > When Solr invokes this service it MUST have valid credentials and stuff. > How/where do I configure/write authentication part before Solr consumes my > REST service? > > Any pointers would be appreciated. > > Thanks, > > -- > View this message in context: > http://old.nabble.com/HttpDataSource-consume-REST-API-with-Authentication-required-tp27785340p27785340.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Indexing a word in url
I also couldn't get the exact results I wanted for indexing URL components using WordDelimeterFilter or patternTokenizer, so resorted to adding a new field ('pathparts'), plus a few lines of code to generate the tokens in our content preprocessor which submits documents to SOLR for indexing. -Simon On Tue, Apr 1, 2008 at 7:24 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : Actually I want to use anything that is not alphabet or digit to be the > : separator - anything between them will be a word (so that I can use the > URL > : fragment to see what is indexed about this site)...any suggestion? > > In addition to Mike's suggestion of trying out the WordDelimiterFilter, > take a look at the PatternTokenizerFactory. > > > > -Hoss > >
Re: SolrClient from inside processAdd function
Similarly, I had considered a URP which would call the Solr Tagger to add new metadata fields for indexing to incoming documents (and recall discussing this with David Smiley), but eventually decided against this approach on the grounds of complexity. -Simon On Wed, Sep 4, 2019 at 2:10 PM Arnold Bronley wrote: > I need to search some other collection inside processAdd function and > append that information to the indexing request. > > On Tue, Sep 3, 2019 at 7:55 PM Erick Erickson > wrote: > > > This really sounds like an XY problem. What do you need the SolrClient > > _for_? I suspect there’s an easier way to do this….. > > > > Best, > > Erick > > > > > On Sep 3, 2019, at 6:17 PM, Arnold Bronley > > wrote: > > > > > > Hi, > > > > > > Is there a way to create SolrClient from inside processAdd function for > > > custom update processor for the same Solr on which it is executing? > > > > > -- I am transferring my email from Yahoo to simon.rosent...@gmail.com. I will continue to receive Yahoo email but will reply from this account. Please update your address lists accordingly.
Re: Solr Text Tagger | All tags in desc order
Hi Vipul: I'm not sure what you mean by 'score' in this context, as tagging requests do not return a standard Solr/Lucene score. If you're looking for the number of times a specific tag occurs in the tagged text, then you'll need to calculate that in your application from the returned JSON. HTH -Simon On Fri, Oct 4, 2019 at 5:41 AM Vipul Sharma wrote: > Hi All, > > After putting all the master data in Solr Text Tagger, I want to parse > resume text to fetch the top five skills based on there score is there any > way to fetch the result in descending order? >