Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread dinesh naik
Thanks Eric and Upayavira for your inputs. Is there a way i can associate this to a unique id of document, either using schema browser or TermsComponent? Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 2:55 AM, Upayavira wrote: > Use the schema browser on the admin UI, and click the "load te

Re: Some guidance on memory requirements/usage/tuning

2015-06-30 Thread Toke Eskildsen
On Tue, 2015-06-30 at 16:39 +1000, Caroline Hind wrote: > We have very recently upgraded from SOLR 4.1 to 5.2.1, and at the same > time increased the physical RAM from 24Gb to 96Gb. We run multiple > cores on this one server, approximately 20 in total, but primarily we > have one that is huge in co

Solr DIH from MySQL with unique ID

2015-06-30 Thread kurt
Hello. I have a question about the Solr Data Import Handler. I'm using Solr 5.2.1 on a Linux server with 32G ram. I have five different collections, and for each collection, I'm trying to import data from a MySQL database. All of the MySQL queries work properly in MySQL, and previously I was abl

Re: Solr Suggester not working.

2015-06-30 Thread ssharma7...@gmail.com
davidphilip cherian & Alessandro Benedetti, Thanks for you feedback & links, I was able to get the suggestions from suggester component. Thanks & Regards, Sachin Vyas. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Suggester-not-working-tp4214086p4214873.html Sent fro

Re: fq versus q

2015-06-30 Thread Esther Goldbraich
Thank you Erick. This solution fits part of our queries, will adopt it for those. Yet we have use-cases in which the results can not be cached. Everyone, What do you think about our assumptions and conclusions? As a general rule of thumb, at least in our case, would you please comment on the

Re: Solr Suggester not working.

2015-06-30 Thread Vincenzo D'Amore
Hi, can you post your final configuration? On Tue, Jun 30, 2015 at 9:57 AM, ssharma7...@gmail.com < ssharma7...@gmail.com> wrote: > davidphilip cherian & Alessandro Benedetti, > Thanks for you feedback & links, I was able to get the suggestions from > suggester component. > > Thanks & Regards, >

Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Thomas Michael Engelke
God damn. Thank you. *ashamed* Am 30.06.2015 00:21 schrieb Erick Erickson: > Try not putting it in double quotes? > > Best, > Erick > > On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke > wrote: > >> A friend and I are trying to develop some software using Solr in the >> background

Re: Solr Suggester not working.

2015-06-30 Thread ssharma7...@gmail.com
Vincenzo D'Amore, The following is my (CURRENT) Working Final Configuration: *Scheme.xml* . . . . . .

Re: optimize status

2015-06-30 Thread Erick Erickson
I've actually seen this happen right in front of my eyes "in the field". However, that was a very high-performance environment. My assumption was that fragmented index files were causing more disk seeks especially for the first-pass query response in distributed mode. So, if the problem is similar,

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Erick Erickson
In short, not unless you want to get into low-level Lucene coding. Inverted indexes are, well, inverted so their very structure makes this difficult. It looks like this: But I'm not convinced yet that this isn't an XY problem. What is the high-level problem you're trying to solve here? Maybe there

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread dinesh naik
Hi Erick, This is mainly for debugging purpose. If i have 20M records and few fields in some of the documents are not indexed as expected or something went wrong during indexing then how do we pin point the exact issue and fix the problem? Best Regards, Dinesh Naik On Tue, Jun 30, 2015 at 5:56 P

Re: Some guidance on memory requirements/usage/tuning

2015-06-30 Thread Erick Erickson
bq: The type of queries that are run can return anything from 1 million to 9.5 million documents, and typically run for anything from 20 to 45 minutes. Uhhh, are you literally setting the &rows parameter to over 9.5M and getting that many docs all at once? Or is that just numFound and you're _real

SolrCloud 5.2.1 upgrade

2015-06-30 Thread Vincenzo D'Amore
Hi All, I have a bunch of java clients connecting to a solrcloud cluster 4.8.1 with Solrj 4.8.0. The question is, I have to switch clients and cluster to the new version at same time? Could I upgrade the cluster and in the following months upgrade clients? BTW, looking at Solrj 5.2.1 I have seen

Re: Solr DIH from MySQL with unique ID

2015-06-30 Thread Erick Erickson
Two very quick questions: 1> how big is your transaction log? Well, do you even have one? If Solr is abnormally terminated, it'll replay the tlog on startup. The scenario here would be something like you were running DIH without any kind of hard commit specified and killed Solr for some reason. Th

Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Erick Erickson
Pesky computers, they keep doing exactly what I tell 'em to do, not what I mean ;) I'll open a JIRA for making Solr DWIM-compliant, Do What I Mean ;) ;) On Tue, Jun 30, 2015 at 4:17 AM, Thomas Michael Engelke wrote: > God damn. Thank you. > > *ashamed* > > Am 30.06.2015 00:21 schrieb Erick Eric

Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Alessandro Benedetti
I would like to add some consideration if possible. I find the field type really hard analysed, are you sure is this ok with your suggestions requirement ? Usually is better to keep the field for suggestion as less analysed as possible and then play with the different type of suggesters. If you not

Re: SolrCloud 5.2.1 upgrade

2015-06-30 Thread Vincenzo D'Amore
Update: regarding the solrj changelog I found this: - https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+4+to+Solr+5 and this: - https://issues.apache.org/jira/browse/SOLR/component/12324331?selectedTab=com.atlassian.jira.jira-projects-plugin:component-changelog-panel On Tue

Re: Solr Suggester not working.

2015-06-30 Thread Vincenzo D'Amore
Thanks Sachin Vyas. Maybe I have found a typo, but there is a closed comment "-->" alone at end of tag false --> On Tue, Jun 30, 2015 at 2:09 PM, ssharma7...@gmail.com < ssharma7...@gmail.com> wrote: > Vincenzo D'Amore, > The following is my (CURRENT) Working Final Configuration: > > *Scheme.

Re: Solr Suggester not working.

2015-06-30 Thread ssharma7...@gmail.com
Vincenzo D'Amore, Yes You are right, it's a typo, I missed it while cleaning the XML to put on the Solr-User list. But, *REMOVE *the following line, this was not used in my Solr 5.1 configuration: * false--> * Regards, Sachin Vyas. -- View this message in context: http://lucene.472066.n3.nabb

Re: Solr DIH from MySQL with unique ID

2015-06-30 Thread kurt
Erick, Many thanks for your reply. 1. The file solr.log does not show any errors, however, there is a file solr.log.8 which is 5MB and has a ton of text that was trying to index, but there was an invalid date error. I fixed that. Is it possible that Solr keeps trying to use that log? Can I simply

Re: Some guidance on memory requirements/usage/tuning

2015-06-30 Thread Alessandro Benedetti
Am I wrong or the current type of default IndexDirectory is the "NRTCachingDirectoryFactory" since Solr 4.x ? If I remember well this Factory is creating a Directory implementation built on top of a "MMapDirectory". In this case we should rely on the Memory Mapping Operative System feature to prope

Restricting fields returned by Suggester reult.

2015-06-30 Thread ssharma7...@gmail.com
Hi, Is it possible to restrict the result returned by Suggeter to "selected" fields only? i.e. Currently, Suggester returns data in following structure (XML), Can I restrict the Solr (5.1) Suggestor to return ONLY "term" & EXCLUDE & as per Suggeter result XML below ? 0 16

Suggeter Result Exception in specific scenario

2015-06-30 Thread ssharma7...@gmail.com
Hi, I have the following Solr 5.1 configuration: *schema.xml* . . . . . .

Suggester configuration queries.

2015-06-30 Thread ssharma7...@gmail.com
Hi, I have the following Solr 5.1 configuration: *schema.xml* . . . . . .

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Erick Erickson
Dinesh: This is what the admin/analysis page is for. It shows you exactly what tokens are produced by what steps in the analysis chain. That would be far better than trying to analyze the indexed terms. Best, Erick On Tue, Jun 30, 2015 at 8:35 AM, dinesh naik wrote: > Hi Erick, > This is mainly

Re: Restricting fields returned by Suggester reult.

2015-06-30 Thread Alessandro Benedetti
Actually what you are asking does not make any sense. Solr response is returning that data structure because it must return as much as possible. It is responsibility of the client to get what it needs from the response. Talking about the Java Client, I contributed the SolrJ code to parse the Sugge

Re: Solr DIH from MySQL with unique ID

2015-06-30 Thread Erick Erickson
1> Not solr log. The transaction log. If it is present it'll be a child directory of your data directory called "tlog", a sibling to your index directory. And "big" here is gigabytes. And yes, you can just nuke it if you want. You get one automatically if you are using SolrCloud. 2> OK, it was a l

Re: Suggester configuration queries.

2015-06-30 Thread Erick Erickson
This will be pretty much unworkable for any large corpus. The DocumentDictionaryFactory builds its index by reading the stored value from every document in your index to put into a sidecar Solr index (for free text suggester). This can take many minutes so doing this on every commit is an anti-pat

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread dinesh naik
Hi Erick, I agree with you. But i was checking if we could get hold on the whole document (to see all analyzed field values) . There might be chances that field value is common for multiple documents . In such cases it will be difficult to backtrack which document has the issue . Because admin

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Mark Ehle
Thanks to all for the help - it's now storing text and I can search and get results just before in 4.6, but I cannot get snippets to appear when I ask for highlighting. when I add documents, here is the URL my script generates: http://localhost:8080/solr/newspapers/update/extract?literal.id=2015

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Alessandro Benedetti
Do you have the original document available ? Or stored in the field of interest ? Should be quite an easy test to reproduce the Analysis simply using the analysis tool Upaya and Erick suggested. Just use your real document content and you will see how it is exactly analysed. Cheers 2015-06-30 15

Re: How to do a Data sharding for data in a database table

2015-06-30 Thread wwang525
Hi, I am currently investigating the queries with a much small index size (1M) to see the grouping, faceting on the performance degradation. This will allow me to do a lot of tests in a short period of time. However, it looks like the query is executed much faster the second time. This is tested

Re: SolrCloud 5.2.1 upgrade

2015-06-30 Thread Shawn Heisey
On 6/30/2015 6:40 AM, Vincenzo D'Amore wrote: > I have a bunch of java clients connecting to a solrcloud cluster 4.8.1 with > Solrj 4.8.0. > The question is, I have to switch clients and cluster to the new version at > same time? > Could I upgrade the cluster and in the following months upgrade cli

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Alessandro Benedetti
Instead of your immense schema, can you give us the details of the Highlight you are trying to use ? And how you are trying to use it ? Which client ? Direct APi calls ? let us know! Cheers 2015-06-30 15:10 GMT+01:00 Mark Ehle : > Thanks to all for the help - it's now storing text and I can sea

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread dinesh naik
Hi Alessandro, I am able to check the field wise analyzed results. I was interested in getting the complete document. As Erick mentioned - Reconstructing the doc from the postings lists isactually quite tedious. The Luke program (not request handler) has a function that does this, it's not fast t

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Mark Ehle
Alessandro - Someone asked to see the schema, I posted it. Should I have just attached it? Does this mailing list support that? I am by no means a SOLR expert. I am a PHP coder who wrote a (very-much-loved by our library staff and patrons) newspaper indexing tool that I am trying to update. I onl

Re: optimize status

2015-06-30 Thread Shawn Heisey
On 6/29/2015 2:48 PM, Reitzel, Charles wrote: > I take your point about shards and segments being different things. I > understand that the hash ranges per segment are not kept in ZK. I guess I > wish they were. > > In this regard, I liked Mongodb, uses a 2-level sharding scheme. Each shard

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Alessandro Benedetti
No worries, it is not a big deal you shared the schema.xml, I said that only because it turned the mail a little hard to read, anyway, in my opinion the query is correct, so the problem should reside elsewhere. Can you share the solrconfig.xml piece for your select request handler ? Probably it is

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Alessandro Benedetti
But what do you mean with the complete document ? Is it not available anymore ? So you have lost your original document and you want to try to reconstruct from the index ? 2015-06-30 16:05 GMT+01:00 dinesh naik : > Hi Alessandro, > I am able to check the field wise analyzed results. > > I was int

Re: DIH deletes cause opening of searchers

2015-06-30 Thread Shawn Heisey
On 6/25/2015 2:20 AM, Mikhail Khludnev wrote: On Tue, Jun 23, 2015 at 9:23 AM, Rudolf Grigeľ wrote: How can I prevent opening new searcher after every delete statement ? comment tag in solrconfig.xml (it always help) The presence or absence of the updateLog should not affect whether new se

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Mark Ehle
Do you mean this?: explicit 10 On Tue, Jun 30, 2015 at 12:11 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > No worries, it is not a big deal you shared the schema.xml, I said that > only because it turned the mail a little hard to read, anyway, in

Re: How to do a Data sharding for data in a database table

2015-06-30 Thread Erick Erickson
I'd set filterCache and queryResultCache to zero (size and autowarm count) Leave documentCache alone IMO as it's used to store documents on disk as the pass through various query components and doesn't autowarm anyway. I'd think taking it out would skew your results because of multiple decompress

Re: DIH deletes cause opening of searchers

2015-06-30 Thread Erick Erickson
>From the log fragment it's at least worth further investigation. You've had 4 searchers open in less than 1/2 second. That's horribly fast, but you already know that... Let's see the DIH configs, perhaps there's something innocent-seeming there that's causing this. Or, there's a bug somewhere.

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Erick Erickson
Something's not right here. Your query does not specify any field, you have q="JOHN GRAP". Which should parse as q=default_search_field:"JOHN GRAP". BUT, you've commented the default field out of the select request handler. I don't _think_ that there's a default in the code, but I've been surprise

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Mark Ehle
Here's what I get: { "responseHeader":{ "status":0, "QTime":27, "params":{ "echoParams":"all", "fl":"year", "df":"_text_", "indent":"true", "q":"\"JOHN GRAP\"", "hl.simple.pre":"", "debug":"true", "hl.simple.post":"", "hl.fl":"tex

RE: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Dinesh Naik
Hi Alessandro, Lets say I have 20M documents with 50 fields in each. I have applied text analysis like compression,ngram,synonym expansion on these fields. Checking individually field level analysis can be easily done via admin/analysis . But I need to do 50 times analysis check for these 5

Re: How to do a Data sharding for data in a database table

2015-06-30 Thread wwang525
Test_results_round_2.doc -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-do-a-Data-sharding-for-data-in-a-database-table-tp4212765p4215016.html Sent from the Solr - User mailing list archive

Re: optimize status

2015-06-30 Thread Upayavira
On Tue, Jun 30, 2015, at 04:42 PM, Shawn Heisey wrote: > On 6/29/2015 2:48 PM, Reitzel, Charles wrote: > > I take your point about shards and segments being different things. I > > understand that the hash ranges per segment are not kept in ZK. I guess I > > wish they were. > > > > In this r

Re: How to do a Data sharding for data in a database table

2015-06-30 Thread wwang525
Hi All, I did many tests with very consistent test results. Each query was executed after re-indexing, and only one request was sent to query the index. I disabled filterCache and queryResultCache for this test based on Erick's recommendation. The test document was posted to this email list earli

Re: Reading indexed data from solr 5.1.0 using admin/luke?

2015-06-30 Thread Upayavira
you can call the same API as the admin UI does. Pass it strings, it returns tokens in json/xml/whatever. Upayavira On Tue, Jun 30, 2015, at 06:55 PM, Dinesh Naik wrote: > Hi Alessandro, > > Lets say I have 20M documents with 50 fields in each. > > I have applied text analysis like compression,

Using the DataImportHandler to get filepath from MySQL DataBase BackSlash Character Problem

2015-06-30 Thread Paden
Hello, I'm having a slight "Catch-22" scenario going on with my Solr indexing process. I'm using the DataImportHandler to pull a filepath from a database. The problems is that Windows filepaths have the backslash character inside their paths. \\some\filepath So when insert this data into MySQL

RE: Using the DataImportHandler to get filepath from MySQL DataBase BackSlash Character Problem

2015-06-30 Thread Keswani, Nitin - BLS CTR
Hi Paden, I believe you could use a PatternReplaceFilterFactory ( http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceFilterFactory.html ) configured in your fieldtype that could replace '' with '\\' at index time. Thanks. Regards, Nitin

Re: Correcting text at index time

2015-06-30 Thread hossmaa
Hi all Thanks for the replies. So there's no getting away from doing it on my own then... @Jack: I need to replace a whole list of shortened words... It would make a crazy regex (which I incidentally wouldn't even know how to formulate). Cheers A. -- View this message in context: http://luc

Re: Upgrade to 5.2 from 4.6, no storing of text

2015-06-30 Thread Erick Erickson
It _looks_ like you're searching against _text_ and trying to highlight on text. On a very brief grep of all the Java code I don't see _text_ defined anywhere (of course I could be missing something here). So none of this makes sense. > you have no "df" field defined, yet you're getting a default

Re: How to do a Data sharding for data in a database table

2015-06-30 Thread Erick Erickson
bq: The index size is only 1 M records. A 10 times of the record size (> 10M) will likely bring the total response time to > 1 second This is an extrapolation you simply cannot make. Plus you cannot really tell anything from just a few queries about system performance. In fact you must disregard t

Re: Correcting text at index time

2015-06-30 Thread Jack Krupansky
You would have to have a separate instance of the update processor, each with one of the words. Or, you could code a JavaScript script with the stateless script update processor that has the long list or words and replacements as two arrays or an array of objects, and then iterate through the inpu

AUTO: Nicholas M. Wertzberger is out of the office (returning 07/06/2015)

2015-06-30 Thread Nicholas M. Wertzberger
I am out of the office until 07/06/2015. I'll be out of the office through July 4th. Please contact Jason Brown for any pressing JAS Team related items. Note: This is an automated response to your message "Re: Correcting text at index time" sent on 6/30/2015 8:55:16 PM. This is the only noti

Re: Restricting fields returned by Suggester reult.

2015-06-30 Thread ssharma7...@gmail.com
Alessandro Benedetti, Thanks for the update. Actually, what I meant by - "Is it possible to restrict the result returned by Suggeter to "selected" fields only?" was like option of "fl" available for querying (/select) in Solr, wherein there could be some fields as defined in "schema.xml", but we