Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Jan Høydahl / Cominvent
ocument_title' pf='content document_title' v=$qq}&qq=decade -domestic to: q=decade -domestic&defType=dismax&qf=content document_title&pf=content document_title&fq=tag_ids:(23)&fq=document_code_prefix:(A/RES/58) Also, you may want to apply patch SOLR-1553

Re: How I can use score value for my function

2010-06-29 Thread Geert-Jan Brits
It's possible using functionqueries. See this link. http://wiki.apache.org/solr/FunctionQuery#query 2010/6/29 MitchK > > Ramzesua, > > this is not possible, because Solr does not know what is the resulting > score > at query-time (as far as I know). > The score will be computed, when every hit

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Jan Høydahl / Cominvent
Hi, You need to use HTTP POST in order to send those parameters I believe. Try with curl: curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary "uid:6-HOST*" -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.co

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Jan Høydahl / Cominvent
Hmm, nice one - I was not aware of that trick. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 30. juni 2010, at 18.41, bbarani wrote: > > Hi, > > I was able to sucessfully delete multiple documents using t

Re: Dilemma - Very Frequent Synonym updates for Huge Index

2010-07-01 Thread Jan Høydahl / Cominvent
reindexing now and then could be that if your OpenNLP extraction dictionaries have changed, it will be reflected too. BTW: Could you share details of your OpenNLP integration with us? I'm about to do it on another project.. -- Jan Høydahl, search solution architect Cominve

Re: Very basic questions: Faceted front-end?

2010-07-01 Thread Jan Høydahl / Cominvent
Have you had a look at www.twigkit.com ? Could be worth the bucks... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 1. juli 2010, at 00.59, Peter Spam wrote: > Wow, thanks Lance - it's really fast now! > >

Re: Multilingual - Search against the appropriate field

2010-07-01 Thread Jan Høydahl / Cominvent
y try to pull out title_XY where XY is pulled from documents "language" metadata. I think which you choose depends on taste, each has its + and - -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 1. juli 2010, at 12.26

SolrJ: BinaryRequestWriter with StreamingUpdateSolrServer

2010-07-01 Thread Jan Høydahl / Cominvent
Hi, I had the impression that the StreamingUpateSolrServer in SolrJ would automatically use the /update/javabin UpdateRequestHandler. Is this not true? Do we need to call server.setRequestWriter(new BinaryRequestWriter()) for it to transmit content with the binary protocol? -- Jan Høydahl

Re: DisMax, multi fields, and phrase fields

2010-07-01 Thread Jan Høydahl / Cominvent
Hi, Check out the new eDisMax handler (src) and the new pf2 parameter. Also available as path SOLR-1553. Another option to avoid match for doc2 is to add application specific logic in your frontend which detects car brands and years and rewrite the query into a phrase or a filter. -- Jan

Re: Dilemma - Very Frequent Synonym updates for Huge Index

2010-07-01 Thread Jan Høydahl / Cominvent
where your indexeres only do indexing (except at disaster where they can do search as well) - in that case you can happily reindex without worrying about affecting user experience. What exactly is the issue you see with the query-side-only synonym expansion when using KeywordTokenizer? -- Jan

Re: Use free text to search against boolean fields?

2010-07-02 Thread Jan Høydahl / Cominvent
s, e.g. through a set of regex ((not|non|no) (smoker|smoking|smoke))... You could always do a mix also - to keep a free-text field as well, and any words that your parser does not understand can be passed through to the free-text as a "should" term with a boost. -- Jan Høydahl,

Re: SolrJ-1.4.0 client needs slf4j-jdk14-1.5.5 library on J2SE 1.5 Update 21

2010-07-03 Thread Jan Høydahl / Cominvent
Hi, SolrJ uses slf4j logging. As you can read on the wiki http://wiki.apache.org/solr/Solrj#Solr_1.4 you need to provide the slf4j-jdk14 binding (or any other log framework you wish to bind to) yourself and add the jar to your classpath. -- Jan Høydahl, search solution architect Cominvent AS

Re: Use free text to search against boolean fields?

2010-07-03 Thread Jan Høydahl / Cominvent
well educated on how to query your system and behave, then what you suggest makes more sense. It's quick to test and see how it works. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 3. juli 2010, at 01.11, Saïd Radhouani

Re: Unicode processing - Issue with CharStreamAwareWhitespaceTokenizerFactory

2010-07-06 Thread Jan Høydahl / Cominvent
The Char-filters MUST come before the Tokenizer, due to their nature of processing the character-stream and not the tokens. If you need to apply the accent normalizatino later in the analysis chain, either use ISOLatin1AccentFilterFactory or help with the implementation of SOLR-1978. -- Jan

MultiValue dynamicField and copyField

2010-07-14 Thread Jan Simon Winkelmann
Hi everyone, i was wondering if the following was possible somehow: As in: using copyField to copy a multiValued field into another multiValued field. Cheers, Jan

AW: MultiValue dynamicField and copyField

2010-07-14 Thread Jan Simon Winkelmann
I figured out where the problem was. The destination wildcard was actually matching the wrong field. I changed the fieldnames around a bit and now everything works fine. Thanks! > -Ursprüngliche Nachricht- > Von: kenf_nc [mailto:ken.fos...@realestate.com] > Gesendet: Mittwoch, 14. Juli 2

Re: Re:Re: How to speed up solr search speed

2010-07-16 Thread Geert-Jan Brits
B ram instead of the standard which is much lower (I'm not sure what exactly but it could well be 64MB for non -server, aligning with what you're seeing) Geert-Jan 2010/7/16 marship > Hi Tom Burton-West. > > Sorry looks my email ISP filtered out your replies. I checked web versi

Re: Re: How to speed up solr search speed

2010-07-17 Thread Geert-Jan Brits
ould be pretty low and would be used independently of other queries, this would be an excellent candidate for the FQ-param. http://wiki.apache.org/solr/CommonQueryParameters#fq <http://wiki.apache.org/solr/CommonQueryParameters#fq> This was a longer reply than I wanted to. Really think about yo

Re: indexing best practices

2010-07-18 Thread Geert-Jan Brits
on ssd's is of course going to boost performance a lot as well (on large indexes, bc small ones may fit in disk cache entirely) <http://wiki.apache.org/lucene-java/ImproveIndexingSpeed> Hope that helps a bit, Geert-Jan 2010/7/18 kenf_nc > > No one has done performance analysis?

Re: Tree Faceting in Solr 1.4

2010-07-23 Thread Geert-Jan Brits
sort of hierarchy. The first part of your question seemed to be more about Hierarchial faceting as per SOLR-792, but I couldn't quite distill a question from that part. Also, just a suggestion, consider using id's instead of names for filtering; you will get burned sooner or later othe

Re: help with a schema design problem

2010-07-23 Thread Geert-Jan Brits
With the usecase you specified it should work to just index each "Row" as you described in your initial post to be a seperate document. This way p_value and p_type all get singlevalued and you get a correct combination of p_value and p_type. However, this may not go so well with other use-cases yo

Re: help with a schema design problem

2010-07-23 Thread Geert-Jan Brits
design my schema ? I have some solutions > but none seems to be a good solution. One way would be to define a single > field in the schema as p_value_type = "client pramod" i.e. combine the > value > from both the field and store it in a single field. > > > On Sat, Ju

Re: filter query on timestamp slowing query???

2010-07-23 Thread Geert-Jan Brits
omment) but if it is, it would perhaps be more performant. Big IF, I know. Geert-Jan 2010/7/23 Chris Hostetter > : On top of using trie dates, you might consider separating the timestamp > : portion and the type portion of the fq into seperate fq parameters -- > : that will allow them

Re: help with a schema design problem

2010-07-23 Thread Geert-Jan Brits
alue:"Pramod" AND p_type:"Supplier" > > > > > > > > it would give me result as document 1. Which is incorrect, since in > > > > document > > > > 1 Pramod is a Client and not a Supplier. > > Would it? I would expect it to

Re: Tree Faceting in Solr 1.4

2010-07-24 Thread Geert-Jan Brits
Perhaps completely unnessecery when you have a controlled domain, but I meant to use ids for places instead of names, because names will quickly become ambiguous, e.g.: there are numerous different places over the world called washington, etc. 2010/7/24 SR > Hi Geert-Jan, > > What did

Re: Tree Faceting in Solr 1.4

2010-07-24 Thread Geert-Jan Brits
I believe we use an in-process weakhashmap to store the id-name relationship. It's not that we're talking billions of values here. For anything more mem-intensive we use no-sql (tokyo tyrant through memcached protocol at the moment) 2010/7/24 Jonathan Rochkind > > Perhaps completely unnessecery

Re: Which is a good XPath generator?

2010-07-25 Thread Geert-Jan Brits
r products, etc. HTH, Geert-Jan 2010/7/25 Li Li > it's not a related topic in solr. maybe you should read some papers > about wrapper generation or automatical web data extraction. If you > want to generate xpath, you could possibly read liubing's papers such > as &qu

Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits
regular docs becomes: q=title:some+title&fq=type:type_normal and searching for searchqueries becomes (I think this is what you want): q=searchquery:bmw+car&fq=type:type_search Geert-Jan 2010/7/26 > > > > I need you expertise on this one... > > We would like to index

Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits
" as required so you don't forget to include in in your indexing-program) 2010/7/26 > > Thanks for you answer! That's great. > > Now to index search quieries data is there something special to do? or it > stay as usual? > > > > > > >

Re: advice on creating a solr index when data source is from many unrelated db tables

2010-07-29 Thread Geert-Jan Brits
f you did mean this, please show an example of what you want to achieve. HTH, Geert-Jan 2010/7/29 S Ahmed > I understand (and its straightforward) when you want to create a index for > something simple like Products. > > But how do you go about creating a Solr index when you have d

Re: Quering the database

2010-08-02 Thread Geert-Jan Brits
://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Geert-Jan 2010/8/2 Hando420 > > Thank you for your reply. Still the the problem persists even i tested with > a > simple example by defining a column of type text as varchar in database and > in schema.xml used the default id which i

Re: Quering the database

2010-08-03 Thread Geert-Jan Brits
n overview of what's possible. cheers, Geert-Jan <http://lucene.apache.org/solr/tutorial.html> 2010/8/3 Hando420 > > Thanks alot to all now its clear the problem was in the schema. One more > thing i would like to know is if the user queries for something does i

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
Field Collapsing (currently as patch) is exactly what you're looking for imo. http://wiki.apache.org/solr/FieldCollapsing <http://wiki.apache.org/solr/FieldCollapsing>Geert-Jan 2010/8/4 Ken Krugler > Hi all, > > I've got a situation where the key result from

Re: Best solution to avoiding multiple query requests

2010-08-04 Thread Geert-Jan Brits
-issues to make sure this isn't already available now, but just not updated on the wiki) Also I found a blogpost (from the patch creator afaik) with in the comments someone with the same issue + some pointers. http://blog.jteam.nl/2009/10/20/result-grouping-field-collapsing-with-solr/ hope that

Re: how to take a value from the query result

2010-08-05 Thread Geert-Jan Brits
you should parse the xml and extract the value. Lot's of libraries undoubtably exist for PHP to help you with that (I don't know PHP) Moreover, if all you want from the result is AUC_CAT you should consider using the fl=param like: http://172.16.17.126:8983/search/select/?q=AUC_ID:607136&fl=AUC_CA

Re: No "group by"? looking for an alternative.

2010-08-05 Thread Geert-Jan Brits
raints specified. This would likely be something outside of solr (a simple sql-select on a single product) hope that helps, Geert-Jan 2010/8/5 Mickael Magniez > > I've got only one document per shoes, whatever its size or color. > > My first try was to create one document per

Re: XML Format

2010-08-06 Thread Geert-Jan Brits
at first glance I see no difference between the 2 documents. Perhaps you can illustrate which fields are not in the resultset that you want to be there? also use the 'fl'-param to describe which fields should be outputted in your results. Of course, you have to first make sure the fields you want

Re: stemming the index

2010-08-06 Thread Jan Høydahl / Cominvent
Check out slides 36-38 in this presentation for some hint on a possible solution: http://www.slideshare.net/janhoy/migrating-fast-to-solr-jan-hydahl-cominvent-as-euro-con -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 7

Re: Deleting old index data from solr. But HDD spaces doesn`t free.

2010-08-06 Thread Jan Høydahl / Cominvent
What you are missing is a final server.optimize(); Deleting a document will only mark it as deleted in the index until an optimize. If disk space is a real problem in your case because you e.g. update all docs in the index frequently, you can trigger an optimize(), say nightly. -- Jan Høydahl

Re: how to create a custom type in Solr

2010-08-06 Thread Jan Høydahl / Cominvent
Your use case can be solved by splitting the range into two int's: Document: {title: My document, from: 8000, to: 9000} Query: q=title:"My" AND (from:[* TO 8500] AND to:[8500 TO *]) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Tra

Re: SOLR QUERY

2010-08-06 Thread Jan Høydahl / Cominvent
Another way is to use DisMax parser, and give it a &qf=field1 field2 field3... parameter, and it will automatically search in all fields specified. It is more powerful than having one default field, and saves that disk space. Buy you sacrifice some extra resources during querying. --

Re: How do i update some document when i use sharding indexs?

2010-08-09 Thread Geert-Jan Brits
from the chars (256 * first char + 16 * 2nd char + 3rd char), and take that nr modulo 20. That should give you a nr in [0,20) which is the shard-index. use the same algorithm to determine which shard contains the document that you want to change. Geert-Jan 2010/8/9 lu.rongbin > >My i

Re: How do i update some document when i use sharding indexs?

2010-08-09 Thread Geert-Jan Brits
Just to be completely clear: the program that splits your index in 20 shards should employ this algo as well. 2010/8/9 Geert-Jan Brits > I'm not sure if Solr has some build-in support for sharding-functions, but > you should generally use some hashing-algorithm to split the indi

Re: how to support "implicit trailing wildcards"

2010-08-10 Thread Geert-Jan Brits
you could satisfy this by making 2 fields: 1. exactmatch 2. wildcardmatch use copyfield in your schema to copy 1 --> 2 . q=exactmatch:mount+wildcardmatch:mount*&q.op=OR this would score exact matches above (solely) wildcard matches Geert-Jan 2010/8/10 yandong yao > Hi Bastian, >

Re: Indexing fieldvalues with dashes and spaces

2010-08-10 Thread Jan Høydahl / Cominvent
=String. I often create a dynamic field for such, e.g. and then do a copyField. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 9. aug. 2010, at 09.54, PeterKerk wrote: > > Hi Erick, > > Ok. its more clear n

Re: how to support "implicit trailing wildcards"

2010-08-10 Thread Jan Høydahl / Cominvent
t trickier, but something along these lines: q=(mount OR mount*) AND (everest OR everest*) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 10. aug. 2010, at 09.38, Geert-Jan Brits wrote: > you could satisfy this by ma

Re: solr query result not read the latest xml file

2010-08-10 Thread Jan Høydahl / Cominvent
-Ddata=files > -Durl=http://localhost:8983/solr/update > -Dcommit=yes > Thus for your index, try: java -Durl=http://localhost:80/search/update -jar post.jar myfile.xml -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.co

Re: delete Problem..

2010-08-10 Thread Jan Høydahl / Cominvent
f this does not break other functionality you need on that field. Then it would support searching part of the field. You should make this as a phrase search to avoid ambiguities. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com O

Re: solr query result not read the latest xml file

2010-08-11 Thread Jan Høydahl / Cominvent
egory is deleted. How would you know by simply looking at the file system? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 11. aug. 2010, at 04.10, e8en wrote: > > thanks for you response Jan, > I just knew that the

Re: timestamp field

2010-08-11 Thread Jan Høydahl / Cominvent
Hi, Which time zone are you located in? Do you have DST? Solr uses UTC internally for dates, which means that "NOW" will be the time in London right now :) Does that appear to be right 4 u? Also see this thread: http://search-lucene.com/m/hqBed2jhu2e2/ -- Jan Høydahl, search solution

Re: Delta-import with solrj client

2010-08-11 Thread Jan Høydahl / Cominvent
Hi, Make sure you use a proper "ID" field, which does *not* change even if the content in the database changes. In this way, when your delta-import fetches changed rows to index, they will update the existing rows in your index. -- Jan Høydahl, search solution architect Co

Re: Filter Performance in Solr 1.3

2010-08-11 Thread Geert-Jan Brits
you use them repeatedly. If on the other hand you're seeing slower repsonse times with a fq-filter applied all the time, then the same queries without the fq-filter, there must be something strange going on since this really shouldn't happen in normal situations. Geert-Jan 2010/8/11

Re: how to support "implicit trailing wildcards"

2010-08-11 Thread Jan Høydahl / Cominvent
I guess q=mount OR (mount*)^0.01 would work equally as well, i.e. diminishing the effect of wildcard matches. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 11. aug. 2010, at 17.53, yandong yao wrote: > Hi Jan, > &

Re: Analysing SOLR logfiles

2010-08-11 Thread Jan Høydahl / Cominvent
Have a look at www.splunk.com -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 11. aug. 2010, at 19.34, Jay Flattery wrote: > Hi there, > > > Just wondering what tools people use to analyse SOLR log files

Re: bug or feature???

2010-08-11 Thread Jan Høydahl / Cominvent
Your syntax looks a bit funny. Which version of Solr are you using? Pure negative queries are not supported, try q=(*:* -title:janitor) instead. Also, for debugging what's going on, please add &debugQuery=true and share the parsed query for both cases with us. -- Jan Høydahl, search

Re: Indexing and ExtractingRequestHandler

2010-08-11 Thread Jan Høydahl / Cominvent
Hi, You can try Tika command line to parse your Excel file, then you will se the exact textual output from it, which will be indexed into Solr, and thus inspect whether something is missing. Are you sure you use a version of Luke which supports your version of Lucene? -- Jan Høydahl, search

Re: Wiki documentation Packaged as single HTML or PDF

2010-08-16 Thread Jan Høydahl / Cominvent
it's still the best out there. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 13. aug. 2010, at 13.49, Samuel Lopes Grigolato wrote: > Hello, > > I need to ship the Solr wiki documentation, preferably in PDF

Re: Function query to boost scores by a constant if all terms are present

2010-08-18 Thread Jan Høydahl / Cominvent
, which it does whenever all three terms match. PS: You can achieve the same in a Lucene query, using q=a fox _val_:"map(query($qq),0,0,0,100.0)" -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 17. aug. 2010, at

Re: Solr data type for date faceting

2010-08-18 Thread Jan Høydahl / Cominvent
to extract all string values offline from the index and somehow rebuild the index offline? Andrzej, is that possible? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 18. aug. 2010, at 11.12, Karthik K wrote: > Thanks Mark.

Re: Missing tokens

2010-08-18 Thread Jan Høydahl / Cominvent
Hi, Can you share with us how your schema looks for this field? What FieldType? What tokenizer and analyser? How do you parse the PDF document? Before submitting to Solr? With what tool? How do you do the query? Do you get the same results when doing the query from a browser, not SolrJ? -- Jan

Re: improving search response time

2010-08-18 Thread Jan Høydahl / Cominvent
It includes timings for each component. High latency could be caused by a number of different factors, and it is important to first isolate the bottleneck. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 18. aug. 2010, at 1

Re: Solr's Index Live Updates

2010-08-18 Thread Jan Høydahl / Cominvent
Hi, I'm afraid you'll have to post the full document again, then do a commit. But it WILL be lightning fast, as it is only the updated document which is indexed, all the other existing documents will not be re-indexed. -- Jan Høydahl, search solution architect Cominvent AS - www.com

Re: Missing tokens

2010-08-18 Thread Jan Høydahl / Cominvent
wiki.apache.org/solr/ExtractingRequestHandler ?? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 18. aug. 2010, at 16.25, paul.mo...@dds.net wrote: > Here's my field description. I mentioned 'contents' fi

Re: Solr data type for date faceting

2010-08-19 Thread Jan Høydahl / Cominvent
Yes, I forgot that strings support alphanumeric ranges. However, they will potentially be very memory intensive since you dont get the trie-optimization and since strings take up more space than ints. Only way is to try it out. -- Jan Høydahl, search solution architect Cominvent AS

Re: Missing tokens

2010-08-19 Thread Jan Høydahl / Cominvent
Hi, Your bug is right there in the WhitespaceTokenizer, where you see that it does NOT strip away the "." as whitespace. Try with StandardTokenizerFactory instead, as it removes punctuation. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training

Re: improving search response time

2010-08-19 Thread Jan Høydahl / Cominvent
search"~50^9.5)~0.01 (sum(sdouble(yearScore)))^1.1 (sum(sdouble(readerScore)))^2.0 Do you need "pf" at all? Can you smash together similarly weighted fields with copyfield into a new one, reducing the number of fiels to lookup from 7 to perhaps 5? -- Jan Høydahl, search

Re: Basic conceptual questions about solr

2010-08-19 Thread Jan Høydahl / Cominvent
r, exposing some API on some port. And then when user searches your search portal, e.g. search.mycompany.com/?q=foo, the GUI uses some AJAX to reach out to the local search service and filter that in to the results... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.co

Re: How to get most indexed keyword from SOLR

2010-08-20 Thread Jan Høydahl / Cominvent
Check out the luke request handler: http://localhost:8983/solr/admin/luke?fl=my_ad_field&numTerms=100 - you'll find topTerms for the fields specified -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 20. aug. 2010,

Re: How to Debug Sol-Code in Eclipse ?!

2010-08-22 Thread Geert-Jan Brits
rServer and step-through/debug the source-code from there. It works just like it is your own source-code. HTH, Geert-Jan 2010/8/22 stockii > > thx for you reply. > > i dont want to test my own classes in unittest. i try to understand how > solr > works , because i write a little t

Re: Scoring of documents, boost partial and exact hits in one field

2010-08-22 Thread Jan Høydahl / Cominvent
Hi, Try a wildcard term with lower score: q=title:work AND title:work*&debugQuery=true You will now see from the debug printout that you get an extra boost for workload. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com O

Re: Solr search speed very low

2010-08-25 Thread Geert-Jan Brits
have a look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters to see how that works. 2010/8/25 Marco Martinez > You should use the tokenizer solr.WhitespaceTokenizerFactory in your field > type to get your terms indexed, once you have indexed the data, you dont > need to use the * i

Re: how to deal with virtual collection in solr?

2010-08-25 Thread Jan Høydahl / Cominvent
gt; > 3. I got a error when I index pdf files which are version 1.5 or 1.6. Would > you please tell me if there is a patch to fix it? How did you try to index these PDFs? What version of Solr are you using? Exactly what error message did you get? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com

Re: solr working...

2010-08-26 Thread Geert-Jan Brits
Check out Drew Farris' explantion for remote debugging Solr with Eclipse posted a couple of days ago: http://lucene.472066.n3.nabble.com/How-to-Debug-Sol-Code-in-Eclipse-td1262050.html <http://lucene.472066.n3.nabble.com/How-to-Debug-Sol-Code-in-Eclipse-td1262050.html> Geert-Jan 2010/8

Re: how to deal with virtual collection in solr?

2010-08-27 Thread Jan Høydahl / Cominvent
n add that field to your schema, and then inject it as metadata on the ExtractingRequestHandler call: curl "http://localhost:8983/solr/update/extract?literal.collection=aaprivate&literal.id=doc1&commit=true"; -F "fi...@myfile.pdf" -- Jan Høydahl, search solution arch

Re: Creating new Solr cores using relative paths

2010-08-27 Thread Jan Høydahl / Cominvent
clude syntax to embed XML snippets into solrconfig.xml. Since it does not support variable substitution, without a stable base path it's very hard to use. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 27. aug. 2010

Re: how to deal with virtual collection in solr?

2010-08-31 Thread Jan Høydahl / Cominvent
And as Lance pointed out, make sure your XML files conform to the Solr XML format (http://wiki.apache.org/solr/UpdateXmlMessages). -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 27. aug. 2010, at 15.04, Ma, Xiaohui (NIH/NLM

Re: questions about synonyms

2010-08-31 Thread Geert-Jan Brits
concerning: > . I got a very big text file of synonyms. How I can use it? Do I need to index this text file first? have you seen http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter ? Cheers, Geert-Jan <http://wiki.apache.org/solr/AnalyzersTokenizersTokenF

Re: High - Low field value?

2010-09-01 Thread Geert-Jan Brits
StatsComponent is exactly what you're looking for. http://wiki.apache.org/solr/StatsComponent <http://wiki.apache.org/solr/StatsComponent>Cheers, Geert-Jan 2010/9/1 kenf_nc > > I want to do range facets on a couple fields, a Price field in particular. > But Price is rel

Re: how to deal with virtual collection in solr?

2010-09-03 Thread Jan Høydahl / Cominvent
You did not supply your actual query. Try to add a &q=foobar parameter, also you don't need a & before shards since you have the ?. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 1. sep. 2010, at 20.14, Ma, Xia

Re: In Need of Direction; Phrase-Context Tracking / Injection (Child Indexes) / Dismissal

2010-09-03 Thread Jan Høydahl / Cominvent
instead of milliseconds though... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 3. sep. 2010, at 01.03, Scott Gonyea wrote: > Hi Grant, > > Thanks for replying--sorry for sticking this on dev; I had imagined that > develo

Re: Auto Suggest

2010-09-03 Thread Jan Høydahl / Cominvent
Are you phrasing the query, like &q="app mou" ? I guess with edgeNgram you use KeywordTokenizer which stores phrases as single terms. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com On 2. sep. 2010, at 14.53, Ja

Re: In Need of Direction; Phrase-Context Tracking / Injection (Child Indexes) / Dismissal

2010-09-06 Thread Jan Høydahl / Cominvent
there which could do much of what you need. Did you know you can also embed Solr through EmbeddedSolr to include it in a workflow (See SOLR-1301). Also, I just found http://sna-projects.com/azkaban/ which looks promising to control advanced Hadoop workflows. Just some pointers.. -- Jan Høydahl, s

Re: Is there a way to fetch the complete list of data from a particular column in SOLR document?

2010-09-07 Thread Geert-Jan Brits
>Please let me know if there are any other ideas / suggestions to implement this. You're indexing program should really take care of this IMHO. Each time your indexer inserts a document to Solr, flag the corresponding entity in your RDBMS, each time you delete, remove the flag. You should implemen

Re: Is there a way to fetch the complete list of data from a particular column in SOLR document?

2010-09-09 Thread Geert-Jan Brits
e process query solr for documents in the indexing > state and set them to committed if they are queryable in solr. > > On Tue, Sep 7, 2010 at 14:26, Geert-Jan Brits wrote: > >>Please let me know if there are any other ideas / suggestions to > implement > > this. > >

Re: Date faceting +1MONTH problem

2010-09-10 Thread Jan Høydahl / Cominvent
ular usecase I would consider a workaround where you add a new year-month field for this kind of faceting, avoiding date math completely since it will be a string facet, giving you: [2008-12] => 0 [2009-01] => 0 [2009-02] => 0 [2009-03] => 0 [2009-04] => 0 -- Jan Høydahl, search solut

Re: Sorting not working on a string field

2010-09-13 Thread Jan Høydahl / Cominvent
ot;2", but after "02". -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 10. sep. 2010, at 19.14, n...@frameweld.com wrote: > Hello, I seem to be having a problem with sorting. I have a string field > (time_code) that I want to order by. When the re

Re: mm=0?

2010-09-13 Thread Jan Høydahl / Cominvent
urned on (e.g. alpha~) * Redirect user to some other, broader source (wikipedia, google...) if relevant to your domain. No matter what you do, it is important to communicate it to the user in a very clear way. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 11. sep. 20

Re: Our SOLR instance seems to be single-threading and therefore not taking advantage of its multi-proc host

2010-09-14 Thread Jan Høydahl / Cominvent
it max out? In your case it's Tomcat which handles the threading of requests, and Solr is definitely capable of utilizing multi cores. Could it be that you are bound by something else than CPU? Like disk, memory, network or such? -- Jan Høydahl, search solution architect Cominve

Re: Solr UIMA integration

2010-09-20 Thread Jan Høydahl / Cominvent
urable on a per instance basis? It could be done as follows: concept concept concept ... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 20. sep. 2010, at 12.35, Tommaso Teofili wrote: > Hi all, > I am working on integrating Apache UIMA as un UpdateRequestP

Re: Restrict possible results based on relational information

2010-09-20 Thread Jan Høydahl / Cominvent
NGramFilter in the "to" field, you get the effect of an automatic wildcard search since "John Doe" will be indexed as (conceptually) "J Jo Joh John D Do Doe" -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 20. sep. 2010, at 12.36, Stefan M

Re: Calculating distances in Solr using longitude latitude

2010-09-22 Thread Jan Høydahl / Cominvent
:-) Also, that Wiki page clearly states in the very first line that it talks about uncommitted stuff "Solr4.0". I think that is pretty clear. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 22. sep. 2010, at 03.31, Lance Norskog wrote: > Dev

Re: Different analyzers for dfferent documents in different languages?

2010-09-22 Thread Jan Høydahl / Cominvent
ide Solr or write an UpdateRequestProcessor which does the renaming for you. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 22. sep. 2010, at 12.01, Andy wrote: > I have documents that are in different languages. There's a field in the > documents specifying

Re: Autocomplete: match words anywhere in the token

2010-09-22 Thread Jan Høydahl / Cominvent
Hmm, the terms component can only give you terms, so I don't think you can use that method. Try to go for creating a new Solr Core for your usecase. A bit more work but much more flexible. See http://search-lucene.com/m/Zfxp52FX49G1 -- Jan Høydahl, search solution architect Cominve

Re: is indexing single-threaded?

2010-09-23 Thread Jan Høydahl / Cominvent
SolrJ threads speeds up feeding throughput. The building the index is still single threaded (per core), isn't it? Don't know about analysis. But you cannot have two threads write to the same file... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 23. sep

Re: Autocomplete: match words anywhere in the token

2010-09-23 Thread Jan Høydahl / Cominvent
Make sure you're using AND as default operator... -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 22. sep. 2010, at 20.14, Arunkumar Ayyavu wrote: > Thanks for the responses. Now, I included the EdgeNGramFilter. But, I get > the following results when I

Re: Is Solr right for our project?

2010-09-27 Thread Jan Høydahl / Cominvent
Solr will match this in version 3.1 which is the next major release. Read this page: http://wiki.apache.org/solr/SolrCloud for feature descriptions Coming to a trunk near you - see https://issues.apache.org/jira/browse/SOLR-1873 -- Jan Høydahl, search solution architect Cominvent AS

Re: Is Solr right for our project?

2010-09-28 Thread Jan Høydahl / Cominvent
. However, we encourage you to do a test install based on TRUNK+SOLR-1873 and give it a try. But we cannot guarantee that the APIs will not change in the released version (hopefully 3.1 sometime this year). -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 28. sep

Conditional Function Queries

2010-09-28 Thread Jan Høydahl / Cominvent
":100, "green":sum(30,20)) What do you think? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com

Re: Conditional Function Queries

2010-09-28 Thread Jan Høydahl / Cominvent
Ok, I created the issues: IF function: SOLR-2136 AND, OR, NOT: SOLR-2137 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 28. sep. 2010, at 19.36, Yonik Seeley wrote: > On Tue, Sep 28, 2010 at 11:33 AM, Jan Høydahl / Cominvent > wrote: >> Have anyone

Re: can i have more update processors with solr

2010-10-01 Thread Jan Høydahl / Cominvent
I think the parameter name is confusing. I have proposed renaming it to processor.chain: https://issues.apache.org/jira/browse/SOLR-2105 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 30. sep. 2010, at 22.25, Markus Jelsma wrote: > Almost, you can defin

<    5   6   7   8   9   10   11   12   >