Re: Need help with facets

2010-08-10 Thread Ahmet Arslan
--- On Wed, 8/11/10, Moazzam Khan wrote: > From: Moazzam Khan > Subject: Re: Need help with facets > To: solr-user@lucene.apache.org > Date: Wednesday, August 11, 2010, 1:32 AM > Thanks Ahmet that worked! > > Here's another issues I have : > > Like I said before, I have these fields in Solr

Re: solr query result not read the latest xml file

2010-08-10 Thread e8en
thanks for you response Jan, I just knew that the post.jar only an example tool so what should I use if not post.jar for production? btw, I already tried using this command: java -Durl=http://localhost:8983/search/update -jar post.jar cat_817.xml and IT WORKS !! the cat_817.xml reflected directl

Re: DIH and multivariable fields problems

2010-08-10 Thread kenf_nc
Glad I could help. I also would think it was a very common issue. Personally my schema is almost all dynamic fields. I have unique_id, content, last_update_date and maybe one other field specifically defined, the rest are all dynamic. This lets me accept an almost endless variety of document type

Re: PDF file

2010-08-10 Thread Jayendra Patil
Try ... curl " http://lhcinternal.nlm.nih.gov:8989/solr/lhc/update/extract?stream.file= /pub2009001.pdf&literal.id=777045&commit=true" stream.file - specify full path literal. - specify any extra params if needed Regards, Jayendra On Tue, Aug 10, 2010 at 4:49 PM, Ma, Xiaohui (NIH/NLM/LHC) [C] <

Re: Solr 1.4 - stats page slow

2010-08-10 Thread entdeveloper
Apologies if this was resolved, but we just deployed Solr 1.4.1 and the stats page takes over a minute to load for us as well and began causing OutOfMemory errors so we've had to refrain from hitting the page. From what I gather, it is the fieldCache part that's causing it. Was there ever an offi

Re: How to compile nightly build?

2010-08-10 Thread harrysmith
In this particular case I would like to get the trunk. Is there a different link for binary distributions of nightly builds? I had been downloading from here: http://hudson.zones.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/ In the case I did want to compile fro

Re: Modifications to AbstractSubTypeFieldType

2010-08-10 Thread Lance Norskog
Compound types are young and will probably mutate. I will do my own hack until things settle down. Lance On Mon, Jul 12, 2010 at 12:47 AM, Mark Allan wrote: > On 7 Jul 2010, at 6:24 pm, Yonik Seeley wrote: >> >> On Wed, Jul 7, 2010 at 8:15 AM, Grant Ingersoll >> wrote: >>> >>> Originally, I had

Re: How to compile nightly build?

2010-08-10 Thread Moazzam Khan
You don't have to download the source. You can just download the binary distribution from their site and run it without compiling it. - Moazzam On Tue, Aug 10, 2010 at 1:48 PM, harrysmith wrote: > > I am attempting to follow the instructions located at: > > http://wiki.apache.org/solr/Extracting

Re: Need help with facets

2010-08-10 Thread Moazzam Khan
Thanks Ahmet that worked! Here's another issues I have : Like I said before, I have these fields in Solr documents FirstName LastName RecruitedDate VolumeDate (just added this in this email) VolumeDone (just added this in this email) Now I have to get sum of all VolumeDone (integer field) for

RE: PDF file

2010-08-10 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Thanks so much for your help! I tried to index a pdf file and got the following. The command I used is curl 'http://lhcinternal.nlm.nih.gov:8989/solr/lhc/update/extract?map.content=text&map.stream_name=id&commit=true' -F "fi...@pub2009001.pdf" Did I do something wrong? Do I need modify anythi

RE: PDF file

2010-08-10 Thread Sharp, Jonathan
Xiaohui, You need to add the following jars to the lib subdirectory of the solr config directory on your server. (path inside the solr 1.4.1 download) /dist/apache-solr-cell-1.4.1.jar plus all the jars in /contrib/extraction/lib HTH -Jon From: Ma, X

Re: Improve Query Time For Large Index

2010-08-10 Thread Peter Karich
Hi Tom, my index is around 3GB large and I am using 2GB RAM for the JVM although a some more is available. If I am looking into the RAM usage while a slow query runs (via jvisualvm) I see that only 750MB of the JVM RAM is used. > Can you give us some examples of the slow queries? for example the

RE: PDF file

2010-08-10 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Does anyone have any experience with PDF file? I really appreciate your help! Thanks so much in advance. -Original Message- From: Ma, Xiaohui (NIH/NLM/LHC) [C] Sent: Tuesday, August 10, 2010 10:37 AM To: 'solr-user@lucene.apache.org' Subject: PDF file I have a lot of pdf files. I am tryi

Do we need index analyzer for query elevation component

2010-08-10 Thread Darniz
Hello, In order for query elevation we define a type. do we really need index time analyzer for query elevation type. Let say we have some document already indexed and i added only the query time analyzer, looks like solr reads the words in elevate.xml and map words to the respective document. in

How to compile nightly build?

2010-08-10 Thread harrysmith
I am attempting to follow the instructions located at: http://wiki.apache.org/solr/ExtractingRequestHandler#Getting_Started_with_the_Solr_Example I have downloaded the most recent clean build from Hudson. After running 'ant example' I get the following error: C:\solr_build\apache-solr-4.0-201

Re: Need help with facets

2010-08-10 Thread Ahmet Arslan
> I have a solr index whose documents have the following > fields: > > FirstName > LastName > RecruitedDate > > I update the index when any of the three fields change for > that specific person. > > I need to get facets based on when someone was recruited. > The facets are : > > Recruited withi

Re: delete Problem..

2010-08-10 Thread Moazzam Khan
Are you running a commit command after every delete command? I had the same problem with updates. I wasn't committing my updates. - Moazzam Khan http://moazzam-khan.com On Tue, Aug 10, 2010 at 8:52 AM, kenf_nc wrote: > > I'd try 2 things. > First do a query >   q=EMAIL_HEADER_FROM:test.de > and

Need help with facets

2010-08-10 Thread Moazzam Khan
Hi guys, I have a solr index whose documents have the following fields: FirstName LastName RecruitedDate I update the index when any of the three fields change for that specific person. I need to get facets based on when someone was recruited. The facets are : Recruited within 1 month Recruite

Re: DIH and multivariable fields problems

2010-08-10 Thread Alexey Serba
> Have others successfully imported dynamic multivalued fields in a > child entity using the DataImportHandler via the child entity returning > multiple records through a RDBMS? Yes, it's working ok with static fields. I didn't even know that it's possible to use variables in field names ( "dynami

RE: Improve Query Time For Large Index

2010-08-10 Thread Burton-West, Tom
Hi Peter, A few more details about your setup would help list members to answer your questions. How large is your index? How much memory is on the machine and how much is allocated to the JVM? Besides the Solr caches, Solr and Lucene depend on the operating system's disk caching for caching of

PDF file

2010-08-10 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
I have a lot of pdf files. I am trying to import pdf files to solr and index them. I added ExtractingRequestHandler to solrconfig.xml. Please tell me if I need download some jar files. In the Solr1.4 Enterprise Search Server book, use following command to import a mccm.pdf. curl 'http://loc

Re: Implementing lookups while importing data

2010-08-10 Thread Alexey Serba
> We are currently doing this via a JOIN on the numeric > field, between the main data table and the lookup table, but this > dramatically slows down indexing. I believe SQL JOIN is the fastest and easiest way in your case (in comparison with nested entity even using CachedSqlEntity). You probably

Improve Query Time For Large Index

2010-08-10 Thread Peter Karich
Hi, I have 5 Million small documents/tweets (=> ~3GB) and the slave index replicates itself from master every 10-15 minutes, so the index is optimized before querying. We are using solr 1.4.1 (patched with SOLR-1624) via SolrJ. Now the search speed is slow >2s for common terms which hits more tha

Re: delete Problem..

2010-08-10 Thread kenf_nc
I'd try 2 things. First do a query q=EMAIL_HEADER_FROM:test.de and make sure some documents are found. If nothing is found, there is nothing to delete. Second, how are you testing to see if the document is deleted? The physical data isn't removed from the index until you Optimize I believe. I

Re: delete Problem..

2010-08-10 Thread Jan Høydahl / Cominvent
Hi, Since EMAIL_HEADER_FROM is a String type, you need to specify the whole field every time. Wildcards could also work, but you'll get a problem with leading wildcards. The solution would be to change the fieldType into a "text" type using e.g. StandardTokenizerFactory - if this does not brea

RE: hl.usePhraseHighlighter

2010-08-10 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Thanks so much for your help! It works. I really appreciate it. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com] Sent: Monday, August 09, 2010 6:05 PM To: solr-user@lucene.apache.org Subject: RE: hl.usePhraseHighlighter > I used text type and found the following in schema

Re: solr query result not read the latest xml file

2010-08-10 Thread Jan Høydahl / Cominvent
Hi, Beware that post.jar is just an example tool to play with the default example index located at /solr/ namespace. It is very limited and you shold look elsewhere for a more production ready and robust tool. However, it has the ability to specify custom url. Please try: java -jar post.jar -h

Re: Facet Fields - ID vs. Display Value

2010-08-10 Thread kenf_nc
If your concern is performance, faceting integers versus faceting strings, I believe Lucene makes the differences negligible. Given that choice I'd go with string. Now..if you need to keep an association between id and string, you may want to facet a combined field : or some other delimiter. Then

Re: how to support "implicit trailing wildcards"

2010-08-10 Thread Jan Høydahl / Cominvent
Hi, You don't need to duplicate the content into two fields to achieve this. Try this: q=mount OR mount* The exact match will always get higher score than the wildcard match because wildcard matches uses "constant score". Making this work for multi term queries is a bit trickier, but somethin

Re: DIH: Rows fetch OK, Total Documents Failed??

2010-08-10 Thread Alexey Serba
Do you have any required fields or uniqueKey in your schema.xml? Do you provide values for all these fields? AFAIU you don't need commonField attribute for id and title fields. I don't think that's your problem but anyway... On Sat, Jul 31, 2010 at 11:29 AM, wrote: > >  Hi, > > I'm a bit lost

Re: Indexing fieldvalues with dashes and spaces

2010-08-10 Thread Jan Høydahl / Cominvent
Hi, Try solr.KeywordTokenizerFactory. However, in your case it looks as if you have certain requirements for searching that requires tokenization. So you should leave the WhitespaceTokenizer as is and create a separate field specially for the faceting, with indexed=true, stored=false and type=

Re: Process entire result set

2010-08-10 Thread Eloi Rocha
Thanks Jonathan! We decided to create offline results and store them in a Non-sql storage (HBase). So we can answer the requests selecting one the the offline generated results. This offline results are generated everyday. Thanks! Eloi On Thu, Aug 5, 2010 at 8:59 PM, Jonathan Rochkind wrote:

delete Problem..

2010-08-10 Thread Jörg Agatz
Hallo Users... I have a Problem, to delete some indext Item i Tryed it with : java -Ddata=args -jar /home/service/solr/apache-solr-nightly/example/exampledocs/post.jar "EMAIL_HEADER_FROM:test.de" but Nothing, EMAIL_HEADER_FROM is a "String" and in the past it ever works. but now? I cant delete

Re: solr query result not read the latest xml file

2010-08-10 Thread e8en
finally I found out the cause of my problem yes you don't need to delete the index and restart the tomcat just to get the data query result updated, you just need to commit the xml files. I made a custom url as per requirement from my client default url -- > http://localhost/solr/select/?q=ITEM_C

AW: AW: solr query result not read the latest xml file

2010-08-10 Thread Bastian Spitzer
you can check the admin panel to see if there are pending deletes/commits in the statistics section. older versions of post.jar dont auto-commit the changes, so if your xml doesnt contain a you could just create a commit.xml containing only the following: and send it via post.jar. you can al

Re: solr query result not read the latest xml file

2010-08-10 Thread Ahmet Arslan
> yes I try with both value, never304="true" and > never304="false" and none of > them make it works It must be , so lets forget about never304="false". But when you change something in solrconfig.xml you need to restart jetty/tomcat. java -jar post.jar *.xml does by default at the end. > w

Re: AW: solr query result not read the latest xml file

2010-08-10 Thread e8en
hi Bastian, how to send a ? is it by typing : java -jar post.jar cat_978.xml? if yes then I've already done that any solution please? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-file-tp1066785p1068782.html Sent from the Solr - Us

Re: solr query result not read the latest xml file

2010-08-10 Thread e8en
yes I try with both value, never304="true" and never304="false" and none of them make it works what is curl and wget? I use mozilla firefox browser I'm really newbie in programming world especially solr -- View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-rea

Solr Delta import where last_modified

2010-08-10 Thread Hando420
Hi all. I have set my data-config with mysql database. The problem i am having is mysql doesn't execute deltaquery. The where last_modified is not executed and throws an error of unknown column last_modified in where clause. Shouldn't this be treated as a deltaquery instead of a column in table. A

AW: solr query result not read the latest xml file

2010-08-10 Thread Bastian Spitzer
make sure you send a after add/delete to make the changes visible. -Ursprüngliche Nachricht- Von: e8en [mailto:e...@tokobagus.com] Gesendet: Dienstag, 10. August 2010 10:04 An: solr-user@lucene.apache.org Betreff: Re: solr query result not read the latest xml file I already set in my s

Re: solr query result not read the latest xml file

2010-08-10 Thread Ahmet Arslan
> I already set in my solrconfig.xml as you told me: > > > and then I commit the xml > and it's still not working > the query result still show the old data :( > > do you have any suggestion? Shouldn't it be never304="true"? You wrote never304="false" Additionally cant you try with something

Re: solr query result not read the latest xml file

2010-08-10 Thread e8en
I already set in my solrconfig.xml as you told me: and then I commit the xml and it's still not working the query result still show the old data :( do you have any suggestion? Eben -- View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-f

Re: how to support "implicit trailing wildcards"

2010-08-10 Thread Geert-Jan Brits
you could satisfy this by making 2 fields: 1. exactmatch 2. wildcardmatch use copyfield in your schema to copy 1 --> 2 . q=exactmatch:mount+wildcardmatch:mount*&q.op=OR this would score exact matches above (solely) wildcard matches Geert-Jan 2010/8/10 yandong yao > Hi Bastian, > > Sorry for n