Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi,

wildcard and fuzzy queries are not analyzed.

-Sascha



Alok Bhandari  schrieb:

Hello ,

I am pushing "Chuck Follett'.?.?" in solr and when I query for this field
with query string field:Follett'.* I am getting 0 results.

field type declared is






 

and parser we are using is EdisMax .

Is it the case that for prefix query the text analysis is not done I am
getting 0 results or there is something fundamentally wrong with my
data/schema .

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Prefix query is not analysed?

2012-07-02 Thread Alok Bhandari
Thanks for reply.

If I check the debug query through solr-admin I can see that the lower case
filter is applied and 

  "rawquerystring":"em_to_name:Follett'.*",
"querystring":"em_to_name:Follett'.*",
"parsedquery":"+em_to_name:follett'.*",
"parsedquery_toString":"+em_to_name:follett'.*",
"explain":{},
"QParser":"ExtendedDismaxQParser",


I can see this query. So is it the case that only tokenization is not done
for the wildcard queries but other filters specified are applied?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435p3992450.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi,

I suppose you are using Solr 3.6. Then take a look at

http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/

-Sascha



Alok Bhandari  schrieb:

Thanks for reply.

If I check the debug query through solr-admin I can see that the lower case
filter is applied and 

"rawquerystring":"em_to_name:Follett'.*",
"querystring":"em_to_name:Follett'.*",
"parsedquery":"+em_to_name:follett'.*",
"parsedquery_toString":"+em_to_name:follett'.*",
"explain":{},
"QParser":"ExtendedDismaxQParser",


I can see this query. So is it the case that only tokenization is not done
for the wildcard queries but other filters specified are applied?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435p3992450.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can't find solr.xml

2012-07-02 Thread Nabeel Sulieman
Argh! (and hooray!)

I started from scratch again, following the wiki instructions. I did only
one thing differently; put my data directory in /opt instead of /home/dev.
And now it works!

I'm glad it's working now. I just wish I knew exactly what the difference
is. The directory in /opt has exactly the same permissions as the one in
/home/dev (chown -R tomcat solr).


On Sun, Jul 1, 2012 at 10:08 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Possibly an obvious thing you've already looked at, but does the user
> that Tomcat's running under have execute permissions on all the
> directories leading up to your solr directory? It might be being
> prevented from traversing the directory tree.
>
> Also I would verify there's no ACLs on anything in that directory
> tree. Those can get bizarre if you're like me and don't understand
> them. :)
>
> I'm assuming you're using a UNIX variant here...
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn’t a Game.
> http://www.appinions.com
>
>
> On Sun, Jul 1, 2012 at 11:40 AM, Nabeel Sulieman
>  wrote:
> > I tried going back to 3.5 and ran into the same problem, so it's
> definitely
> > an issue on my server.
> >
> > It's just so bizarre; I can cat or vi "/home/dev/solr/solr.xml" with no
> > problems, and I've even tried setting the permission to read/write for
> all,
> > but tomcat still can't seem to find the file.
> >
> > Hmmm... I wonder if my tomcat user is running in some kind of jailed
> > environment? I'll look into that next.
> >
> > Anyways, I'll retrace my steps and see if I come up with anything.
> >
> > -Original Message-
> > From: Mark Miller [mailto:markrmil...@gmail.com]
> > Sent: Sunday, July 01, 2012 5:27 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can't find solr.xml
> >
> > Can't think of anything that would cause this from 3.5 to 3.6.
> >
> > If you indeed have a solr home located at home/dev/solr/ and there is a
> conf
> > dir under that, all I can imagine is that it's a permissions issue or
> > something at the OS level.
> >
> > On Jun 30, 2012, at 2:28 PM, Nabeel Sulieman wrote:
> >
> >> Hi,
> >>
> >>
> >>
> >> I really hate bothering this group with something that should be
> >> trivial, but I've been googling and experimenting to get this to work
> >> for the last week now. I had no trouble getting my simple
> >> configuration working on 3.5, but when I moved over to 3.6, I seem to
> have
> > hit something strange.
> >>
> >>
> >>
> >> As I said I'm on the latest version of solr (3.6.0), and I'm using
> >> exactly the standard war file, with the "solr/home" section
> >> uncommented and set to my Solr directory.
> >>
> >>
> >>
> >> However, even though the path is correct, Solr/Tomcat don't seem to be
> >> able to find the solr.xml file, nor the solrconfig.xml file.
> >>
> >>
> >>
> >> Java version is 1.6.0_29-b11, tomcat 5.5.35, CentOS.
> >>
> >>
> >>
> >> What am I missing here?
> >>
> >>
> >>
> >> Thanks. Below is the error log.
> >>
> >>
> >>
> >> Jun 30, 2012 12:47:58 PM
> >> org.apache.solr.core.CoreContainer$Initializer
> >> initialize
> >>
> >> INFO: looking for solr.xml: /home/dev/solr/solr.xml
> >>
> >> Jun 30, 2012 12:47:58 PM
> >> org.apache.solr.core.CoreContainer$Initializer
> >> initialize
> >>
> >> INFO: no solr.xml file found - using default
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer load
> >>
> >> INFO: Loading CoreContainer using Solr Home: '/home/dev/solr/'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader
> >> 
> >>
> >> INFO: new SolrResourceLoader for directory: '/home/dev/solr/'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer create
> >>
> >> INFO: Creating SolrCore '' using instanceDir: /home/dev/solr/.
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader
> >> 
> >>
> >> INFO: new SolrResourceLoader for directory: '/home/dev/solr/./'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.common.SolrException log
> >>
> >> SEVERE: java.lang.RuntimeException: Can't find resource
> >> 'solrconfig.xml' in classpath or '/home/dev/solr/./conf/',
> >> cwd=/usr/local/jakarta/apache-tomcat-5.5.35/bin
> >>
> >>at
> >> org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoade
> >> r.java
> >> :273)
> >>
> >>at
> >> org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.
> >> java:2
> >> 39)
> >>
> >>at org.apache.solr.core.Config.(Config.java:141)
> >>
> >>at
> >> org.apache.solr.core.SolrConfig.(SolrConfig.java:138)
> >>
> >>at
> >> org.apache.solr.core.CoreContainer.create(CoreContainer.java:455)
> >>
> >>at
> >> org.apache.solr.core.CoreContainer.load(CoreContainer.java:335)
> >>
> >>at
> >> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContaine
> >> r.java
> >> :165)
> >>
> >>

Re: Prefix query is not analysed?

2012-07-02 Thread Alok Bhandari
Yes I am using Solr 3.6.

Thanks for the link it is very useful.
>From the link I could make out that if analyzer  includes any one of the
following  then they are applied and any other elements specified under
analyzer are not applied as they are not multi-term aware.

ASCIIFoldingFilterFactory
LowerCaseFilterFactory
LowerCaseTokenizerFactory
MappingCharFilterFactory
PersianCharFilterFactory




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Prefix-query-is-not-analysed-tp3992435p3992463.html
Sent from the Solr - User mailing list archive at Nabble.com.


Facet Order when the count is the same

2012-07-02 Thread maurizio1976
Hi,
does anybody knows if there is a rule around the ordering of what's returned
by a facet if the counts are all the same?
Is that ordered by time of indexation of the doc in Solr?

cheers
Maurizio

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Facet-Order-when-the-count-is-the-same-tp3992471.html
Sent from the Solr - User mailing list archive at Nabble.com.


Dynamic Field Name?

2012-07-02 Thread Bruno Mannina

Dear All,

In my XML files to index I have several abstract fields with different 
languages.

I.e: ABEN, ABIT, ABFR, ABPT, etc

I would like to:
Index & Store only the ABEN
and
Only Store all other AB* fields

Is it possible to write in the schema.xml

multivalued="true" />
multivalued="true" />


PS: multivalues because sometimes I have 2 or 3 items in the same 

Thanks for your comment,
Bruno


Re: Atomic Multicore Operations - E.G. Move Docs

2012-07-02 Thread Nicholas Ball

That could work, but then how do you ensure commit is called on the two
cores at the exact same time?
Also, any way to commit a specific update rather then all the back-logged
ones?

Cheers,
Nicholas

On Sat, 30 Jun 2012 16:19:31 -0700, Lance Norskog 
wrote:
> Index all documents to both cores, but do not call commit until both
> report that indexing worked. If one of the cores throws an exception,
> call roll back on both cores.
> 
> On Sat, Jun 30, 2012 at 6:50 AM, Nicholas Ball
>  wrote:
>>
>> Hey all,
>>
>> Trying to figure out the best way to perform atomic operation across
>> multiple cores on the same solr instance i.e. a multi-core environment.
>>
>> An example would be to move a set of docs from one core onto another
core
>> and ensure that a softcommit is done as the exact same time. If one
were
>> to
>> fail so would the other.
>> Obviously this would probably require some customization but wanted to
>> know what the best way to tackle this would be and where should I be
>> looking in the source.
>>
>> Many thanks for the help in advance,
>> Nicholas a.k.a. incunix


Re: Dismax Question

2012-07-02 Thread Ahmet Arslan
> So, my question is how do we get Solr search to work with
> AND when it is splitting words? The splitting part is good,
> the bad part is that it is searching for any one of those
> split words.

Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.



http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29


Re: Dynamic Field Name?

2012-07-02 Thread Ahmet Arslan


--- On Mon, 7/2/12, Bruno Mannina  wrote:

> From: Bruno Mannina 
> Subject: Dynamic Field Name?
> To: solr-user@lucene.apache.org
> Date: Monday, July 2, 2012, 1:02 PM
> Dear All,
> 
> In my XML files to index I have several abstract fields with
> different languages.
> I.e: ABEN, ABIT, ABFR, ABPT, etc
> 
> I would like to:
> Index & Store only the ABEN
> and
> Only Store all other AB* fields
> 
> Is it possible to write in the schema.xml
> 
>  stored="true" multivalued="true" />
>  stored="true" multivalued="true" />
> 
> PS: multivalues because sometimes I have 2 or 3 items in the
> same 

Yes it is possible. Note that you need to use capital 'V' in multiValued 
option. 






Re: Dynamic Field Name?

2012-07-02 Thread Bruno Mannina

Le 02/07/2012 13:32, Ahmet Arslan a écrit :


--- On Mon, 7/2/12, Bruno Mannina  wrote:


From: Bruno Mannina 
Subject: Dynamic Field Name?
To: solr-user@lucene.apache.org
Date: Monday, July 2, 2012, 1:02 PM
Dear All,

In my XML files to index I have several abstract fields with
different languages.
I.e: ABEN, ABIT, ABFR, ABPT, etc

I would like to:
Index & Store only the ABEN
and
Only Store all other AB* fields

Is it possible to write in the schema.xml




PS: multivalues because sometimes I have 2 or 3 items in the
same 

Yes it is possible. Note that you need to use capital 'V' in multiValued option.







ok thx !



indexing documents in Apache Solr using php-curl library

2012-07-02 Thread Asif
I am indexing the file using php curl library. I am stuck here with the code
echo "Stored in: " . "upload/" . $_FILES["file"]["name"];
 $result=move_uploaded_file($_FILES["file"]["tmp_name"],"upload/" .
$_FILES["file"]["name"]);
 if ($result == 1) echo "Upload done .";
$options = getopt("f:");
$infile = $options['f'];

$url = "http://localhost:8983/solr/update/";;
$filename = "upload/" . $_FILES["file"]["name"];
$handle = fopen($filename, "rb");
$contents = fread($handle, filesize($filename));
fclose($handle);
echo $url;
$post_string = file_get_contents("upload/" .
$_FILES["file"]["name"]);
echo $contents;
$header = array("Content-type:text/xml; charset=utf-8");

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLINFO_HEADER_OUT, 1);

$data = curl_exec($ch);

if (curl_errno($ch)) {
   print "curl_error:" . curl_error($ch);
} else {
   curl_close($ch);
   print "curl exited okay\n";
   echo "Data returned...\n";
   echo "\n";
   echo $data;
   echo "\n";
}

Nothing is showing as a result. Moreover there is nothing shown in the event
log of Apache Solr. please help me with the code

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-documents-in-Apache-Solr-using-php-curl-library-tp3992452.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Dismax Question

2012-07-02 Thread Vadim Kisselmann
in your schema.xml you can set the default query parser operator, in
your case , but it's
deprecated.
When you use the edismax, read this:http://drupal.org/node/1559394 .
mm-param is here the answer.

Best regards
Vadim





2012/7/2 Steve Fatula :
> Let's say a user types in:
>
> DualHead2Go
>
>
> The way solr is working, it splits this into:
>
> Dual Head 2 Go
>
> And searches the index for various fields, and finds records where any ONE of 
> them matches.
>
> Now, if I simply type the search terms Dual Head 2 Go, it finds records where 
> ALL of them match. This is because we set q.op to AND.
>
> Recently, we went from Solr 3.4 to 3.6, and, 3.4 used to work ok, 3.6 seems o 
> behave differently, or, perhaps we mucked something up.
>
> So, my question is how do we get Solr search to work with AND when it is 
> splitting words? The splitting part is good, the bad part is that it is 
> searching for any one of those split words.
>
> Steve


solr error, cannot run PhraseQuery

2012-07-02 Thread jimmon
http://172.17.100.16:/solr/collection1/select?q=name:(4U OR Cabs)^12.8
category:(4U OR Cabs)^1.6 subcategory:(4U OR Cabs)^0.4 description:(4U OR
Cabs)^0.1

the above query is throwing the the following error

field "subcategory" was indexed with Field.omitTermFreqAndPositions=true;
cannot run PhraseQuery (term=4)



subcategory field in schema.xml is like this



The only difference between name,category,subcategory & description is
subcategory is multivalued.
In the above solr query if we change subcategory:(4U OR Cabs) to 
subcategory:(5U OR Cabs) ,subcategory:(6U OR Cabs)  etc it works.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-error-cannot-run-PhraseQuery-tp3992469.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr error, cannot run PhraseQuery

2012-07-02 Thread Ahmet Arslan
> http://172.17.100.16:/solr/collection1/select?q=name:(4U
> OR Cabs)^12.8
> category:(4U OR Cabs)^1.6 subcategory:(4U OR Cabs)^0.4
> description:(4U OR
> Cabs)^0.1
> 
> the above query is throwing the the following error
> 
> field "subcategory" was indexed with
> Field.omitTermFreqAndPositions=true;
> cannot run PhraseQuery (term=4)
> 
> 
> 
> subcategory field in schema.xml is like this
> 
>  multiValued="true"/>

omitTermFreqAndPositions="true" is an option that field types can have. 
Positions are required for phrase queries. Do you have this option set in 
type="text_en" definition?


Re: Dismax Question

2012-07-02 Thread Steve Fatula

>From: Vadim Kisselmann 
>To: solr-user@lucene.apache.org; Steve Fatula  
>Sent: Monday, July 2, 2012 4:31 AM
>Subject: Re: Dismax Question
> 
>in your schema.xml you can set the default query parser operator, in
>your case , but it's
>deprecated.
>
>
>I do set the default query operator, as shown by using separate words in my 
>example, it correct ands them. The different is when using one words, SOLR 
>splits it into 3 words, and does not and. I don't understand why it does not 
>and them when solr splits the words, but does when solr does not split them.
>
>
>I've specified mm as 100% as well with no impact.
>
>This used to work on Solr 3.4.

Re: Dismax Question

2012-07-02 Thread Steve Fatula

>From: Ahmet Arslan 
>To: solr-user@lucene.apache.org; Steve Fatula  
>Sent: Monday, July 2, 2012 6:22 AM
>Subject: Re: Dismax Question
> 
>> So, my question is how do we get Solr search to work with
>> AND when it is splitting words? The splitting part is good,
>> the bad part is that it is searching for any one of those
>> split words.
>
>Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.
>
>
>
>I set mm to 100%, no effect at all. It works only for words typed in that are 
>separated already. Remember, the example here is:
>
>
>DualHead2Go finds all kinds of matches (it splits into dual head 2 go)
>
>
>Dial Head 2 Go finds the correct matches, indicating it is adding them based 
>on q/op, defautOperator, and mm.

Re: indexing documents in Apache Solr using php-curl library

2012-07-02 Thread Sascha SZOTT
Hi,

perhaps it's better to use a PHP Solr client library. I used

   https://code.google.com/p/solr-php-client/

in a project of mine and it worked just fine.

-Sascha

Asif wrote:
> I am indexing the file using php curl library. I am stuck here with the code
> echo "Stored in: " . "upload/" . $_FILES["file"]["name"];
>  $result=move_uploaded_file($_FILES["file"]["tmp_name"],"upload/" .
> $_FILES["file"]["name"]);
>  if ($result == 1) echo "Upload done .";
> $options = getopt("f:");
> $infile = $options['f'];
> 
> $url = "http://localhost:8983/solr/update/";;
> $filename = "upload/" . $_FILES["file"]["name"];
> $handle = fopen($filename, "rb");
> $contents = fread($handle, filesize($filename));
> fclose($handle);
> echo $url;
> $post_string = file_get_contents("upload/" .
> $_FILES["file"]["name"]);
> echo $contents;
> $header = array("Content-type:text/xml; charset=utf-8");
> 
> $ch = curl_init();
> 
> curl_setopt($ch, CURLOPT_URL, $url);
> curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
> curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
> curl_setopt($ch, CURLOPT_POST, 1);
> curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string);
> curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
> curl_setopt($ch, CURLINFO_HEADER_OUT, 1);
> 
> $data = curl_exec($ch);
> 
> if (curl_errno($ch)) {
>print "curl_error:" . curl_error($ch);
> } else {
>curl_close($ch);
>print "curl exited okay\n";
>echo "Data returned...\n";
>echo "\n";
>echo $data;
>echo "\n";
> }
> 
> Nothing is showing as a result. Moreover there is nothing shown in the event
> log of Apache Solr. please help me with the code
> 



Fuzzy Search issues using Solr 4.0

2012-07-02 Thread Matteo Diarena
Dear Solr Users,

I'm an enthusiastic solr user since version 1.4. I'm now working on a new
solr based application heavily using fuzzy searches for string matching.

Unfortunately I'm facing a strange problem using fuzzy search and I hope
someone can help me to get more information.

 

I indexed several company names in a field named ENTITY_NAME using the
following parameters in schema.xml

 

.



   

   

   

   



.



.

 

One of these companies is "TS PUBLISHING INC"

Following the list of queries with the returned and the expected result

1)  ENTITY_NAME:(ts AND publising)   => matches, OK

2)  ENTITY_NAME:(ts AND publising~1)  => matches, OK

3)  ENTITY_NAME:(td~1 AND publishing)  => doesn't match, KO (it was
supposed to match)

4)  ENTITY_NAME:(ts AND pablisin~3)=> doesn't match, KO (it was
supposed to match)

 

Why td~1 does not match ts?

Why pablisin~3 publishing?

 

How can I investigate the problem? 

Is there any parameter I can set in solrconfig.xml? 

Is there any tool I can use to see how the automata is built?

 

Thanks a lot in advance,

Matteo Diarena
Senior KM Developer - VOLO.com S.r.l.
Via Luigi Rizzo, 8/1 - 20151 MILANO
Fax  +39 02 8945 3500

Tel  +39 02 8945 3023
Cell +39 345 2129244
  m.diar...@volocom.it
  http://www.volocom.it

 



Re: index writer in searchComponent

2012-07-02 Thread Dmitry Kan
Hi Peyman,

It is, at least from your perspective and probably system design. In our
case for example, we have slightly different approach, where a user query
is formed on a client side and then caught and pre-processed on a "backend"
side before being sent over to solr. That "backend" side would be then
ideal for any check of a query.

But back to your setup. Could share more thoughts about where do you get
the queries from, why do they inter-mix with your indexing processes? Could
you separate gathering of queries and indexing? If you could, I can't see,
why, prior to indexing you wouldn't be able to query your index and figure
out item 3.

Could be some X Y problem, but I don't insist.

Regards,
Dmitry

On Mon, Jul 2, 2012 at 12:32 AM, Peyman Faratin wrote:

> Hi Dmitry
> Which SolrJ API would I use to receive the user query? I was under the
> impression the request handler mechanism was the (RESTFUL) interface
> between user query and the index/s.
> thank you
> Peyman
>
> On Jul 1, 2012, at 10:11 AM, Dmitry Kan wrote:
>
> > Hi Peyman,
> >
> > Could you just use solrj api for this purpose? That is, ask via solrj api
> > 1-2 and perform 3 if entity (assuming you mean document or some field
> value
> > by X) didn't exist, i.e. add it to the index.
> >
> > // Dmitry
> >
> > On Sun, Jul 1, 2012 at 6:03 AM, Peyman Faratin  >wrote:
> >
> >> Hi Erik
> >>
> >> The workflow I'd like to implement is
> >>
> >> 1- search the index using the incoming query
> >> 2- the query is of the type "does entity X exist"
> >> 3- if X does not exist in the index then I'd like to add X to the index
> >>
> >> Currently I am using a custom search component to achieve this by
> creating
> >> a solrserver within the init (or inform) method of the search component
> and
> >> using that instance to update (and commit) the index. I am not sure
> this is
> >> the best approach either and thought using the IndexReader of the search
> >> component itself maybe better.
> >>
> >> Is there a better approach in your opinion?
> >>
> >> thank you Erik
> >>
> >> Peyman
> >>
> >> On Jun 30, 2012, at 8:13 PM, Erick Erickson wrote:
> >>
> >>> Lots of the index modification (all of it?) has been removed in 4.0
> >>> from IndexReaders...
> >>>
> >>> It seems like you could always get the directory and open a
> >>> SolrIndexWriter wherever you wanted,
> >>> but I'm not sure it's a good idea, are there other processes that will
> >>> be writing to the index at the
> >>> same time?
> >>>
> >>> What's the purpose here anyway? There might be a better approach
> >>>
> >>> Best
> >>> Erick
> >>>
> >>> On Thu, Jun 28, 2012 at 4:02 PM, Peyman Faratin <
> pey...@robustlinks.com>
> >> wrote:
>  Hi
> 
>  Is it possible to add a new document to the index in a custom
> >> SearchComponent (that also implements a SolrCoreAware)? I can get a
> >> reference to the indexReader via the ResponseBuilder parameter of the
> >> process() method using
> 
>  rb.req.getSearcher().getReader()
> 
>  But is it possible to actually add a new document to the index _after_
> >> searching the index? I.e accessing the indexWriter?
> 
>  thank you
> 
>  Peyman
> >>
> >>
> >
> >
> > --
> > Regards,
> >
> > Dmitry Kan
>
>


-- 
Regards,

Dmitry Kan


RE: Can't find solr.xml

2012-07-02 Thread Noordeen, Roxy
Try to add below below file, and set to any path you would like to use:

/usr/local/tomcat/conf/Catalina/localhost/solr.xml 

  


Roxy


-Original Message-
From: Nabeel Sulieman [mailto:nabeel.sulie...@gmail.com] 
Sent: Monday, July 02, 2012 4:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Can't find solr.xml

Argh! (and hooray!)

I started from scratch again, following the wiki instructions. I did only
one thing differently; put my data directory in /opt instead of /home/dev.
And now it works!

I'm glad it's working now. I just wish I knew exactly what the difference
is. The directory in /opt has exactly the same permissions as the one in
/home/dev (chown -R tomcat solr).


On Sun, Jul 1, 2012 at 10:08 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Possibly an obvious thing you've already looked at, but does the user
> that Tomcat's running under have execute permissions on all the
> directories leading up to your solr directory? It might be being
> prevented from traversing the directory tree.
>
> Also I would verify there's no ACLs on anything in that directory
> tree. Those can get bizarre if you're like me and don't understand
> them. :)
>
> I'm assuming you're using a UNIX variant here...
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn’t a Game.
> http://www.appinions.com
>
>
> On Sun, Jul 1, 2012 at 11:40 AM, Nabeel Sulieman
>  wrote:
> > I tried going back to 3.5 and ran into the same problem, so it's
> definitely
> > an issue on my server.
> >
> > It's just so bizarre; I can cat or vi "/home/dev/solr/solr.xml" with no
> > problems, and I've even tried setting the permission to read/write for
> all,
> > but tomcat still can't seem to find the file.
> >
> > Hmmm... I wonder if my tomcat user is running in some kind of jailed
> > environment? I'll look into that next.
> >
> > Anyways, I'll retrace my steps and see if I come up with anything.
> >
> > -Original Message-
> > From: Mark Miller [mailto:markrmil...@gmail.com]
> > Sent: Sunday, July 01, 2012 5:27 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can't find solr.xml
> >
> > Can't think of anything that would cause this from 3.5 to 3.6.
> >
> > If you indeed have a solr home located at home/dev/solr/ and there is a
> conf
> > dir under that, all I can imagine is that it's a permissions issue or
> > something at the OS level.
> >
> > On Jun 30, 2012, at 2:28 PM, Nabeel Sulieman wrote:
> >
> >> Hi,
> >>
> >>
> >>
> >> I really hate bothering this group with something that should be
> >> trivial, but I've been googling and experimenting to get this to work
> >> for the last week now. I had no trouble getting my simple
> >> configuration working on 3.5, but when I moved over to 3.6, I seem to
> have
> > hit something strange.
> >>
> >>
> >>
> >> As I said I'm on the latest version of solr (3.6.0), and I'm using
> >> exactly the standard war file, with the "solr/home" section
> >> uncommented and set to my Solr directory.
> >>
> >>
> >>
> >> However, even though the path is correct, Solr/Tomcat don't seem to be
> >> able to find the solr.xml file, nor the solrconfig.xml file.
> >>
> >>
> >>
> >> Java version is 1.6.0_29-b11, tomcat 5.5.35, CentOS.
> >>
> >>
> >>
> >> What am I missing here?
> >>
> >>
> >>
> >> Thanks. Below is the error log.
> >>
> >>
> >>
> >> Jun 30, 2012 12:47:58 PM
> >> org.apache.solr.core.CoreContainer$Initializer
> >> initialize
> >>
> >> INFO: looking for solr.xml: /home/dev/solr/solr.xml
> >>
> >> Jun 30, 2012 12:47:58 PM
> >> org.apache.solr.core.CoreContainer$Initializer
> >> initialize
> >>
> >> INFO: no solr.xml file found - using default
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer load
> >>
> >> INFO: Loading CoreContainer using Solr Home: '/home/dev/solr/'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader
> >> 
> >>
> >> INFO: new SolrResourceLoader for directory: '/home/dev/solr/'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer create
> >>
> >> INFO: Creating SolrCore '' using instanceDir: /home/dev/solr/.
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader
> >> 
> >>
> >> INFO: new SolrResourceLoader for directory: '/home/dev/solr/./'
> >>
> >> Jun 30, 2012 12:47:58 PM org.apache.solr.common.SolrException log
> >>
> >> SEVERE: java.lang.RuntimeException: Can't find resource
> >> 'solrconfig.xml' in classpath or '/home/dev/solr/./conf/',
> >> cwd=/usr/local/jakarta/apache-tomcat-5.5.35/bin
> >>
> >>at
> >> org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoade
> >> r.java
> >> :273)
> >>
> >>at
> >> org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.
> >> java:2
> >> 39)
> >>
> >>at org.apache.solr.core.Config.(Config.java:141)
> >>
> >>at
> >> org.apache.solr.core.SolrConfig.(SolrConfig.java:138)
> >>
> 

Saravanan Chinnadurai/Actionimages is out of the office.

2012-07-02 Thread Saravanan . Chinnadurai
I will be out of the office starting  02/07/2012 and will not return until
16/07/2012.

Please email to itsta...@actionimages.com  for any urgent issues.


Action Images is a division of Reuters Limited and your data will therefore be 
protected
in accordance with the Reuters Group Privacy / Data Protection notice which is 
available
in the privacy footer at www.reuters.com
Registered in England No. 145516   VAT REG: 397000555


Re: Dismax Question

2012-07-02 Thread Joel Rosen
I and another user recently posted about this exact same issue.  It sounds
like maybe this is a new bug introduced in 3.6:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E

I've managed to figure out a fix that is working well enough for my own
application right now.  I set autoGeneratePhraseQueries to "true" on my
field, and also set qs=2.  The high query slop value simulates the AND
behavior that I want since my documents are relatively short, but this is
obviously not the correct solution, and I don't know if there are any
performance issues with using really high query slop values.

On Mon, Jul 2, 2012 at 9:16 AM, Steve Fatula wrote:

>
> >From: Ahmet Arslan 
> >To: solr-user@lucene.apache.org; Steve Fatula 
> >Sent: Monday, July 2, 2012 6:22 AM
> >Subject: Re: Dismax Question
> >
> >> So, my question is how do we get Solr search to work with
> >> AND when it is splitting words? The splitting part is good,
> >> the bad part is that it is searching for any one of those
> >> split words.
> >
> >Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.
> >
> > autoGeneratePhraseQueries="true">
> >
> >I set mm to 100%, no effect at all. It works only for words typed in that
> are separated already. Remember, the example here is:
> >
> >
> >DualHead2Go finds all kinds of matches (it splits into dual head 2 go)
> >
> >
> >Dial Head 2 Go finds the correct matches, indicating it is adding them
> based on q/op, defautOperator, and mm.


query filter with OR reduces results

2012-07-02 Thread Breese,John
I'm trying to figure out why adding an "OR" condition to a query filter would 
end up restricting results, instead of unioning them? I would expect to get 
back more results by adding a union to the filter query.


Case 1 – 380 Results

380 results are returned with a query of xyz with a query filter of

(NOT sentence_id:[* TO *])


Case 2 – 41 Results

41 results are returned with the same query and a filter query of

(usage_flags:PRESCRIPTION)


Case 3 – Union of Case 1 and 2

Only 41 results are returned using the same query and a filter query of

(NOT sentence_id:[* TO *])

OR

(usage_flags:PRESCRIPTION)



I would expect that I would get at least 380 results with the union, instead of 
the 41 that I am seeing.

I only really notice this issue when using the NOT operator, so perhaps the 
problem lies with not understanding quite how the NOT operator is applied here. 
Any ideas?


I'm testing using the Solr /admin/form.jsp

Solr Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06 11:34:53

The default query parser is EDisMax, though I don't think that will change the 
filter queries.

--
CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.


Re: How to improve this solr query?

2012-07-02 Thread Michael Della Bitta
Hi Chamnap,

The first thing that jumped out at me was "facet.mincount=1". Are you
sure you need this? Increasing this number should drastically improve
speed.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn  wrote:
> Hi all,
>
> I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb. The
> problem is that my query is so slow; the average response time is 12 secs
> against 13 millions documents.
>
> What I am doing is to send quoted string (q2) to string fields and
> non-quoted string (q1) to other fields and combine the result together.
>
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> *
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> *
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
>
> I have done solr optimize already, but it's still slow. Any idea how to
> improve the speed? Am I done anything wrong?
>
> --
> Chhorn Chamnap
> http://chamnap.github.com/


Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Hi Michael,

Thanks for quick response. Based on documentation, "facet.mincount" means
that solr will return facet fields that has at least that number. For me, I
just want to ensure my facet fields count doesn't have zero value.

I try to increase to 10, but it still slows even for the same query.

Actually, those 13 million documents are divided into 200 portals. I
already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
it's still slow.

On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Hi Chamnap,
>
> The first thing that jumped out at me was "facet.mincount=1". Are you
> sure you need this? Increasing this number should drastically improve
> speed.
>
> Michael Della Bitta
>
> 
> Appinions, Inc. -- Where Influence Isn’t a Game.
> http://www.appinions.com
>
>
> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn 
> wrote:
> > Hi all,
> >
> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
> The
> > problem is that my query is so slow; the average response time is 12 secs
> > against 13 millions documents.
> >
> > What I am doing is to send quoted string (q2) to string fields and
> > non-quoted string (q1) to other fields and combine the result together.
> >
> >
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> > *
> >
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> > *
> >
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >
> > I have done solr optimize already, but it's still slow. Any idea how to
> > improve the speed? Am I done anything wrong?
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: edismax parser ignores mm parameter when tokenizer splits tokens (hypenated words, WDF splitting etc)

2012-07-02 Thread Tom Burton-West
Opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-3589, which
also lists a couple other related mailing list posts.




On Thu, Jun 28, 2012 at 12:18 PM, Tom Burton-West wrote:

> Hello,
>
> My previous e-mail with a CJK example has received no replies.   I
> verified that this problem also occurs for English.  For example in the
> case of the word "fire-fly" , The ICUTokenizer and the WordDelimeterFilter
> both split this into two tokens "fire" and "fly".
>
> With an edismax query and a must match of 2 :  q={!edsmax mm=2} if the
> words are entered separately at [fire fly], the edismax parser honors the
> mm parameter and does the equivalent of a Boolean AND query.  However if
> the words are entered as a hypenated word [fire-fly], the tokenizer splits
> these into two tokens "fire" and "fly" and the edismax parser does the
> equivalent of a Boolean OR query.
>
> I'm not sure I understand the output of the debugQuery, but judging by the
> number of hits returned it appears that edismax is not honoring the mm
> parameter. Am I missing something, or is this a bug?
>
>  I'd like to file a JIRA issue, but want to find out if I am missing
> something here.
>
> Details of several queries are appended below.
>
> Tom Burton-West
>
> edismax query mm=2   query with hypenated word [fire-fly]
>
> 
> {!edismax mm=2}fire-fly
> {!edismax mm=2}fire-fly
> +DisjunctionMaxQuery(((ocr:fire ocr:fly)))
> +((ocr:fire ocr:fly))
>
>
> Entered as separate words [fire fly]  numFound="184962
>  edismax mm=2
> 
> {!edismax mm=2}fire fly
> {!edismax mm=2}fire fly
> 
> +((DisjunctionMaxQuery((ocr:fire)) DisjunctionMaxQuery((ocr:fly)))~2)
> 
>
> Regular Boolean AND query:   [fire AND fly] numFound="184962
> fire AND fly
> fire AND fly
> +ocr:fire +ocr:fly
> +ocr:fire +ocr:fly
>
> Regular Boolean OR query: fire OR fly 366047  numFound="366047"
> 
> fire OR fly
> fire OR fly
> ocr:fire ocr:fly
> ocr:fire ocr:fly
>


Re: Dismax Question

2012-07-02 Thread Steve Fatula
From: Joel Rosen 

To: solr-user@lucene.apache.org; Steve Fatula  
>Cc: Ahmet Arslan ; Tom Burton-West  
>Sent: Monday, July 2, 2012 10:31 AM
>Subject: Re: Dismax Question
> 
>
>I and another user recently posted about this exact same issue.  It sounds 
>like maybe this is a new bug introduced in 3.6:
>
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E
>
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E
>
>That sounds like the same thing, I've noticed searching for SEP-100 actually 
>does a SEP or 100, even though it's supposed to be AND. 

Has a bug report been filed? 

I've managed to figure out a fix that is working well enough for my own 
application right now.  I set autoGeneratePhraseQueries to "true" on my field, 
and also set qs=2.  The high query slop value simulates the AND behavior 
that I want since my documents are relatively short, but this is obviously not 
the correct solution, and I don't know if there are any performance issues with 
using really high query slop values.
>
>Ok, so, I need to figure out autoGeneratePhraseQueries I guess. The way I 
>understand it, if I search for WORD1 WORD2, it will only find "WORD1 WORD2", 
>is that correct? That would be bad if so since I'd really want WORD1 AND 
>WORD2, but not the phrase.

Trying to find some good doc for this feature!

Re: Dismax Question

2012-07-02 Thread Steve Fatula


From: Joel Rosen 
>To: solr-user@lucene.apache.org; Steve Fatula  
>Cc: Ahmet Arslan ; Tom Burton-West  
>Sent: Monday, July 2, 2012 10:31 AM
>Subject: Re: Dismax Question
> 
>I and another user recently posted about this exact same issue.  It sounds
>like maybe this is a new bug introduced in 3.6:
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E
>
>Does anyone happen to know if 3.5 (we went from 3.4 to 3.6) happens to have 
>the problem? If not, we'd probably revert since we can't deal with the 
>millions if extra search results that should not be there.

Check if input xml file is ok?

2012-07-02 Thread Bruno Mannina

Dear All,

I have several input file to index within more than 17 000 docs.

Sometimes indexation bugs because there is a mistake in XML structure 
(like a ]]> inside CDATA field)


Is exist a "test.jar" somewhere before running the post.jar?

Thanks alot,
Bruno


Re: Trunk error in Tomcat

2012-07-02 Thread Erik Hatcher
Interestingly, I just logged the issue of it not showing the right error in the 
UI here: 

As for your specific issue, not sure, but the error should at least also show 
in the admin view.

Erik


On Jul 2, 2012, at 18:59 , Briggs Thompson wrote:

> Hi All,
> 
> I just grabbed the latest version of trunk and am having a hard time
> getting it running properly in tomcat. It does work fine in Jetty. The
> admin screen gives the following error:
> This interface requires that you activate the admin request handlers, add
> the following configuration to your  Solrconfig.xml
> 
> I am pretty certain the front end error has nothing to do with the actual
> error. I have seen some other folks on the distro with the same problem,
> but none of the threads have a solution (that I could find). Below is the
> stack trace. I also tried with different versions of Lucene but none
> worked. Note: my index is EMPTY and I am not migrating over an index build
> with a previous version of lucene. I think I ran into this a while ago with
> an earlier version of trunk, but I don't recall doing anything to fix it.
> Anyhow, if anyone has an idea with this one, please let me know.
> 
> Thanks!
> Briggs Thompson
> 
> SEVERE: null:java.lang.NoSuchFieldError: LUCENE_50
> at
> org.apache.solr.analysis.SynonymFilterFactory$1.createComponents(SynonymFilterFactory.java:83)
> at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:83)
> at
> org.apache.lucene.analysis.synonym.SynonymMap$Builder.analyze(SynonymMap.java:120)
> at
> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:99)
> at
> org.apache.lucene.analysis.synonym.SolrSynonymParser.add(SolrSynonymParser.java:70)
> at
> org.apache.solr.analysis.SynonymFilterFactory.loadSolrSynonyms(SynonymFilterFactory.java:131)
> at
> org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:93)
> at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:584)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:112)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:812)
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:510)
> at org.apache.solr.core.CoreContainer.load(CoreContainer.java:333)
> at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:282)
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:101)
> at
> org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277)
> at
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258)
> at
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382)
> at
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:103)
> at
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4649)
> at
> org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5305)
> at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
> at
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
> at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:618)
> at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:963)
> at
> org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1600)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)



DIH - unable to ADD individual new documents

2012-07-02 Thread Klostermeyer, Michael
I am not able to ADD individual documents via the DIH, but updating works as 
expected.   The stored procedure that is called within the DIH returns the 
expected data for the new document, Solr appears to "do its thing", but it 
never makes it to the Solr server, as evidence that subsequent queries do not 
return it.

Is there a trick to adding new documents using the DIH?

Mike



Re: how do I search the archives for solr-user

2012-07-02 Thread Chris Hostetter


http://lucene.apache.org/solr/discussion.html#mail-archives


-Hoss


Re: Using custom user-defined caches to store user app data while indexing

2012-07-02 Thread Chris Hostetter

: If you implement SolrCoreAware interface in your custom
: UpdateRequestProcessorFactory, you could then access your cache via Solr
: Core in the inform method, I think. Haven't tried it myself, but it looks
: logical to me to start from there.

right ... but you really only need to be SolrCoreAware if you have to 
access the SolrCore during "initialization"

You can also access the SolrCore from the SolrQueryRequest -- which is 
probably all you need for your UpdateRequestProcessorFactory.


-Hoss


RE: DIH - unable to ADD individual new documents

2012-07-02 Thread Klostermeyer, Michael
I should add that I am using the full-import command in all cases, and setting 
clean=false for the individual adds.

Mike


-Original Message-
From: Klostermeyer, Michael [mailto:mklosterme...@riskexchange.com] 
Sent: Monday, July 02, 2012 5:41 PM
To: solr-user@lucene.apache.org
Subject: DIH - unable to ADD individual new documents

I am not able to ADD individual documents via the DIH, but updating works as 
expected.   The stored procedure that is called within the DIH returns the 
expected data for the new document, Solr appears to "do its thing", but it 
never makes it to the Solr server, as evidence that subsequent queries do not 
return it.

Is there a trick to adding new documents using the DIH?

Mike



How to space between spatial search results? (Declustering)

2012-07-02 Thread mcb
I have a classic spatial search schema in solr with a lat_long field. Is
there way to do a bounding-box type search that will pull back results that
are uniformly distributed so their isn't the case of 10 pins being on top of
each other? Ie, if I have a 100-mile box, not result will return within 5
miles or so of another?

Does anyone have any insight or experience with this? 

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-space-between-spatial-search-results-Declustering-tp3992668.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DIH - unable to ADD individual new documents

2012-07-02 Thread Klostermeyer, Michael
The URL I am using is 
http://localhost/solr/dataimport?commit=true&wt=json&clean=false&uniqueID=2028046&command=full%2Dimport&entity=myEntityName

uniqueID is the ID of the newly created DB record.  This ID gets passed to the 
stored procedure and returns the expected data when I run the SP directly.

Mike


-Original Message-
From: Klostermeyer, Michael [mailto:mklosterme...@riskexchange.com] 
Sent: Monday, July 02, 2012 8:24 PM
To: solr-user@lucene.apache.org
Subject: RE: DIH - unable to ADD individual new documents

I should add that I am using the full-import command in all cases, and setting 
clean=false for the individual adds.

Mike


-Original Message-
From: Klostermeyer, Michael [mailto:mklosterme...@riskexchange.com] 
Sent: Monday, July 02, 2012 5:41 PM
To: solr-user@lucene.apache.org
Subject: DIH - unable to ADD individual new documents

I am not able to ADD individual documents via the DIH, but updating works as 
expected.   The stored procedure that is called within the DIH returns the 
expected data for the new document, Solr appears to "do its thing", but it 
never makes it to the Solr server, as evidence that subsequent queries do not 
return it.

Is there a trick to adding new documents using the DIH?

Mike



Re: How to improve this solr query?

2012-07-02 Thread Lance Norskog
Wildcards are slow. Leading wildcards are even more slow. Is there
some way to search that data differently? If it is a string, can you
change it to a text field and make sure 'apartment' is a separate
word?

On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn  wrote:
> Hi Michael,
>
> Thanks for quick response. Based on documentation, "facet.mincount" means
> that solr will return facet fields that has at least that number. For me, I
> just want to ensure my facet fields count doesn't have zero value.
>
> I try to increase to 10, but it still slows even for the same query.
>
> Actually, those 13 million documents are divided into 200 portals. I
> already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
> it's still slow.
>
> On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
> michael.della.bi...@appinions.com> wrote:
>
>> Hi Chamnap,
>>
>> The first thing that jumped out at me was "facet.mincount=1". Are you
>> sure you need this? Increasing this number should drastically improve
>> speed.
>>
>> Michael Della Bitta
>>
>> 
>> Appinions, Inc. -- Where Influence Isn’t a Game.
>> http://www.appinions.com
>>
>>
>> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn 
>> wrote:
>> > Hi all,
>> >
>> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
>> The
>> > problem is that my query is so slow; the average response time is 12 secs
>> > against 13 millions documents.
>> >
>> > What I am doing is to send quoted string (q2) to string fields and
>> > non-quoted string (q1) to other fields and combine the result together.
>> >
>> >
>> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
>> >
>> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
>> > *
>> >
>> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
>> > *
>> >
>> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
>> >
>> > I have done solr optimize already, but it's still slow. Any idea how to
>> > improve the speed? Am I done anything wrong?
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnap.github.com/
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnap.github.com/



-- 
Lance Norskog
goks...@gmail.com


Re: query filter with OR reduces results

2012-07-02 Thread Jack Krupansky

Purely negative queries are problematic. You need to add a "*:*":

(*:* NOT sentence_id:[* TO *])

Meaning all documents except those with values in the sentence_id field.

-- Jack Krupansky

-Original Message- 
From: Breese,John

Sent: Monday, July 02, 2012 11:19 AM
To: solr-user@lucene.apache.org
Subject: query filter with OR reduces results

I'm trying to figure out why adding an "OR" condition to a query filter 
would end up restricting results, instead of unioning them? I would expect 
to get back more results by adding a union to the filter query.



Case 1 – 380 Results

380 results are returned with a query of xyz with a query filter of

(NOT sentence_id:[* TO *])


Case 2 – 41 Results

41 results are returned with the same query and a filter query of

(usage_flags:PRESCRIPTION)


Case 3 – Union of Case 1 and 2

Only 41 results are returned using the same query and a filter query of

(NOT sentence_id:[* TO *])

OR

(usage_flags:PRESCRIPTION)



I would expect that I would get at least 380 results with the union, instead 
of the 41 that I am seeing.


I only really notice this issue when using the NOT operator, so perhaps the 
problem lies with not understanding quite how the NOT operator is applied 
here. Any ideas?



I'm testing using the Solr /admin/form.jsp

Solr Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06 11:34:53

The default query parser is EDisMax, though I don't think that will change 
the filter queries.


--
CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities 
laws. Unauthorized forwarding, printing, copying, distribution, or use of 
such information is strictly prohibited and may be unlawful. If you are not 
the addressee, please promptly delete this message and notify the sender of 
the delivery error by e-mail or you may call Cerner's corporate offices in 
Kansas City, Missouri, U.S.A at (+1) (816)221-1024. 



Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Hi Lance,

I didn't use wildcards at all. This is a normal text search only. I need a
string field because it needs to be matched exactly, and the value is
sometimes a multi-word, so quoted it is necessary.

By the way, if I do a super plain query, it takes at least 600ms. I'm not
sure why. On another solr instance with similar amount of data, it takes
only 50ms.

I see something strange on the response, there is always

build

What does that mean?

On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog  wrote:

> Wildcards are slow. Leading wildcards are even more slow. Is there
> some way to search that data differently? If it is a string, can you
> change it to a text field and make sure 'apartment' is a separate
> word?
>
> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn 
> wrote:
> > Hi Michael,
> >
> > Thanks for quick response. Based on documentation, "facet.mincount" means
> > that solr will return facet fields that has at least that number. For
> me, I
> > just want to ensure my facet fields count doesn't have zero value.
> >
> > I try to increase to 10, but it still slows even for the same query.
> >
> > Actually, those 13 million documents are divided into 200 portals. I
> > already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
> > it's still slow.
> >
> > On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
> > michael.della.bi...@appinions.com> wrote:
> >
> >> Hi Chamnap,
> >>
> >> The first thing that jumped out at me was "facet.mincount=1". Are you
> >> sure you need this? Increasing this number should drastically improve
> >> speed.
> >>
> >> Michael Della Bitta
> >>
> >> 
> >> Appinions, Inc. -- Where Influence Isn’t a Game.
> >> http://www.appinions.com
> >>
> >>
> >> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >> wrote:
> >> > Hi all,
> >> >
> >> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
> >> The
> >> > problem is that my query is so slow; the average response time is 12
> secs
> >> > against 13 millions documents.
> >> >
> >> > What I am doing is to send quoted string (q2) to string fields and
> >> > non-quoted string (q1) to other fields and combine the result
> together.
> >> >
> >> >
> >>
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >> >
> >>
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> >> > *
> >> >
> >>
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> >> > *
> >> >
> >>
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >> >
> >> > I have done solr optimize already, but it's still slow. Any idea how
> to
> >> > improve the speed? Am I done anything wrong?
> >> >
> >> > --
> >> > Chhorn Chamnap
> >> > http://chamnap.github.com/
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: DIH - unable to ADD individual new documents

2012-07-02 Thread Gora Mohanty
On 3 July 2012 07:54, Klostermeyer, Michael
 wrote:
> I should add that I am using the full-import command in all cases, and 
> setting clean=false for the individual adds.

What does the data-import page report at the end of the
full-import, i.e., how many documents were indexed?
Are there any error messages in the Solr logs? Please
share with us your DIH configuration file, and Solr
schema.xml.

Regards,
Gora


Re: How to improve this solr query?

2012-07-02 Thread Lance Norskog
&q2=*"apartment"*
q1=*apartment*

These are wildcards

On Mon, Jul 2, 2012 at 8:30 PM, Chamnap Chhorn  wrote:
> Hi Lance,
>
> I didn't use wildcards at all. This is a normal text search only. I need a
> string field because it needs to be matched exactly, and the value is
> sometimes a multi-word, so quoted it is necessary.
>
> By the way, if I do a super plain query, it takes at least 600ms. I'm not
> sure why. On another solr instance with similar amount of data, it takes
> only 50ms.
>
> I see something strange on the response, there is always
>
> build
>
> What does that mean?
>
> On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog  wrote:
>
>> Wildcards are slow. Leading wildcards are even more slow. Is there
>> some way to search that data differently? If it is a string, can you
>> change it to a text field and make sure 'apartment' is a separate
>> word?
>>
>> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn 
>> wrote:
>> > Hi Michael,
>> >
>> > Thanks for quick response. Based on documentation, "facet.mincount" means
>> > that solr will return facet fields that has at least that number. For
>> me, I
>> > just want to ensure my facet fields count doesn't have zero value.
>> >
>> > I try to increase to 10, but it still slows even for the same query.
>> >
>> > Actually, those 13 million documents are divided into 200 portals. I
>> > already include "fq=portal_uuid: kjkjkjk" inside each nested query, but
>> > it's still slow.
>> >
>> > On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
>> > michael.della.bi...@appinions.com> wrote:
>> >
>> >> Hi Chamnap,
>> >>
>> >> The first thing that jumped out at me was "facet.mincount=1". Are you
>> >> sure you need this? Increasing this number should drastically improve
>> >> speed.
>> >>
>> >> Michael Della Bitta
>> >>
>> >> 
>> >> Appinions, Inc. -- Where Influence Isn’t a Game.
>> >> http://www.appinions.com
>> >>
>> >>
>> >> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn <
>> chamnapchh...@gmail.com>
>> >> wrote:
>> >> > Hi all,
>> >> >
>> >> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17 Gb.
>> >> The
>> >> > problem is that my query is so slow; the average response time is 12
>> secs
>> >> > against 13 millions documents.
>> >> >
>> >> > What I am doing is to send quoted string (q2) to string fields and
>> >> > non-quoted string (q1) to other fields and combine the result
>> together.
>> >> >
>> >> >
>> >>
>> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
>> >> >
>> >>
>> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
>> >> > *
>> >> >
>> >>
>> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
>> >> > *
>> >> >
>> >>
>> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
>> >> >
>> >> > I have done solr optimize already, but it's still slow. Any idea how
>> to
>> >> > improve the speed? Am I done anything wrong?
>> >> >
>> >> > --
>> >> > Chhorn Chamnap
>> >> > http://chamnap.github.com/
>> >>
>> >
>> >
>> >
>> > --
>> > Chhorn Chamnap
>> > http://chamnap.github.com/
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>
>
>
> --
> Chhorn Chamnap
> http://chamnap.github.com/



-- 
Lance Norskog
goks...@gmail.com


Re: How to improve this solr query?

2012-07-02 Thread Chamnap Chhorn
Lance, I didn't use widcard at all. I use only this, the difference is
quoted or not.

q2=*"apartment"*
q1=*apartment*
*
*
On Tue, Jul 3, 2012 at 12:06 PM, Lance Norskog  wrote:

> &q2=*"apartment"*
> q1=*apartment*
>
> These are wildcards
>
> On Mon, Jul 2, 2012 at 8:30 PM, Chamnap Chhorn 
> wrote:
> > Hi Lance,
> >
> > I didn't use wildcards at all. This is a normal text search only. I need
> a
> > string field because it needs to be matched exactly, and the value is
> > sometimes a multi-word, so quoted it is necessary.
> >
> > By the way, if I do a super plain query, it takes at least 600ms. I'm not
> > sure why. On another solr instance with similar amount of data, it takes
> > only 50ms.
> >
> > I see something strange on the response, there is always
> >
> > build
> >
> > What does that mean?
> >
> > On Tue, Jul 3, 2012 at 10:02 AM, Lance Norskog 
> wrote:
> >
> >> Wildcards are slow. Leading wildcards are even more slow. Is there
> >> some way to search that data differently? If it is a string, can you
> >> change it to a text field and make sure 'apartment' is a separate
> >> word?
> >>
> >> On Mon, Jul 2, 2012 at 10:01 AM, Chamnap Chhorn <
> chamnapchh...@gmail.com>
> >> wrote:
> >> > Hi Michael,
> >> >
> >> > Thanks for quick response. Based on documentation, "facet.mincount"
> means
> >> > that solr will return facet fields that has at least that number. For
> >> me, I
> >> > just want to ensure my facet fields count doesn't have zero value.
> >> >
> >> > I try to increase to 10, but it still slows even for the same query.
> >> >
> >> > Actually, those 13 million documents are divided into 200 portals. I
> >> > already include "fq=portal_uuid: kjkjkjk" inside each nested query,
> but
> >> > it's still slow.
> >> >
> >> > On Mon, Jul 2, 2012 at 11:47 PM, Michael Della Bitta <
> >> > michael.della.bi...@appinions.com> wrote:
> >> >
> >> >> Hi Chamnap,
> >> >>
> >> >> The first thing that jumped out at me was "facet.mincount=1". Are you
> >> >> sure you need this? Increasing this number should drastically improve
> >> >> speed.
> >> >>
> >> >> Michael Della Bitta
> >> >>
> >> >> 
> >> >> Appinions, Inc. -- Where Influence Isn’t a Game.
> >> >> http://www.appinions.com
> >> >>
> >> >>
> >> >> On Mon, Jul 2, 2012 at 12:35 PM, Chamnap Chhorn <
> >> chamnapchh...@gmail.com>
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I'm using solr 3.5 with nested query on the 4 core cpu server + 17
> Gb.
> >> >> The
> >> >> > problem is that my query is so slow; the average response time is
> 12
> >> secs
> >> >> > against 13 millions documents.
> >> >> >
> >> >> > What I am doing is to send quoted string (q2) to string fields and
> >> >> > non-quoted string (q1) to other fields and combine the result
> >> together.
> >> >> >
> >> >> >
> >> >>
> >>
> facet=true&sort=score+desc&q2=*"apartment"*&facet.mincount=1&q1=*apartment*
> >> >> >
> >> >>
> >>
> &tie=0.1&q.alt=*:*&wt=json&version=2.2&rows=20&fl=uuid&facet.query=has_map:+true&facet.query=has_image:+true&facet.query=has_website:+true&start=0&q=
> >> >> > *
> >> >> >
> >> >>
> >>
> _query_:+"{!dismax+qf='.'+fq='..'+v=$q1}"+OR+_query_:+"{!dismax+qf='..'+fq='...'+v=$q2}"
> >> >> > *
> >> >> >
> >> >>
> >>
> &facet.field={!ex%3Ddt}sub_category_uuids&facet.field={!ex%3Ddt}location_uuid
> >> >> >
> >> >> > I have done solr optimize already, but it's still slow. Any idea
> how
> >> to
> >> >> > improve the speed? Am I done anything wrong?
> >> >> >
> >> >> > --
> >> >> > Chhorn Chamnap
> >> >> > http://chamnap.github.com/
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Chhorn Chamnap
> >> > http://chamnap.github.com/
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goks...@gmail.com
> >>
> >
> >
> >
> > --
> > Chhorn Chamnap
> > http://chamnap.github.com/
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Chhorn Chamnap
http://chamnap.github.com/