How to approach to analyze Solr Edismax Query log

2012-09-14 Thread Fumio Takayama
HI All

I provide the search service using Solr.

When users used the service, I would like to analyze the search query log
of Solr and to know to what kind of search word it is referring.
It is searching to Solr using the Edismax query.

Then, when analyzing, it is being examined whether analysis is made using
ExtendedDismaxQParserPlugin of Solr.

I have two question.

- is this approach right?
- How to use ExtendedDismaxQParserPlugin when my approach is right?

(Initialization, the parameter of Method, etc.)

Would you help someone?

Regards,

Fumio Takayama


Solr cloud in 4.0 with NRT performance

2012-09-14 Thread samarth s
Hi,

I am currently using features like facet and group/collapse on solr 3.6.
The frequency of writing is user driven, and hence is expected to be
visible real time or at least near real time. These updates should be
consistent in facet and group results as well. Also to handle the query
load, I may have to use replication/sharding w/ or w/o solr cloud.

I am planning to migrate to solr 4.0, and use its powerful features of NRT
( soft commit ) and Solr Cloud ( using Zookeeper ) to achieve the above
requirements.

Is a Solr Cloud with a replication level greater than 1, capable of giving
NRT results ?
If yes, do these NRT results work with all kinds of querying, like,
faceting and grouping ?

It would be great if some one could share their insights and numbers on
these questions.

-- 
Regards,
Samarth


Re: Can i have more than one field as defaultSearchField in schema.xml

2012-09-14 Thread Jack Krupansky

As of Solr 3.6, defaultSearchField is deprecated:

* SOLR-2724: Specifying  and defaultOperator="..."/> in
 schema.xml is now considered deprecated.  Instead you are encouraged to 
specify these via the "df"

 and "q.op" parameters in your request handler definition.  (David Smiley)

So, you can specify the "df" request parameter on each request handler in 
your solrconfig, and each can be different.


Or the edismax query parser can be used with its "qf" parameter which 
specifies a list of fields.


What is your specific use case?

-- Jack Krupansky

-Original Message- 
From: veena rani

Sent: Friday, September 14, 2012 1:30 AM
To: solr-user@lucene.apache.org
Subject: Can i have more than one field as defaultSearchField in schema.xml

Hi,

Can i have more than one field as defaultSearchField in schema.xml.
This is my default search field in solr,

**
* techskill.*
But i need to add one more field to default search field.

--
Regards,
Veena.
Banglore. 



Re: How to approach to analyze Solr Edismax Query log

2012-09-14 Thread Jack Krupansky
Are you trying to re-parse the queries that you extract from the log to 
determine the query terms?


You might look at how the highlighter works since it accesses the query 
terms.


-- Jack Krupansky

-Original Message- 
From: Fumio Takayama

Sent: Friday, September 14, 2012 4:39 AM
To: solr-user@lucene.apache.org
Subject: How to approach to analyze Solr Edismax Query log

HI All

I provide the search service using Solr.

When users used the service, I would like to analyze the search query log
of Solr and to know to what kind of search word it is referring.
It is searching to Solr using the Edismax query.

Then, when analyzing, it is being examined whether analysis is made using
ExtendedDismaxQParserPlugin of Solr.

I have two question.

- is this approach right?
- How to use ExtendedDismaxQParserPlugin when my approach is right?

(Initialization, the parameter of Method, etc.)

Would you help someone?

Regards,

Fumio Takayama 



Re: DIH import from MySQL results in garbage text for special chars

2012-09-14 Thread Erick Erickson
Is your _browser_ set to handle the appropriate character set? Or whatever
you're using to inspect your data? How about your servlet container?



Best
Erick

On Mon, Sep 10, 2012 at 7:47 AM, Pranav Prakash  wrote:
> Hi Folks,
>
> I am attempting to import documents to Solr from MySQL using DIH. One of
> the field contains the text - “Future of Mobile Value Added Services (VAS)
> in Australia” .Notice the character “ and ”.
>
> When I am importing, it gets stored as - “Future of Mobile Value Added
> Services (VAS) in Australia�.
>
> The datasource config clearly mentions use of UTF-8 as follows:
>
>driver="com.mysql.jdbc.Driver"
> url="jdbc:mysql://localhost/ohapp_devel"
> user="username"
> useUnicode="true"
> characterEncoding="UTF-8"
> password="password"
> zeroDateTimeBehavior="convertToNull"
> name="app" />
>
>
> A plain SQL Select statement on the MySQL Console gives appropriate text. I
> even tried using following scriptTransformer to get rid of this char, but
> it was of no particular use in my case.
>
> function gsub(source, pattern, replacement) {
>   var match, result;
>   if (!((pattern != null) && (replacement != null))) {
> return source;
>   }
>   result = '';
>   while (source.length > 0) {
> if ((match = source.match(pattern))) {
>   result += source.slice(0, match.index);
>   result += replacement;
>   source = source.slice(match.index + match[0].length);
> } else {
>   result += source;
>   source = '';
> }
>   }
>   return result;
> }
>
> function fixQuotes(c){
>   c = gsub(c, /\342\200(?:\234|\235)/,'"');
>   c = gsub(c, /\342\200(?:\230|\231)/,"'");
>   c = gsub(c, /\342\200\223/,"-");
>   c = gsub(c, /\342\200\246/,"...");
>   c = gsub(c, /\303\242\342\202\254\342\204\242/,"'");
>   c = gsub(c, /\303\242\342\202\254\302\235/,'"');
>   c = gsub(c, /\303\242\342\202\254\305\223/,'"');
>   c = gsub(c, /\303\242\342\202\254"/,'-');
>   c = gsub(c, /\342\202\254\313\234/,'"');
>   c = gsub(c, /“/, '"');
>   return c;
> }
>
> function cleanFields(row){
>   var fieldsToClean = ['title', 'description'];
>   for(i =0; i< fieldsToClean.length; i++){
> var old_text = String(row.get(fieldsToClean[i]));
> row.put(fieldsToClean[i], fixQuotes(old_text) );
>   }
>   return row;
> }
>
> My understanding goes that this must be a very common problem. It also
> occurs with human names which have these chars. What is an appropriate way
> to get the appropriate text indexed and searchable? The fieldtype where
> this is stored goes as follows
>
>   
> 
>   
>   
>   
>   
>   
>protected="protwords.txt"/>
>synonyms="synonyms.txt"
>   ignoreCase="true"
>   expand="true" />
>words="stopwords_en.txt"
>   ignoreCase="true" />
>words="stopwords_en.txt"
>   ignoreCase="true" />
>generateWordParts="1"
>   generateNumberParts="1"
>   catenateWords="1"
>   catenateNumbers="1"
>   catenateAll="0"
>   preserveOriginal="1" />
>   
> 
>
>
> *Pranav Prakash*
>
> "temet nosce"


Re: solr issue with seaching words

2012-09-14 Thread Erick Erickson
And you have SnowballPorterFilterFactory in your analysis chain which
is transforming "jacke" into "jack".

Best
Erick

On Mon, Sep 10, 2012 at 7:15 AM, zainu  wrote:
> http://lucene.472066.n3.nabble.com/file/n4006583/Unbenannt.png
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-issue-with-seaching-words-tp4005200p4006583.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Atomic Updates, Payloads, Non-stored data

2012-09-14 Thread Erick Erickson
You can't. This is the whole "stacked segment" discussion

But you might consider external file fields, sometimes they
can work out...

Best
Erick

On Mon, Sep 10, 2012 at 1:19 PM, jimtronic  wrote:
> Hi,
>
> I'm using payloads to tie a value to an attribute for a document -- eg a
> user's rating for a document. I do not store this data, but I index it and
> access the value through function queries.
>
> I was really excited about atomic updates, but they do not work for me
> because they are blowing out all of my non-stored payload data.
>
> I can make the fields stored, but that is not desirable as in some cases
> there's a lot of data.
>
> I was wondering how feasible it would be for me to modify the
> DistributedUpdateProcessor so that it preserves my non-stored payloads while
> performing the atomic updates.
>
> Thanks! Jim
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Atomic-Updates-Payloads-Non-stored-data-tp4006678.html
> Sent from the Solr - User mailing list archive at Nabble.com.


How does Solr handle overloads so well?

2012-09-14 Thread Mike Gagnon
Hi,

I have been studying how server software responds to requests that cause
CPU overloads (such as infinite loops).

In my experiments I have observed that Solr performs unusually well when
subjected to such loads. Every other piece of web software I've
experimented with drops to zero service under such loads. Do you know how
Solr achieves such good performance?

I am guessing that when Solr is overload sheds load to make room for
incoming requests, but I could not find any documentation that describes
Solr's overload strategy.

Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
using it index and search about 10,000 pages on MediaWiki. I test both
Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a rate
of 300 requests per second. At the same time, I submitted "overload
requests" at a rate of 60 requests per second. Each overload request caused
an infinite loop in Solr via https://issues.apache.org/jira/browse/SOLR-2631
 .

With Jetty about 70% of non-overload requests completed --- 95% of requests
completing within 0.6 seconds.
With Tomcat about 34% of non-overload requests completed --- 95% of
requests completing within 0.6 seconds.

I also ran Solr+Jetty with non-overload requests coming in 65 requests per
second (overload requests remain at 60 requests per second). In this
workload, the completion rate drops to 15% and the 95th percentile latency
increases to 25.

Cheers,
Mike Gagnon


Re: Solr search not working after copying a new field to an existing Indexed Field

2012-09-14 Thread Erick Erickson
This should not be the case. Any time you add a document with
a pre-existing , the old document is completely
replaced by the new document.

You do have to issue a commit before you'll see the new info
though. And if you do not have a  defined, you'll
see two copies of the doc.

And sometimes your browser cache will fool you...

Best
Erick

On Tue, Sep 11, 2012 at 8:21 PM, Mani  wrote:
> Eric,
>  "When you add a doc with the same unique key as an old doc,
> the data associated with the first version of the doc is entirely
> thrown away and its as though you'd never indexed it at all", I did exactly
> the same. The old doc and new doc there is not a change except the Name has
> changed. When I query Solr for the document, I do see the Name field with
> the correct recent changes. However if I search for for the new name, I do
> not get the result. So I removed all the documents entirely and then added
> the same new document. It worked. Not sure if this is a bug.
>
> So whenever I add a new field to an existing search field, the document
> needs to be thrown away (not just adding with the same key as its not
> working in my case) for the search to take effect.
>
> Thanks
>
>
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-search-not-working-after-copying-a-new-field-to-an-existing-Indexed-Field-tp4005993p4007096.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to approach to analyze Solr Edismax Query log

2012-09-14 Thread Fumio Takayama
Hi, Jack

>Are you trying to re-parse the queries that you extract from the log to
determine the query terms?

Yes, I try to re-prase queries from the log.

> You might look at how the highlighter works since it accesses the query
terms.

Thanks for your help. I check the highlighter works.

Regards

Fumio Takayama

2012/9/14 Jack Krupansky 

> Are you trying to re-parse the queries that you extract from the log to
> determine the query terms?
>
> You might look at how the highlighter works since it accesses the query
> terms.
>
> -- Jack Krupansky
>
> -Original Message- From: Fumio Takayama
> Sent: Friday, September 14, 2012 4:39 AM
> To: solr-user@lucene.apache.org
> Subject: How to approach to analyze Solr Edismax Query log
>
>
> HI All
>
> I provide the search service using Solr.
>
> When users used the service, I would like to analyze the search query log
> of Solr and to know to what kind of search word it is referring.
> It is searching to Solr using the Edismax query.
>
> Then, when analyzing, it is being examined whether analysis is made using
> ExtendedDismaxQParserPlugin of Solr.
>
> I have two question.
>
> - is this approach right?
> - How to use ExtendedDismaxQParserPlugin when my approach is right?
>
> (Initialization, the parameter of Method, etc.)
>
> Would you help someone?
>
> Regards,
>
> Fumio Takayama
>


Broken highlight truncation for hl.alternateField

2012-09-14 Thread Arcadius Ahouansou
Hello.

I am using the fastVectorHighlighter in Solr3.5 to highight  and truncate
the summary of my results.

The standard breakIterator is being used with hl.bs.type = WORD as
per http://lucidworks.lucidimagination.com/display/solr/Highlighting

Search is being performed on the document title and summary.

In my edismax requesthandler, I have as default:

true
summary
summary


A simplified query looks like this:

/solr/search?q=help&hl=true&f.summary.hl.fragsize=250&f.summary.hl.maxAlternateFieldLength=250

So, I am truncating only the summary.

1- When a search term is found in the decription, everyting works well as
expected
and the summary is truncated and contains whole words only (the
breakIterator is being applied properly)

2- However, when there is no match in the summary, then
the f.summary.hl.alternateField quicks-in and the summary returned is often
truncated in the middle of a word (i.e we may get "peo" instead of
"people").
This lets me suppose that the breakIterator is not applied to
f.summary.hl.alternateField.


My question is: how to return full word truncation when summary is fetched
from  f.summary.hl.alternateField ? (i.e no match in summary)
Or is there any other way I could get proper truncation when there is no
match in the summary?


 Thank you very much.

Arcadius


Re: Multiple structured datasource(rss,db,xml) in single schema.xml

2012-09-14 Thread nishi
Is it I am going which solr doesn't support for handling this? Or is it no
one has done in this manner?. Please suggest if its do-able with the
solr...my requirement and my approach.

Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-structured-datasource-rss-db-xml-in-single-schema-xml-tp4007497p4007799.html
Sent from the Solr - User mailing list archive at Nabble.com.


Accidental multivalued fields?

2012-09-14 Thread Travis Low
Greetings.  I am using Solr 3.4.0 with tomcat 7.0.22.  I've been using
these versions successfully for a while, but on my latest project, I cannot
sort ANY field without getting this exception:

SEVERE: org.apache.solr.common.SolrException: can not sort on multivalued
field: id
at
org.apache.solr.schema.SchemaField.checkSortability(SchemaField.java:161)
at org.apache.solr.schema.TrieField.getSortField(TrieField.java:126)
at
org.apache.solr.schema.SchemaField.getSortField(SchemaField.java:144)
at
org.apache.solr.search.QueryParsing.parseSort(QueryParsing.java:385)
at org.apache.solr.search.QParser.getSort(QParser.java:251)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:82)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
[snip]

The thing is, I have only one multivalued field in my schema, at least, I
thought so.  I even tried sorting on id, which is the unique key, and got
the same error.  Here are the fields in my schema:


   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

   
   
   
   
   

   
   
   
   
   

   
   
   
   
   


   
   
   
   
   
   
   
   
   
   
   
   
   
   


 id

I can post the entire schema.xml if need be.  Can anyone please tell me
what's going on?

cheers,

Travis

-- 

**

*Travis Low, Director of Development*


** * *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* 

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed
to be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from
the content of this email.


Re: Accidental multivalued fields?

2012-09-14 Thread Chris Hostetter

: Greetings.  I am using Solr 3.4.0 with tomcat 7.0.22.  I've been using
: these versions successfully for a while, but on my latest project, I cannot
: sort ANY field without getting this exception:
: 
: SEVERE: org.apache.solr.common.SolrException: can not sort on multivalued

...

: The thing is, I have only one multivalued field in my schema, at least, I
: thought so.  I even tried sorting on id, which is the unique key, and got
: the same error.  Here are the fields in my schema:

a) multiValued can be set on fieldType and is then inherited by the fields

b) Check the "version" property on your  tag.  If the value is 
"1.0" then all fields are assumed to be multiValued.

Here's the comment from the example schema included with Solr 3.4...


  


-Hoss


Re: Accidental multivalued fields?

2012-09-14 Thread Travis Low
Thanks much!  It was the schema version attribute -- the recycled
schema.xml I used did not contain that very useful comment.  Everything
works great now!

On Fri, Sep 14, 2012 at 1:56 PM, Chris Hostetter
wrote:

>
> : Greetings.  I am using Solr 3.4.0 with tomcat 7.0.22.  I've been using
> : these versions successfully for a while, but on my latest project, I
> cannot
> : sort ANY field without getting this exception:
> :
> : SEVERE: org.apache.solr.common.SolrException: can not sort on multivalued
>
> ...
>
> : The thing is, I have only one multivalued field in my schema, at least, I
> : thought so.  I even tried sorting on id, which is the unique key, and got
> : the same error.  Here are the fields in my schema:
>
> a) multiValued can be set on fieldType and is then inherited by the fields
>
> b) Check the "version" property on your  tag.  If the value is
> "1.0" then all fields are assumed to be multiValued.
>
> Here's the comment from the example schema included with Solr 3.4...
>
> 
>   
>
>
> -Hoss
>



-- 

**

*Travis Low, Director of Development*


** * *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* 

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed
to be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from
the content of this email.


1.3 to 3.6 migration

2012-09-14 Thread Sujatha Arun
Hi,

Just migrated to 3.6.1 from 1.3 version with the following observation

Indexed content using the same source

   *1.3
 3.6.1*
 Number of documents indexed 11505  13937
Index Time  - Full Index 170ms 171ms
Index size 23 MB
 31MB
Query Time [first time] for *:*  44 ms  187

and *:* query is not cached in 3.6.1 in query result cache ,is this
expected?

some points:

Even though I used the same data source ,the number of documents indexed
seem to be more in 3.6.1 [ not sure why?]
All the other params including index size and query time seem to be more
instnead of less in 3.6.1 and  queries are not getting cached in 3.6.1

Attached the schema's - any pointers?

Regards
Sujatha






  

  









 
   
 
 
 
   
  
 
 
 
   

 































  

  




  








  
  







  





  







  




  



  




  








  





 



   

	
	
	
	
	
	

	
	


	
	
	
	


	
	
	
	
	
	
	
	
	
	

	
	
	

	
	
	
	


	
	
	
	
	
	
	
	
	

	
   
  

   
   


   
   
   
   
   
 

 
  id

 
 content

 
 

  













 
 
 
 







  
  
  





 
   
 
 
 
   
  
 
 
 
   

 
 
  
  
  
 
 
 






 

  

  

 
  








  
  







  


  


 
	
	
	
	
	

	
	


	
	
	
	


	
	
	
	
	
	
	
	
	
	

	
	
	
	

	
	
	
	


	
	
	
	
	
	
	
	
	

	
   
  

   
   


   
   
   
   
   
 

 
 id

 
 content

 
 
 
 
 
 
 
 
 
 
 
 




Re: 3.6.1 - Suggester and spellcheker Implementation

2012-09-14 Thread Sujatha Arun
Thanks . :(

Regards
Sujatha

On Thu, Sep 13, 2012 at 2:28 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi Sujatha,
>
> No, suggester and spellchecker are separate beasts.
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Wed, Sep 12, 2012 at 3:18 PM, Sujatha Arun  wrote:
> > Hi ,
> >
> > If I am looking to implement Suggester Implementation with 3.6.1 ,I
> beleive
> > this creates it own index , now If I want to also use the spellcheck
>  also
> > ,would it be using the same index as suggester?
> >
> > Regards
> > Sujatha
>


Re: 1.3 to 3.6 migration

2012-09-14 Thread Otis Gospodnetic
Hi,

Maybe your indexer is different/modified/buggy?

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Fri, Sep 14, 2012 at 3:23 PM, Sujatha Arun  wrote:
> Hi,
>
> Just migrated to 3.6.1 from 1.3 version with the following observation
>
> Indexed content using the same source
>
>1.3
> 3.6.1
>  Number of documents indexed 11505  13937
> Index Time  - Full Index 170ms 171ms
> Index size 23 MB
> 31MB
> Query Time [first time] for *:*  44 ms  187
>
> and *:* query is not cached in 3.6.1 in query result cache ,is this
> expected?
>
> some points:
>
> Even though I used the same data source ,the number of documents indexed
> seem to be more in 3.6.1 [ not sure why?]
> All the other params including index size and query time seem to be more
> instnead of less in 3.6.1 and  queries are not getting cached in 3.6.1
>
> Attached the schema's - any pointers?
>
> Regards
> Sujatha
>


Re: Is it possible to do an "if" statement in a Solr query?

2012-09-14 Thread Gustav
Hello Hoss!
The case here would be: "if total result set contains any original medicines
X, 
then remove all generic medicines Y such that Y is a generic form of X."

In your example and in my case, the result should be Vaxidrop + Generipill



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-do-an-if-statement-in-a-Solr-query-tp4007311p4007853.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is it possible to do an "if" statement in a Solr query?

2012-09-14 Thread Chris Hostetter

: Hello Hoss!
: The case here would be: "if total result set contains any original medicines
: X, 
: then remove all generic medicines Y such that Y is a generic form of X."
: 
: In your example and in my case, the result should be Vaxidrop + Generipill

then wunder's suggestion of grouping on the family, with a group.sort that 
puts the "originals" first, is the simplest idea i can think of OOTB.  
You'll still have to look at each group and "ingnore" the generics if the 
first item in the group is an "original" -- which of course means things 
like "numFound" won't really be accurate.

An alternative solution using multiple requests wold simple be to do the 
query once with an "fq=type:original" filter, and if you get any results, 
then re-issue the query using all of the docids returned in some new 
"fq=-generic_of:(...)" filter to exclude any docs that are generics of hte 
matching originals.

That logic could also be encapsulated i na custom component -- you could 
look at the "exclude" logic QueryElevationComponent for inspiration.  the 
crux of the differnece between that while QEC gets it's excludes from a 
config file, you would get them from a (Query->DocSet)+FilterCache lookup

-Hoss


Re: Solr 4.0-BETA facet pivot returns no result

2012-09-14 Thread Chris Hostetter

: I got the answer,apache-solr-4.0.0-BETA.tgz is OK, I used the
: apache-solr-4.0.0-BETA.zip before

That still doesn't really explain the problem you were seeing -- i just 
checked both the tgz and zip artifacts for 4.0-BETA and confirmed that the 
results of your example URL are correct & identical (as is hte 
solr.war files) in both cases


hossman@frisbee:~/tmp/solr-4.0-BETA$ md5sum 
*/apache-solr-4.0.0-BETA/example/webapps/solr.war
2a1b6aaf690da53e9fc7692002e99210  
bin-tgz/apache-solr-4.0.0-BETA/example/webapps/solr.war
2a1b6aaf690da53e9fc7692002e99210  
bin-zip/apache-solr-4.0.0-BETA/example/webapps/solr.war


-Hoss


Re: Broken highlight truncation for hl.alternateField

2012-09-14 Thread Koji Sekiguchi

Hi Arcadius,

I think it is a feature. If no match terms found on hl.fl fields then it 
triggers
hl.alternateField function, and if you set hl.maxAlternateFieldLength=[LENGTH],
the highlighter extracts the first [LENGTH] characters of stored data of the 
hl.fl
field. As this is the common feature of both highlighter and FVH, it doesn't 
take
into account hl.bs.type (it is a special param for boundary scanner).

For now, implement boundary scanning in your client if you want.

koji
--
http://soleami.com/blog/starting-lab-work.html

(12/09/15 0:13), Arcadius Ahouansou wrote:

Hello.

I am using the fastVectorHighlighter in Solr3.5 to highight  and truncate
the summary of my results.

The standard breakIterator is being used with hl.bs.type = WORD as
per http://lucidworks.lucidimagination.com/display/solr/Highlighting

Search is being performed on the document title and summary.

In my edismax requesthandler, I have as default:

true
summary
summary


A simplified query looks like this:

/solr/search?q=help&hl=true&f.summary.hl.fragsize=250&f.summary.hl.maxAlternateFieldLength=250

So, I am truncating only the summary.

1- When a search term is found in the decription, everyting works well as
expected
and the summary is truncated and contains whole words only (the
breakIterator is being applied properly)

2- However, when there is no match in the summary, then
the f.summary.hl.alternateField quicks-in and the summary returned is often
truncated in the middle of a word (i.e we may get "peo" instead of
"people").
This lets me suppose that the breakIterator is not applied to
f.summary.hl.alternateField.


My question is: how to return full word truncation when summary is fetched
from  f.summary.hl.alternateField ? (i.e no match in summary)
Or is there any other way I could get proper truncation when there is no
match in the summary?


  Thank you very much.

Arcadius






Re: 1.3 to 3.6 migration

2012-09-14 Thread Sujatha Arun
Can you please elaborate?

Regards
Sujatha

On Sat, Sep 15, 2012 at 1:34 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi,
>
> Maybe your indexer is different/modified/buggy?
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Fri, Sep 14, 2012 at 3:23 PM, Sujatha Arun  wrote:
> > Hi,
> >
> > Just migrated to 3.6.1 from 1.3 version with the following observation
> >
> > Indexed content using the same source
> >
> >1.3
> > 3.6.1
> >  Number of documents indexed 11505  13937
> > Index Time  - Full Index 170ms
> 171ms
> > Index size 23 MB
> > 31MB
> > Query Time [first time] for *:*  44 ms
>  187
> >
> > and *:* query is not cached in 3.6.1 in query result cache ,is this
> > expected?
> >
> > some points:
> >
> > Even though I used the same data source ,the number of documents indexed
> > seem to be more in 3.6.1 [ not sure why?]
> > All the other params including index size and query time seem to be more
> > instnead of less in 3.6.1 and  queries are not getting cached in 3.6.1
> >
> > Attached the schema's - any pointers?
> >
> > Regards
> > Sujatha
> >
>