Regd WSTX EOFException

2010-08-25 Thread Pooja Verlani
Hi,
Sometimes while indexing to solr, I am getting  the following exception.
"com.ctc.wstx.exc.WstxEOFException: Unexpected end of input block in end tag"
I think its some configuration issue. Kindly suggest.

I have a solr working with Tomcat 6

Thanks
Pooja


Re: Regd WSTX EOFException

2010-08-26 Thread Pooja Verlani
Hi,
The client being used is php curl.
Could that be a problem?
On Wed, Aug 25, 2010 at 7:10 PM, Yonik Seeley
 wrote:
> On Wed, Aug 25, 2010 at 6:41 AM, Pooja Verlani  
> wrote:
>> Hi,
>> Sometimes while indexing to solr, I am getting  the following exception.
>> "com.ctc.wstx.exc.WstxEOFException: Unexpected end of input block in end tag"
>> I think its some configuration issue. Kindly suggest.
>>
>> I have a solr working with Tomcat 6
>
> Sounds like the input is sometimes being truncated (or corrupted) when
> it's sent to solr.
> What client are you using?
>
> -Yonik
> http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8
>


Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
Hi,

I have a "X" field in my index, which is a feature hash I would like to use
to remove the duplicates in my result.
I cant keep this as the unique id field. Is there any method or any
parameter at the search time to remove the duplicates on a particular
field(hash in this case)?

Thanks in advance,

Regards,
Pooja


Re: Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
Hi Eric,

Thanks for the quick reply.
I had looked at the deduplication but I found it to deduplication at the
index time, right? I would prefer to do deduplication at the search time!

Regards,
Pooja

On Tue, Jun 21, 2011 at 11:15 PM, Erick Erickson wrote:

> I think this is what you're looking for:
> http://wiki.apache.org/solr/Deduplication
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani 
> wrote:
> > Hi,
> >
> > I have a "X" field in my index, which is a feature hash I would like to
> use
> > to remove the duplicates in my result.
> > I cant keep this as the unique id field. Is there any method or any
> > parameter at the search time to remove the duplicates on a particular
> > field(hash in this case)?
> >
> > Thanks in advance,
> >
> > Regards,
> > Pooja
> >
>


Re: Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
I am fine to remove the duplicates and not show them up for this use case.
But grouping can also help me show one representative from the group.
At present I am using solr 1.4. Any idea how to achieve it otherwise if not
by using solr 3.3.

Regards,
Pooja

On Tue, Jun 21, 2011 at 11:55 PM, Erick Erickson wrote:

> Well, in trunk and the soon-to-be-released Solr 3.3, you could use
> grouping,
> what is the use-case here? Are you going to show all the docs (even
> duplicates)
> some of the time?
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 1:53 PM, Pooja Verlani 
> wrote:
> > Hi Eric,
> >
> > Thanks for the quick reply.
> > I had looked at the deduplication but I found it to deduplication at the
> > index time, right? I would prefer to do deduplication at the search time!
> >
> > Regards,
> > Pooja
> >
> > On Tue, Jun 21, 2011 at 11:15 PM, Erick Erickson <
> erickerick...@gmail.com>wrote:
> >
> >> I think this is what you're looking for:
> >> http://wiki.apache.org/solr/Deduplication
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani  >
> >> wrote:
> >> > Hi,
> >> >
> >> > I have a "X" field in my index, which is a feature hash I would like
> to
> >> use
> >> > to remove the duplicates in my result.
> >> > I cant keep this as the unique id field. Is there any method or any
> >> > parameter at the search time to remove the duplicates on a particular
> >> > field(hash in this case)?
> >> >
> >> > Thanks in advance,
> >> >
> >> > Regards,
> >> > Pooja
> >> >
> >>
> >
>


Query time noun, verb boosting

2011-06-22 Thread Pooja Verlani
Hi,

At the query time, I want to make the lucene query such that it should boost
only the noun from the query or some concept existing in the index. Are
there any possibilities or any possible ideas that can be worked around?


Regards,
Pooja


Re: Query time noun, verb boosting

2011-06-22 Thread Pooja Verlani
Hi,

Say for example, a query like "mammohan singh dancing", I am preferring to
make a compulsory condition on nouns to be searched but any verb isnt
important for me, I am preferring to extract results for manmohan singh and
not for dancing. If I can extract noun verb or can get to know that in my
index I have a concept of "manmohan singh" or an identity if not concept, I
would like to define rules for doing a strict(compulsory) match of
noun(concept) and loose match(non-compulsory boosting) for the verb.

Basically, I want to avoid getting zero results for a compulsory match of
the 3 tokens(in this case manmohan singh dancing) of the query and instead I
want to do a compulsory match on manmohan singh since that exists in my
index and "dancing" shouldn't be a compulsory match for non-zero number of
results.

Hope this explains.
Any suggestions?

Regards,
Pooja


On Thu, Jun 23, 2011 at 11:07 AM, Anshum  wrote:

> What would you mean by 'noun or some concept'. Would be better if you could
> give a rather concrete example.
> About detecting parts of speech, you could use a lot of libraries but I
> didn't get about boosting terms from the Index.
>
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
>
>
> On Thu, Jun 23, 2011 at 11:02 AM, Pooja Verlani  >wrote:
>
> > Hi,
> >
> > At the query time, I want to make the lucene query such that it should
> > boost
> > only the noun from the query or some concept existing in the index. Are
> > there any possibilities or any possible ideas that can be worked around?
> >
> >
> > Regards,
> > Pooja
> >
>


Restricting the Solr Posting List (retrieved set)

2011-07-11 Thread Pooja Verlani
Hi,

We want to search in an index in such a way that even if a clause has a long
posting list - Solr should stop collecting documents for the clause
after receiving X documents that match the clause.

For example, if  for query "India",solr can return 5M documents, we would
like to restrict the set at only 500K documents.

The assumption is that since we are posting chronologically - we would like
the X most recent documents to be matched for the clause only.

Is it possible anyway?

Regards,
Pooja


Re: Restricting the Solr Posting List (retrieved set)

2011-07-11 Thread Pooja Verlani
Thanks for the reply.

I am having a very huge index, so to retrieve older documents when not
needed definitely wastes time and also at the same time I would need to do
recency boosts/ time sort. So, I am looking for a way to avoid that.
Thats why I am in need to restrict my docset  and recently added ones. I
would not prefer to use the "rows" parameter for this.

Thanks,
pooja

On Mon, Jul 11, 2011 at 5:49 PM, Bob Sandiford  wrote:

> A good answer may also depend on WHY you are wanting to restrict to 500K
> documents.
>
> Are you seeking to reduce the time spent by Solr in determining the doc
> count?  Are you just wanting to prevent people from moving too far into the
> result set?  Is it case that you can only display 6 digits for your return
> count? :)
>
> If Solr is performing adequately, you could always just artificially
> restrict the result set.  Solr doesn't actually 'return' all 5M documents -
> it only returns the number you have specified in your query (as well as
> having some cache for the next results in anticipation of a subsequent
> query).  So, if the total count returned exceeds 500K, then just report 500K
> as the number of results, and similarly restrict how far a user can page
> through the results...
>
> (And - you can (and sounds like you should) sort your results by descending
> post date so that you do in fact get the most recent ones coming back
> first...)
>
> Bob Sandiford | Lead Software Engineer | SirsiDynix
> P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
> www.sirsidynix.com
>
>
> > -Original Message-
> > From: Ahmet Arslan [mailto:iori...@yahoo.com]
> > Sent: Monday, July 11, 2011 7:43 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Restricting the Solr Posting List (retrieved set)
> >
> >
> > > We want to search in an index in such a way that even if a
> > > clause has a long
> > > posting list - Solr should stop collecting documents for
> > > the clause
> > > after receiving X documents that match the clause.
> > >
> > > For example, if  for query "India",solr can return 5M
> > > documents, we would
> > > like to restrict the set at only 500K documents.
> > >
> > > The assumption is that since we are posting chronologically
> > > - we would like
> > > the X most recent documents to be matched for the clause
> > > only.
> > >
> > > Is it possible anyway?
> >
> > Looks like your use-case is suitable for time based sharding.
> > http://wiki.apache.org/solr/DistributedSearch
> >
> > Lets say you divide your shards according to months. You will have a
> > separate core for each month.
> > http://wiki.apache.org/solr/CoreAdmin
> >
> > When a query comes in, you will hit the most recent core. If you don't
> > obtain enough results add a new value (previous month core) to &shards=
> > parameter.
> >
>
>
>


csv responsewriter and numfound

2011-08-03 Thread Pooja Verlani
Hi,

Is there anyway to get numFound from csv response format? Some parameter?
Or shall I change the code for csvResponseWriter for this?

Thanks,
Pooja


Same id on two shards

2011-08-06 Thread Pooja Verlani
Hi,

We have a multicore solr with 6 cores. We merge the results using shards
parameter or distrib handler.
I have a problem, I might post one document on one of the cores and then
post it after some days on another core, as I have a time-sliced multicore
setup!

The question is if I retrieve a document which is posted on both the shards,
will solr return me only one document or both. And if only one document will
be return, which one?

Regards,
Pooja


Hierarchical schema design

2009-08-31 Thread Pooja Verlani
Hi all,
Is there a possibility to have a hierarchical schema in solr, meaning can we
have objects under objects.
For example, for a doc like:

 
 
 
 


  
  
  ,b3>

.
.
.
.
.
.
.


I need to make schema with 3 types of such objects and all of them having
different field values for each.

Please reply if there exists such a possibility.

Regards.
Pooja


Phrase stopwords

2009-09-23 Thread Pooja Verlani
Hi,
Is it possible to have a phrase as a stopword in solr? In case, please share
how to do so?

regards,
Pooja


Using recency rord on /distrib

2009-09-23 Thread Pooja Verlani
Hi,
I have to put recency using recip and rord functions on an app using
/distrib requesthandler.
Can i put bf param in /distrib directly call the url like:
http://localhost:8983/solr/distrib/?q=cable

where in /distrib requesthandler bf is defined as:

recip(rord(last_sold_date),1,1000,1000)^0.7
 

I am not able to see the difference in the results with or without the bf
param defined.

Please share your views.

regards,
Pooja


Date field being null

2009-10-06 Thread Pooja Verlani
Hi,
My fieldtype definition is like:


I am defining a field:


Can I have a null for such a field? or is there a way I can use it as a date
field only if the value is null. I cant put the field as a string type as I
have to apply recency sort and some filters for that field.
Regards,
Pooja


Solr Random field

2009-10-26 Thread Pooja Verlani
Hi,
I want a random sort type in the search results. The scenario is:
I want to return random results with no context relation to the query fired,
if I am not able to find any results relevant.
I want something like:

http://localhost:8083/solr/select/?q=*:*&sort=RANDOM.
Please suggest.

Regards,
Pooja


Re: Solr Random field

2009-10-26 Thread Pooja Verlani
No I dont have the field with this type and neither do I want to re-index ..
is it possible otherwise ?

2009/10/26 Noble Paul നോബിള്‍ नोब्ळ् 

> do you have a field whose type="random" . If yes then u can sort by that
> field
>
> On Mon, Oct 26, 2009 at 3:35 PM, Pooja Verlani 
> wrote:
> > Hi,
> > I want a random sort type in the search results. The scenario is:
> > I want to return random results with no context relation to the query
> fired,
> > if I am not able to find any results relevant.
> > I want something like:
> >
> > http://localhost:8083/solr/select/?q=*:*&sort=RANDOM.
> > Please suggest.
> >
> > Regards,
> > Pooja
> >
>
>
>
> --
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
>


Hierarchical xml

2009-12-01 Thread Pooja Verlani
Hi,
I want to index an xml like following:


John
1979-29-17T28:14:48Z


   ABC College
   1998
 
 
   PQRS College
   2001
 
  
   XYZ College
   2003
 



I am not able to judge how should be the schema like?
Also, if I flatten such an xml and make collegename & year as multivalued
like this:
ABC College, PQRS College, XYZ College
1998,2001,2003

In such a scenario I can't make a coorespondence between ABC college & year
1998.

In case someone has an efficient way out, do share.
Thanks in anticipation.

Regards,
Pooja


Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-11 Thread Pooja Verlani
Hi all,

I have a specific requirement for query time boosting.
I have to boost a field on the basis of the value returned from one of the
fields of the document.

Basically, I have the creationDate for a document and in order to introduce
recency factor in the search, i need to give a boost to the creation field,
where the boost value is something like a log(1/x) function and x is the
(presentDate - creationDate).
Till now what I have seen is we can give only a static boost to the
documents.

In case you can provide a solution to my problem.. please do reply :)

Thanks a lot,
Regards.
Pooja


Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-12 Thread Pooja Verlani
Hi,

Will this currentDate work with epoch time only or can work with any date
format as specified by the "simpleDateFormat" class of Java ??

Thank you,
Regards,
Pooja

On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Take a look at FunctionQuery support in Solr:
>
> http://wiki.apache.org/solr/FunctionQuery
>
> http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd
>
> On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani  >wrote:
>
> > Hi all,
> >
> > I have a specific requirement for query time boosting.
> > I have to boost a field on the basis of the value returned from one of
> the
> > fields of the document.
> >
> > Basically, I have the creationDate for a document and in order to
> introduce
> > recency factor in the search, i need to give a boost to the creation
> field,
> > where the boost value is something like a log(1/x) function and x is the
> > (presentDate - creationDate).
> > Till now what I have seen is we can give only a static boost to the
> > documents.
> >
> > In case you can provide a solution to my problem.. please do reply :)
> >
> > Thanks a lot,
> > Regards.
> > Pooja
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-14 Thread Pooja Verlani
hi,
Is it possible to have a fieldname with colon for example "source:site"? I
want to apply query time boost as per recency to this field with the recency
function.
Recip function with rord isn't taking my source:site fieldname, its throwing
an exception. I have tried with escape characters too.
Please suggest something.

Thank you,
Regards
Pooja

On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Take a look at FunctionQuery support in Solr:
>
> http://wiki.apache.org/solr/FunctionQuery
>
> http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd
>
> On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani  >wrote:
>
> > Hi all,
> >
> > I have a specific requirement for query time boosting.
> > I have to boost a field on the basis of the value returned from one of
> the
> > fields of the document.
> >
> > Basically, I have the creationDate for a document and in order to
> introduce
> > recency factor in the search, i need to give a boost to the creation
> field,
> > where the boost value is something like a log(1/x) function and x is the
> > (presentDate - creationDate).
> > Till now what I have seen is we can give only a static boost to the
> > documents.
> >
> > In case you can provide a solution to my problem.. please do reply :)
> >
> > Thanks a lot,
> > Regards.
> > Pooja
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-14 Thread Pooja Verlani
ohk.. that means I can't use colon in the fieldname ever in such a scenario
?

On Mon, Dec 15, 2008 at 12:24 PM, Akshay  wrote:

> The colon is used to specify value for a field. E.g. in the query box of
> solr admin you would type something like fieldName:
> (title:Java). You can use hypen '-' or some other character in the field
> name instead of colon.
>
> On Mon, Dec 15, 2008 at 12:11 PM, Pooja Verlani  >wrote:
>
> > hi,
> > Is it possible to have a fieldname with colon for example "source:site"?
> I
> > want to apply query time boost as per recency to this field with the
> > recency
> > function.
> > Recip function with rord isn't taking my source:site fieldname, its
> > throwing
> > an exception. I have tried with escape characters too.
> > Please suggest something.
> >
> > Thank you,
> > Regards
> > Pooja
> >
> > On Thu, Dec 11, 2008 at 7:20 PM, Shalin Shekhar Mangar <
> > shalinman...@gmail.com> wrote:
> >
> > > Take a look at FunctionQuery support in Solr:
> > >
> > > http://wiki.apache.org/solr/FunctionQuery
> > >
> > >
> >
> http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd
> > >
> > > On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani <
> pooja.verl...@gmail.com
> > > >wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have a specific requirement for query time boosting.
> > > > I have to boost a field on the basis of the value returned from one
> of
> > > the
> > > > fields of the document.
> > > >
> > > > Basically, I have the creationDate for a document and in order to
> > > introduce
> > > > recency factor in the search, i need to give a boost to the creation
> > > field,
> > > > where the boost value is something like a log(1/x) function and x is
> > the
> > > > (presentDate - creationDate).
> > > > Till now what I have seen is we can give only a static boost to the
> > > > documents.
> > > >
> > > > In case you can provide a solution to my problem.. please do reply :)
> > > >
> > > > Thanks a lot,
> > > > Regards.
> > > > Pooja
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shalin Shekhar Mangar.
> > >
> >
>
>
>
> --
> Regards,
> Akshay Ukey.
>


Fwd: Distributed Searching - Limitations?

2008-12-18 Thread Pooja Verlani
Hi,
I am planning to use Solr's distributed searching for my project. But while
going through http://wiki.apache.org/solr/DistributedSearch, i found a few
limitations with it. Can anyone please explain the 2nd and 3rd points in the
limitations sections on the page. The points are:

   -

   When duplicate doc IDs are received, Solr chooses the first doc and
   discards subsequent ones
   -

   No distributed idf

Thanks.
Regards,
Pooja


Problem with WT parameter when upgrading from Solr1.2 to solr1.3

2008-12-29 Thread Pooja Verlani
Hi,



I just upgraded my system from Solr 1.2 to Solr 1.3. I am using the same
plugin for the queryResponseWriter that I used in Solr1.2. Problem here is
that when I am using *wt* parameter as the plugin name with full package
then I don't get the response which I used to get in 1.2 and when I don't
give WT parameter, I get the perfect response from the default
XMLResponseWriter as expected. Also the above problem occurs only when we
use Shards. This occurs only when I am using distributed query on multiple
shards, on individual shards it working fine i.e. when we use /select clause
on individual shards.
(http://localhost:8081/solr/select?q=%22indian%20railways%22&qt=modified&fl=*,score&wt=custom&hl=true).




On individual shards, the custom responsewriters are working absolutely fine
but not with combining shards or using /distrib/


http://localhost:8081/solr/distrib?q=%22indian%20railways%22&qt=modified&fl=*,score&wt=custom&hl=true




Please help.





This is part of solrconfig.xml



   

 

   x,y,z

 

   



**

**

* *



**



Thanks & Regards,

Almas


Problem with query response writer when upgrading from Solr1.2 to solr1.3

2008-12-30 Thread Pooja Verlani
Hi,



I just upgraded my system from Solr 1.2 to Solr 1.3. I am using the same
plugin for the queryResponseWriter that I used in Solr1.2. Problem here is
that when I am using *wt* parameter as the plugin name with full package
then I don't get the response which I used to get in 1.2 and when I don't
give WT parameter, I get the perfect response from the default
XMLResponseWriter as expected. Also the above problem occurs only when we
use Shards and also, response is perfect when wt is given with hl=true. This
occurs only when I am using distributed query on multiple shards, on
individual shards it working fine i.e. when we use /select clause on
individual shards.
(http://localhost:8081/solr/select?q=%22indian%20railways%22&qt=modified&fl=*,score&wt=custom&hl=true).




On individual shards, the custom responsewriters are working absolutely fine
but not with combining shards or using /distrib/


http://localhost:8081/solr/distrib?q=%22indian%20railways%22&qt=modified&fl=*,score&wt=custom&hl=true




Please help.





This is part of solrconfig.xml



   

 

   x,y,z

 

   



**

**

* *

*
*

Regards,**

*Pooja
*


Querying back with top few results in the same XMLWriter!

2009-01-08 Thread Pooja Verlani
Hi,
I am using a ranking algorithm by modifying the XMLWriter to use a
formulation which takes the top 3 results and query with the 3 results and
now presents the result with as function of the results from these 3
queries. Can anyone reply if I can take the top 3results and query with them
in the same reponsewriter?
 Or is there any functionality provided by solr in either 1.2 or 1.3
version.

Thank you.
Regards,
Pooja Verlani


Re: Problem with WT parameter when upgrading from Solr1.2 to solr1.3

2009-01-09 Thread Pooja Verlani
yeah, finally I did it by modifying the required solrDocumentList and using
it instead of DocList object as in Solr 1.2

Thanks
Pooja

On Fri, Jan 9, 2009 at 9:01 AM, Yonik Seeley  wrote:

> On Thu, Jan 8, 2009 at 9:40 PM, Chris Hostetter
>  wrote:
> > you have a custom response writer you had working in
> > Solr 1.2, and now you are trying to use that same custom response writer
> in
> > Solr 1.3 with distributed requests?
>
> Right, that's probably the crux of it - distributed search required
> some extensions to response writers... things like handling
> SolrDocument and SolrDocumentList.
>
> -Yonik
>


Release of solr 1.4 & autosuggest

2009-02-15 Thread Pooja Verlani
Hi All,
I am interested in TermComponent addition in solr 1.4 (
http://wiki.apache.org/solr/TermsComponent). When
should we expect solr 1.4 to be available for use?
Also, can this Termcomponent be made available as a plugin for solr 1.3?

Kindly reply if you have any idea.

Regards,
Pooja


javax.xml.stream.XMLStreamException while posting

2009-03-02 Thread Pooja Verlani
Hi,
When I posting a valid xml document to solr, its giving the following error:

{http--10003-7} javax.xml.stream.XMLStreamException: :2:20 expected '-' at
'['
{http--10003-7} at
com.caucho.xml.stream.XMLStreamReaderImpl.error(XMLStreamReaderImpl.java:1268)
{http--10003-7} at
com.caucho.xml.stream.XMLStreamReaderImpl.expect(XMLStreamReaderImpl.java:1127)
{http--10003-7} at
com.caucho.xml.stream.XMLStreamReaderImpl.readNext(XMLStreamReaderImpl.java:642)
{http--10003-7} at
com.caucho.xml.stream.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
{http--10003-7} at
org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:321)
{http--10003-7} at
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:195)
{http--10003-7} at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
{http--10003-7} at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
{http--10003-7} at
org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
{http--10003-7} at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
{http--10003-7} at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
{http--10003-7} at
com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:87)
{http--10003-7} at
com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:187)
{http--10003-7} at
com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:266)
{http--10003-7} at
com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:270)
{http--10003-7} at
com.caucho.server.port.TcpConnection.run(TcpConnection.java:678)
{http--10003-7} at
com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:721)
{http--10003-7} at
com.caucho.util.ThreadPool$Item.run(ThreadPool.java:643)
{http--10003-7} at java.lang.Thread.run(Thread.java:619)

No matter what I do with the xmls, it is recurring. I am using solr 1.3 with
resin 3.1.6 on Intel Xeon with CentOS 4.6 release. The java version I am
using is 1.6.0_10.
Please let me know if someone can throw some light on it :)

Thank you,

Regards,
Pooja


Re: javax.xml.stream.XMLStreamException while posting

2009-03-03 Thread Pooja Verlani
Thanks it did work with the standard woodstox parser in the resin
configuration file.

Regards,
Pooja

On Mon, Mar 2, 2009 at 9:55 PM, Walter Underwood wrote:

> Also, open your document in a browser to make sure that it really is
> well-formed. Most browsers will pinpoint the syntax error. --wunder
>
> On 3/2/09 6:46 AM, "Noble Paul നോബിള്‍  नोब्ळ्" 
> wrote:
>
> > the parser you are using is not the standard woodstox one.
> > try this http://docs.sun.com/app/docs/doc/819-3672/gfkoy?a=view
> >
> > On Mon, Mar 2, 2009 at 6:24 PM, Pooja Verlani 
> wrote:
> >> Hi,
> >> When I posting a valid xml document to solr, its giving the following
> error:
> >>
> >> {http--10003-7} javax.xml.stream.XMLStreamException: :2:20 expected '-'
> at
> >> '['
> >> {http--10003-7} at
> >>
>
> com.caucho.xml.stream.XMLStreamReaderImpl.error(XMLStreamReaderImpl.java:1268>>
> )
> >> {http--10003-7} at
> >>
> com.caucho.xml.stream.XMLStreamReaderImpl.expect(XMLStreamReaderImpl.java:112
> >> 7)
> >> {http--10003-7} at
> >>
> com.caucho.xml.stream.XMLStreamReaderImpl.readNext(XMLStreamReaderImpl.java:6
> >> 42)
> >> {http--10003-7} at
> >>
> com.caucho.xml.stream.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
> >> {http--10003-7} at
> >>
> org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandl
> >> er.java:321)
> >> {http--10003-7} at
> >>
> org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateReques
> >> tHandler.java:195)
> >> {http--10003-7} at
> >>
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRe
> >> questHandler.java:123)
> >> {http--10003-7} at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.j
> >> ava:131)
> >> {http--10003-7} at
> >> org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
> >> {http--10003-7} at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:30
> >> 3)
> >> {http--10003-7} at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:2
> >> 32)
> >> {http--10003-7} at
> >>
> com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:
> >> 87)
> >> {http--10003-7} at
> >>
> com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:18
> >> 7)
> >> {http--10003-7} at
> >>
> com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:2
> >> 66)
> >> {http--10003-7} at
> >> com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:270)
> >> {http--10003-7} at
> >> com.caucho.server.port.TcpConnection.run(TcpConnection.java:678)
> >> {http--10003-7} at
> >> com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:721)
> >> {http--10003-7} at
> >> com.caucho.util.ThreadPool$Item.run(ThreadPool.java:643)
> >> {http--10003-7} at java.lang.Thread.run(Thread.java:619)
> >>
> >> No matter what I do with the xmls, it is recurring. I am using solr 1.3
> with
> >> resin 3.1.6 on Intel Xeon with CentOS 4.6 release. The java version I am
> >> using is 1.6.0_10.
> >> Please let me know if someone can throw some light on it :)
> >>
> >> Thank you,
> >>
> >> Regards,
> >> Pooja
> >>
> >
> >
>
>


Regd. Difference check at the time of updation

2009-04-07 Thread Pooja Verlani
Hi all,

I am looking for a mechanism to check the amount
of difference between a document already in the index
with the one updated with some new content. Basically,
I want to design a criteria to decide whether or not to
update the document with the new one.
In case solr already has something like that present or if anyone
has an idea, please do share.

Thanks...
Regards,
Pooja


user feedback in solr

2009-06-10 Thread Pooja Verlani
Hi all,

I wanted to know if there is any provision to accommodate user feedback in
the form of query logs and click logs,
to improve the search relevance and ranking.
Also, is there a possibility of it being included in the next version ?

Thank you,
Regards,
Pooja


Word frequency count in the index

2009-07-16 Thread Pooja Verlani
Hi,

Is there any way in SOLR to know the count of each word indexed in the solr
?
I want to find out the different word frequencies to figure out '
application specific stop words'.

Please let me know if its possible.

Thank you,
Regards,
Pooja


Re: Word frequency count in the index

2009-07-22 Thread Pooja Verlani
Hi Grant,
thanks for your reply. I have one more doubt, if I use Luke's request
handler in solr for this issue, the top terms I get, are they term frequency
or highest document frequency terms.
I would like to get terms that occur max in a document and those document
form a good percentage in the total index.
Kindly reply if any other option straight or an elaborate one is available.

Thank you,


Pooja

On Thu, Jul 16, 2009 at 4:05 PM, Grant Ingersoll wrote:

> In the trunk version, the TermsComponent should give you this:
> http://wiki.apache.org/solr/TermsComponent.  Also, you can use the
> LukeRequestHandler to get the top words in each field.
>
> Alternatively, you may just want to point Luke at your index.
>
>
> On Jul 16, 2009, at 6:29 AM, Pooja Verlani wrote:
>
>  Hi,
>>
>> Is there any way in SOLR to know the count of each word indexed in the
>> solr
>> ?
>> I want to find out the different word frequencies to figure out '
>> application specific stop words'.
>>
>> Please let me know if its possible.
>>
>> Thank you,
>> Regards,
>> Pooja
>>
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>


Synonyms from index

2009-07-22 Thread Pooja Verlani
Hi,
Is there a possible way to generate synonyms from the index ? I have an
index with lots of searchable terms turning out to be having synonyms and
users too have different synonyms.
If not then the only way if to learn from the query logs and click logs
but in case there exists one, please share.

regards,
Pooja


Dynamic boosting of ids at search time

2010-01-17 Thread Pooja Verlani
Hi,
I have to boost certain ids at the search time and these ids are not fixed,
so i can't keep them in DismaxRequest handler.
I mean, if for query x, ids to be boosted are 243452,346563,773567, then for
query y the ids to be boosted won't be the same. They are calculated at the
search time.
Also, I cant keep them in the lucene query as the list goes in thousands.
Please suggest a good resolution to it.

Thanks n Regards
Pooja


SOLR Multivalued field and length norm

2010-02-25 Thread Pooja Verlani
Hi,
I understand if I query on a multivalued field, length norm takes the total
length of the multivalued field.
Is it possible to use the length of only the particular value in the array
of multivalued field? It would be easier and more efficient in searching
then.

Regards,
Pooja


Changing term frequency according to value of one of the fields

2010-02-25 Thread Pooja Verlani
Hi,
I want to modify Similarity class for my app like the following-
Right now tf is Math.sqrt(termFrequency)
I would like to modify it to
Math.sqrt(termFrequncy/solrDoc.getFieldValue("count"))
where count is one of the fields in the particular solr document.
Is it possible to do so? Can I import solrDocument class and take the
particular solrDoc for calculating tf in the similarity class?

Please suggest.

regards,
Pooja


distributed solr and tf-idf

2010-03-22 Thread Pooja Verlani
Hi,
How good is the distributed solr shards tf-idf (If at all its working with
solr 1.4) ?
Is there a chance of it getting better. I have to implement a huge index
with many shards. How is it possible to get a global tf-idf for the same,
any ideas?

Regards,
Pooja


numFound:0 when documents exists

2010-04-08 Thread Pooja Verlani
Hi,
In our search engine, we are getting numFound to be "0" for some queries
where documents actually exist and also they are returned too. It randomly
sometimes returns numfound="0". Does any one has an idea what can be the
possible reason for the same?

Regards,
Pooja


Long Lucene queries

2010-05-07 Thread Pooja Verlani
Hi all,

In my web-app, i have to fire a query thats too long due to the various
boosts I have to give. The size changes according to the query and many a
times I get a blank page as I probably cross lucene's character limit. Is it
possible to post it otherwise, to solr. Shall I be using POST instead of a
GET here? Any other better suggestion?

Regards,
Pooja


Re: Long Lucene queries

2010-05-11 Thread Pooja Verlani
Hi,
Thanks Eric..
The search parameter length is a lot to be done in GET, I am thinking of
opting for POST, is it possible to do POST request to solr. Any
configuration changes or code changes required for the same? I have many
parameters but only one is supposed to be very lengthy.

Any suggestions?

Regards,
Pooja

On Fri, May 7, 2010 at 4:39 PM, Erik Hatcher  wrote:

>
> On May 7, 2010, at 6:56 AM, Pooja Verlani wrote:
>
>> In my web-app, i have to fire a query thats too long due to the various
>> boosts I have to give. The size changes according to the query and many a
>> times I get a blank page as I probably cross lucene's character limit. Is
>> it
>> possible to post it otherwise, to solr. Shall I be using POST instead of a
>> GET here? Any other better suggestion?
>>
>
> A few options:
>
>  * Use POST (except you won't see the params in the log files)
>
>  * Tomcat: <
> http://wiki.apache.org/solr/SolrTomcat#Enabling_Longer_Query_Requests>
>
>  * Jetty: <http://wiki.apache.org/solr/SolrJetty#Long_HTTP_GET_Query_URLs>
>
> Or, possibly a lot of your query params can be put into solrconfig.xml, and
> you send over just what changed.  You can do some tricks with param
> substitution to streamline this stuff in some cases.  Some examples of what
> you're sending over would help us see where some improvements could be made.
>
>Erik
>
>


Too many clauses in lucene query

2010-05-12 Thread Pooja Verlani
Hi all,
I am forming a query to boost a certain ids, the list of ids can go till
2000 too. I am sometimes getting the error for too many clauses in the
boolean query and otherwise i am getting a null page. Can you suggest any
config changes regarding this.
I am using solr 1.3.

Regards,
Pooja