Re: how to get one word frequency from a document

2011-07-16 Thread Ahmet Arslan
> I am trying to use TermVectorComponent to get the word
> frequency from
> a particular document. Here is the url I used:
> "q=someword+id%3A"somedoc"&qt=tvrh&tv.all=true".
> But the result
> includes all the words' frequency in that document. Are
> there any
> query filters or request parameters that I can use to get
> one
> particular word's frequency from a particular document?

May be this http://wiki.apache.org/solr/FunctionQuery#tf ?


Start parameter messes with rows

2011-07-16 Thread Markus Jelsma
Hi,

Got something very strange here with Solr Implementation Version: 3.4-SNAPSHOT 
1145597M - markus - 2011-07-12 17:10:47. It's an index with ~1.4 million 
records indexed from a Nutch crawl. I noticed strange behaviour when using an 
interface and paging through the result set, the result set got longer for 
each page. It looks like the start parameter is ignored but used to increment 
the rows parameter instead!

So start = 1 returns 11 rows and start = 500 returns 510 rows, funky! Can 
anyone confirm this behaviour is did something happen with my configuration?

This does not happen with Solr Implementation Version: 3.2-SNAPSHOT 1098534M - 
markus - 2011-05-25 15:51:45.

Thanks


Re: Start parameter messes with rows

2011-07-16 Thread jame vaalet
hi ,
i just wanna be clear in the concepts of core and shard ?
a single core is an index with same schema  , is this wat core really is ?
can a single core contain two separate indexes with different schema in it ?
Is a shard  refers to a collection of index in a single physical machine
?can a single core be presented in different shards ?



- JAME


Document IDs instead of count for facets?

2011-07-16 Thread Jeff Schmidt
Hello:

I have a need for applying certain terms and filter queries to my index 
(producing the result set), and to create a response to clients that indicates 
for a number of fields (some are multi-valued), the document IDs with values 
for those fields. These fields are both indexed and stored. For existing 
operations, these fields are treated as facets and that works great.

I'd like to do this as efficiently as possible. It seems my problem will be 
solved if there is some way to get the document IDs associated with the facet 
counts.

I'm using SolrJ to talk to Solr in my application, and I don't see a way to do 
this. I'm using Sorl 3.3.

Many thanks!

Jeff
--
Jeff Schmidt
535 Consulting
j...@535consulting.com
http://www.535consulting.com
(650) 423-1068











Re: POST VS GET and NON English Characters

2011-07-16 Thread Paul Libbrecht
If you have the option, try setting the default charset of the 
servlet-container to utf-8.
Typically this is done by setting a system property on startup.

My experience has been that the default used to be utf-8 but it is less and 
less and sometimes in a surprising way!

paul


Le 16 juil. 2011 à 05:34, Sujatha Arun a écrit :

> It works fine with GET method ,but I am wondering why it does not with POST
> method.
> 
> 2011/7/15 pankaj bhatt 
> 
>> Hi Arun,
>> This looks like an Encoding issue to me.
>>  Can you change your browser settinsg to UTF-8 and hit the search url
>> via GET method.
>> 
>>   We faced the similar problem with chienese,korean languages, this
>> solved the problem.
>> 
>> / Pankaj Bhatt.
>> 
>> 2011/7/15 Sujatha Arun 
>> 
>>> Hello,
>>> 
>>> We have implemented solr search in  several languages .Intially we used
>> the
>>> "GET" method for querying ,but later moved to  "POST" method to
>> accomodate
>>> lengthy queries .
>>> 
>>> When we moved form  GET TO POSt method ,the german characteres could no
>>> longer be searched and I had to use the fucntion utf8_decode in my
>>> application  for the search to work for german characters.
>>> 
>>> Currently I am doing this  while quering using the POST method ,we are
>>> using
>>> the standard Request Handler
>>> 
>>> 
>>> $this->_queryterm=iconv("UTF-8", "ISO-8859-1//TRANSLIT//IGNORE",
>>> $this->_queryterm);
>>> 
>>> 
>>> This makes the query work for german characters and other languages but
>>> does
>>> not work for certain charactes  in Lithuvanian and spanish.Example:
>>> *Not working
>>> 
>>>  - *Iš
>>>  - Estremadūros
>>>  - sNaująjį
>>>  - MEDŽIAGOTYRA
>>>  - MEDŽIAGOS
>>>  - taškuose
>>> 
>>> *Working
>>> 
>>>  - *garbę
>>>  - ieškoti
>>>  - ispanų
>>> 
>>> Any ideas /input  ?
>>> 
>>> Regards
>>> Sujatha
>>> 
>> 



Re: Document IDs instead of count for facets?

2011-07-16 Thread Erik Hatcher
I'm a bit confused by what you're asking for, but maybe it's as simple as 
making a fq=facet_field:facet_value&fl=id request?  You'd have to do this for 
each value in the field.

Erik

On Jul 16, 2011, at 09:59 , Jeff Schmidt wrote:

> Hello:
> 
> I have a need for applying certain terms and filter queries to my index 
> (producing the result set), and to create a response to clients that 
> indicates for a number of fields (some are multi-valued), the document IDs 
> with values for those fields. These fields are both indexed and stored. For 
> existing operations, these fields are treated as facets and that works great.
> 
> I'd like to do this as efficiently as possible. It seems my problem will be 
> solved if there is some way to get the document IDs associated with the facet 
> counts.
> 
> I'm using SolrJ to talk to Solr in my application, and I don't see a way to 
> do this. I'm using Sorl 3.3.
> 
> Many thanks!
> 
> Jeff
> --
> Jeff Schmidt
> 535 Consulting
> j...@535consulting.com
> http://www.535consulting.com
> (650) 423-1068
> 
> 
> 
> 
> 
> 
> 
> 
>