New to Solr can someone help me to know if Solr fits my use case

2014-03-26 Thread Saurabh Agarwal
Hi All,

I am  new to Solr and from initial reading i am quite convinced Solr
will be of great help. Can anyone help in making that decision.

Usecase:
1.  I will have PDF,Word docs generated daily/weekly ( lot of them )
which kinds of get overwritten frequently.
2. I have a dictionary kind of thing ( having a list of which
words/small sentences should be part of above docs , words which
cannot be and alternatives for some  ).
3. Now i want Solr to search my Docs produced in step 1 to be searched
for words/small sentences from step 2 and give me my Doc Name/line no
in which they exist.

Will Solr be a good help to me, If anybody can help giving some
examples that will be great.

Appreciate your help and patience.

Thanks
Saurabh


Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Saurabh Agarwal
Can anyone help me please.

Hi All,

I am  new to Solr and from initial reading i am quite convinced Solr
will be of great help. Can anyone help in making that decision.

Usecase:
1.  I will have PDF,Word docs generated daily/weekly ( lot of them )
which kinds of get overwritten frequently.
2. I have a dictionary kind of thing ( having a list of which
words/small sentences should be part of above docs , words which
cannot be and alternatives for some  ).
3. Now i want Solr to search my Docs produced in step 1 to be searched
for words/small sentences from step 2 and give me my Doc Name/line no
in which they exist.

Will Solr be a good help to me, If anybody can help giving some
examples that will be great.

Appreciate your help and patience.

Thanks
Saurabh


Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-27 Thread Saurabh Agarwal
Thanks a lot Alex for your reply, Appreciate the same.

So if i leave the line no part.
1. I guess putting pdf/word  in solr for search can be done, These
documents will go go in solr.
2. For search any automatic way to give a excel sheet or large search
keywords to search for .
ie i have 1000's of words that i want to search in doc can i do it
collectively or send search queries one by one.

Thanks
Saurabh



On Fri, Mar 28, 2014 at 6:48 AM, Alexandre Rafalovitch
 wrote:
> This feels somewhat backwards. It's very hard to extract Line-Number
> information out of MSWord and next to impossible from PDF. So, it's
> not whether the Solr is a good fit or not here is that maybe your
> whole architecture has a major issue. Can you do this/what you want by
> hand at least once? Down to the precision you want?
>
> If you can, then yes you probably can automate the searching with
> Solr, though you will still have serious issues (sentence crossing
> line-boundaries, etc). But I suspect your whole approach will change
> once you try to do this manually.
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
> On Thu, Mar 27, 2014 at 11:46 PM, Saurabh Agarwal
>  wrote:
>> Can anyone help me please.
>>
>> Hi All,
>>
>> I am  new to Solr and from initial reading i am quite convinced Solr
>> will be of great help. Can anyone help in making that decision.
>>
>> Usecase:
>> 1.  I will have PDF,Word docs generated daily/weekly ( lot of them )
>> which kinds of get overwritten frequently.
>> 2. I have a dictionary kind of thing ( having a list of which
>> words/small sentences should be part of above docs , words which
>> cannot be and alternatives for some  ).
>> 3. Now i want Solr to search my Docs produced in step 1 to be searched
>> for words/small sentences from step 2 and give me my Doc Name/line no
>> in which they exist.
>>
>> Will Solr be a good help to me, If anybody can help giving some
>> examples that will be great.
>>
>> Appreciate your help and patience.
>>
>> Thanks
>> Saurabh


Re: New to Solr can someone help me to know if Solr fits my use case

2014-03-31 Thread Saurabh Agarwal
Thanks a lot Alexandre for the response much appreciated.

Thanks
Saurabh

On Fri, Mar 28, 2014 at 8:56 AM, Alexandre Rafalovitch
 wrote:
> 1. You don't actually put PDF/Word into Solr. Instead, it is run
> through content and metadata extraction process and then index that.
> This is important because "a computer" does not understand what you
> are looking for when you open a PDF. It only understand whatever text
> is possible to extract. In case of PDF it is often not much at all,
> unless it was generated with accessibility layer in place. You can
> experiment with what you can extract by downloading a standalone
> Apache Tika install, which has a command line version or using Solr's
> extractOnly flag. Solr, internally, uses Tika, so the results should
> be the same.
>
> 2) When you do a search you can do "field:(Keyword1 Keyword2 Keyword3
> Keyword4) and you get as results any document that matches one of
> those. Not sure about 1000 of them in one go, but certainly a large
> number.
>
> On the other hand, if you have same keywords all the time and you are
> trying to match documents against them, you might be more interested
> in Elastic Search's percolator
> (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-percolate.html
> ) or in Luwak (https://github.com/flaxsearch/luwak).
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
> On Fri, Mar 28, 2014 at 10:05 AM, Saurabh Agarwal
>  wrote:
>> Thanks a lot Alex for your reply, Appreciate the same.
>>
>> So if i leave the line no part.
>> 1. I guess putting pdf/word  in solr for search can be done, These
>> documents will go go in solr.
>> 2. For search any automatic way to give a excel sheet or large search
>> keywords to search for .
>> ie i have 1000's of words that i want to search in doc can i do it
>> collectively or send search queries one by one.
>>
>> Thanks
>> Saurabh
>>
>>
>>
>> On Fri, Mar 28, 2014 at 6:48 AM, Alexandre Rafalovitch
>>  wrote:
>>> This feels somewhat backwards. It's very hard to extract Line-Number
>>> information out of MSWord and next to impossible from PDF. So, it's
>>> not whether the Solr is a good fit or not here is that maybe your
>>> whole architecture has a major issue. Can you do this/what you want by
>>> hand at least once? Down to the precision you want?
>>>
>>> If you can, then yes you probably can automate the searching with
>>> Solr, though you will still have serious issues (sentence crossing
>>> line-boundaries, etc). But I suspect your whole approach will change
>>> once you try to do this manually.
>>>
>>> Regards,
>>>Alex.
>>> Personal website: http://www.outerthoughts.com/
>>> Current project: http://www.solr-start.com/ - Accelerating your Solr 
>>> proficiency
>>>
>>>
>>> On Thu, Mar 27, 2014 at 11:46 PM, Saurabh Agarwal
>>>  wrote:
>>>> Can anyone help me please.
>>>>
>>>> Hi All,
>>>>
>>>> I am  new to Solr and from initial reading i am quite convinced Solr
>>>> will be of great help. Can anyone help in making that decision.
>>>>
>>>> Usecase:
>>>> 1.  I will have PDF,Word docs generated daily/weekly ( lot of them )
>>>> which kinds of get overwritten frequently.
>>>> 2. I have a dictionary kind of thing ( having a list of which
>>>> words/small sentences should be part of above docs , words which
>>>> cannot be and alternatives for some  ).
>>>> 3. Now i want Solr to search my Docs produced in step 1 to be searched
>>>> for words/small sentences from step 2 and give me my Doc Name/line no
>>>> in which they exist.
>>>>
>>>> Will Solr be a good help to me, If anybody can help giving some
>>>> examples that will be great.
>>>>
>>>> Appreciate your help and patience.
>>>>
>>>> Thanks
>>>> Saurabh


question related to solr LTR plugin

2017-03-06 Thread Saurabh Agarwal (BLOOMBERG/ 731 LEX)
Hi, 

I do have a question related to solr LTR plugin. I have a use case of 
personalization and wondering whether you can help me there. I would like to 
rerank my query based on the relationship of searcher with the author of the 
returned documents. I do have relationship score in the external datastore in 
form of user1(searcher), user2(author), relationship score. In my query, I can 
pass searcher id as external feature. My question is that during querying, how 
do I retrieve relationship score for each documents as a feature and rerank the 
documents. Would I need to implement a custom feature to do so? and How to 
implement the custom feature.

Thanks,
Saurabh