from:"Stavros Delisavas"

Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delisavas

Hello Solr-Experts,

I am currently having a strange issue with my solr querys. I am running
a small php/mysql-website that uses Solr for faster text-searches in
name-lists, movie-titles, etc. Recently I noticed that the results on my
local development-environment differ from those on my webserver. Both
use the 100% same mysql-database with identical solr-queries for
data-import.
This is a sample query:

http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29&version=2.2&start=0&rows=1000&indent=on&fl=titleid

It is autogenerated by an php-script and 100% identical on local and on
my webserver. My local solr gives me the expected results: all entries
that have the words "into" AND "the" AND "wild*" in them.
But my webserver acts as if I was looking for "into" OR "the" OR
"wild*", eventhough the query is the same (as shown above). That's why I
get useless (too many) results on the webserver-side.

I don't know what could be the issue. I have tried to check the
config-files but I don't really know what to look for, so it is
overwhelming for me to search through this big file without knowing.

What could be the problem, where can I check/find it and how can I solve
that problem?

In case, additional informations are needed, let me know please.

Thank you!

(Excuse my poor english, please. It's not my mother-language.)

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delisavas

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:
> What does the debug output say from debugQuery=true say between the two?
>
>
>
> On Oct 16, 2013, at 5:16, Stavros Delisavas  wrote:
>
>> Hello Solr-Experts,
>>
>> I am currently having a strange issue with my solr querys. I am running
>> a small php/mysql-website that uses Solr for faster text-searches in
>> name-lists, movie-titles, etc. Recently I noticed that the results on my
>> local development-environment differ from those on my webserver. Both
>> use the 100% same mysql-database with identical solr-queries for
>> data-import.
>> This is a sample query:
>>
>> http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29&version=2.2&start=0&rows=1000&indent=on&fl=titleid
>>
>> It is autogenerated by an php-script and 100% identical on local and on
>> my webserver. My local solr gives me the expected results: all entries
>> that have the words "into" AND "the" AND "wild*" in them.
>> But my webserver acts as if I was looking for "into" OR "the" OR
>> "wild*", eventhough the query is the same (as shown above). That's why I
>> get useless (too many) results on the webserver-side.
>>
>> I don't know what could be the issue. I have tried to check the
>> config-files but I don't really know what to look for, so it is
>> overwhelming for me to search through this big file without knowing.
>>
>> What could be the problem, where can I check/find it and how can I solve
>> that problem?
>>
>> In case, additional informations are needed, let me know please.
>>
>> Thank you!
>>
>> (Excuse my poor english, please. It's not my mother-language.)

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-17 Thread Stavros Delisavas

Thank you,
I found the file with the stopwords and noticed that my local file is
empty (comments only) and the one on my webserver has a big list of
english stopwords. That seems to be the problem.

I think in general it is a good idea to use stopwords for random
searches, but it is not usefull in my special case. Is there a way to
(de)activate stopwords query-wise? Like I would like to ignore stopwords
when searching in titles but I would like to use stopwords when users do
a fulltext-search on whole articles, etc.

Thanks again,
Stavros


On 17.10.2013 09:13, Upayavira wrote:
> Stopwords are small words such as "and", "the" or "is",that we might
> choose to exclude from our documents and queries because they are such
> common terms. Once you have stripped stop words from your above query,
> all that is left is the word "wild", or so is being suggested.
>
> Somewhere in your config, close to solr config.xml, you will find a file
> called something like stopwords.txt. Compare these files between your
> two systems.
>
> Upayavira
>
> On Thu, Oct 17, 2013, at 07:18 AM, Stavros Delsiavas wrote:
>> Unfortunatly, I don't really know what stopwords are. I would like it to 
>> not ignore any words of my query.
>> How/Where can I change this stopwords-behaviour?
>>
>>
>> Am 16.10.2013 23:45, schrieb Jack Krupansky:
>>> So, the stopwords.txt file is different between the two systems - the 
>>> first has stop words but the second does not. Did you expect stop 
>>> words to be removed, or not?
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Stavros Delsiavas
>>> Sent: Wednesday, October 16, 2013 5:02 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Local Solr and Webserver-Solr act differently ("and" 
>>> treated like "or")
>>>
>>> Okay I understand,
>>>
>>> here's the rawquerystring. It was at about line 3000:
>>>
>>> 
>>>  title:(into AND the AND wild*)
>>>  title:(into AND the AND wild*)
>>>  +title:wild*
>>>  +title:wild*
>>>
>>> At this place the debug output DOES differ from the one on my local
>>> system. But I don't understand why...
>>> This is the local debug output:
>>>
>>> 
>>>   title:(into AND the AND wild*)
>>>   title:(into AND the AND wild*)
>>>   +title:into +title:the +title:wild*
>>>   +title:into +title:the
>>> +title:wild*
>>>
>>> Why is that? Any ideas?
>>>
>>>
>>>
>>>
>>> Am 16.10.2013 21:03, schrieb Shawn Heisey:
>>>> On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
>>>>> My local solr gives me:
>>>>> http://pastebin.com/Q6d9dFmZ
>>>>>
>>>>> and my webserver this:
>>>>> http://pastebin.com/q87WEjVA
>>>>>
>>>>> I copied only the first few hundret lines (of more than 8000) because
>>>>> the webserver output was to big even for pastebin.
>>>>>
>>>>>
>>>>>
>>>>> On 16.10.2013 12:27, Erik Hatcher wrote:
>>>>>> What does the debug output say from debugQuery=true say between the 
>>>>>> two?
>>>> What's really needed here is the first part of the  section,
>>>> which has rawquerystring, querystring, parsedquery, and
>>>> parsedquery_toString.  The info from your local solr has this part, but
>>>> what you pasted from the webserver one didn't include those parts,
>>>> because it's further down than the first few hundred lines.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-21 Thread Stavros Delisavas

Okay, I emtpied the stopword file. I don't know where the wordlist came
from. I have never seen this and never touched that file. Anyways...
Now my queries do work with one word, like "in" or "to" but the queries
still do not work when I use more than one stopword within one query.
Instead of too many results I now get NO results at all.

What could be the problem?



On 17.10.2013 15:02, Jack Krupansky wrote:
> The default Solr stopwords.txt file is empty, so SOMEBODY created that
> non-empty stop words file.
> 
> The StopFilterFactory token filter in the field type analyzer controls
> stop word processing. You can remove that step entirely, or different
> field types can reference different stop word files, or some field type
> analyzers can use the stop filter and some would not have it. This does
> mean that you would have to use different field types for fields that
> want different stop word processing.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Stavros Delisavas
> Sent: Thursday, October 17, 2013 3:27 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Local Solr and Webserver-Solr act differently ("and"
> treated like "or")
> 
> Thank you,
> I found the file with the stopwords and noticed that my local file is
> empty (comments only) and the one on my webserver has a big list of
> english stopwords. That seems to be the problem.
> 
> I think in general it is a good idea to use stopwords for random
> searches, but it is not usefull in my special case. Is there a way to
> (de)activate stopwords query-wise? Like I would like to ignore stopwords
> when searching in titles but I would like to use stopwords when users do
> a fulltext-search on whole articles, etc.
> 
> Thanks again,
> Stavros
> 
> 
> On 17.10.2013 09:13, Upayavira wrote:
>> Stopwords are small words such as "and", "the" or "is",that we might
>> choose to exclude from our documents and queries because they are such
>> common terms. Once you have stripped stop words from your above query,
>> all that is left is the word "wild", or so is being suggested.
>>
>> Somewhere in your config, close to solr config.xml, you will find a file
>> called something like stopwords.txt. Compare these files between your
>> two systems.
>>
>> Upayavira
>>
>> On Thu, Oct 17, 2013, at 07:18 AM, Stavros Delsiavas wrote:
>>> Unfortunatly, I don't really know what stopwords are. I would like it to
>>> not ignore any words of my query.
>>> How/Where can I change this stopwords-behaviour?
>>>
>>>
>>> Am 16.10.2013 23:45, schrieb Jack Krupansky:
>>>> So, the stopwords.txt file is different between the two systems - the
>>>> first has stop words but the second does not. Did you expect stop
>>>> words to be removed, or not?
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> -Original Message- From: Stavros Delsiavas
>>>> Sent: Wednesday, October 16, 2013 5:02 PM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Local Solr and Webserver-Solr act differently ("and"
>>>> treated like "or")
>>>>
>>>> Okay I understand,
>>>>
>>>> here's the rawquerystring. It was at about line 3000:
>>>>
>>>> 
>>>>  title:(into AND the AND wild*)
>>>>  title:(into AND the AND wild*)
>>>>  +title:wild*
>>>>  +title:wild*
>>>>
>>>> At this place the debug output DOES differ from the one on my local
>>>> system. But I don't understand why...
>>>> This is the local debug output:
>>>>
>>>> 
>>>>   title:(into AND the AND wild*)
>>>>   title:(into AND the AND wild*)
>>>>   +title:into +title:the +title:wild*
>>>>   +title:into +title:the
>>>> +title:wild*
>>>>
>>>> Why is that? Any ideas?
>>>>
>>>>
>>>>
>>>>
>>>> Am 16.10.2013 21:03, schrieb Shawn Heisey:
>>>>> On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
>>>>>> My local solr gives me:
>>>>>> http://pastebin.com/Q6d9dFmZ
>>>>>>
>>>>>> and my webserver this:
>>>>>> http://pastebin.com/q87WEjVA
>>>>>>
>>>>>> I copied only the first few hundret lines (of more than 8000) because
>>>>>> the webserver output was to big even for pastebin.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 16.10.2013 12:27, Erik Hatcher wrote:
>>>>>>> What does the debug output say from debugQuery=true say between the
>>>>>>> two?
>>>>> What's really needed here is the first part of the  section,
>>>>> which has rawquerystring, querystring, parsedquery, and
>>>>> parsedquery_toString.  The info from your local solr has this part,
>>>>> but
>>>>> what you pasted from the webserver one didn't include those parts,
>>>>> because it's further down than the first few hundred lines.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>

How to work with remote solr savely?

2013-11-22 Thread Stavros Delisavas

Hello Solr-Friends,
I have a question about working with solr which is installed on a remote
server.
I have a php-project with a very big mysql-database of about 10gb and I
am also using solr for about 10,000,000 entries indexed for fast search
and access of the mysql-data.
I have a local copy myself so I can continue to work on the php-project
itself, but I want to make it available for more developers too. How can
I make solr accessable ONLY for those exclusive developers? For mysql
it's no problem to add an additional mysql-user with limited access.

But for Solr it seems difficult to me. I have had my administrator
restrict the java-port 8080 to localhost only. That way no one outside
can access solr or the solr-admin interface.
How can I allow access to other developers without making the whole
solr-interface (port 8080) available to the public?

Thanks,

Stavros

Re: How to work with remote solr savely?

2013-11-22 Thread Stavros Delisavas

Thanks for your fast reply.
First of all http basic authentication unfortunatly is not secure. Also
this would give every developer full admin priviliges. Anyways, can you
tell me where I can do those configurations?

Are there any alternative or more secure ways to restrict solr-access?
In general extern developers need search-query-access only. They should
not be able to write/change the documents or access solr-admin-pages.

Thank you


Am 22.11.2013 15:34, schrieb michael.boom:
> Use HTTP basic authentication, setup in your servlet container
> (jetty/tomcat).
>
> That should work fine if you are *not* using SolrCloud.
>
>
>
> -
> Thanks,
> Michael
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-work-with-remote-solr-savely-tp4102612p4102613.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: How to work with remote solr savely?

2013-11-22 Thread Stavros Delisavas

Thanks for the suggestions. I will have a look at the suggestions and
try them out.



Am 22.11.2013 16:01, schrieb Hoggarth, Gil:
> You could also use one of the proxy scripts, such as
> http://code.google.com/p/solr-php-client/, which is coincidentally
> linked (eventually) from Michael's suggested SolrSecurity URL.
>
> -Original Message-
> From: michael.boom [mailto:my_sky...@yahoo.com] 
> Sent: 22 November 2013 14:53
> To: solr-user@lucene.apache.org
> Subject: Re: How to work with remote solr savely?
>
> http://wiki.apache.org/solr/SolrSecurity#Path_Based_Authentication
>
> Maybe you could achieve write/read access limitation by setting path
> based
> authentication:
> The update handler "/solr/core/update"  should be protected by
> authentication, with credentials only known to you. But then of course,
> your indexing client will need to authenticate in order to add docs to
> solr.
> Your select handler "/solr/core/select" could then be open or protected
> by http auth with credentials open to developers.
>
> That's the first idea that comes to mind - haven't tested it. 
> If you do, feedback and let us know how it went.
>
>
>
> -
> Thanks,
> Michael
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-work-with-remote-solr-savely-t
> p4102612p4102618.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

Dear Solr-Experts,

I am using Solr for my current web-application on my server successfully.
Now I would like to use it in my second web-application that is hosted
on the same server. Is it possible in any way to create two independent
instances/databases in Solr? I know that I could create another set of
fields with alternated field names, but I would prefer to be independent
on my field naming for all my projects.

Also I would like to be able to have one state of my development version
and one state of my production version on my server so that I can do
tests on my development-state without interference on my production-version.
What is the best-practice to achieve this or how can this be done in
general?

I have searched google but could not get any usefull results because I
don't even know what terms to search for with solr.
A minimal-example would be most helpfull.

Thanks a lot!

Stavros

Re: How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

Thanks for the fast responses. Looks like exactly what I was looking for!




Am 23.01.2014 09:46, schrieb Furkan KAMACI:
> Hi;
>
> Firstly you should read here and learn the terminology of Solr:
> http://wiki.apache.org/solr/SolrTerminology
>
> Thanks;
> Furkan KAMACI
>
>
> 2014/1/23 Alexandre Rafalovitch 
>
>> If you are not worried about them stepping on each other's toes
>> (performance, disk space, etc), just create multiple collections.
>> There are examples of that in standard distribution (e.g. badly named
>> example/multicore).
>>
>> Regards,
>>   Alex.
>> Personal website: http://www.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all
>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> book)
>>
>>
>> On Thu, Jan 23, 2014 at 3:36 PM, Stavros Delisavas 
>> wrote:
>>> Dear Solr-Experts,
>>>
>>> I am using Solr for my current web-application on my server successfully.
>>> Now I would like to use it in my second web-application that is hosted
>>> on the same server. Is it possible in any way to create two independent
>>> instances/databases in Solr? I know that I could create another set of
>>> fields with alternated field names, but I would prefer to be independent
>>> on my field naming for all my projects.
>>>
>>> Also I would like to be able to have one state of my development version
>>> and one state of my production version on my server so that I can do
>>> tests on my development-state without interference on my
>> production-version.
>>> What is the best-practice to achieve this or how can this be done in
>>> general?
>>>
>>> I have searched google but could not get any usefull results because I
>>> don't even know what terms to search for with solr.
>>> A minimal-example would be most helpfull.
>>>
>>> Thanks a lot!
>>>
>>> Stavros

Re: How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

I didn't know that the "core"-term is associated with this use case. I
expected it to be some technical feature that allows to run more
solr-instances for better multithread-cpu-usage. For example to activate
two solr-cores when two cpu-cores are available on the server.

So in general, I have the feeling that the term "core" is somewhat
confusing for solr-beginners like me.



Am 23.01.2014 09:54, schrieb Alexandre Rafalovitch:
> Which is why it is curious that you did not find it. Looking back at
> it now, do you have a suggestion of what could be improved to insure
> people find this easier in the future?
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Thu, Jan 23, 2014 at 3:49 PM, Stavros Delisavas  
> wrote:
>> Thanks for the fast responses. Looks like exactly what I was looking for!
>>
>>
>>
>>
>> Am 23.01.2014 09:46, schrieb Furkan KAMACI:
>>> Hi;
>>>
>>> Firstly you should read here and learn the terminology of Solr:
>>> http://wiki.apache.org/solr/SolrTerminology
>>>
>>> Thanks;
>>> Furkan KAMACI
>>>
>>>
>>> 2014/1/23 Alexandre Rafalovitch 
>>>
>>>> If you are not worried about them stepping on each other's toes
>>>> (performance, disk space, etc), just create multiple collections.
>>>> There are examples of that in standard distribution (e.g. badly named
>>>> example/multicore).
>>>>
>>>> Regards,
>>>>   Alex.
>>>> Personal website: http://www.outerthoughts.com/
>>>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>>>> - Time is the quality of nature that keeps events from happening all
>>>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>>>> book)
>>>>
>>>>
>>>> On Thu, Jan 23, 2014 at 3:36 PM, Stavros Delisavas 
>>>> wrote:
>>>>> Dear Solr-Experts,
>>>>>
>>>>> I am using Solr for my current web-application on my server successfully.
>>>>> Now I would like to use it in my second web-application that is hosted
>>>>> on the same server. Is it possible in any way to create two independent
>>>>> instances/databases in Solr? I know that I could create another set of
>>>>> fields with alternated field names, but I would prefer to be independent
>>>>> on my field naming for all my projects.
>>>>>
>>>>> Also I would like to be able to have one state of my development version
>>>>> and one state of my production version on my server so that I can do
>>>>> tests on my development-state without interference on my
>>>> production-version.
>>>>> What is the best-practice to achieve this or how can this be done in
>>>>> general?
>>>>>
>>>>> I have searched google but could not get any usefull results because I
>>>>> don't even know what terms to search for with solr.
>>>>> A minimal-example would be most helpfull.
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Stavros

Re: How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

So far, I successfully managed to create a core from my existing
configuration by opening this URL in my browser:

http://localhost:8080/solr/admin/cores?action=CREATE&name=glPrototypeCore&instanceDir=/etc/solr

New status from http://localhost:8080/solr/admin/cores?action=STATUS is:



0
4




/usr/share/solr/./
/var/lib/solr/data/
2014-01-23T08:42:39.087Z
3056197

4401029
4401029
1370010628806
12
true
false

org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@77c58801

2013-10-29T14:17:22Z



glPrototypeCore
/etc/solr/
/var/lib/solr/data/
2014-01-23T09:29:30.019Z
245267

4401029
4401029
1370010628806
12
true
false

org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@5ad83862

2013-10-29T14:17:22Z






>From my understanding I now have an unnamed core and a core named
"glPrototypeCore" which uses the same configuration.

I copied the files data-config.xml, schema.xml into a new directory
"/etc/solr/glinstance" and tried to create another core but this always
throws me error 400. I even tried by adding the schema- and
config-parameters with full path, but this did not lead to any
difference. Also I don't understand what the "dataDir"-parameter is for.
I could not find any data-directories in /etc/solr/ but the creation of
the first core worked anyway.

Can someone help? Is there any better place for my new
instance-directory and what files do I really need?





Am 23.01.2014 10:10, schrieb Stavros Delisavas:
> I didn't know that the "core"-term is associated with this use case. I
> expected it to be some technical feature that allows to run more
> solr-instances for better multithread-cpu-usage. For example to activate
> two solr-cores when two cpu-cores are available on the server.
>
> So in general, I have the feeling that the term "core" is somewhat
> confusing for solr-beginners like me.
>
>
>
> Am 23.01.2014 09:54, schrieb Alexandre Rafalovitch:
>> Which is why it is curious that you did not find it. Looking back at
>> it now, do you have a suggestion of what could be improved to insure
>> people find this easier in the future?
>>
>> Regards,
>>Alex.
>> Personal website: http://www.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all
>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> book)
>>
>>
>> On Thu, Jan 23, 2014 at 3:49 PM, Stavros Delisavas  
>> wrote:
>>> Thanks for the fast responses. Looks like exactly what I was looking for!
>>>
>>>
>>>
>>>
>>> Am 23.01.2014 09:46, schrieb Furkan KAMACI:
>>>> Hi;
>>>>
>>>> Firstly you should read here and learn the terminology of Solr:
>>>> http://wiki.apache.org/solr/SolrTerminology
>>>>
>>>> Thanks;
>>>> Furkan KAMACI
>>>>
>>>>
>>>> 2014/1/23 Alexandre Rafalovitch 
>>>>
>>>>> If you are not worried about them stepping on each other's toes
>>>>> (performance, disk space, etc), just create multiple collections.
>>>>> There are examples of that in standard distribution (e.g. badly named
>>>>> example/multicore).
>>>>>
>>>>> Regards,
>>>>>   Alex.
>>>>> Personal website: http://www.outerthoughts.com/
>>>>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>>>>> - Time is the quality of nature that keeps events from happening all
>>>>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>>>>> book)
>>>>>
>>>>>
>>>>> On Thu, Jan 23, 2014 at 3:36 PM, Stavros Delisavas 
>>>>> wrote:
>>>>>> Dear Solr-Experts,
>>>>>>
>>>>>> I am using Solr for my current web-application on my server successfully.
>>>>>> Now I would like to use it in my second web-application that is hosted
>>>>>> on the same server. Is it possible in any way to create two independent
>>>>>> instances/databases in Solr? I know that I could create another set of
>>>>>> fields with alternated field names, but I would prefer to be independent
>>>>>> on my field naming for all my projects.
>>>>>>
>>>>>> Also I would like to be able to have one state of my development version
>>>>>> and one state of my production version on my server so that I can do
>>>>>> tests on my development-state without interference on my
>>>>> production-version.
>>>>>> What is the best-practice to achieve this or how can this be done in
>>>>>> general?
>>>>>>
>>>>>> I have searched google but could not get any usefull results because I
>>>>>> don't even know what terms to search for with solr.
>>>>>> A minimal-example would be most helpfull.
>>>>>>
>>>>>> Thanks a lot!
>>>>>>
>>>>>> Stavros

Re: How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

Thanks a lot,
those are great examples. I managed to get my cores working. What I
noticed so far is that the first (auto-created) core is symlinking files
to /etc/solr/...  or to /var/lib/solr/...

I now am not sure where my self made-collections should be. Shall I
create folders in /usr/share/solr/ and symlink to my
files in /etc/solr or can I have hard-copies in my collection-folders?
Is /usr/share/solr/ a good place for my collection-folders at all?



Am 23.01.2014 12:16, schrieb Alexandre Rafalovitch:
> You need config-dir level schema.xml, and solrconfig.xml. For multiple
> collections, you also need a top-level solr.xml. And unless the config
> files a lot of references to other files, you need nothing else.
>
> For examples, check the example directory in the distribution. Or have
> a look at examples from my book:
> https://github.com/arafalov/solr-indexing-book/tree/master/published .
> This shows the solr.xml that points at a lot of collections. The first
> nearly minimal collection is collection1, but you can then explore
> others for various degree of complexity.
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Thu, Jan 23, 2014 at 4:55 PM, Stavros Delisavas  
> wrote:
>> So far, I successfully managed to create a core from my existing
>> configuration by opening this URL in my browser:
>>
>> http://localhost:8080/solr/admin/cores?action=CREATE&name=glPrototypeCore&instanceDir=/etc/solr
>>
>> New status from http://localhost:8080/solr/admin/cores?action=STATUS is:
>>
>> 
>> 
>> 0
>> 4
>> 
>> 
>> 
>> 
>> /usr/share/solr/./
>> /var/lib/solr/data/
>> 2014-01-23T08:42:39.087Z
>> 3056197
>> 
>> 4401029
>> 4401029
>> 1370010628806
>> 12
>> true
>> false
>> 
>> org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
>> lockFactory=org.apache.lucene.store.NativeFSLockFactory@77c58801
>> 
>> 2013-10-29T14:17:22Z
>> 
>> 
>> 
>> glPrototypeCore
>> /etc/solr/
>> /var/lib/solr/data/
>> 2014-01-23T09:29:30.019Z
>> 245267
>> 
>> 4401029
>> 4401029
>> 1370010628806
>> 12
>> true
>> false
>> 
>> org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
>> lockFactory=org.apache.lucene.store.NativeFSLockFactory@5ad83862
>> 
>> 2013-10-29T14:17:22Z
>> 
>> 
>> 
>> 
>>
>>
>> From my understanding I now have an unnamed core and a core named
>> "glPrototypeCore" which uses the same configuration.
>>
>> I copied the files data-config.xml, schema.xml into a new directory
>> "/etc/solr/glinstance" and tried to create another core but this always
>> throws me error 400. I even tried by adding the schema- and
>> config-parameters with full path, but this did not lead to any
>> difference. Also I don't understand what the "dataDir"-parameter is for.
>> I could not find any data-directories in /etc/solr/ but the creation of
>> the first core worked anyway.
>>
>> Can someone help? Is there any better place for my new
>> instance-directory and what files do I really need?
>>
>>
>>
>>
>>
>> Am 23.01.2014 10:10, schrieb Stavros Delisavas:
>>> I didn't know that the "core"-term is associated with this use case. I
>>> expected it to be some technical feature that allows to run more
>>> solr-instances for better multithread-cpu-usage. For example to activate
>>> two solr-cores when two cpu-cores are available on the server.
>>>
>>> So in general, I have the feeling that the term "core" is somewhat
>>> confusing for solr-beginners like me.
>>>
>>>
>>>
>>> Am 23.01.2014 09:54, schrieb Alexandre Rafalovitch:
>>>> Which is why it is curious that you did not find it. Looking back at
>>>> it now, do you have a suggestion of what could be improved to insure
>>>> people find this easier in the future?
>>>>
>>>> Regards,
>>>>Alex.
>>>> Personal website: http://www.outerthoughts.com/
>>>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>>>> - Time is the quality of nature that keeps events from happening all
&g

Re: How to use Solr for two different projects on one server

2014-01-23 Thread Stavros Delisavas

I installed solr via apt-get and followed the online tutorials that I
found to adjust the existing schema.xml and created dataconfig.xml the
way I needed them.

Was this the wrong approach? I don't know what Bitname stack is.




Am 23.01.2014 12:50, schrieb Alexandre Rafalovitch:
> You are not doing this on a download distribution, do you? You are
> using Bitnami stack or something. That's why you are not seeing the
> examples folder, etc.
>
> I recommend step back, use downloaded distribution and do your
> learning and setup using that. Then, go and see where your production
> stack put various bits of Solr. Otherwise, you are doing two (15?)
> things at once.
>
> Regards,
>Alex.
> P.s. If you like the examples, the book actually explains what they
> do. You could be quarter way to mastery in less than 24 hours...
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Thu, Jan 23, 2014 at 6:38 PM, Stavros Delisavas  
> wrote:
>> Thanks a lot,
>> those are great examples. I managed to get my cores working. What I
>> noticed so far is that the first (auto-created) core is symlinking files
>> to /etc/solr/...  or to /var/lib/solr/...
>>
>> I now am not sure where my self made-collections should be. Shall I
>> create folders in /usr/share/solr/ and symlink to my
>> files in /etc/solr or can I have hard-copies in my collection-folders?
>> Is /usr/share/solr/ a good place for my collection-folders at all?
>>
>>
>>
>> Am 23.01.2014 12:16, schrieb Alexandre Rafalovitch:
>>> You need config-dir level schema.xml, and solrconfig.xml. For multiple
>>> collections, you also need a top-level solr.xml. And unless the config
>>> files a lot of references to other files, you need nothing else.
>>>
>>> For examples, check the example directory in the distribution. Or have
>>> a look at examples from my book:
>>> https://github.com/arafalov/solr-indexing-book/tree/master/published .
>>> This shows the solr.xml that points at a lot of collections. The first
>>> nearly minimal collection is collection1, but you can then explore
>>> others for various degree of complexity.
>>>
>>> Regards,
>>>Alex.
>>> Personal website: http://www.outerthoughts.com/
>>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>>> - Time is the quality of nature that keeps events from happening all
>>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>>> book)
>>>
>>>
>>> On Thu, Jan 23, 2014 at 4:55 PM, Stavros Delisavas  
>>> wrote:
>>>> So far, I successfully managed to create a core from my existing
>>>> configuration by opening this URL in my browser:
>>>>
>>>> http://localhost:8080/solr/admin/cores?action=CREATE&name=glPrototypeCore&instanceDir=/etc/solr
>>>>
>>>> New status from http://localhost:8080/solr/admin/cores?action=STATUS is:
>>>>
>>>> 
>>>> 
>>>> 0
>>>> 4
>>>> 
>>>> 
>>>> 
>>>> 
>>>> /usr/share/solr/./
>>>> /var/lib/solr/data/
>>>> 2014-01-23T08:42:39.087Z
>>>> 3056197
>>>> 
>>>> 4401029
>>>> 4401029
>>>> 1370010628806
>>>> 12
>>>> true
>>>> false
>>>> 
>>>> org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
>>>> lockFactory=org.apache.lucene.store.NativeFSLockFactory@77c58801
>>>> 
>>>> 2013-10-29T14:17:22Z
>>>> 
>>>> 
>>>> 
>>>> glPrototypeCore
>>>> /etc/solr/
>>>> /var/lib/solr/data/
>>>> 2014-01-23T09:29:30.019Z
>>>> 245267
>>>> 
>>>> 4401029
>>>> 4401029
>>>> 1370010628806
>>>> 12
>>>> true
>>>> false
>>>> 
>>>> org.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/var/lib/solr/data/index
>>>> lockFactory=org.apache.lucene.store.NativeFSLockFactory@5ad83862
>>>> 
>>>> 2013-10-29T14:17:22Z
>>>> 
>>>> 
>>>> 
>>>> 
>>>>
>>>>
>>>> From my understanding I now ha

How to query multiple words correctly

2013-07-13 Thread Stavros Delisavas


Hello Solr-Community,
I am having some strange behavior that I don't understand. I hope you 
can help. I try to query/search for two words. For example "(*foo* AND 
*bar*)
What I want is to get all entries that contain the string foo AND 
contain the word bar. What I get is all entries that contain foo OR 
contain bar.

But I want entries that contain BOTH words.
like: "foobar 123", "bla foo bla bar", "blafoobla bar", etc

What do I have to change in my query to get the desired result?

Thank you!

Re: How to query multiple words correctly

2013-07-13 Thread Stavros Delisavas


Thank you,
problem solved!


On 13.07.2013 12:16, Otis Gospodnetic wrote:

Hi,

Does the same happen if you use +*foo* +*bar* syntax?

If such queries turn out to be too slow, consider indexing ngrams.

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Sat, Jul 13, 2013 at 5:40 AM, Stavros Delisavas  wrote:

Hello Solr-Community,
I am having some strange behavior that I don't understand. I hope you can
help. I try to query/search for two words. For example "(*foo* AND *bar*)
What I want is to get all entries that contain the string foo AND contain
the word bar. What I get is all entries that contain foo OR contain bar.
But I want entries that contain BOTH words.
like: "foobar 123", "bla foo bla bar", "blafoobla bar", etc

What do I have to change in my query to get the desired result?

Thank you!

data-import problem

2013-06-05 Thread Stavros Delisavas


Hello Solr-Friends,

I have a problem with my current solr configuration. I want to import 
two tables into solr. I got it to work for the first table, but the 
second table doesn't get imported (no errormessage, 0 rows skipped).
I have two tables called name and title and i want to load their fields 
called id, name and id title (two id colums that have nothing to do with 
each other)


This is in my data-config.xml:









and this is in my schema.xml:

data-import problem

2013-06-05 Thread Stavros Delisavas


Hello Solr-Friends,

I have a problem with my current solr configuration. I want to import 
two tables into solr. I got it to work for the first table, but the 
second table doesn't get imported (no errormessage, 0 rows skipped).
I have two tables called name and title and i want to load their fields 
called id, name and id title (two id colums that have nothing to do with 
each other)


This is in my data-config.xml:









and this is in my schema.xml:










id




I chose that unique key only because solr asked for it.
In my SolrAdmin Scheme-Browser I can see three fields id, name and 
title, but titleid is missing and title itself is empy with no entries. 
I don't know how to get it work to index two seperate lists.


I hope someone can help, thank you!

data-import problem

2013-06-05 Thread Stavros Delisavas


Hello Solr-Friends,

I have a problem with my current solr configuration. I want to import 
two tables into solr. I got it to work for the first table, but the 
second table doesn't get imported (no errormessage, 0 rows skipped).
I have two tables called name and title and i want to load their fields 
called id, name and id title (two id colums that have nothing to do with 
each other)


This is in my data-config.xml:









and this is in my schema.xml:










id




I chose that unique key only because solr asked for it.
In my SolrAdmin Scheme-Browser I can see three fields id, name and 
title, but titleid is missing and title itself is empy with no entries. 
I don't know how to get it work to index two seperate lists.


I hope someone can help, thank you!

PS: I am sorry if this mail reached you twice. I sent it the first time 
when I was not registered yet and don't know if the mail was received. 
Sending now again after registration to mailing list.

Re: data-import problem

2013-06-05 Thread Stavros Delisavas


Thanks so far.

This change makes Solr work over the title-entries too, yay! 
Unfortunatly they don't get processed(skipped rows). In my log it says

"missing required field id" for every entry.

I checked my schema.xml. In there "id" is not set as a required field. 
removing the uniquekey-property also leads to no improvement.


Any further ideas?





Am 05.06.2013 18:01, schrieb sodoo:

Maybe problem is two document declare in data-config.xml.

You will try change this one.


  
  





--
View this message in context: 
http://lucene.472066.n3.nabble.com/data-import-problem-tp4068306p4068373.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: data-import problem

2013-06-05 Thread Stavros Delisavas


Thanks for the hints.
I am not sure how to solve this issue. I previously made a typo, there 
are definetly two different tables.

Here is my real configuration:

http://pastebin.com/JUDzaMk0

For testing purposes I added "LIMIT 10" to the SQL-statements because my 
tables are very huge and tests would take too long (about 5gb, 
6.5million rows). I included my whole data-config and what I have 
changed from the default schema.xml. I don't know how to solve the "all 
ids have to be unique"-problem. I can not believe that Solr does not 
offer any solution at all to handle multiple data sources with their own 
individual ids. Maybe its possible to have solr create its own ids while 
importing the data?


Actually there is no direct relation between my "name"-table and my 
"title"-table. All I want is to be able to do fast text-search in those 
two tables in order to find the belonging ids of these entries.


Let me know if you need more information.

Thank you!





Am 05.06.2013 20:54, schrieb Gora Mohanty:

On 6 June 2013 00:09, Stavros Delisavas  wrote:

Thanks so far.

This change makes Solr work over the title-entries too, yay! Unfortunatly
they don't get processed(skipped rows). In my log it says
"missing required field id" for every entry.

I checked my schema.xml. In there "id" is not set as a required field.
removing the uniquekey-property also leads to no improvement.

[...]

There are several things wrong with your problem statement.
You say that you have two tables, but both SELECTs seem
to use the same table. I am going to assume that you really
have two different tables.

Unless you have changed the default schema.xml, "id" should
be defined as the uniqueKey for the document. You probably
do not want to remove that, and even if you just remove the
uniqueKey property, the field "id" remains defined as a required
field.

The issue is with with your SELECT for the second entity:

This renames "id" to titleid, and hence the required field
"id" in schema.xml is missing.

While you do need something like:

   
   


However, you will need to ensure that the ids are unique
in the two tables, else entries from the second entity will
overwrite matching ids from the first.

Also, do you have field definitions within the entities? Please
share the complete schema.xml and the DIH configuration
file with us, rather than snippets: Use pastebin.com if they
are large.

Regards,
Gora

Re: data-import problem

2013-06-06 Thread Stavros Delisavas

I tryed to deactivate the uniquekey, but that made solr not work at all. 
I got Error 500 for everything (no admin page, etc). So I had to 
reactivate it.


This is my current configuration as you recommended. Unfortunatly still 
no improvement. The second table doesn't get recorded. I included the 
errormessage of the log file.


http://pastebin.com/0vut38qL

Has no one ever successfully imported two tables into solr before?



Am 06.06.2013 00:01, schrieb bbarani:

A Solr index does not need a unique key, but almost all indexes use one.

http://wiki.apache.org/solr/UniqueKey

Try the below query passing id as id instead of titleid..


  


A proper dataimport config will look like,









--
View this message in context: 
http://lucene.472066.n3.nabble.com/data-import-problem-tp4068345p4068447.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Heap space problem with mlt query

2013-06-06 Thread Stavros Delisavas

I recently had the same issue which could be fixed very easily. Add the 
property batchSize="-1" to your -tag.


Tell me if that helped.

Am 06.06.2013 11:30, schrieb Varsha Rani:

Hi,

As per suggestions , changed  in my config file  as :
  reduced document cache size from 31067 to 16384 and
  autowarmcount from 2046 to 1024.

My machine RAM size is 16GB , 1 GB RAM used as index of 85GB started.

  my config file as :

128



  


  




I am running 20-25 mlt queries in 1 sec . As with each mlt query RAM used
increases continuously.  As RAM used reached to 6GB, java heap space problem
occur. With each 5 continuous mlt queries RAM used increased by 1GB.








--
View this message in context: 
http://lucene.472066.n3.nabble.com/Heap-space-problem-with-mlt-query-tp4068278p4068541.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: data-import problem

2013-06-06 Thread Stavros Delisavas

It's surprising to me that all tables have to have a relationship in 
order to be used in solr. What if I have two indipendent projects 
running on the same webserver? I would not be able to use Solr for both 
of them, really? That would be very dissappointing...


Anyway, luckily there is an indirect relationship between the two tables 
but there is an "N to N" relationship with a thrid table in between. The 
full join in MySQL would be something like this:


SELECT (cast.id??), title.id, title.title, name.id, name.name
FROM name, title, cast
WHERE title.id = cast.movie_id
AND cast.person_id = name.id

But this will definatly lead to multiple entries of name.name and 
title.title because they are connected with an N-to-N relationship. So 
the resulting table would not have unique keys either!! Nor title.id or 
name.id. There is another id available cast.id which could be used as a 
unique id, but its a completly useless and irrelevant id which has no 
connection/relation to anything else at all. So there is no real use for 
it to include it, unless Solr really needs a unique id.


I am still a noob with Solr. Can you please help me to adapt the given 
Join to the xml-syntax for my data-config.xml?

That would be very great!


Am 06.06.2013 17:58, schrieb bbarani:

The below error clearly says that you have declared a unique id but that
unique id is missing for some documents.

org.apache.solr.common.SolrException: [doc=null] missing required field:
nameid

This is mainly because you are just trying to import 2 tables in to a
document without any relationship between the data of 2 tables.

table 1 has the nameid (unique key) but table 2 has to be joined with table
1 to form a relationship between the 2 tables. You can't just dump the value
since table 2 might have more values than table1 (but table1 has the unique
id).

I am not sure of your table structure, I am assuming that there is a key
(ex: nameid in title table) that can be used to join name and title table.

Try something like this..

   
 
 
 
 
*
*
 
 
   




--
View this message in context: 
http://lucene.472066.n3.nabble.com/data-import-problem-tp4068345p4068636.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: data-import problem

2013-06-06 Thread Stavros Delisavas

Unfortunatly my two tables do not share a unique key. they both have 
integers as keys starting with number 1. Is there any way to overcome 
this problem? Removing the uniquekey-property from my schema.xml leads 
to solr not working (I have tryed that already).
The link you provided is showing what I have already tryed before which 
was leading to my current problem. When I setup my data-config as shown 
in that thread, my second table does not get recorded because of the 
missing field (name.id/nameid the unique key) in my "title"-table...



Am 06.06.2013 18:32, schrieb bbarani:

You don't really need to have a relationship but the unique id should be
unique in a document. I had mentioned about the relationship due to the fact
that the unique key was present only in one table but not the other..

Check out this link for more information on importing multiple table data.

http://lucene.472066.n3.nabble.com/Create-index-on-few-unrelated-table-in-Solr-td4068054.html



--
View this message in context: 
http://lucene.472066.n3.nabble.com/data-import-problem-tp4068345p4068650.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: data-import problem

2013-06-06 Thread Stavros Delisavas


Perfect! This finally worked! Shawn, thank you a lot!

How do I set up multiple cores?

Again, thank you so much! I was looking for a solution for days!


Am 06.06.2013 19:23, schrieb Shawn Heisey:

On 6/6/2013 11:15 AM, Stavros Delisavas wrote:

Unfortunatly my two tables do not share a unique key. they both have
integers as keys starting with number 1. Is there any way to overcome
this problem? Removing the uniquekey-property from my schema.xml leads
to solr not working (I have tryed that already).
The link you provided is showing what I have already tryed before which
was leading to my current problem. When I setup my data-config as shown
in that thread, my second table does not get recorded because of the
missing field (name.id/nameid the unique key) in my "title"-table...


Change the id field to a StrField in your schema, and then use 
something like this:








If these documents have no connection to each other at all, set up 
multiple cores so they are entirely separate indexes.


Thanks,
Shawn

Re: data-import problem

2013-06-06 Thread Stavros Delisavas


Think about movies and the cast of a movie.
There are movies (title) which have their unique ids. And there are many 
people (name) like the producer, actors, etc which have their unique 
ids. But there are ppl who have been actor in more than one movie. Thats 
why i have a third table which connects those two tables via name.id and 
title.id.


Anyway, I think my problem is satisfactory solved for me. Do you think I 
did something wrong?



Am 06.06.2013 19:45, schrieb bbarani:

Not sure if I understand your situation..I am not sure how would you relate
the data between 2 tables if theres no relationship? You are trying to just
dump random values from 2 tables in to a document?ConsiderTable1: Name
idpeter   1john2mike   3Table2:Title  TitleIdCEO
111developer222Officer333Cleaner   444IT
555Your document will look something like..but Peter is a cleaner and not a
CEO..1peterCEO>



--
View this message in context: 
http://lucene.472066.n3.nabble.com/data-import-problem-tp4068345p4068677.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: data-import problem

2013-06-06 Thread Stavros Delisavas

Thats okay. For now, I guess it is okay. Finally I could import all 6.6 
million entries successfully. I am happy.




Am 06.06.2013 19:44, schrieb Shawn Heisey:

On 6/6/2013 11:38 AM, Stavros Delisavas wrote:

Perfect! This finally worked! Shawn, thank you a lot!

How do I set up multiple cores?

Again, thank you so much! I was looking for a solution for days!


Cores are defined in solr.xml - the default example core is named 
collection1.  I am struggling to find documentation for multicore that 
is suitable for a novice.  There is some information on this wiki 
page, but it is geared towards the use of the CoreAdmin API, not 
multiple cores themselves.


wiki.apache.org/solr/CoreAdmin

To access a specific core with query urls, you don't use URLs like 
/solr/select that you might have seen in documentation, you use 
/solr/corename/select or /solr/corename/update instead.


Thanks,
Shawn

Local Solr and Webserver-Solr act differently ("and" treated like "or")

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

How to work with remote solr savely?

Re: How to work with remote solr savely?

Re: How to work with remote solr savely?

How to use Solr for two different projects on one server

Re: How to use Solr for two different projects on one server

Re: How to use Solr for two different projects on one server

Re: How to use Solr for two different projects on one server

Re: How to use Solr for two different projects on one server

Re: How to use Solr for two different projects on one server

How to query multiple words correctly

Re: How to query multiple words correctly

data-import problem

data-import problem

data-import problem

Re: data-import problem

Re: data-import problem

Re: data-import problem

Re: Heap space problem with mlt query

Re: data-import problem

Re: data-import problem

Re: data-import problem

Re: data-import problem

Re: data-import problem

27 matches

Site Navigation

Mail list logo

Footer information