from:"Bruno"


Hi All,

Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the same
schema but not.
I have one solr instance launch. 300 000 records in each collection.

I try to use this request without having both results:

http://my_adress:my_port/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns only C2 results.

I have 5 identical fields on both collection
id, fid, st, cc, timestamp
where id is the unique key field.

Can someone could explain me why it doesn't work ?

Thanks a lot !
Bruno

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?


yes id value is unique in C1 and unique in C2.
id in C1 is never present in C2
id in C2 is never present in C1

Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?
To get proper results, the ids should be unique across both the cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:


Hi All,

Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the same
schema but not.
I have one solr instance launch. 300 000 records in each collection.

I try to use this request without having both results:

http://my_adress:my_port
/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port
/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns only C2 results.

I have 5 identical fields on both collection
id, fid, st, cc, timestamp
where id is the unique key field.

Can someone could explain me why it doesn't work ?

Thanks a lot !
Bruno

---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com

--

Regards,
Binoy Dalal




---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?


Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same query 
fid:34520196


http://xxx.xxx.xxx.xxx:/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{ "responseHeader":{ "status":0, "QTime":1, "params":{ 
"fl":"fid,cc*,st", "indent":"true", "q":"fid:34520196", 
"collection":"c1,c2", "wt":"json"}}, 
"response":{"numFound":1,"start":0,"docs":[ {


"id":"EP1680447",
"st":"LAPSED",
"fid":"34520196"}]
  }
}


http://xxx.xxx.xxx.xxx:/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "fl":"id,fid,cc*,st",
  "indent":"true",
  "q":"fid:34520196",
  "collection":"c1,c2",
  "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
  {
"id":"WO2005040212",
"st":"PENDING",
"cc_CA":"LAPSED",
"cc_EP":"LAPSED",
"cc_JP":"PENDING",
"cc_US":"LAPSED",
"fid":"34520196"}]
  }}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno

Le 06/01/2016 14:56, Emir Arnautovic a écrit :

Hi Bruno,
Can you check counts? Is it possible that first page is only with 
results from collection that you sent request to so you assumed it 
returns only results from single collection?


Thanks,
Emir

On 06.01.2016 14:33, Susheel Kumar wrote:

Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned 
results
from two identical collections. I doubt if it is broken in 5.4 just 
double

check if you are not missing anything else.

Thanks,
Susheel

http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 



responseHeader": {"status": 0,"QTime": 98,"params": {"q": 
"id_type:hello","

indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1","
id_type": "hello","_version_": 1522623395043213300},{"id": 
"3","id_type": "

hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina  wrote:


yes id value is unique in C1 and unique in C2.
id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :


Are Id values for docs in both the collections exactly same?
To get proper results, the ids should be unique across both the cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:

Hi All,

Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the 
same

schema but not.
I have one solr instance launch. 300 000 records in each collection.

I try to use this request without having both results:

http://my_adress:my_port
/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port
/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns only C2 results.

I have 5 identical fields on both collection
id, fid, st, cc, timestamp
where id is the unique key field.

Can someone could explain me why it doesn't work ?

Thanks a lot !
Bruno

---
L'absence de virus dans ce courrier électronique a été vérifiée 
par le

logiciel antivirus Avast.
http://www.avast.com

--


Regards,
Binoy Dalal



---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com







---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?


I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :

I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel

On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina  wrote:


Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same query
fid:34520196

http://xxx.xxx.xxx.xxx:
/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{ "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st",
"indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[ {

 "id":"EP1680447",
 "st":"LAPSED",
 "fid":"34520196"}]
   }
}


http://xxx.xxx.xxx.xxx:
/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{
   "responseHeader":{
 "status":0,
 "QTime":0,
 "params":{
   "fl":"id,fid,cc*,st",
   "indent":"true",
   "q":"fid:34520196",
   "collection":"c1,c2",
   "wt":"json"}},
   "response":{"numFound":1,"start":0,"docs":[
   {
 "id":"WO2005040212",
 "st":"PENDING",
 "cc_CA":"LAPSED",
 "cc_EP":"LAPSED",
 "cc_JP":"PENDING",
 "cc_US":"LAPSED",
 "fid":"34520196"}]
   }}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :


Hi Bruno,
Can you check counts? Is it possible that first page is only with results
from collection that you sent request to so you assumed it returns only
results from single collection?

Thanks,
Emir

On 06.01.2016 14:33, Susheel Kumar wrote:


Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel


http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2

responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1","
id_type": "hello","_version_": 1522623395043213300},{"id":
"3","id_type":"
hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina  wrote:

yes id value is unique in C1 and unique in C2.

id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?

To get proper results, the ids should be unique across both the cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:

Hi All,


Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the
same
schema but not.
I have one solr instance launch. 300 000 records in each collection.

I try to use this request without having both results:

http://my_adress:my_port
/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port
/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns only C2 results.

I have 5 identical fields on both collection
id, fid, st, cc, timestamp
where id is the unique key field.

Can someone could explain me why it doesn't work ?

Thanks a lot !
Bruno

---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com

--

Regards,

Binoy Dalal


---

L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com




---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?

Same result on my dev' server, it seems that collection param haven't 
effect on the query...


Q: I don't see on the solr 5.4 doc, the "collection" param for select 
handler, is it always present in 5.4 version ?


Le 06/01/2016 17:38, Bruno Mannina a écrit :

I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :

I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel

On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina  wrote:


Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same query
fid:34520196

http://xxx.xxx.xxx.xxx:
/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 



{ "responseHeader":{ "status":0, "QTime":1, "params":{ 
"fl":"fid,cc*,st",
"indent":"true", "q":"fid:34520196", "collection":"c1,c2", 
"wt":"json"}},

"response":{"numFound":1,"start":0,"docs":[ {

 "id":"EP1680447",
 "st":"LAPSED",
 "fid":"34520196"}]
   }
}


http://xxx.xxx.xxx.xxx:
/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2 



{
   "responseHeader":{
 "status":0,
 "QTime":0,
 "params":{
   "fl":"id,fid,cc*,st",
   "indent":"true",
   "q":"fid:34520196",
   "collection":"c1,c2",
   "wt":"json"}},
   "response":{"numFound":1,"start":0,"docs":[
   {
 "id":"WO2005040212",
 "st":"PENDING",
 "cc_CA":"LAPSED",
 "cc_EP":"LAPSED",
 "cc_JP":"PENDING",
 "cc_US":"LAPSED",
 "fid":"34520196"}]
   }}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :


Hi Bruno,
Can you check counts? Is it possible that first page is only with 
results
from collection that you sent request to so you assumed it returns 
only

results from single collection?

Thanks,
Emir

On 06.01.2016 14:33, Susheel Kumar wrote:


Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel


http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2 



responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": 
"1","

id_type": "hello","_version_": 1522623395043213300},{"id":
"3","id_type":"
hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina  
wrote:


yes id value is unique in C1 and unique in C2.

id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?
To get proper results, the ids should be unique across both the 
cores.


On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:

Hi All,


Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the
same
schema but not.
I have one solr instance launch. 300 000 records in each 
collection.


I try to use this request without having both results:

http://my_adress:my_port
/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port
/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns only C2 results.

I have 5 identical fields on both collection
id, fid, st, cc, timestamp
where id is the unique key field.

Can someone could explain me why it doesn't work ?

Thanks a lot !
Bruno

---
L'absence de virus dans ce courrier électronique a été vérifiée 
par le

logiciel antivirus Avast.
http://www.avast.com

--

Regards,

Binoy Dalal


---
L'absence de virus dans ce courrier électronique a été vérifiée 
par le

logiciel antivirus Avast.
http://www.avast.com




---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com





---
L'absence de virus dans ce courrier électronique a été vérifiée par le 
logiciel antivirus Avast.

http://www.avast.com






---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?


Hi Ester,

yes, i saw it, but if I use:

q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2)

I have only the result from the collection used in the select/c1

Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit :

Hi Bruno,

You might consider using the JoinQueryParser. Details here : 
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser

Best,
Esther


Le 6 janv. 2016 à 08:48, Bruno Mannina  a écrit :

Same result on my dev' server, it seems that collection param haven't effect on 
the query...

Q: I don't see on the solr 5.4 doc, the "collection" param for select handler, 
is it always present in 5.4 version ?

Le 06/01/2016 17:38, Bruno Mannina a écrit :

I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :

I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel


On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina  wrote:

Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same query
fid:34520196

http://xxx.xxx.xxx.xxx:
/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{ "responseHeader":{ "status":0, "QTime":1, "params":{ "fl":"fid,cc*,st",
"indent":"true", "q":"fid:34520196", "collection":"c1,c2", "wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[ {

 "id":"EP1680447",
 "st":"LAPSED",
 "fid":"34520196"}]
   }
}


http://xxx.xxx.xxx.xxx:
/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{
   "responseHeader":{
 "status":0,
 "QTime":0,
 "params":{
   "fl":"id,fid,cc*,st",
   "indent":"true",
   "q":"fid:34520196",
   "collection":"c1,c2",
   "wt":"json"}},
   "response":{"numFound":1,"start":0,"docs":[
   {
 "id":"WO2005040212",
 "st":"PENDING",
 "cc_CA":"LAPSED",
 "cc_EP":"LAPSED",
     "cc_JP":"PENDING",
 "cc_US":"LAPSED",
 "fid":"34520196"}]
   }}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :


Hi Bruno,
Can you check counts? Is it possible that first page is only with results
from collection that you sent request to so you assumed it returns only
results from single collection?

Thanks,
Emir


On 06.01.2016 14:33, Susheel Kumar wrote:

Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel


http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2

responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id": "1","
id_type": "hello","_version_": 1522623395043213300},{"id":
"3","id_type":"
hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina  wrote:

yes id value is unique in C1 and unique in C2.

id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?

To get proper results, the ids should be unique across both the cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:

Hi All,


Solr 5.4, Ubuntu

I thought it was simple to request across two collections with the
same
schema but not.
I have one solr instance launch. 300 000 records in each collection.

I try to use this request without having both results:

http://my_adress:my_port
/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json

this request returns only C1 results and if I do:

http://my_adress:my_port
/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json

it returns on

Re: Newbie: Searching across 2 collections ?


:( not work for me

http://my_adress:my_port/solr/c1/select?q={!join from=fid to=fid 
fromIndex=c2}fid:34520196&wt=json

the result is always the same, it answer only for c1
34520196 has result in both collections



Le 06/01/2016 18:16, Binoy Dalal a écrit :

Bruno,
Use join like so:
{!join from=f1 to=f2 fromIndex=c2}
On c1

On Wed, 6 Jan 2016, 22:30 Bruno Mannina  wrote:


Hi Ester,

yes, i saw it, but if I use:

q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2)

I have only the result from the collection used in the select/c1

Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit :

Hi Bruno,

You might consider using the JoinQueryParser. Details here :

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser

Best,
Esther


Le 6 janv. 2016 à 08:48, Bruno Mannina  a écrit :

Same result on my dev' server, it seems that collection param haven't

effect on the query...

Q: I don't see on the solr 5.4 doc, the "collection" param for select

handler, is it always present in 5.4 version ?

Le 06/01/2016 17:38, Bruno Mannina a écrit :

I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :

I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel


On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina 

wrote:

Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same

query

fid:34520196

http://xxx.xxx.xxx.xxx:


/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{ "responseHeader":{ "status":0, "QTime":1, "params":{

"fl":"fid,cc*,st",

"indent":"true", "q":"fid:34520196", "collection":"c1,c2",

"wt":"json"}},

"response":{"numFound":1,"start":0,"docs":[ {

  "id":"EP1680447",
  "st":"LAPSED",
  "fid":"34520196"}]
}
}


http://xxx.xxx.xxx.xxx:


/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{
"responseHeader":{
  "status":0,
  "QTime":0,
  "params":{
"fl":"id,fid,cc*,st",
"indent":"true",
"q":"fid:34520196",
"collection":"c1,c2",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
  "id":"WO2005040212",
  "st":"PENDING",
  "cc_CA":"LAPSED",
  "cc_EP":"LAPSED",
  "cc_JP":"PENDING",
  "cc_US":"LAPSED",
  "fid":"34520196"}]
}}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :


Hi Bruno,
Can you check counts? Is it possible that first page is only with

results

from collection that you sent request to so you assumed it returns

only

results from single collection?

Thanks,
Emir


On 06.01.2016 14:33, Susheel Kumar wrote:

Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel




http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2

responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id":

"1","

id_type": "hello","_version_": 1522623395043213300},{"id":
"3","id_type":"
hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina 

wrote:

yes id value is unique in C1 and unique in C2.

id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?

To get proper results, the ids should be unique across both the

cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina 

wrote:

Hi All,


Solr 5.4, Ubuntu

I thought it was simple to req

Re: Newbie: Searching across 2 collections ?


Yeah ! it works with your method !

thanks a lot Esther !


Le 06/01/2016 19:15, Esther-Melaine Quansah a écrit :

Ok, so join won’t work. Distributed search is your answer. This worked for me:

http://localhost:8983/solr/temp/select?shards=localhost:8983/solr/job,localhost:8983/solr/temp&q=*:*
 
<http://localhost:8983/solr/temp/select?shards=localhost:8983/solr/job,localhost:8983/solr/temp&q=*:*>

so for you it’d look something like

http://localhost:8983/solr/c1/select?shards=localhost:8983/solr/c1,localhost:8983/solr/c2&q=fid:34520196
 
<http://localhost:8983/solr/c1/select?shards=localhost:8983/solr/c1,localhost:8983/solr/c2&q=fid:34520196>
and obviously, you’ll just choose the ports that correspond to your 
configuration.

Esther

On Jan 6, 2016, at 9:36 AM, Bruno Mannina  wrote:

:( not work for me

http://my_adress:my_port/solr/c1/select?q={!join from=fid to=fid 
fromIndex=c2}fid:34520196&wt=json

the result is always the same, it answer only for c1
34520196 has result in both collections



Le 06/01/2016 18:16, Binoy Dalal a écrit :

Bruno,
Use join like so:
{!join from=f1 to=f2 fromIndex=c2}
On c1

On Wed, 6 Jan 2016, 22:30 Bruno Mannina  wrote:


Hi Ester,

yes, i saw it, but if I use:

q={!join from=fid to=fid}fid:34520196 (with or not &collection=c1,c2)

I have only the result from the collection used in the select/c1

Le 06/01/2016 17:52, esther.quan...@lucidworks.com a écrit :

Hi Bruno,

You might consider using the JoinQueryParser. Details here :

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser

Best,
Esther


Le 6 janv. 2016 à 08:48, Bruno Mannina  a écrit :

Same result on my dev' server, it seems that collection param haven't

effect on the query...

Q: I don't see on the solr 5.4 doc, the "collection" param for select

handler, is it always present in 5.4 version ?

Le 06/01/2016 17:38, Bruno Mannina a écrit :

I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :

I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel


On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina 

wrote:

Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same

query

fid:34520196

http://xxx.xxx.xxx.xxx:


/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{ "responseHeader":{ "status":0, "QTime":1, "params":{

"fl":"fid,cc*,st",

"indent":"true", "q":"fid:34520196", "collection":"c1,c2",

"wt":"json"}},

"response":{"numFound":1,"start":0,"docs":[ {

  "id":"EP1680447",
  "st":"LAPSED",
  "fid":"34520196"}]
}
}


http://xxx.xxx.xxx.xxx:


/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2

{
"responseHeader":{
  "status":0,
  "QTime":0,
  "params":{
"fl":"id,fid,cc*,st",
"indent":"true",
"q":"fid:34520196",
"collection":"c1,c2",
    "wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
  "id":"WO2005040212",
  "st":"PENDING",
  "cc_CA":"LAPSED",
  "cc_EP":"LAPSED",
  "cc_JP":"PENDING",
  "cc_US":"LAPSED",
  "fid":"34520196"}]
}}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :


Hi Bruno,
Can you check counts? Is it possible that first page is only with

results

from collection that you sent request to so you assumed it returns

only

results from single collection?

Thanks,
Emir


On 06.01.2016 14:33, Susheel Kumar wrote:

Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel




http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2

responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt

Re: Newbie: Searching across 2 collections ?


Hi Shawn,

thanks for this info, I use solr alone on my own server.

Le 06/01/2016 20:13, Shawn Heisey a écrit :

On 1/6/2016 2:41 AM, Bruno Mannina wrote:

I try to use this request without having both results:

http://my_adress:my_port/solr/C1/select?collection=C1,C2&q=fid:34520196&wt=json


this request returns only C1 results and if I do:

http://my_adress:my_port/solr/C2/select?collection=C1,C2&q=fid:34520196&wt=json


it returns only C2 results.

Are you running in SolrCloud mode (with zookeeper)?  If you're not, then
the collection parameter doesn't do anything, and old-style distributed
search (with the shards parameter) will be your only option.

Thanks,
Shawn






---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Newbie: Searching across 2 collections ?


Hi,

is it possible that was the problem wrote by Shawn and you have 
SolrCloud mode (with zookeeper) ?


The solution gives by Esther works fine so it's ok for me :)

**

Are you running in SolrCloud mode (with zookeeper)?  If you're not, then
the collection parameter doesn't do anything, and old-style distributed
search (with the shards parameter) will be your only option.

Thanks,
Shawn

***

Le 06/01/2016 19:17, Susheel Kumar a écrit :

Hi Bruno,  I just tested on 5.4 for your sake and it works fine.  You are
somewhere goofing up.  Please create a new simple schema different from
your use case with 2-3 fields with 2-3 documents and test this out
independently on your current problem.  That's what i can make suggestion
and did same to confirm this.

On Wed, Jan 6, 2016 at 11:48 AM, Bruno Mannina  wrote:


Same result on my dev' server, it seems that collection param haven't
effect on the query...

Q: I don't see on the solr 5.4 doc, the "collection" param for select
handler, is it always present in 5.4 version ?


Le 06/01/2016 17:38, Bruno Mannina a écrit :


I have a dev' server, I will do some test on it...

Le 06/01/2016 17:31, Susheel Kumar a écrit :


I'll suggest if you can setup some some test data locally and try this
out.  This will confirm your understanding.

Thanks,
Susheel

On Wed, Jan 6, 2016 at 10:39 AM, Bruno Mannina  wrote:

Hi Susheel, Emir,

yes I check, and I have one result in c1 and in c2 with the same query
fid:34520196

http://xxx.xxx.xxx.xxx:
/solr/c1/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2


{ "responseHeader":{ "status":0, "QTime":1, "params":{
"fl":"fid,cc*,st",
"indent":"true", "q":"fid:34520196", "collection":"c1,c2",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[ {

  "id":"EP1680447",
  "st":"LAPSED",
  "fid":"34520196"}]
}
}


http://xxx.xxx.xxx.xxx:
/solr/c2/select?q=fid:34520196&wt=json&indent=true&fl=id,fid,cc*,st&collection=c1,c2


{
"responseHeader":{
  "status":0,
  "QTime":0,
  "params":{
"fl":"id,fid,cc*,st",
"indent":"true",
"q":"fid:34520196",
"collection":"c1,c2",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
  "id":"WO2005040212",
  "st":"PENDING",
  "cc_CA":"LAPSED",
  "cc_EP":"LAPSED",
  "cc_JP":"PENDING",
  "cc_US":"LAPSED",
  "fid":"34520196"}]
}}


I have the same xxx.xxx.xxx.xxx: (server:port).
unique key field C1, C2 : id

id data in C1 is different of id data in C2

Must I config/set something in solr ?

thanks,
Bruno


Le 06/01/2016 14:56, Emir Arnautovic a écrit :

Hi Bruno,

Can you check counts? Is it possible that first page is only with
results
from collection that you sent request to so you assumed it returns only
results from single collection?

Thanks,
Emir

On 06.01.2016 14:33, Susheel Kumar wrote:

Hi Bruno,

I just tested this scenario in my local solr 5.3.1 and it returned
results
from two identical collections. I doubt if it is broken in 5.4 just
double
check if you are not missing anything else.

Thanks,
Susheel



http://localhost:8983/solr/c1/select?q=id_type%3Ahello&wt=json&indent=true&collection=c1,c2

responseHeader": {"status": 0,"QTime": 98,"params": {"q":
"id_type:hello","
indent": "true","collection": "c1,c2","wt": "json"}},
response": {"numFound": 2,"start": 0,"maxScore": 1,"docs": [{"id":
"1","
id_type": "hello","_version_": 1522623395043213300},{"id":
"3","id_type":"
hello","_version_": 1522623422397415400}]}

On Wed, Jan 6, 2016 at 6:13 AM, Bruno Mannina 
wrote:

yes id value is unique in C1 and unique in C2.


id in C1 is never present in C2
id in C2 is never present in C1


Le 06/01/2016 11:12, Binoy Dalal a écrit :

Are Id values for docs in both the collections exactly same?


To get proper results, the ids should be unique across both the
cores.

On Wed, 6 Jan 2016, 15:11 Bruno Mannina  wrote:

Hi All,

Solr 5.4, Ubuntu

I thought it was simple to request across two collections

Wildcard "?" ?

2015-10-21 Thread Bruno Mannina


Dear Solr-user,

I'm surprise to see in my SOLR 5.0 that the wildward ? replace
inevitably 1 character.

my request is:

title:magnet? AND tire?

 SOLR found only title with a character after magnet and tire but don't
found
title with only magnet AND tire


Do you know where can I tell to solr that ? wildcard means [0, 1]
character and not [1] character ?
Is it possible ?


Thanks a lot !

my field in my schema is defined like that:


   Field: title

Field-Type:
   org.apache.solr.schema.TextField
PI Gap:
   100

Flags:  Indexed Tokenized   Stored  Multivalued
Properties  y
y
y
y
Schema  y
y
y
y
Index   y
y
y


 *

   org.apache.solr.analysis.TokenizerChain

 *

   org.apache.solr.analysis.TokenizerChain




---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Wildcard "?" ?

2015-10-21 Thread Bruno Mannina


title:/magnet.?/ doesn't work for me because solr answers:

|title = "Magnetic folding system"|

but thanks to give me the idea to use regexp !!!

Le 21/10/2015 18:46, Upayavira a écrit :

No, you cannot tell Solr to handle wildcards differently. However, you
can use regular expressions for searching:

title:/magnet.?/ should do it.

Upayavira

On Wed, Oct 21, 2015, at 11:35 AM, Bruno Mannina wrote:

Dear Solr-user,

I'm surprise to see in my SOLR 5.0 that the wildward ? replace
inevitably 1 character.

my request is:

title:magnet? AND tire?

   SOLR found only title with a character after magnet and tire but don't
found
title with only magnet AND tire


Do you know where can I tell to solr that ? wildcard means [0, 1]
character and not [1] character ?
Is it possible ?


Thanks a lot !

my field in my schema is defined like that:


 Field: title

Field-Type:
 org.apache.solr.schema.TextField
PI Gap:
 100

Flags:  Indexed Tokenized   Stored  Multivalued
Properties  y
y
y
y
Schema  y
y
y
y
Index   y
y
y


   *

 org.apache.solr.analysis.TokenizerChain

   *

 org.apache.solr.analysis.TokenizerChain




---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com






---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Re: Wildcard "?" ?

2015-10-22 Thread Bruno Mannina


Upayavira,

Thanks a lot for these information

Regards,
Bruno

Le 21/10/2015 19:24, Upayavira a écrit :

regexp will match the whole term. So, if you have stemming on, magnetic
may well stem to magnet, and that is the term against which the regexp
is executed.

If you want to do the regexp against the whole field, then you need to
do it against a string version of that field.

The process of using a regexp (and a wildcard for that matter) is:
  * search through the list of terms in your field for terms that match
  your regexp (uses an FST for speed)
  * search for documents that contain those resulting terms

Upayavira

On Wed, Oct 21, 2015, at 12:08 PM, Bruno Mannina wrote:

title:/magnet.?/ doesn't work for me because solr answers:

|title = "Magnetic folding system"|

but thanks to give me the idea to use regexp !!!

Le 21/10/2015 18:46, Upayavira a écrit :

No, you cannot tell Solr to handle wildcards differently. However, you
can use regular expressions for searching:

title:/magnet.?/ should do it.

Upayavira

On Wed, Oct 21, 2015, at 11:35 AM, Bruno Mannina wrote:

Dear Solr-user,

I'm surprise to see in my SOLR 5.0 that the wildward ? replace
inevitably 1 character.

my request is:

title:magnet? AND tire?

SOLR found only title with a character after magnet and tire but don't
found
title with only magnet AND tire


Do you know where can I tell to solr that ? wildcard means [0, 1]
character and not [1] character ?
Is it possible ?


Thanks a lot !

my field in my schema is defined like that:


  Field: title

Field-Type:
  org.apache.solr.schema.TextField
PI Gap:
  100

Flags:  Indexed Tokenized   Stored  Multivalued
Properties  y
y
y
y
Schema  y
y
y
y
Index   y
y
y


*

  org.apache.solr.analysis.TokenizerChain

*

  org.apache.solr.analysis.TokenizerChain




---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com



---
L'absence de virus dans ce courrier électronique a été vérifiée par le
logiciel antivirus Avast.
http://www.avast.com





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com

Solr 3.6, Highlight and multi words?

2015-03-29 Thread Bruno Mannina


Dear Solr User,

I try to work with highlight, it works well but only if I have only one
keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5

Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:



 

  (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made 
fromplastic  material
  , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal 
body made fromplastic

  
  

betweenplastic  tapes 3 and 3 having two heat fusion layers, and 
the twoplastic  tapes 3 and 3 are stuck

  
  

elements. A connecting element is formed as a hinge, a flexible foil or a 
flexibleplastic  part. #CMT#USE

  
  

  A bicycle handlebar grip includes an inner fiber layer and an 
outerplastic  layer. Thus, the fiber
handlebar grip, while theplastic  layer is soft and 
has an adjustable thickness to provide a comfortable
sensation to a user. In addition, theplastic  layer 
includes a holding portion coated on the outer surface
layer to enhance the combination strength between the fiber layer and 
theplastic  layer and to enhance

  






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 3.6, Highlight and multi words?

2015-03-29 Thread Bruno Mannina


Additional information, in my schema.xml, my field is defined like this:

 

May be it misses something? like termVectors



Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only 
one keyword in my query?!

If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 



Could you help me please to understand ? I read doc, google, without 
success...

so I post here...

my result is:



 

  (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal 
body (10) made from<em>plastic</em> material
  , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# 
The bicycle pedal has a pedal body made 
from<em>plastic</em>


  
  

   between<em>plastic</em>  tapes 3 and 3 having 
two heat fusion layers, and the two<em>plastic</em>  tapes 
3 and 3 are stuck


  
  

elements. A connecting element is formed as a hinge, a 
flexible foil or a flexible<em>plastic</em>  part. 
#CMT#USE


  
  

  A bicycle handlebar grip includes an inner fiber layer and 
an outer<em>plastic</em> layer. Thus, the fiber
handlebar grip, while the<em>plastic</em>  
layer is soft and has an adjustable thickness to provide a 
comfortable
sensation to a user. In addition, 
the<em>plastic</em>  layer includes a holding portion 
coated on the outer surface
layer to enhance the combination strength between the 
fiber layer and the<em>plastic</em>  layer and to 
enhance


  






---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?


regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only 
one keyword in my query?!

If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 



Could you help me please to understand ? I read doc, google, without 
success...

so I post here...

my result is:



 

  (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal 
body (10) made from<em>plastic</em> material
  , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# 
The bicycle pedal has a pedal body made 
from<em>plastic</em>


  
  

   between<em>plastic</em>  tapes 3 and 3 having 
two heat fusion layers, and the two<em>plastic</em>  tapes 
3 and 3 are stuck


  
  

elements. A connecting element is formed as a hinge, a 
flexible foil or a flexible<em>plastic</em>  part. 
#CMT#USE


  
  

  A bicycle handlebar grip includes an inner fiber layer and 
an outer<em>plastic</em> layer. Thus, the fiber
handlebar grip, while the<em>plastic</em>  
layer is soft and has an adjustable thickness to provide a 
comfortable
sensation to a user. In addition, 
the<em>plastic</em>  layer includes a holding portion 
coated on the outer surface
layer to enhance the combination strength between the 
fiber layer and the<em>plastic</em>  layer and to 
enhance


  






---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com

Re: Solr 3.6, Highlight and multi words?


Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField 
for abfr, aben, abit, abpt


Concerning the 2., yes you have right it's not and but AND

I have this result:



  <em>Bicycle</em>  frame comprises holder, particularly for 
water bottle, where holder is connected


  #CMT# #/CMT# The<em>bicycle</em>  frame (7) comprises a holder 
(1), particularly for a water bottle
  . The holder is connected with the<em>bicycle</em>  frame by a 
screw (5), where a mounting element has a compensation
section which is made of an elastic material, particularly 
a<em>plastic</em>  material. The compensation section

  


So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?

^^
2. Try removing the word "and" from the query.  There may be some interaction 
with a stop word filter.  If you want a phrase query, wrap it in quotes.

3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&row
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:



  
 
   (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal
body (10) made from<em>plastic</em> material
   , particularly for touring bike. #CMT#ADVANTAGE : #/CMT#
The bicycle pedal has a pedal body made
from<em>plastic</em>
 
   
   
 
between<em>plastic</em>  tapes 3 and 3 having
two heat fusion layers, and the two<em>plastic</em>  tapes
3 and 3 are stuck
 
   
   
 
 elements. A connecting element is formed as a hinge, a
flexible foil or a flexible<em>plastic</em>  part.
#CMT#USE
 
   
   
 
   A bicycle handlebar grip includes an inner fiber layer and
an outer<em>plastic</em> layer. Thus, the fiber
 handlebar grip, while the<em>plastic</em>
layer is soft and has an adjustable thickness to provide a
comfortable
 sensation to a user. In addition,
the<em>plastic</em>  layer includes a holding portion
coated on the outer surface
 layer to enhance the combination strength between the
fiber layer and the<em>plastic</em>  layer and to
enhance
 
   


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*

Re: Solr 3.6, Highlight and multi words?


ok for qf (i can't test now)

but concerning hl.simple.pre hl.simple.post I can define only one color no ?

in the sample solrconfig.xml there are several color,


  

  
  

  

How can I tell to solr to use these color instead of hl.simple.pre/post ?



Le 01/04/2015 20:58, Reitzel, Charles a écrit :

If you want to query on the field ab, you'll probably need to add it the qf 
parameter.

To control the highlighting markup, with the standard highlighter, use 
hl.simple.pre and hl.simple.post.

https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter


-Original Message-----
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 2:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField for 
abfr, aben, abit, abpt

Concerning the 2., yes you have right it's not and but AND

I have this result:


  
<em>Bicycle</em>  frame comprises holder, particularly for 
water bottle, where holder is connected
  
  
#CMT# #/CMT# The<em>bicycle</em>  frame (7) comprises a 
holder (1), particularly for a water bottle
. The holder is connected with the<em>bicycle</em>  frame by 
a screw (5), where a mounting element has a compensation
  section which is made of an elastic material, particularly 
a<em>plastic</em>  material. The compensation section
  



So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?
 
^^ 2. Try removing the word "and" from the query.  There may be some interaction with a stop word filter.  If you want a phrase query, wrap it in quotes.


3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&ro
w
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:



   
  
(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal
body (10) made from<em>plastic</em> material
, particularly for touring bike. #CMT#ADVANTAGE : #/CMT#
The bicycle pedal has a pedal body made
from<em>plastic</em>
  


  
 between<em>plastic</em>  tapes 3 and 3
having two heat fusion layers, and the
two<em>plastic</em>  tapes
3 and 3 are stuck
  


  
  elements. A connecting element is formed as a hinge, a
flexible foil or a flexible<em>plastic</em>  part.
#CMT#USE
  


  
A bicycle handlebar grip includes an inner fiber layer
and an outer<em>plastic</em> layer. Thus, the fiber
  handlebar grip, while the<em>plastic</em>
layer is soft and has an adjustable thickness to provide a
comfortable
  sensation to a user. In addition,
the<em>plastic</em>  layer includes a holding portion
coated on the outer surface
  layer to enhance the combination strength between the
fiber layer and the<em>plastic</em>  layer and to
enhance
  


**
*** This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
**
***


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*

Re: Solr 3.6, Highlight and multi words?


of course no prb charles, you already help me !

Le 01/04/2015 21:54, Reitzel, Charles a écrit :

Sorry, I've never tried highlighting in multiple colors...

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 3:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

ok for qf (i can't test now)

but concerning hl.simple.pre hl.simple.post I can define only one color no ?

in the sample solrconfig.xml there are several color,



  


  


How can I tell to solr to use these color instead of hl.simple.pre/post ?



Le 01/04/2015 20:58, Reitzel, Charles a écrit :

If you want to query on the field ab, you'll probably need to add it the qf 
parameter.

To control the highlighting markup, with the standard highlighter, use 
hl.simple.pre and hl.simple.post.

https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter


-Original Message-----
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 2:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField
for abfr, aben, abit, abpt

Concerning the 2., yes you have right it's not and but AND

I have this result:


   
 <em>Bicycle</em>  frame comprises holder, particularly for 
water bottle, where holder is connected
   
   
 #CMT# #/CMT# The<em>bicycle</em>  frame (7) comprises a 
holder (1), particularly for a water bottle
 . The holder is connected with the<em>bicycle</em>  frame 
by a screw (5), where a mounting element has a compensation
   section which is made of an elastic material, particularly 
a<em>plastic</em>  material. The compensation section
   
 


So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?
  
^^ 2. Try removing the word "and" from the query.  There may be some interaction with a stop word filter.  If you want a phrase query, wrap it in quotes.


3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&r
o
w
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:




   
 (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a
pedal body (10) made from<em>plastic</em> material
 , particularly for touring bike. #CMT#ADVANTAGE :
#/CMT# The bicycle pedal has a pedal body made
from<em>plastic</em>
   
 
 
   
  between<em>plastic</em>  tapes 3 and 3
having two heat fusion layers, and the
two<em>plastic</em>  tapes
3 and 3 are stuck
   
 
 
   
   elements. A connecting element is formed as a hinge,
a flexible foil or a flexible<em>plastic</em>  part.
#CMT#USE
   
 
 
   
 A bicycle handlebar grip includes an inner fiber layer
and an outer<em>plastic</em> layer. Thus, the fiber
   handlebar grip, while the<em>plastic</em>
layer is soft and has an adjustable thickness to provide a
comfortable
   sensation to a user. In addition,
the<em>plastic</em>  layer includes a holding portion
coated on the outer surface
   layer to enhance the combination strength between the
fiber layer and the<em>plastic</em>  layer

Solr 5.0, defaultSearchField, defaultOperator ?

2015-04-17 Thread Bruno Mannina


Dear Solr users,

Since today I used SOLR 5.0 (I used solr 3.6) so i try to adapt my old
schema for solr 5.0.

I have two questions:
- how can I set the defaultSearchField ?
I don't want to use in the query the df tag  because I have a lot of
modification to do for that on my web project.

- how can I set the defaultOperator (and|or) ?

It seems that these "options" are now deprecated in SOLR 5.0 schema.

Thanks a lot for your comment,

Regards,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0, defaultSearchField, defaultOperator ?

2015-04-18 Thread Bruno Mannina


Thx Chris & Ahmet !

Le 17/04/2015 23:56, Chris Hostetter a écrit :

: df and q.op are the ones you are looking for.
: You can define them in defaults section.

specifically...

https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig


:
: Ahmet
:
:
:
: On Friday, April 17, 2015 9:18 PM, Bruno Mannina  wrote:
: Dear Solr users,
:
: Since today I used SOLR 5.0 (I used solr 3.6) so i try to adapt my old
: schema for solr 5.0.
:
: I have two questions:
: - how can I set the defaultSearchField ?
: I don't want to use in the query the df tag  because I have a lot of
: modification to do for that on my web project.
:
: - how can I set the defaultOperator (and|or) ?
:
: It seems that these "options" are now deprecated in SOLR 5.0 schema.
:
: Thanks a lot for your comment,
:
: Regards,
: Bruno
:
: ---
: Ce courrier électronique ne contient aucun virus ou logiciel malveillant 
parce que la protection avast! Antivirus est active.
: http://www.avast.com
:

-Hoss
http://www.lucidworks.com/



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Correspondance table ?

2015-04-20 Thread Bruno Mannina


Dear Solr Users,

Solr 5.0.0

I have actually around 90 000 000 docs in my solr, and I have a field
with one char which represents a category. i.e:
value = a, definition : nature and health
etc...
I have fews categories, around 15.

These definition categories can changed during years.

Can I use a file where I will have
a\tNature and Health
b\tComputer science
etc...

and instead of having the code letter in my json result solr, I will
have the definition ?
Only in the result.
The query will be done with the code letter.

I'm sure it's possible !

Additional question: is it possible to do that also with a big
correspondance file? around 5000 definitions?

Thanks for your help,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Correspondance table ?

2015-04-20 Thread Bruno Mannina


Hi Alex,

well ok but if I have a big table ? more than 10 000 entries ?
is it safe to do that client side ?

note:
I have one little table
but I have also 2 big tables for 2 other fields

Le 20/04/2015 10:57, Alexandre Rafalovitch a écrit :

The best place to do so is in the client software, since you are not
using it for search in any way. So, wherever you get your Solr's
response JSON/XML/etc, map it there.

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 20 April 2015 at 18:23, Bruno Mannina  wrote:

Dear Solr Users,

Solr 5.0.0

I have actually around 90 000 000 docs in my solr, and I have a field with
one char which represents a category. i.e:
value = a, definition : nature and health
etc...
I have fews categories, around 15.

These definition categories can changed during years.

Can I use a file where I will have
a\tNature and Health
b\tComputer science
etc...

and instead of having the code letter in my json result solr, I will have
the definition ?
Only in the result.
The query will be done with the code letter.

I'm sure it's possible !

Additional question: is it possible to do that also with a big
correspondance file? around 5000 definitions?

Thanks for your help,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Correspondance table ?

2015-04-20 Thread Bruno Mannina


Hi Jack,

ok, it's not for many millions of users, just max 100 by day.
it will be used on traditional "PC" and also on mobile clients.

Then, I need to do test to verify the possibility.

Thx

Le 20/04/2015 14:20, Jack Krupansky a écrit :

It depends on the specific nature of your clients. Is they in-house users,
like only dozens or hundreds, or is this a large web app with many millions
of users and with mobile clients as well as traditional "PC" clients?

If it feels too much to do in the client, then a middleware API service
layer could be the way to go. In any case, don't try to load too much work
onto the Solr server itself.

-- Jack Krupansky

On Mon, Apr 20, 2015 at 7:32 AM, Bruno Mannina  wrote:


Hi Alex,

well ok but if I have a big table ? more than 10 000 entries ?
is it safe to do that client side ?

note:
I have one little table
but I have also 2 big tables for 2 other fields


Le 20/04/2015 10:57, Alexandre Rafalovitch a écrit :


The best place to do so is in the client software, since you are not
using it for search in any way. So, wherever you get your Solr's
response JSON/XML/etc, map it there.

Regards,
 Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 20 April 2015 at 18:23, Bruno Mannina  wrote:


Dear Solr Users,

Solr 5.0.0

I have actually around 90 000 000 docs in my solr, and I have a field
with
one char which represents a category. i.e:
value = a, definition : nature and health
etc...
I have fews categories, around 15.

These definition categories can changed during years.

Can I use a file where I will have
a\tNature and Health
b\tComputer science
etc...

and instead of having the code letter in my json result solr, I will have
the definition ?
Only in the result.
The query will be done with the code letter.

I'm sure it's possible !

Additional question: is it possible to do that also with a big
correspondance file? around 5000 definitions?

Thanks for your help,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Solr5.0.0, do a commit alone ?

2015-04-21 Thread Bruno Mannina


Dear Solr Users,

With Solr3.6, when I want to force a commit without giving data, I do:
java -jar post.jar

Now with Solr5.0.0, I use
bin/post .

but it do not accept to do a commit if I don't give a data directory. ie:
bin/post -c mydb -commit yes

I want to do that because I have a file with delete action.
Each line in this file contains one ref to delete
bin/post -c mydb -commit no -d "..."
So I would like to do the commit only after running my file with a
command line

bin/post -c mydb -commit yes (without data) is not accepted by post

Thanks,
Sincerely,
Bruno




---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem


Dear Solr Community,

I have a recent computer with 8Go RAM, I installed Ubuntu 14.04 and SOLR 
5.0, Java 7

This is a brand new installation.

all work fine but I would like to increase the JAVA_MEM_SOLR (40% of 
total RAM available).

So I edit the bin/solr.in.sh

# Increase Java Min/Max Heap as needed to support your indexing / query 
needs

SOLR_JAVA_MEM="-Xms3g –Xmx3g -XX:MaxPermSize=512m -XX:PermSize=512m"

but with this param, the solr server can't be start, I use:
bin/solr start

Do you have an idea of the problem ?

Thanks a lot for your comment,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Delete document stop my solr 5.0 ?!


Dear Solr Users,

I have a brand new computer where I installed Ubuntu 14.04, 8Go RAM,
SOLR 5.0, Java 7
I indexed 92 000 000 docs (little text file ~2ko each)
I have around 30 fields

All work fine but each Tuesday I need to delete some docs inside, so I
create a batch file
with inside line like this:
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:58644"
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:162882"
..
.
/home/solr/solr-5.0.0/bin/post -c docdb  -commit yes -d
"f1:2868668"

my f1 field is my key field. It is unique.

But if my file contains more than one or two hundreds line, my solr
shutdown.
Two hundreds line shutdown always solr 5.0.
I have no error in my console, just Solr can't be reach on the port 8983.

Is exists a variable that I must increase to disable this error ?

On my old solr 3.6, I don't use the same line to delete document, I use:
java -jar -Ddata=args -Dcommit=no  post.jar
"113422"

You can see that I use directly  not , and my schema between
solr3.6 and solr5.0 is almost the same.
I have just some more fields.
why this method do not work now ?

Thanks a lot,
Bruno


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Delete document stop my solr 5.0 ?!


ok I have this OOM error in the log file ...

#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh 
8983/home/solr/solr-5.0.0/server/logs"
#   Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh 
8983/home/solr/solr-5.0.0/server/logs"...

Running OOM killer script for process 28233 for Solr on port 8983
Killed process 28233

I try in few minutes to increase the

formdataUploadLimitInKB

and I will tell you the result.

Le 04/05/2015 14:58, Shawn Heisey a écrit :

On 5/4/2015 3:19 AM, Bruno Mannina wrote:

All work fine but each Tuesday I need to delete some docs inside, so I
create a batch file
with inside line like this:
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:58644"
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:162882"
..
.
/home/solr/solr-5.0.0/bin/post -c docdb  -commit yes -d
"f1:2868668"

my f1 field is my key field. It is unique.

But if my file contains more than one or two hundreds line, my solr
shutdown.
Two hundreds line shutdown always solr 5.0.
I have no error in my console, just Solr can't be reach on the port 8983.

Is exists a variable that I must increase to disable this error ?

As far as I know, the only limit that can affect that is the maximum
post size.  Current versions of Solr default to a 2MB max post size,
using the formdataUploadLimitInKB attribute on the requestParsers
element in solrconfig.xml, which defaults to 2048.

Even if that limit is exceeded by a request, it should not crash Solr,
it should simply log an error and ignore the request.  It would be a bug
if Solr does crash.

What happens if you increase that limit?  Are you seeing any error
messages in the Solr logfile when you send that delete request?

Thanks,
Shawn






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Delete document stop my solr 5.0 ?!


I increase the

formdataUploadLimitInKB

to 2048000 and the problem is the same, same error

an idea ?



Le 04/05/2015 16:38, Bruno Mannina a écrit :

ok I have this OOM error in the log file ...

#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh 
8983/home/solr/solr-5.0.0/server/logs"
#   Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh 
8983/home/solr/solr-5.0.0/server/logs"...

Running OOM killer script for process 28233 for Solr on port 8983
Killed process 28233

I try in few minutes to increase the

formdataUploadLimitInKB

and I will tell you the result.

Le 04/05/2015 14:58, Shawn Heisey a écrit :

On 5/4/2015 3:19 AM, Bruno Mannina wrote:

All work fine but each Tuesday I need to delete some docs inside, so I
create a batch file
with inside line like this:
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:58644"
/home/solr/solr-5.0.0/bin/post -c docdb  -commit no -d
"f1:162882"
..
.
/home/solr/solr-5.0.0/bin/post -c docdb  -commit yes -d
"f1:2868668"

my f1 field is my key field. It is unique.

But if my file contains more than one or two hundreds line, my solr
shutdown.
Two hundreds line shutdown always solr 5.0.
I have no error in my console, just Solr can't be reach on the port 
8983.


Is exists a variable that I must increase to disable this error ?

As far as I know, the only limit that can affect that is the maximum
post size.  Current versions of Solr default to a 2MB max post size,
using the formdataUploadLimitInKB attribute on the requestParsers
element in solrconfig.xml, which defaults to 2048.

Even if that limit is exceeded by a request, it should not crash Solr,
it should simply log an error and ignore the request.  It would be a bug
if Solr does crash.

What happens if you increase that limit?  Are you seeing any error
messages in the Solr logfile when you send that delete request?

Thanks,
Shawn






---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem


Yes ! it works !!!

Scott perfect 

For my config 3g do not work, but 2g yes !

Thanks

Le 04/05/2015 16:50, Scott Dawson a écrit :

Bruno,
You have the wrong kind of dash (a long dash) in front of the Xmx flag.
Could that be causing a problem?

Regards,
Scott

On Mon, May 4, 2015 at 5:06 AM, Bruno Mannina  wrote:


Dear Solr Community,

I have a recent computer with 8Go RAM, I installed Ubuntu 14.04 and SOLR
5.0, Java 7
This is a brand new installation.

all work fine but I would like to increase the JAVA_MEM_SOLR (40% of total
RAM available).
So I edit the bin/solr.in.sh

# Increase Java Min/Max Heap as needed to support your indexing / query
needs
SOLR_JAVA_MEM="-Xms3g –Xmx3g -XX:MaxPermSize=512m -XX:PermSize=512m"

but with this param, the solr server can't be start, I use:
bin/solr start

Do you have an idea of the problem ?

Thanks a lot for your comment,
Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Delete document stop my solr 5.0 ?!

Yes it was that ! I increased the SOLR_JAVA_MEM to 2g (with 8Go Ram i do 
more, 3g fail to run solr on my brand new computer)


thanks !

Le 04/05/2015 17:03, Shawn Heisey a écrit :

On 5/4/2015 8:38 AM, Bruno Mannina wrote:

ok I have this OOM error in the log file ...

#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="/home/solr/solr-5.0.0/bin/oom_solr.sh
8983/home/solr/solr-5.0.0/server/logs"
#   Executing /bin/sh -c "/home/solr/solr-5.0.0/bin/oom_solr.sh
8983/home/solr/solr-5.0.0/server/logs"...
Running OOM killer script for process 28233 for Solr on port 8983

Out Of Memory errors are a completely different problem.  Solr behavior
is completely unpredictable after an OutOfMemoryError exception, so the
5.0 install includes a script to run on OOME that kills Solr.  It's the
only safe way to handle that problem.

Your Solr install is not being given enough Java heap memory for what it
is being asked to do.  You need to increase the heap size for Solr.  If
you look at the admin UI for Solr in a web browser, you can see what the
max heap is set to ... on a default 5.0 install running Solr with
"bin/solr" the max heap will be 512m ... which is VERY small.  Try using
bin/solr with the -m option, set to something like 2g (for 2 gigabytes
of heap).

Thanks,
Shawn






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem


Shaun thanks a lot for this comment,

So, I have this information, no information about 32 or 64 bits...

solr@linux:~$ java -version
java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2)
OpenJDK Server VM (build 24.79-b02, mixed mode)
solr@linux:~$

solr@linux:~$ uname -a
Linux linux 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:11:46 UTC 
2015 i686 i686 i686 GNU/Linux

solr@linux:~$

I need to install a new version of Java ? I just install my ubuntu since 
one week :)

updates are up to date.

Le 04/05/2015 17:23, Shawn Heisey a écrit :

On 5/4/2015 9:09 AM, Bruno Mannina wrote:

Yes ! it works !!!

Scott perfect 

For my config 3g do not work, but 2g yes !

If you can't start Solr with a 3g heap, chances are that you are running
a 32-bit version of Java.  A 32-bit Java cannot go above a 2GB heap.  A
64-bit JVM requires a 64-bit operating system, which requires a 64-bit
CPU.  Since 2006, Intel has only been providing 64-bit chips to the
consumer market, and getting a 32-bit chip in a new computer has gotten
extremely difficult.  The server market has had only 64-bit chips from
Intel since 2005.  I am not sure what those dates look like for AMD
chips, but it is probably similar.

Running "java -version" should give you enough information to determine
whether your Java is 32-bit or 64-bit.  This is the output from that
command on a Linux machine that is running a 64-bit JVM from Oracle:

root@idxa4:~# java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

If you are running Solr on Linux, then the output of "uname -a" should
tell you whether your operating system is 32 or 64 bit.

Thanks,
Shawn






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0, Ubuntu 14.04, SOLR_JAVA_MEM problem


ok, I note all these information, thanks !

I will update if it's needed. 2go seems to be ok.

Le 04/05/2015 18:46, Shawn Heisey a écrit :

On 5/4/2015 10:28 AM, Bruno Mannina wrote:

solr@linux:~$ java -version
java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2)
OpenJDK Server VM (build 24.79-b02, mixed mode)
solr@linux:~$

solr@linux:~$ uname -a
Linux linux 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:11:46 UTC
2015 i686 i686 i686 GNU/Linux
solr@linux:~$

Both Linux and Java are 32-bit.  For linux, I know this because your
arch is "i686", which means it is coded for a newer generation 32-bit
CPU.  You can't be running a 64-bit Java, and the Java version confirms
that because it doesn't contain "64-bit".

Run this command:

cat /proc/cpuinfo

If the "flags" on the CPU contain the string "lm" (long mode), then your
CPU is capable of running a 64-bit (sometimes known as amd64 or x86_64)
version of Linux, and a 64-bit Java.  You will need to re-install both
Linux and Java to get this capability.

Here's "uname -a" from a 64-bit version of Ubuntu:

Linux lb1 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux

Since you are running 5.0, I would recommend Oracle Java 8.

http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html

Thanks,
Shawn






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Solr 5.0 - uniqueKey case insensitive ?


Dear Solr users,

I have a problem with SOLR5.0 (and not on SOLR3.6)

What kind of field can I use for my uniqueKey field named "code" if I
want it case insensitive ?

On SOLR3.6, I defined a string_ci field like this:



  
  





and it works fine.
- If I add a document with the same code then the doc is updated.
- If I search a document with lower or upper case, the doc is found


But in SOLR5.0, if I use this definition then :
- I can search in lower/upper case, it's OK
- BUT if I add a doc with the same code then the doc is added not updated !?

I read that the problem could be that the type of field is tokenized
instead of use a string.

If I change from string_ci to string, then
- I lost the possibility to search in lower/upper case
- but it works fine to update the doc.

So, could you help me to find the right field type to:

- search in case insensitive
- if I add a document with the same code, the old doc will be updated

Thanks a lot !


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0 - uniqueKey case insensitive ?


Hello Chris,

yes I confirm on my SOLR3.6 it works fine since several years, and each 
doc added with same code is updated not added.


To be more clear, I receive docs with a field name "pn" and it's the 
uniqueKey, and it always in uppercase


so I must define in my schema.xml

required="true" stored="true"/>
indexed="true" stored="false"/>

...
   id
...
  

but the application that use solr already exists so it requests with pn 
field not id, i cannot change that.
and in each docs I receive, there is not id field, just pn field, and  i 
cannot also change that.


so there is a problem no ? I must import a id field and request a pn 
field, but I have a pn field only for import...



Le 05/05/2015 01:00, Chris Hostetter a écrit :

: On SOLR3.6, I defined a string_ci field like this:
:
: 
: 
:   
:   
: 
: 
:
: 


I'm really suprised that field would have worked for you (reliably) as a
uniqueKey field even in Solr 3.6.

the best practice for something like what you describe has always (going
back to Solr 1.x) been to use a copyField to create a case insensitive
copy of your uniqueKey for searching.

if, for some reason, you really want case insensitve *updates* (so a doc
with id "foo" overwrites a doc with id "FOO" then the only reliable way to
make something like that work is to do the lowercassing in an
UpdateProcessor to ensure it happens *before* the docs are distributed to
the correct shard, and so the correct existing doc is overwritten (even if
you aren't using solr cloud)



-Hoss
http://www.lucidworks.com/





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Solr 5.0 - uniqueKey case insensitive ?

2015-05-06 Thread Bruno Mannina


Yes thanks it's now for me too.

Daniel, my pn is always in uppercase and I index them always in uppercase.
the problem (solved now after all your answers, thanks) was the request, 
if users

requests with lowercase then solr reply no result and it was not good.

but now the problem is solved, I changed in my source file the name pn 
field to id

and in my schema I use a copy field named pn and it works perfectly.

Thanks a lot !!!

Le 06/05/2015 09:44, Daniel Collins a écrit :

Ah, I remember seeing this when we first started using Solr (which was 4.0
because we needed Solr Cloud), I never got around to filing an issue for it
(oops!), but we have a note in our schema to leave the key field a normal
string (like Bruno we had tried to lowercase it which failed).
We didn't really know Solr in those days, and hadn't really thought about
it since then, but Hoss' and Erick's explanations make perfect sense now!

Since shard routing is (basically) done on hashes of the unique key, if I
have 2 documents which are the "same", but have values "HELLO" and "hello",
they might well hash to completely different shards, so the update
logistics would be horrible.

Bruno, why do you need to lowercase at all then?  You said in your example,
that your client application always supplies "pn" and it is always
uppercase, so presumably all adds/updates could be done directly on that
field (as a normal string with no lowercasing).  Where does the case
insensitivity come in, is that only for searching?  If so couldn't you add
a search field (called id), and update your app to search using that (or
make that your default search field, I guess it depends if your calling app
explicitly uses the pn field name in its searches).


On 6 May 2015 at 01:55, Erick Erickson  wrote:


Well, "working fine" may be a bit of an overstatement. That has never
been officially supported, so it "just happened" to work in 3.6.

As Chris points out, if you're using SolrCloud then this will _not_
work as routing happens early in the process, i.e. before the analysis
chain gets the token so various copies of the doc will exist on
different shards.

Best,
Erick

On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina  wrote:

Hello Chris,

yes I confirm on my SOLR3.6 it works fine since several years, and each

doc

added with same code is updated not added.

To be more clear, I receive docs with a field name "pn" and it's the
uniqueKey, and it always in uppercase

so I must define in my schema.xml

 
 
indexed="true"

stored="false"/>
...
id
...
   

but the application that use solr already exists so it requests with pn
field not id, i cannot change that.
and in each docs I receive, there is not id field, just pn field, and  i
cannot also change that.

so there is a problem no ? I must import a id field and request a pn

field,

but I have a pn field only for import...



Le 05/05/2015 01:00, Chris Hostetter a écrit :

: On SOLR3.6, I defined a string_ci field like this:
:
: 
: 
:   
:   
: 
: 
:
: 


I'm really suprised that field would have worked for you (reliably) as a
uniqueKey field even in Solr 3.6.

the best practice for something like what you describe has always (going
back to Solr 1.x) been to use a copyField to create a case insensitive
copy of your uniqueKey for searching.

if, for some reason, you really want case insensitve *updates* (so a doc
with id "foo" overwrites a doc with id "FOO" then the only reliable way

to

make something like that work is to do the lowercassing in an
UpdateProcessor to ensure it happens *before* the docs are distributed

to

the correct shard, and so the correct existing doc is overwritten (even

if

you aren't using solr cloud)



-Hoss
http://www.lucidworks.com/




---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com




---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

How to index 20 000 files with a command line ?

2015-05-29 Thread Bruno Mannina


Dear Solr Users,

Habitualy i use this command line to index my files:
>bin/post -c hbl /data/hbl-201522/*.xml

but today I have a big update, so there are 20 000 xml files (each files
1kohttp://www.avast.com

Re: How to index 20 000 files with a command line ?

2015-05-29 Thread Bruno Mannina


oh yes like this:

 find  /data/hbl-201522/-name  "*.xml"  -exec  bin/post -c hbl  {}  \;

?

Le 29/05/2015 14:15, Sergey Shvets a écrit :

Hello Bruno,

You can use find command with exec attribute.

regards
  Sergey

Friday, May 29, 2015, 3:11:37 PM, you wrote:

Dear Solr Users,

Habitualy i use this command line to index my files:
  >bin/post -c hbl /data/hbl-201522/*.xml

but today I have a big update, so there are 20 000 xml files (each files
1kohttp://www.avast.com








---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Help for a field in my schema ?

2015-05-29 Thread Bruno Mannina


Dear Solr-Users,

(SOLR 5.0 Ubuntu)

I have xml files with tags like this
claimXXYYY

where XX is a language code like FR EN DE PT etc... (I don't know the
number of language code I can have)
and YYY is a number [1..999]

i.e.:
claimen1
claimen2
claimen3
claimfr1
claimfr2
claimfr3

I would like to define fields named:
*claimen* equal to claimenYYY (EN language, all numbers, indexed=true,
stored=true) (search needed and must be displayed)
*claim *equal to all claimXXYYY (all languages, all numbers,
indexed=true, stored false) (search not needed but must be displayed)

Is it possible to have these 2 fields ?

Could you help me to declare them in my schema.xml ?

Thanks a lot for your help !

Bruno



---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Possible or not ?


Dear Solr Users,

I would like to post  1 000 000 records (1 records = 1 files) in one shoot ?
and do the commit and the end.

Is it possible to do that ?

I've several directories with each 20 000 files inside.
I would like to do:
bin/post -c mydb /DATA

under DATA I have
/DATA/1/*.xml (20 000 files)
/DATA/2/*.xml (20 000 files)
/DATA/3/*.xml (20 000 files)

/DATA/50/*.xml (20 000 files)

Actually, I post 5 directories in one time (it takes around 1h30 for 100
000 records/files)

But it's Friday and I would like to run it during the W.E. alone.

Thanks for your comment,

Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
https://www.avast.com/antivirus

Re: Possible or not ?


Hi Alessandro,

I'm actually on my dev' computer, so I would like to post 1 000 000 xml 
file (with a structure defined in my schema.xml)


I have already import 1 000 000 xml files by using
bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5
where /DATA0/X contains 20 000 xml files (I do it 20 times by just 
changing X from 1 to 50)


I would like to do now
bin/post -c mydb /DATA1

I would like to know If my SOLR5 will run fine and no provide an memory 
error because there are too many files

in one post without doing a commit?

The commit will be done at the end of 1 000 000.

Is it ok ?


Le 05/06/2015 16:59, Alessandro Benedetti a écrit :

Hi Bruno,
I can not see what is your challenge.
Of course you can index your data in the flavour you want and do a commit
whenever you want…
Are those xml Solr xml ?
If not you would need to use the DIH, the extract update handler or any
custom Indexer application.
Maybe I missed your point…
Give me more details please !

Cheers

2015-06-05 15:41 GMT+01:00 Bruno Mannina :


Dear Solr Users,

I would like to post  1 000 000 records (1 records = 1 files) in one shoot
?
and do the commit and the end.

Is it possible to do that ?

I've several directories with each 20 000 files inside.
I would like to do:
bin/post -c mydb /DATA

under DATA I have
/DATA/1/*.xml (20 000 files)
/DATA/2/*.xml (20 000 files)
/DATA/3/*.xml (20 000 files)

/DATA/50/*.xml (20 000 files)

Actually, I post 5 directories in one time (it takes around 1h30 for 100
000 records/files)

But it's Friday and I would like to run it during the W.E. alone.

Thanks for your comment,

Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
https://www.avast.com/antivirus







---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
https://www.avast.com/antivirus

Re: Possible or not ?


Ok thanks for these information !

Le 05/06/2015 17:37, Erick Erickson a écrit :

Picking up on Alessandro's point. While you can post all these docs
and commit at the end, unless you do a hard commit (
openSearcher=true or false doesn't matter), then if your server should
abnormally terminate for _any_ reason, all these docs will be
replayed on startup from the transaction log.

I'll also echo Alessandro's point that I don't see the advantage of this.
Personally I'd set my hard commit interval with openSearcher=false
to something like 6 (60 seconds it's in milliseconds) and forget
about it. You're not imposing  much extra load on the system, you're
durably saving your progress, you're avoiding really, really, really
long restarts if your server should stop for some reason.

If you don't want the docs to be _visible_ for searches, be sure your
autocommit has openSearcer set to false and disable soft commits
(set the interval to -1 or remove it from your solrconfig).

Best,
Erick

On Fri, Jun 5, 2015 at 8:21 AM, Alessandro Benedetti
 wrote:

I can not see any problem in that, but talking about commits I would like
to make a difference between "Hard" and "Soft" .

Hard commit -> durability
Soft commit -> visibility

I suggest you this interesting reading :
https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
It's an old interesting Erick post.

It explains you better what are the differences between different commit
types.

I would put you in this scenario :

Heavy (bulk) indexing

The assumption here is that you’re interested in getting lots of data to
the index as quickly as possible for search sometime in the future. I’m
thinking original loads of a data source etc.

- Set your soft commit interval quite long. As in 10 minutes or even
longer (-1 for no soft commits at all). *Soft commit is about
visibility, *and my assumption here is that bulk indexing isn’t about
near real time searching so don’t do the extra work of opening any kind of
searcher.
- Set your hard commit intervals to 15 seconds, openSearcher=false.
Again the assumption is that you’re going to be just blasting data at Solr.
The worst case here is that you restart your system and have to replay 15
seconds or so of data from your tlog. If your system is bouncing up and
down more often than that, fix the reason for that first.
- Only after you’ve tried the simple things should you consider
refinements, they’re usually only required in unusual circumstances. But
they include:
   - Turning off the tlog completely for the bulk-load operation
   - Indexing offline with some kind of map-reduce process
   - Only having a leader per shard, no replicas for the load, then
   turning on replicas later and letting them do old-style replication to
   catch up. Note that this is automatic, if the node discovers it is “too
   far” out of sync with the leader, it initiates an old-style replication.
   After it has caught up, it’ll get documents as they’re indexed to the
   leader and keep its own tlog.
   - etc.



Actually you could do the commit only at the end, but I can not see any
advantage in that.
I suggest you to play with auto hard/soft commit config and get a better
idea of the situation !

Cheers

2015-06-05 16:08 GMT+01:00 Bruno Mannina :


Hi Alessandro,

I'm actually on my dev' computer, so I would like to post 1 000 000 xml
file (with a structure defined in my schema.xml)

I have already import 1 000 000 xml files by using
bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5
where /DATA0/X contains 20 000 xml files (I do it 20 times by just
changing X from 1 to 50)

I would like to do now
bin/post -c mydb /DATA1

I would like to know If my SOLR5 will run fine and no provide an memory
error because there are too many files
in one post without doing a commit?

The commit will be done at the end of 1 000 000.

Is it ok ?



Le 05/06/2015 16:59, Alessandro Benedetti a écrit :


Hi Bruno,
I can not see what is your challenge.
Of course you can index your data in the flavour you want and do a commit
whenever you want…
Are those xml Solr xml ?
If not you would need to use the DIH, the extract update handler or any
custom Indexer application.
Maybe I missed your point…
Give me more details please !

Cheers

2015-06-05 15:41 GMT+01:00 Bruno Mannina :

  Dear Solr Users,

I would like to post  1 000 000 records (1 records = 1 files) in one
shoot
?
and do the commit and the end.

Is it possible to do that ?

I've several directories with each 20 000 files inside.
I would like to do:
bin/post -c mydb /DATA

under DATA I have
/DATA/1/*.xml (20 000 files)
/DATA/2/*.xml (20 000 files)
/DATA/3/*.xml (20 000 files)

/DATA/50/*.xml (20 000 files)

Actually, I post 5 directories in one time (it takes around 1h30 for

Re: Possible or not ?

Thanks for the link,

So, I launch this post, I will see on Monday if it will ok :)

Le 05/06/2015 17:21, Alessandro Benedetti a écrit :

I can not see any problem in that, but talking about commits I would like
to make a difference between "Hard" and "Soft" .

Hard commit -> durability
Soft commit -> visibility

I suggest you this interesting reading :
https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
It's an old interesting Erick post.

It explains you better what are the differences between different commit
types.

I would put you in this scenario :

Heavy (bulk) indexing

The assumption here is that you’re interested in getting lots of data to
the index as quickly as possible for search sometime in the future. I’m
thinking original loads of a data source etc.

- Set your soft commit interval quite long. As in 10 minutes or even
longer (-1 for no soft commits at all). *Soft commit is about
visibility, *and my assumption here is that bulk indexing isn’t about
near real time searching so don’t do the extra work of opening any kind of
searcher.
- Set your hard commit intervals to 15 seconds, openSearcher=false.
Again the assumption is that you’re going to be just blasting data at Solr.
The worst case here is that you restart your system and have to replay 15
seconds or so of data from your tlog. If your system is bouncing up and
down more often than that, fix the reason for that first.
- Only after you’ve tried the simple things should you consider
refinements, they’re usually only required in unusual circumstances. But
they include:
- Turning off the tlog completely for the bulk-load operation
- Indexing offline with some kind of map-reduce process
- Only having a leader per shard, no replicas for the load, then
turning on replicas later and letting them do old-style replication to
catch up. Note that this is automatic, if the node discovers it is “too
far” out of sync with the leader, it initiates an old-style replication.
After it has caught up, it’ll get documents as they’re indexed to the
leader and keep its own tlog.
- etc.

Actually you could do the commit only at the end, but I can not see any
advantage in that.
I suggest you to play with auto hard/soft commit config and get a better
idea of the situation !

Cheers

2015-06-05 16:08 GMT+01:00 Bruno Mannina :

Hi Alessandro,

I'm actually on my dev' computer, so I would like to post 1 000 000 xml
file (with a structure defined in my schema.xml)

I have already import 1 000 000 xml files by using
bin/post -c mydb /DATA0/1 /DATA0/2 /DATA0/3 /DATA0/4 /DATA0/5
where /DATA0/X contains 20 000 xml files (I do it 20 times by just
changing X from 1 to 50)

I would like to do now
bin/post -c mydb /DATA1

I would like to know If my SOLR5 will run fine and no provide an memory
error because there are too many files
in one post without doing a commit?

The commit will be done at the end of 1 000 000.

Is it ok ?

Le 05/06/2015 16:59, Alessandro Benedetti a écrit :

Hi Bruno,
I can not see what is your challenge.
Of course you can index your data in the flavour you want and do a commit
whenever you want…
Are those xml Solr xml ?
If not you would need to use the DIH, the extract update handler or any
custom Indexer application.
Maybe I missed your point…
Give me more details please !

Cheers

2015-06-05 15:41 GMT+01:00 Bruno Mannina :

Dear Solr Users,

I would like to post 1 000 000 records (1 records = 1 files) in one
shoot
?
and do the commit and the end.

Is it possible to do that ?

I've several directories with each 20 000 files inside.
I would like to do:
bin/post -c mydb /DATA

under DATA I have
/DATA/1/*.xml (20 000 files)
/DATA/2/*.xml (20 000 files)
/DATA/3/*.xml (20 000 files)

/DATA/50/*.xml (20 000 files)

Actually, I post 5 directories in one time (it takes around 1h30 for 100
000 records/files)

But it's Friday and I would like to run it during the W.E. alone.

Thanks for your comment,

Bruno

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
https://www.avast.com/antivirus

---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce
que la protection avast! Antivirus est active.
https://www.avast.com/antivirus

How to index text field with html entities ?

2016-07-29 Thread Bruno Mannina


Dear Solr User,

Solr 5.0.1

I have several xml files that contains html entities in some fields.

I have a author field (english text) with this kind of text:

Brown & Gammon

If I set my field like this:

Brown & Gammon

Solr generates error "Undeclared general entity"

if I add CDATA like this:



it seems that I can't search with the &

au:"brown & gammon"

Could you help me to find the right syntax ?

Thanks a lot,

Bruno




---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Re: How to index text field with html entities ?

2016-07-29 Thread Bruno Mannina


Hi Chris,

Thanks for your answer, and I add a little thing,

after checking my log it seems that it concerns only some html entities.
No problem with & but I have problem with:

ü
“
etc...

I will check your answer to find a solution,

Thanks !

Le 29/07/2016 à 23:58, Chris Hostetter a écrit :

: I have several xml files that contains html entities in some fields.

...

: If I set my field like this:
:
: Brown & Gammon
:
: Solr generates error "Undeclared general entity"

...because that's not valid XML...

: if I add CDATA like this:
:
: 
:
: it seems that I can't search with the &

...because that is valid xml, and tells solr you want the literal string
"Brown & Gammon" to be indexed -- given a typical analyzer you are
probably getting either "&" or "amp" as a term in your index.

: Could you help me to find the right syntax ?

the client code you are using for indexing can either "parse" these HTML
snippets using an HTML parser, and then send solr the *real* string you
want to index, or you can configure solr with something like
HTMLStripFieldUpdateProcessorFactory (if you want both the indexed form
and the stored form to be plain text) or HTMLStripCharFilterFactory (if
you wnat to preserve the html markup in the stored value, but strip it as
part of the analysis chain for indexing.


http://lucene.apache.org/solr/6_1_0/solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html
http://lucene.apache.org/core/6_1_0/analyzers-common/org/apache/lucene/analysis/charfilter/HTMLStripCharFilterFactory.html


-Hoss
http://www.lucidworks.com/




---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Re: How to index text field with html entities ?

2016-07-30 Thread Bruno Mannina


Thanks Shawn for these precisions

Le 30/07/2016 à 00:43, Shawn Heisey a écrit :

On 7/29/2016 4:05 PM, Bruno Mannina wrote:

after checking my log it seems that it concerns only some html entities.
No problem with & but I have problem with:

ü
“
etc...

Those are valid *HTML* entities, but they are not valid *XML* entities.
The list of entities that are valid in XML is quite short -- there are
only five of them.

https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML

When Solr processes XML, it is only going to convert entities that are
valid for XML -- the five already mentioned.  It will fail on the other
247 entities that are only valid for HTML.

If you are seeing the problem with & (which is one of the five valid
XML entities) then we'll need the Solr version and the full error
message/stacktrace from the solr logfile.

Thanks,
Shawn





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Strange error when I try to copy....

2016-09-09 Thread Bruno Mannina


Dear Solr Users,

I use since several years SOLR and since two weeks, I have a problem
when I try to copy my solr index.

My solr index is around 180Go (~100 000 000 docs, 1 doc ~ 3ko)

My method to save my index every Sunday:

- I stop SOLR 5.4 on Ubuntu 14.04LTS - 16Go - i3-2120 CPU @ 3.30Ghz

- I do a simple directory copy /data to my HDD backup (from 2To SATA to
2To SATA directly connected to the Mothercard).

All files are copied fine but one not ! the biggest (~65Go) failed.

I have the message : "Error splicing file: Input/output error"

I tried also on windows (I have a dualboot), I have "redondance error".

I check my HDD, no error, I check the file "_k46.fdt" no error, I can
delete docs, add docs, my database can be reach and works fine.

Is someone have an idea to backup my database ? or why I have this error ?

Many thanks for your help,

Sincerely,

Bruno





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Re: Strange error when I try to copy....

2016-09-09 Thread Bruno Mannina


Le 09/09/2016 à 17:57, Shawn Heisey a écrit :

On 9/8/2016 9:41 AM, Bruno Mannina wrote:

- I stop SOLR 5.4 on Ubuntu 14.04LTS - 16Go - i3-2120 CPU @ 3.30Ghz

- I do a simple directory copy /data to my HDD backup (from 2To SATA
to 2To SATA directly connected to the Mothercard).

All files are copied fine but one not ! the biggest (~65Go) failed.

I have the message : "Error splicing file: Input/output error"

This isn't a Solr issue, which is easy to determine by the fact that
you've stopped Solr and it's not even running.  It's a problem with the
filesystem, probably the destination filesystem.

The most common reason that I have found for this error is a destination
filesystem that is incapable of holding a large file -- which can happen
when the disk is formatted fat32 instead of ntfs or a Linux filesystem.
You can have a 2TB filesystem with fat32, but no files larger than 4GB
-- so your 65GB file won't fit.

I think you're going to need to reformat that external drive with
another filesystem.  If you choose NTFS, you'll be able to use the disk
on either Linux or Windows.

Thanks,
Shawn



Hi Shawn,

First thanks for your answer, effectively it's a little bit clear.
Tonight I will check the file system of my hdd.

And sorry for this question out of solr subject.

Cdlt,
Bruno



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Solr 5.4.0: Colored Highlight and multi-value field ?

2017-10-03 Thread Bruno Mannina

Dear all,



Is it possible to have a colored highlight in a multi-value field ?



Im succeed to do it on a textfield but not in a multi-value field, then
SOLR takes hl.simple.pre / hl.simple.post as tag.



Thanks a lot for your help,



Cordialement, Best Regards

Bruno Mannina

 <http://www.matheo-software.com> www.matheo-software.com

 <http://www.patent-pulse.com> www.patent-pulse.com

Tél. +33 0 970 738 743

Mob. +33 0 634 421 817

 <https://www.facebook.com/PatentPulse> facebook (1)
<https://twitter.com/matheosoftware> 1425551717
<https://www.linkedin.com/company/matheo-software> 1425551737
<https://www.youtube.com/user/MatheoSoftware> 1425551760





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

RE: Solr 5.4.0: Colored Highlight and multi-value field ?

2017-10-06 Thread Bruno Mannina

Hi Erik,

Sorry for the late reply, I wasn't in my office this week...

So, I give more information:

* IC is a multi-value field defined like this:


* The request I use (i.e):
http://my_host/solr/collection/select?
q=ic:(A63C10* OR G06F22/086)
&start=0
&rows=10
&wt=json
&indent=true
&sort=pd+desc
&fl=*
// HighLight
&hl=true
&hl.fl=ti,ab,ic,inc,cpc,apc
&hl.simple.pre=
&hl.simple.post=
&hl.fragmentsBuilder=colored
&hl.useFastVectorHighlighter=true
&hl.highlightMultiTerm=true
&hl.usePhraseHighlighter=true
&hl.fragsize=999
&hl.preserveMulti=true

* Result:
I have only one color (in my case the yellow) for all different values found

* BUT *

If I use a non multi-value field like ti (title) with a query with some keywords


*Result (i.e ti:(foo OR merge) ):
I have different colors for each different terms found


Question:
- Is it because IC field is not defined with all term*="true" options ?
- How can I have different color and not use pre and post tags ?


Many thanks for your help !

-Message d'origine-
De : Erick Erickson [mailto:erickerick...@gmail.com]
Envoyé : mercredi 4 octobre 2017 15:48
À : solr-user
Objet : Re: Solr 5.4.0: Colored Highlight and multi-value field ?

How does it not work for you? Details matter, an example set of values and the 
response from Solr are good bits of info for us to have.

On Tue, Oct 3, 2017 at 3:59 PM, Bruno Mannina 
wrote:

> Dear all,
>
>
>
> Is it possible to have a colored highlight in a multi-value field ?
>
>
>
> I’m succeed to do it on a textfield but not in a multi-value field,
> then SOLR takes hl.simple.pre / hl.simple.post as tag.
>
>
>
> Thanks a lot for your help,
>
>
>
> Cordialement, Best Regards
>
> Bruno Mannina
>
> www.matheo-software.com
>
> www.patent-pulse.com
>
> Tél. +33 0 970 738 743
>
> Mob. +33 0 634 421 817
>
> [image: facebook (1)] <https://www.facebook.com/PatentPulse>[image:
> 1425551717] <https://twitter.com/matheosoftware>[image: 1425551737]
> <https://www.linkedin.com/company/matheo-software>[image: 1425551760]
> <https://www.youtube.com/user/MatheoSoftware>
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_
> campaign=sig-email&utm_content=emailclient> Garanti sans virus.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_
> campaign=sig-email&utm_content=emailclient>
> <#m_-7780043212915396992_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Get docs with same value in one other field ?



Hello all,



I'm facing a problem that I would like to know if it's possible to do it
with one request in SOLR.

I have SOLR 5.



I have docs with several fields but here two are useful for us.

Field 1 : id (unique key)

Field 2 : fid (family Id)



i.e:



id:XXX

fid: 1254



id: YYY

fid: 1254



id: ZZZ

fid:3698



id: QQQ

fid: 3698

.



I request only by id in my project, and I would like in my result have also
all docs that have the same fid .

i.e. if I request :

..q=id:ZZZ&.



I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid



MoreLikeThis, Group, etc. don't answer to my question (but may I don't know
how to use it to do that)



Thanks for your help,



Bruno







---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

RE: Get docs with same value in one other field ?

Ye it's perfect !!! it works.

Thanks David & Alexandre !

-Message d'origine-
De : David Hastings [mailto:hastings.recurs...@gmail.com]
Envoyé : mercredi 22 février 2017 23:00
À : solr-user@lucene.apache.org
Objet : Re: Get docs with same value in one other field ?

sorry embedded link:

q={!join+from=fid=fid}id:ZZZ

On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < hastings.recurs...@gmail.com> 
wrote:

> for a reference to some examples:
>
> https://wiki.apache.org/solr/Join
>
> sor youd want something like:
>
> q={!join+from=fid=fid}i
> <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7Di
> pod>
> d:ZZZ
>
> i dont have much experience with this function however
>
>
>
> On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch
>  > wrote:
>
>> Sounds like two clauses with the second clause being a JOINT search
>> where you match by ID and then join on FID.
>>
>> Would that work?
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>>
>>
>> On 22 February 2017 at 16:27, Bruno Mannina  wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> >
>> > I'm facing a problem that I would like to know if it's possible to
>> > do it with one request in SOLR.
>> >
>> > I have SOLR 5.
>> >
>> >
>> >
>> > I have docs with several fields but here two are useful for us.
>> >
>> > Field 1 : id (unique key)
>> >
>> > Field 2 : fid (family Id)
>> >
>> >
>> >
>> > i.e:
>> >
>> >
>> >
>> > id:XXX
>> >
>> > fid: 1254
>> >
>> >
>> >
>> > id: YYY
>> >
>> > fid: 1254
>> >
>> >
>> >
>> > id: ZZZ
>> >
>> > fid:3698
>> >
>> >
>> >
>> > id: QQQ
>> >
>> > fid: 3698
>> >
>> > .
>> >
>> >
>> >
>> > I request only by id in my project, and I would like in my result
>> > have
>> also
>> > all docs that have the same fid .
>> >
>> > i.e. if I request :
>> >
>> > ..q=id:ZZZ&.
>> >
>> >
>> >
>> > I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid
>> >
>> >
>> >
>> > MoreLikeThis, Group, etc. don't answer to my question (but may I
>> > don't
>> know
>> > how to use it to do that)
>> >
>> >
>> >
>> > Thanks for your help,
>> >
>> >
>> >
>> > Bruno
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > ---
>> > L'absence de virus dans ce courrier électronique a été vérifiée par
>> > le
>> logiciel antivirus Avast.
>> > https://www.avast.com/antivirus
>>
>
>


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

RE: Get docs with same value in one other field ?

Just a little more thing, I need to request up to 1000 id's
Actually I test with 2 or 3 and it takes times (my db is around 100 000 000 
docs, 128Go RAM).

Do you think, it could be OOM error ? if I test with up to 1000 id ?

-Message d'origine-
De : Bruno Mannina [mailto:bmann...@free.fr] 
Envoyé : mercredi 22 février 2017 23:47
À : solr-user@lucene.apache.org
Objet : RE: Get docs with same value in one other field ?

Ye it's perfect !!! it works.

Thanks David & Alexandre !

-Message d'origine-
De : David Hastings [mailto:hastings.recurs...@gmail.com]
Envoyé : mercredi 22 février 2017 23:00
À : solr-user@lucene.apache.org
Objet : Re: Get docs with same value in one other field ?

sorry embedded link:

q={!join+from=fid=fid}id:ZZZ

On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < hastings.recurs...@gmail.com> 
wrote:

> for a reference to some examples:
>
> https://wiki.apache.org/solr/Join
>
> sor youd want something like:
>
> q={!join+from=fid=fid}i
> <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7Di
> pod>
> d:ZZZ
>
> i dont have much experience with this function however
>
>
>
> On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch 
>  > wrote:
>
>> Sounds like two clauses with the second clause being a JOINT search 
>> where you match by ID and then join on FID.
>>
>> Would that work?
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and 
>> experienced
>>
>>
>> On 22 February 2017 at 16:27, Bruno Mannina  wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> >
>> > I'm facing a problem that I would like to know if it's possible to 
>> > do it with one request in SOLR.
>> >
>> > I have SOLR 5.
>> >
>> >
>> >
>> > I have docs with several fields but here two are useful for us.
>> >
>> > Field 1 : id (unique key)
>> >
>> > Field 2 : fid (family Id)
>> >
>> >
>> >
>> > i.e:
>> >
>> >
>> >
>> > id:XXX
>> >
>> > fid: 1254
>> >
>> >
>> >
>> > id: YYY
>> >
>> > fid: 1254
>> >
>> >
>> >
>> > id: ZZZ
>> >
>> > fid:3698
>> >
>> >
>> >
>> > id: QQQ
>> >
>> > fid: 3698
>> >
>> > .
>> >
>> >
>> >
>> > I request only by id in my project, and I would like in my result 
>> > have
>> also
>> > all docs that have the same fid .
>> >
>> > i.e. if I request :
>> >
>> > ..q=id:ZZZ&.
>> >
>> >
>> >
>> > I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid
>> >
>> >
>> >
>> > MoreLikeThis, Group, etc. don't answer to my question (but may I 
>> > don't
>> know
>> > how to use it to do that)
>> >
>> >
>> >
>> > Thanks for your help,
>> >
>> >
>> >
>> > Bruno
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > ---
>> > L'absence de virus dans ce courrier électronique a été vérifiée par 
>> > le
>> logiciel antivirus Avast.
>> > https://www.avast.com/antivirus
>>
>
>


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

RE: Get docs with same value in one other field ?

Ok Alex, I will looking for a best solution. I'm afraid to have a OOM with a 
huge number of ids.

And yes I already use a POST query, it was just to show my problem. Anyway 
thanks to indicate me this information also.

-Message d'origine-
De : Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Envoyé : jeudi 23 février 2017 00:08
À : solr-user
Objet : Re: Get docs with same value in one other field ?

A thousand of IDs could be painful to send and perhaps to run against.

At minimum, look into splitting your query into multiple variables (so you 
could reuse the list in both direct and join query). Look also at using terms 
query processor that specializes in the list of IDs. You may also need to send 
your ID list as a POST, not GET request to avoid blowing the URL length.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 22 February 2017 at 17:55, Bruno Mannina  wrote:
> Just a little more thing, I need to request up to 1000 id's Actually I
> test with 2 or 3 and it takes times (my db is around 100 000 000 docs, 128Go 
> RAM).
>
> Do you think, it could be OOM error ? if I test with up to 1000 id ?
>
> -Message d'origine-
> De : Bruno Mannina [mailto:bmann...@free.fr] Envoyé : mercredi 22
> février 2017 23:47 À : solr-user@lucene.apache.org Objet : RE: Get
> docs with same value in one other field ?
>
> Ye it's perfect !!! it works.
>
> Thanks David & Alexandre !
>
> -Message d'origine-
> De : David Hastings [mailto:hastings.recurs...@gmail.com]
> Envoyé : mercredi 22 février 2017 23:00 À :
> solr-user@lucene.apache.org Objet : Re: Get docs with same value in
> one other field ?
>
> sorry embedded link:
>
> q={!join+from=fid=fid}id:ZZZ
>
> On Wed, Feb 22, 2017 at 4:58 PM, David Hastings < 
> hastings.recurs...@gmail.com> wrote:
>
>> for a reference to some examples:
>>
>> https://wiki.apache.org/solr/Join
>>
>> sor youd want something like:
>>
>> q={!join+from=fid=fid}i
>> <http://localhost:8983/solr/select?q=%7B!join+from=manu_id_s+to=id%7D
>> i
>> pod>
>> d:ZZZ
>>
>> i dont have much experience with this function however
>>
>>
>>
>> On Wed, Feb 22, 2017 at 4:40 PM, Alexandre Rafalovitch
>> > > wrote:
>>
>>> Sounds like two clauses with the second clause being a JOINT search
>>> where you match by ID and then join on FID.
>>>
>>> Would that work?
>>>
>>> Regards,
>>>Alex.
>>> 
>>> http://www.solr-start.com/ - Resources for Solr users, new and
>>> experienced
>>>
>>>
>>> On 22 February 2017 at 16:27, Bruno Mannina  wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> >
>>> > I'm facing a problem that I would like to know if it's possible to
>>> > do it with one request in SOLR.
>>> >
>>> > I have SOLR 5.
>>> >
>>> >
>>> >
>>> > I have docs with several fields but here two are useful for us.
>>> >
>>> > Field 1 : id (unique key)
>>> >
>>> > Field 2 : fid (family Id)
>>> >
>>> >
>>> >
>>> > i.e:
>>> >
>>> >
>>> >
>>> > id:XXX
>>> >
>>> > fid: 1254
>>> >
>>> >
>>> >
>>> > id: YYY
>>> >
>>> > fid: 1254
>>> >
>>> >
>>> >
>>> > id: ZZZ
>>> >
>>> > fid:3698
>>> >
>>> >
>>> >
>>> > id: QQQ
>>> >
>>> > fid: 3698
>>> >
>>> > .
>>> >
>>> >
>>> >
>>> > I request only by id in my project, and I would like in my result
>>> > have
>>> also
>>> > all docs that have the same fid .
>>> >
>>> > i.e. if I request :
>>> >
>>> > ..q=id:ZZZ&.
>>> >
>>> >
>>> >
>>> > I get the docs ZZZ of course but also QQQ because QQQ_fid =
>>> > ZZZ_fid
>>> >
>>> >
>>> >
>>> > MoreLikeThis, Group, etc. don't answer to my question (but may I
>>> > don't
>>> know
>>> > how to use it to do that)
>>> >
>>> >
>>> >
>>> > Thanks for your help,
>>> >
>>> >
>>> >
>>> > Bruno
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---
>>> > L'absence de virus dans ce courrier électronique a été vérifiée
>>> > par le
>>> logiciel antivirus Avast.
>>> > https://www.avast.com/antivirus
>>>
>>
>>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le 
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Get docs with same value in one other field ?

Hello all,



Im facing a problem that I would like to know if its possible to do it
with one request in SOLR.

I have SOLR 5.



I have docs with several fields but here two are useful for us.

Field 1 : id (unique key)

Field 2 : fid (family Id)



i.e:



id:XXX

fid: 1254



id: YYY

fid: 1254



id: ZZZ

fid:3698



id: QQQ

fid: 3698





I request only by id in my project, and I would like in my result have also
all docs that have the same fid .

i.e. if I request :

..q=id:ZZZ&



I get the docs ZZZ of course but also QQQ because QQQ_fid = ZZZ_fid



MoreLikeThis, Group, etc dont answer to my question (but may I dont know
how to use it to do that)



Thanks for your help,



Bruno





Bruno Mannina

 <http://www.matheo-software.com> www.matheo-software.com

 <http://www.patent-pulse.com> www.patent-pulse.com

Tél. +33 0 430 650 788
Fax. +33 0 430 650 728



Stay in touch!

 <https://twitter.com/matheosoftware> cid:image001.png@01D2860B.70B15DC0
<https://www.linkedin.com/company/matheo-software>
cid:image002.png@01D2860B.70B15DC0
<https://www.youtube.com/user/MatheoSoftware>
cid:image003.png@01D2860B.70B15DC0





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Solr5, Clustering & exact phrase problem

2017-03-13 Thread Bruno Mannina

Dear Solr-User,



Im trying to use solr clustering (Lingo algorithm) on my database (notices
with id, title, abstract fields)



All works fine when my query is simple (with or without Boolean operators)
but if I try with exact phrase like:

..&q=ti:snowboard binding&



Then Solr generates only one cluster named other and put inside all
notices.



As I test it since few times, I have in my solrconfig the sample that
example gives.

Of course, I changed field names.



Do you know if I made a mistake, missing something or may be exact phrase is
not supported by clustering ?



Just one another question, I want to generate clusters by using fields
abstract and title, is exact what I did ion my solrconfig:

Carrot.title = title

Carrot.snippet = abstract



Thanks a lot for your help,



Bruno Mannina

 <http://www.matheo-software.com> www.matheo-software.com

 <http://www.patent-pulse.com> www.patent-pulse.com

Tél. +33 0 430 650 788
Fax. +33 0 430 650 728

 <https://www.facebook.com/PatentPulse> facebook (1)
<https://twitter.com/matheosoftware> 1425551717
<https://www.linkedin.com/company/matheo-software> 1425551737
<https://www.youtube.com/user/MatheoSoftware> 1425551760





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Shards, delete duplicates ?

2017-04-14 Thread Bruno Mannina

Dear Solr users,



I have two collections C1 and C2

For C1 and C2 the unique key is ID.



ID in C1 are patent numbers normalized i.e US + 12 digits + A1

ID in C2 are patent numbers as I receive them. US + 13 digits + A1 (a
leading 0 is added)



My collection C2 has a field name ID12 which is not defined as a unique
field.

This ID12 is the copy of the field ID of C1. (US + 12 digits + A1)

Data in ID12 are unique in the whole C2 collection.



Data in C1_ID and C2_ID12 are the same.



I try to request these both collections using shards in the url.

It works fine but I get duplicate documents. Its normal I know.



Is exists a method, a parameter, or anything else that allows me to indicate

to  solr to compare ID in C1 with ID12 in C2 to delete duplicates ?



Many thanks for your help,





Bruno Mannina

 <http://www.matheo-software.com> www.matheo-software.com

 <http://www.patent-pulse.com> www.patent-pulse.com

Tél. +33 0 430 650 788
Fax. +33 0 430 650 728

 <https://www.facebook.com/PatentPulse> facebook (1)
<https://twitter.com/matheosoftware> 1425551717
<https://www.linkedin.com/company/matheo-software> 1425551737
<https://www.youtube.com/user/MatheoSoftware> 1425551760





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

How can I request a big list of values ?

2014-08-09 Thread Bruno Mannina


Hi All,

I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside.

All work fine, it's great :)

But now, I would like to request a list of values in the same field
(more than 2000 values)

I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR)

but I have a list of 2000 values ! I think it's not the good idea to use
this method.

Can someone help me to find the good solution ?
Can I use a json structure by using a POST method ?

Thanks a lot,
Bruno
|


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: How can I request a big list of values ?

2014-08-10 Thread Bruno Mannina


Hi Jack,

ok but for 2000 values, it means that I must do 40 requests if I choose 
to have 50 values by requests :'(
and in my case, user can choose about 8 topics, so it can generate 8 
times 40 requests... humm...


is it not possible to send a text, json, xml file ?

Le 10/08/2014 17:38, Jack Krupansky a écrit :
Generally, "large requests" are an anti-pattern in modern distributed 
systems. Better to have a number of smaller requests executing in 
parallel and then merge the results in the application layer.


-- Jack Krupansky

-Original Message----- From: Bruno Mannina
Sent: Saturday, August 9, 2014 7:18 PM
To: solr-user@lucene.apache.org
Subject: How can I request a big list of values ?

Hi All,

I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside.

All work fine, it's great :)

But now, I would like to request a list of values in the same field
(more than 2000 values)

I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR)

but I have a list of 2000 values ! I think it's not the good idea to use
this method.

Can someone help me to find the good solution ?
Can I use a json structure by using a POST method ?

Thanks a lot,
Bruno
|


---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: How can I request a big list of values ?

2014-08-10 Thread Bruno Mannina


Hi Anshum,

I can do it with 3.6 release no ?

my main problem, it's that I have around 2000 values, so I can't use one 
request with these values, it's too wide. :'(


I will take a look to generate (like Jack proposes me) several requests, 
but even in this case it seems to be not safe...


Le 10/08/2014 19:45, Anshum Gupta a écrit :

Hi Bruno,

If you would have been on a more recent release,
https://issues.apache.org/jira/browse/SOLR-6318 would have come in
handy perhaps.
You might want to look at patching your version with this though (as a
work around).

On Sat, Aug 9, 2014 at 4:18 PM, Bruno Mannina  wrote:

Hi All,

I'm using actually SOLR 3.6 and I have around 91 000 000 docs inside.

All work fine, it's great :)

But now, I would like to request a list of values in the same field (more
than 2000 values)

I know I can use |?q=x:(AAA BBB CCC ...) (my default operator is OR)

but I have a list of 2000 values ! I think it's not the good idea to use
this method.

Can someone help me to find the good solution ?
Can I use a json structure by using a POST method ?

Thanks a lot,
Bruno
|


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant
parce que la protection avast! Antivirus est active.
http://www.avast.com






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

statistic on a field?

2013-10-20 Thread Bruno Mannina


Dear,

I have a field named "Authors", is it possible to have
the frequency of terms (first 2000 for i.e.) of this field ?

Thanks,

Bruno

Re: statistic on a field?

2013-10-20 Thread Bruno Mannina


Le 20/10/2013 17:52, Bruno Mannina a écrit :

Dear,

I have a field named "Authors", is it possible to have
the frequency of terms (first 2000 for i.e.) of this field ?

Thanks,

Bruno


By using Schema Browser, I have information on my field Authors but I 
have a problem,

I have statistic on part of terms of this field...

i.e.

termfreq
co256875
ltd235899
corp 195554
etc...

The field has been splitted to do stats ?!

FieldType: TEXT_GENERAL
Properties: Indexed, Tokenized, Stored, Multivalued
Schema: Indexed, Tokenized, Stored, Multivalued
Index: indexed, Tokenized, Stored

Position Increment Gap: 100

Distinct: 1803034

I think it's because this field is Tokenized ? no ?

Regards,
Bruno

Is Solr can create temporary sub-index ?


Dear Solr User,

We have to do a new web project which is : Connect our SOLR database to 
a web plateform.


This Web Plateform will be used by several users at the same time.
They do requests on our SOLR and they can apply filter on the result.

i.e.:
Our SOLR contains 87M docs
An user do requests, result is around few hundreds to several thousands.
On the Web Plateform, user will see first 20 results (or more by using 
Next Page button)
But he will need also to filter the whole result by additional terms. 
(Terms that our plateform will propose him)


Is SOLR can create temporary index (manage by SOLR himself during a web 
session) ?


My goal is to not download the whole result on local computer to provide 
filter, or to re-send

the same request several times added to the new criterias.

Many thanks for your comment,

Regards,
Bruno

Re: Is Solr can create temporary sub-index ?


Hello Tim,

Yes solr's facet could be a solution, but I need to re-send the q= each 
time.

I'm asking me just if an another solution exists.

Facet seems to be the good solution.

Bruno



Le 23/10/2013 17:03, Timothy Potter a écrit :

Hi Bruno,

Have you looked into Solr's facet support? If I'm reading your post
correctly, this sounds like the classic case for facets. Each time the user
selects a facet, you add a filter query (fq clause) to the original query.
http://wiki.apache.org/solr/SolrFacetingOverview

Tim


On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina  wrote:


Dear Solr User,

We have to do a new web project which is : Connect our SOLR database to a
web plateform.

This Web Plateform will be used by several users at the same time.
They do requests on our SOLR and they can apply filter on the result.

i.e.:
Our SOLR contains 87M docs
An user do requests, result is around few hundreds to several thousands.
On the Web Plateform, user will see first 20 results (or more by using
Next Page button)
But he will need also to filter the whole result by additional terms.
(Terms that our plateform will propose him)

Is SOLR can create temporary index (manage by SOLR himself during a web
session) ?

My goal is to not download the whole result on local computer to provide
filter, or to re-send
the same request several times added to the new criterias.

Many thanks for your comment,

Regards,
Bruno

Re: Is Solr can create temporary sub-index ?

I have a little question concerning statistics on a request:

I have a field defined like that:
multiValued="true"/>

positionIncrementGap="100" autoGeneratePhraseQueries="true">

words="stopwords.txt" enablePositionIncrements="true"/>

words="stopwords.txt" enablePositionIncrements="true"/>
ignoreCase="true" expand="true"/>

Date sample for this field:

A23L1/22066
A23L1/227
A23L1/231
A23L1/2375

My question is:
Is it possible to have frequency of terms for the whole result of the
initial user's request?

Thanks a lot,
Bruno

Le 23/10/2013 18:12, Timothy Potter a écrit :

Yes, absolutely you resend the q= each time, optionally with any facets
selected by the user using fq=

On Wed, Oct 23, 2013 at 10:00 AM, Bruno Mannina wrote:

Hello Tim,

Yes solr's facet could be a solution, but I need to re-send the q= each
time.
I'm asking me just if an another solution exists.

Facet seems to be the good solution.

Bruno

Le 23/10/2013 17:03, Timothy Potter a écrit :

Hi Bruno,

Have you looked into Solr's facet support? If I'm reading your post
correctly, this sounds like the classic case for facets. Each time the
user
selects a facet, you add a filter query (fq clause) to the original query.
http://wiki.apache.org/solr/**SolrFacetingOverview<http://wiki.apache.org/solr/SolrFacetingOverview>

Tim

On Wed, Oct 23, 2013 at 8:16 AM, Bruno Mannina wrote:

Dear Solr User,

We have to do a new web project which is : Connect our SOLR database to a
web plateform.

This Web Plateform will be used by several users at the same time.
They do requests on our SOLR and they can apply filter on the result.

i.e.:
Our SOLR contains 87M docs
An user do requests, result is around few hundreds to several thousands.
On the Web Plateform, user will see first 20 results (or more by using
Next Page button)
But he will need also to filter the whole result by additional terms.
(Terms that our plateform will propose him)

Is SOLR can create temporary index (manage by SOLR himself during a web
session) ?

My goal is to not download the whole result on local computer to provide
filter, or to re-send
the same request several times added to the new criterias.

Many thanks for your comment,

Regards,
Bruno

Re: Is Solr can create temporary sub-index ?

Hum I think my fieldType = "text_classification" is not appropriated for 
this kind of data...


I don't need to use stopwords, synonym etc...

IC field is a field that contains codes, and codes contains often the 
char "/"

and if I use the Terms option, I get:


...
4563254
3763554
2263254
...
..

Le 23/10/2013 18:51, Bruno Mannina a écrit :
positionIncrementGap="100" autoGeneratePhraseQueries="true">

 
  
  words="stopwords.txt" enablePositionIncrements="true"/>

  
 
 
  
  words="stopwords.txt" enablePositionIncrements="true"/>
  ignoreCase="true" expand="true"/>

Re: Is Solr can create temporary sub-index ?


I need your help to define the right fieldType, please,

this field must be indexed, stored and each value must be considered as 
one term.

The char / don't be consider like a separator.

Is String could be a good fieldType ?

thanks

Le 23/10/2013 18:51, Bruno Mannina a écrit :


 A23L1/22066
 A23L1/227
 A23L1/231
 A23L1/2375

What is the right fieldType for this kind of field?


Dear,

Data look likes:

A23L1/22066
 A23L1/227
 A23L1/231
 A23L1/2375

I tried:
- String
but I can't search with troncation (i.e. A23*)

- Text_General
but as my code contains / then data are splitted...

What kind of field must choose to use truncation and consider code with 
/ as one term?


thanks a lot for your help,
Bruno

Re: What is the right fieldType for this kind of field?


Hi Jack,

Yes String works fine, I forgot to restart my solr server after changing 
my schema.xml...arrf.I'm so stupid sorry !


Le 23/10/2013 20:09, Jack Krupansky a écrit :
Trailing wildcard should work fine for strings, but "a23*" will not 
match "A23*" due to case. You could use the keyword tokenizer plus the 
lower case filter.


-- Jack Krupansky

-Original Message- From: Bruno Mannina
Sent: Wednesday, October 23, 2013 1:54 PM
To: solr-user@lucene.apache.org
Subject: What is the right fieldType for this kind of field?

Dear,

Data look likes:

A23L1/22066
 A23L1/227
 A23L1/231
 A23L1/2375

I tried:
- String
but I can't search with troncation (i.e. A23*)

- Text_General
but as my code contains / then data are splitted...

What kind of field must choose to use truncation and consider code with
/ as one term?

thanks a lot for your help,
Bruno

Re: What is the right fieldType for this kind of field?


Le 23/10/2013 20:09, Jack Krupansky a écrit :
You could use the keyword tokenizer plus the lower case filter. 

Jack,

Could you help me to write the right fieldType please?
(index and query)

Another thing, I don't know if I must use the Keyword tokenizer because 
codes contain "/" char,

and Tokenizer seems split code no ?

Many thanks,

Bruno

Re: What is the right fieldType for this kind of field?


Le 23/10/2013 22:44, Bruno Mannina a écrit :

Le 23/10/2013 20:09, Jack Krupansky a écrit :
You could use the keyword tokenizer plus the lower case filter. 

Jack,

Could you help me to write the right fieldType please?
(index and query)

Another thing, I don't know if I must use the Keyword tokenizer 
because codes contain "/" char,

and Tokenizer seems split code no ?

Many thanks,

Bruno



may be an answer (i don't tested yet)

http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/

Re: What is the right fieldType for this kind of field?


Le 23/10/2013 22:49, Bruno Mannina a écrit :

Le 23/10/2013 22:44, Bruno Mannina a écrit :

Le 23/10/2013 20:09, Jack Krupansky a écrit :
You could use the keyword tokenizer plus the lower case filter. 

Jack,

Could you help me to write the right fieldType please?
(index and query)

Another thing, I don't know if I must use the Keyword tokenizer 
because codes contain "/" char,

and Tokenizer seems split code no ?

Many thanks,

Bruno



may be an answer (i don't tested yet)

http://pietervogelaar.nl/solr-3-5-search-case-insensitive-on-a-string-field-for-exact-match/ 





ok it works fine !

Terms function join with a Select function ?


Dear Solr users,

I use the Terms function to see the frequency data in a field but it's 
for the whole database.


I have 2 questions:
- Is it possible to increase the number of statistic ? actually I have 
the 10 first frequency term.


- Is it possible to limit this statistic to the result of a request ?

PS: the second question is very important for me.

Many thanks

Re: Terms function join with a Select function ?


Dear All,

Ok I have an answer concerning the first question (limit)
It's the terms.limit parameters.

But I can't find how to apply a Terms request on a query result

any idea ?

Bruno

Le 23/10/2013 23:19, Bruno Mannina a écrit :

Dear Solr users,

I use the Terms function to see the frequency data in a field but it's 
for the whole database.


I have 2 questions:
- Is it possible to increase the number of statistic ? actually I have 
the 10 first frequency term.


- Is it possible to limit this statistic to the result of a request ?

PS: the second question is very important for me.

Many thanks





---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Terms function join with a Select function ?


Dear,

humI don't know how can I use it..;

I tried:

my query:
ti:snowboard (3095 results)

I would like to have at the end of my XML, the Terms statistic for the 
field AP (applicant field (patent notice))


but I haven't that...

Please help,
Bruno

/select?q=ti%Asnowboard&version=2.2&start=0&rows=10&indent=on&facet=true&f.ap.facet.limit=10

Le 24/10/2013 14:04, Erik Hatcher a écrit :

That would be called faceting :)

 http://wiki.apache.org/solr/SimpleFacetParameters




On Oct 24, 2013, at 5:23 AM, Bruno Mannina  wrote:


Dear All,

Ok I have an answer concerning the first question (limit)
It's the terms.limit parameters.

But I can't find how to apply a Terms request on a query result....

any idea ?

Bruno

Le 23/10/2013 23:19, Bruno Mannina a écrit :

Dear Solr users,

I use the Terms function to see the frequency data in a field but it's for the 
whole database.

I have 2 questions:
- Is it possible to increase the number of statistic ? actually I have the 10 
first frequency term.

- Is it possible to limit this statistic to the result of a request ?

PS: the second question is very important for me.

Many thanks




---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com







---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Terms function join with a Select function ?


humm facet perfs are very bad (Solr 3.6.0)
My index is around 87 000 000 docs. (4 * Proc double core, 24G Ram)

I thought facets will work only on the result but it seems it's not the 
case.


My request:
http://localhost:2727/solr/select?q=ti:snowboard&rows=0&facet=true&facet.field=ap&facet.limit=5

Do you think my request is wrong ?

Maybe it's not possible to have statistic on a field (like Terms 
function) on a query.....


Thx for your help,

Bruno


Le 24/10/2013 19:40, Bruno Mannina a écrit :

Dear,

humI don't know how can I use it..;

I tried:

my query:
ti:snowboard (3095 results)

I would like to have at the end of my XML, the Terms statistic for the 
field AP (applicant field (patent notice))


but I haven't that...

Please help,
Bruno

/select?q=ti%Asnowboard&version=2.2&start=0&rows=10&indent=on&facet=true&f.ap.facet.limit=10 



Le 24/10/2013 14:04, Erik Hatcher a écrit :

That would be called faceting :)

 http://wiki.apache.org/solr/SimpleFacetParameters




On Oct 24, 2013, at 5:23 AM, Bruno Mannina  wrote:


Dear All,

Ok I have an answer concerning the first question (limit)
It's the terms.limit parameters.

But I can't find how to apply a Terms request on a query result

any idea ?

Bruno

Le 23/10/2013 23:19, Bruno Mannina a écrit :

Dear Solr users,

I use the Terms function to see the frequency data in a field but 
it's for the whole database.


I have 2 questions:
- Is it possible to increase the number of statistic ? actually I 
have the 10 first frequency term.


- Is it possible to limit this statistic to the result of a request ?

PS: the second question is very important for me.

Many thanks




---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com







---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com






---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce 
que la protection avast! Antivirus est active.
http://www.avast.com

Re: Terms function join with a Select function ?