loading solr from Pig?

2013-08-21 Thread geeky2
Hello All,

Is anyone loading Solr from a Pig script / process?

I was talking to another group in our company and they have standardized on
MongoDB instead of Solr - apparently there is very good support between
MongoDB and Pig - allowing users to "stream" data directly from a Pig
process in to MongoDB.

Does solr have anything like this as well?

thx
mark







--
View this message in context: 
http://lucene.472066.n3.nabble.com/loading-solr-from-Pig-tp4085933.html
Sent from the Solr - User mailing list archive at Nabble.com.


why does "*" affect case sensitivity of query results

2013-04-29 Thread geeky2
hello,

environment: solr 3.5


problem statement: when query has "*" appended, it turns case sensitive.

assumption: query should NOT be case sensitive

actual value in database at time of index: 4387828BULK

here is a snapshot of what works and does not work.

what works:

  itemModelNoExactMatchStr:4387828bULk (and any variation of upper and lower
case letters for *bulk*)

  itemModelNoExactMatchStr:4387828bu*
  itemModelNoExactMatchStr:4387828bul*
  itemModelNoExactMatchStr:4387828bulk*


what does NOT work:

 itemModelNoExactMatchStr:4387828BU*
 itemModelNoExactMatchStr:4387828BUL*
 itemModelNoExactMatchStr:4387828BULK*


below are the specifics of my field and fieldType

  



  



  
  


  


thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does "*" affect case sensitivity of query results

2013-04-29 Thread geeky2
was looking in Smiley's book on page 129 and 130.

from the book,

>>
No text analysis is performed on the search word containing the wildcard,
not even lowercasing. So if you want to find a word starting with Sma, then
sma* is required instead of Sma*, assuming the index side of the field's
type
includes lowercasing. This shortcoming is tracked on SOLR-219. Moreover,
if the field that you want to use the wildcard query on is stemmed in the
analysis, then smashing* would not find the original text Smashing because
the stemming process transforms this to smash. Consequently, don't stem.
<<

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801p4059812.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does "*" affect case sensitivity of query results

2013-04-29 Thread geeky2
here is the jira link:

https://issues.apache.org/jira/browse/SOLR-219





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801p4059814.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does "*" affect case sensitivity of query results

2013-04-30 Thread geeky2
hello erik,

thank you for the info - yes - i did notice ;)

one more reason for us to upgrade from 3.5.

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801p406.html
Sent from the Solr - User mailing list archive at Nabble.com.


having trouble storing large text blob fields - returns binary address in search results

2013-05-16 Thread geeky2
hello 

environment: solr 3.5

can someone help me with the correct configuration for some large text blob
fields?

we have two fields in informix tables that are of type text. 

when we do a search the results for these fields come back looking like
this: 

[B@17c232ee

i have tried setting them up as clob fields - but this is not working (see
details below)

i have also tried treating them as plain string fields (removing the
references to clob in the DIH) - but this does not work either.


DIH configuration:


  



Schema.xml

  


thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-storing-large-text-blob-fields-returns-binary-address-in-search-results-tp4063979.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: having trouble storing large text blob fields - returns binary address in search results

2013-05-17 Thread geeky2
Hello Gora,


thank you for the reply - 

i did finally get this to work.  i had to cast the column in the DIH to a
clob - like this.

cast(att.attr_val AS clob) as attr_val,
cast(rsr.rsr_val AS clob) as rsr_val,

once this was done, the ClobTransformer worked.

to my knowledge - this particular use case and the need for the cast is not
documented anywhere.  i checked the solr wiki and searched the threads on
this forum for things like clobtransformer, informix and blob without luck. 
i also did quite a few google searches as well but no luck (but maybe i
missed something ;)

maybe this is just some "edge case".  i also realize that informix is not
that common.

i have a question in to the solr developers list - just so i can better
understand what actually is happening, why it was necessary for the "cast",
and the limitations / parameters of the ClobTransformer.  

the thread on the developers list is located here:

http://lucene.472066.n3.nabble.com/have-developer-question-about-ClobTransformer-and-DIH-td4064256.html

thx
mark






--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-storing-large-text-blob-fields-returns-binary-address-in-search-results-tp4063979p4064286.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: having trouble storing large text blob fields - returns binary address in search results

2013-05-18 Thread geeky2
hello

your comment made me think - so i decided to double check myself.

i opened up the schema in squirrel and made sure that the two columns in
question were actually of type TEXT in the schema - check

i went in to the db-config.xml and removed all references to
ClobTransformer, removed the cast directives from the fields as well as the
clob="true" on the two fields - i pasted the db-config.xml below for
reference - check

i restarted jboss - thus restarting solr - check

i went in to the solr dataimport admin screen and did a clean import - check

after the import was complete - i queried a part that i knew would have one
of the clob fields - results are pasted below as well - you can see the
binary address in the field.




  
N
 *   [B@5b372219*
PIA

  Refrigerators and Freezers

0046
12001892,0046,464
VALVE, WATER
12001892
12001892
1
Y
1
N

  

464
N
13
53.54
464 
Y
  










 


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 
 

 




 

 
 
 

 
 
 
 
 
 
 

   

mark



--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-storing-large-text-blob-fields-returns-binary-address-in-search-results-tp4063979p4064407.html
Sent from the Solr - User mailing list archive at Nabble.com.


seeing lots of "autowarming" messages in log during DIH indexing

2013-05-20 Thread geeky2
hello,

we are tracking down some performance issues with our DIH process.

not sure if this is related - but i am seeing tons of the messages below in
the logs during re-indexing of the core.

what do these messages mean?


2013-05-18 19:37:30,623 INFO  [org.apache.solr.update.UpdateHandler]
(pool-11-thread-1) end_commit_flush
2013-05-18 19:37:30,623 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
main
   
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,624 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming result for Searcher@5b8d745 main
   
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,624 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
main
   
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,625 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming result for Searcher@5b8d745 main
   
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,625 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
main
   
queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=1,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,628 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming result for Searcher@5b8d745 main
   
queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=3,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,628 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
main
   
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
2013-05-18 19:37:30,628 INFO  [org.apache.solr.search.SolrIndexSearcher]
(pool-10-thread-1) autowarming result for Searcher@5b8d745 main
   
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/seeing-lots-of-autowarming-messages-in-log-during-DIH-indexing-tp4064649.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: seeing lots of "autowarming" messages in log during DIH indexing

2013-05-20 Thread geeky2
you mean i would add this switch to my script that kicks of the dataimport?

exmaple:


OUTPUT=$(curl -v
http://${SERVER}.intra.searshc.com:${PORT}/solrpartscat/${CORE}/dataimport
-F command=full-import -F clean=${CLEAN} -F commit=${COMMIT} -F
optimize=${OPTIMIZE} -F openSearcher=false)


what needs to be done _AFTER_ the DIH finishes (if anything)?

eg, does this need to be turned back on after the DIH has finished?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/seeing-lots-of-autowarming-messages-in-log-during-DIH-indexing-tp4064649p4064695.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: seeing lots of "autowarming" messages in log during DIH indexing

2013-05-31 Thread geeky2
the DIH is launched via a script - called by a "cron like" scheduler.

clean, commit and optimize are all true.

thx
mark



#!/bin/bash
SERVER=$1
PORT=$2
CLEAN=$3
COMMIT=$4
OPTIMIZE=$5
COREPATH=$6

echo SERVER: $SERVER
echo PORT: $PORT
echo CLEAN: $CLEAN
echo COMMIT: $COMMIT
echo OPTIMIZE: $OPTIMIZE
echo COREPATH: $COREPATH


if [ $# != 6 ]; then
echo "USAGE: $0 [SERVER] [PORT] [CLEAN: true/false] [COMMIT:
true/false] [OPTIMIZE: true/false] [COREPATH] "
exit 1;
fi

...






--
View this message in context: 
http://lucene.472066.n3.nabble.com/seeing-lots-of-autowarming-messages-in-log-during-DIH-indexing-tp4064649p4067477.html
Sent from the Solr - User mailing list archive at Nabble.com.


translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello all,

environment: solr 3.5, centos

problem statement:  i have several character codes that i want to translate
to ordinal (integer) values (for sorting), while retaining the original code
field in the document.

i was thinking that i could use a copyField from my "code" field to my "ord"
field - then employ a pattern replace filter factory during indexing.

but won't the copyfield fail because the two field types are different?

ps: i also read the wiki about
http://wiki.apache.org/solr/DataImportHandler#Transformer the script
transformer and regex transformer - but was hoping to avoid this - if i
could.




thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
hello jack,

thank you for the code ;)

what "book" are you referring to?  AFAICT - all of the 4.0 books are "future
order".

we won't be moving to 4.0 (soon enough).

so i take it - copyfield will not work, eg - i cannot take a code like ABC
and copy it to an int field and then use the regex to turn it in to an
ordinal?

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068984.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-07 Thread geeky2
thx,


please send me a link to the book so i get/purchase it.


thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4068997.html
Sent from the Solr - User mailing list archive at Nabble.com.


custom field tutorial

2013-06-07 Thread geeky2
can someone point me to a "custom field" tutorial.

i checked the wiki and this list - but still a little hazy on how i would do
this.

essentially - when the user issues a query, i want my class to interrogate a
string field (containing several codes - example boo, baz, bar) 

and return a single integer field that maps to the string field (containing
the code).

example: 

boo=1
baz=2
bar=3

thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/custom-field-tutorial-tp4068998.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it.

i guess i made a "poor" assumption that you would not get predictable
results when copying a code like "mycode" to an int field where where the
desired end result in the int field is say, "1".

i was worried that some sort of ascii conversion or "wrap around" would
happen in the int field.

thx for the insight.

mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069335.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: translating a character code to an ordinal?

2013-06-10 Thread geeky2
i will try it out and let you know - 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/translating-a-character-code-to-an-ordinal-tp4068966p4069339.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: best way to force substitutions in data

2012-01-10 Thread geeky2
thank you both for the information.

Gora, when you mentioned:

>>
- For keeping both values, use synonyms. 
<<

what did you mean exactly.

mark

--
View this message in context: 
http://lucene.472066.n3.nabble.com/best-way-to-force-substitutions-in-data-tp3646195p3647920.html
Sent from the Solr - User mailing list archive at Nabble.com.


linking query in DIH fails with sql syntax error when specific fields contain bad data

2012-01-13 Thread geeky2

hello all,


some of my records contain bad data i the orb_itm_id column.

example:

select * from prtxtps_prt_summ where orb_itm_id like '''%';

prd_gro_id spp_id  orb_itm_id ds_tx rnk_no
0022   335 ' LONG. (TERMINAL ATTACH   )' LONG. (TERMINAL
ATTACH)   0
0042   596 ', FAN MOTOR CAPACITOR S   TRAP 0


this is causing the indexing process to fail on the bad records - with a sql
syntax error


is there a way i can trap for this and cleans the "'" before the sql is
constructed?

mark




















   






2012-01-13 12:27:38,912 SEVERE
[org.apache.solr.handler.dataimport.DataImporter] (Thread-27) Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to execute query: SELECT pa.uom_hi, pa.att_val_hi, pa.uom_low,
pa.att_val_low, a.att_nm FROM prtxtpa_att_val pa, prtxtat_att a WHERE
pa.att_id = a.att_id and pa.orb_itm_id = '' LONG. (TERMINAL ATTACH' and
pa.spp_id = '335' and pa.prd_gro_id = '0022' and pa.att_val_hi is not NULL
Processing Document # 119
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:617)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: java.sql.SQLException: A syntax error has occurred.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/linking-query-in-DIH-fails-with-sql-syntax-error-when-specific-fields-contain-bad-data-tp3657482p3657482.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is it possible to save the search query?

2012-11-20 Thread geeky2
Hello,

i think you are asking two questions here - i'll see if i can give you some
simple examples for both

1) how can i pull data from a solr search result set and compare it to
another for analysis?

one way - might be to drive the results in to files and then use xslt to
extract relevant information.

here is an example xslt file that pulls specific fields from a result:


http://www.w3.org/1999/XSL/Transform";>



[

]
,

,

,





 



2) how can i embed data in to a solr query, making it easier to do analysis
in the log files?

here is a simple example that "bookmarks" or brackets transactions in the
logs - used only during stress testing

#!/bin/bash

TYPE=$1
TAG=$2

if [ $TYPE == 1 ]
then
# beginning
curl -v
http://something:1234/boo/core1/select/?q=partImageURL%3A${TAG}-test-begin&version=2.2&start=0&rows=777indent=on
else
# end
curl -v
http://something:1234/boo/core1/select/?q=partImageURL%3A${TAG}-test-end&version=2.2&start=0&rows=777indent=on
fi


hopefully this will give you something to start with.

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-to-save-the-search-query-tp4018925p4021315.html
Sent from the Solr - User mailing list archive at Nabble.com.


performing a boolean query (OR) with a large number of terms

2013-01-09 Thread geeky2
hello,

environment: solr 3.5

i have a requirement to perform a boolean query (like the example below)
with a large number of terms.

the number of terms could be 15 or possibly larger.

after looking over several theads and the smiley book - i think i just have
include the parens and string all of the terms together with OR's

i just want to make sure that i am not missing anything.

is there a better or more efficient way of doing this?

http://server:port/dir/core1/select?qt=modelItemNoSearch&q=itemModelNoExactMatchStr:%285-100-NGRT7%20OR%205-10-10MS7%20OR%20404%29&rows=30&debugQuery=on&rows=40


thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/performing-a-boolean-query-OR-with-a-large-number-of-terms-tp4032039.html
Sent from the Solr - User mailing list archive at Nabble.com.


searching for q terms that start with a dash/hyphen being interpreted as prohibited clauses

2013-01-17 Thread geeky2
hello

environment: solr 3.5

problem statement:

i have a requirement to search for part numbers that start with a dash /
hyphen.

example q= term: *-0004A-0436*

example query:

http://some_url:some_port/some_core/select?facet=false&sort=score+desc%2C+rankNo+asc%2C+partCnt+desc&start=0&q=*-0004A-0436*+itemType%3A1&wt=xml&qt=itemModelNoProductTypeBrandSearch&rows=4

what is happening: query is returning a huge results set.  in reality there
is one (1) and only one record in the database with this part number.

i believe this is happening because the dash is being interpreted by the
query parser as a prohibited clause and the effective result is, "give me
everything that does NOT have this part number".

how is this handled so that the search is conducted for the actual part:
-0004A-0436

thx
mark

more information:

request handler in solrconfig.xml

  

  edismax
  all
  10
  itemModelNoExactMatchStr^30 itemModelNo^.9
divProductTypeDesc^.8 plsBrandDesc^.5
  *:*
  score desc, rankNo desc, partCnt desc
  true
  itemModelDescFacet
  plsBrandDescFacet
  divProductTypeIdFacet





  


field information from schema.xml (if helpful)


 








  



  
  



  



  










  


  





  
  



  







--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-for-q-terms-that-start-with-a-dash-hyphen-being-interpreted-as-prohibited-clauses-tp4034310.html
Sent from the Solr - User mailing list archive at Nabble.com.


question about syntax for multiple terms in filter query

2013-03-11 Thread geeky2
hello everyone,

i have a question on the filter query syntax for multiple terms, after
reading this:

http://wiki.apache.org/solr/CommonQueryParameters#fq

i see from the above that two (2) syntax constructs are supported

fq=term1:foo & fq=term2:bar

and

fq=+term1:foo +term2:bar

is there a reason why i would want to use one syntax over the other?

does the first syntax support the "and" operand as well as the "&"?

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: question about syntax for multiple terms in filter query

2013-03-11 Thread geeky2
otis and jack - 

thank you VERY much for the feedback - 

jack - 

>>
use a single fq containing two mandatory
clauses if those clauses appear together often
<<

this is the use case i  have to account for - eg, 

right now i have this in my request handler

 
  ...
  itemType:1
  ...
 

which says - i only want parts 

but i need to augment the filter so only parts that have a price >= 1.0 are
returned from the request handler

so i believe i need to have this in the RH
 
  ...
  +itemType:1 +sellingPrice:[1 TO *]
  ...
 

thx
mark







--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442p4046548.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: question about syntax for multiple terms in filter query

2013-03-12 Thread geeky2
hello jack,

yes - i will always be using the two constraints at the same time.

thank you again for the info.

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442p4046650.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: question about syntax for multiple terms in filter query

2013-03-12 Thread geeky2
jack,

did you mean "function query" or filter query

i was going to do this in my request handler for parts

   +itemType:1 +sellingPrice:[1 TO *] 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442p4046715.html
Sent from the Solr - User mailing list archive at Nabble.com.


having trouble escaping a character string

2013-03-12 Thread geeky2
hello all,

i am searching on this field type:





  



  
  


  


for this string: 30326R-26" TILLER

when i use the analyzer and issue the query - it indicates success (please
see attached screen shot)

but when i issue the search url - it does not return a document

http://bogus/solrpartscat/core2/select?qt=modelItemNoSearch&q=itemModelNoExactMatchStr:%2230326R-26%22%20TILLER%22

can someone tell me what i am missing?

thx
mark


 







--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-escaping-a-character-string-tp4046796.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: having trouble escaping a character string

2013-03-12 Thread geeky2
attempting to upload the screenshot bmp file.  the embedded image is
difficult to make out.

temp1.bmp   



--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-escaping-a-character-string-tp4046796p4046798.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: having trouble escaping a character string

2013-03-12 Thread geeky2
oh - 

now i see what i was doing wrong.


i kept trying to use the hex code of %22 as a replacement for the double
quote - but that was not working - 

thank you jack,

mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-escaping-a-character-string-tp4046796p4046821.html
Sent from the Solr - User mailing list archive at Nabble.com.


need general advice on how others version and mange core deployments over time

2013-03-14 Thread geeky2
hello everyone,

i know this is a general topic - but would really appreciate info from
others that are doing this now.

  - how are others managing this so that users are impacted the least 
  - how are others handling the scenario where users don't want to migrate
forward.

thx
mark






--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-general-advice-on-how-others-version-and-mange-core-deployments-over-time-tp4047390.html
Sent from the Solr - User mailing list archive at Nabble.com.


having trouble searching on EdgeNGramFilterFactory field with a length < minGramSize

2013-03-19 Thread geeky2
hello,

i am trying to debug the following query in the analyzer:

*+itemModelNoExactMatchStr:JVM1640CJ01 +plsBrandId:0432 +plsBrandDesc:ge*

the query is going against a field (plsBrandDesc) that is being indexed with 
solr.EdgeNGramFilterFactory and a  minGramSize of 3.  i have included the
complete field definition below.

after doing some experimenting in the analyzer, i believe the query may be
failing because the queried value of "ge" is only two (2) characters long -
and the minimum gram size is three (3) characters.

for example - this query does work in the analyzer.  it has a plsBrandDesc >
three characters and does return exactly one document:

+itemModelNoExactMatchStr:404 +plsBrandId:0431 *+plsBrandDesc:general*


i have tried overriding this behavior by using mm=2, but this does not seem
to work:

+itemModelNoExactMatchStr:JVM1640CJ01 +plsBrandId:0432 +plsBrandDesc:ge mm=2

am i misunderstanding how mm works - or am i getting the syntax for mm
incorrect?

thx
mark









  





  
  



  





--
View this message in context: 
http://lucene.472066.n3.nabble.com/having-trouble-searching-on-EdgeNGramFilterFactory-field-with-a-length-minGramSize-tp4049107.html
Sent from the Solr - User mailing list archive at Nabble.com.


struggling with solr.WordDelimiterFilterFactory and periods "." or dots

2012-02-07 Thread geeky2
hello all,

i am struggling with getting solr.WordDelimiterFilterFactory to behave as is
indicated in the solr book (Smiley) on page 54.

the example in the books reads like this:

>>
Here is an example exercising all options:
WiFi-802.11b to Wi, Fi, WiFi, 802, 11, 80211, b, WiFi80211b
<<

essentially - i have the same requirement with embedded periods and need to
return a successful search on a field, even if the user does NOT enter the
period.

i have a field, itemNo that can contain periods ".".

example content in the itemNo field:

B12.0123

when the user searches on this field, they need to be able to enter an
itemNo without the period, and still find the item.

example:

user enters: B120123 and a document is returned with B12.0123.


unfortunately, the search will NOT return the appropriate document, if the
user enters B120123.

however - the search does work if the user enters B12 0123 (a space in place
of the period).

can someone help me understand what is missing from my configuration?


this is snipped from my schema.xml file


  
 ...

 ...
  





  



**




  
  







  





--
View this message in context: 
http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-and-periods-or-dots-tp3724822p3724822.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: struggling with solr.WordDelimiterFilterFactory and periods "." or dots

2012-02-08 Thread geeky2
hello,

thank you for the reply.

yes - i did re-index after the changes to the schema.

also - thank you for the direction on using the analyzer - but i am not sure
if i am interpreting the feedback from the analyzer correctly.

here is what i did:

in the Field value (Index) box - i placed this: BP2.1UAA

in the Field value (Query) box - i placed this: BP21UAA

then after hitting the Analyze button - i see the following:

Under Index Analyzer for: 

org.apache.solr.analysis.WordDelimiterFilterFactory {splitOnCaseChange=1,
generateNumberParts=1, catenateWords=1, luceneMatchVersion=LUCENE_33,
generateWordParts=1, catenateAll=1, catenateNumbers=1}

i see 

position1   2   3   4
term text   BP  2   1   UAA
21  BP21UAA

Under Query Analyzer for:

org.apache.solr.analysis.WordDelimiterFilterFactory {splitOnCaseChange=1,
generateNumberParts=1, catenateWords=1, luceneMatchVersion=LUCENE_33,
generateWordParts=1, catenateAll=1, catenateNumbers=1}

i see 

position1   2   3
term text   BP  21  UAA
BP21UAA

the above information leads me to believe that i "should" have BP21UAA as an
indexed term generated from the BP2.1UAA value coming from the database.

also - the query analysis lead me to believe that i "should" find a document
when i search on BP21UAA in the itemNo field

do i have this correct

am i missing something here?

i am still unable to get a hit when i search on BP21UAA in the itemNo field.

thank you,
mark

--
View this message in context: 
http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-and-periods-or-dots-tp3724822p3726021.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: struggling with solr.WordDelimiterFilterFactory and periods "." or dots

2012-02-08 Thread geeky2
hello,

thanks for sticking with me on this ...very frustrating 

ok - i did perform the query with the debug parms using two scenarios:

1) a successful search (where i insert the period / dot) in to the itemNo
field and the search returns a document.

itemNo:BP2.1UAA

http://hfsthssolr1.intra.searshc.com:8180/solrpartscat/core1/select/?q=itemNo%3ABP2.1UAA&version=2.2&start=0&rows=10&indent=on&debugQuery=on

results from debug





  0
  1
  
on
10

2.2
on
0
itemNo:BP2.1UAA
  


  

PHILIPS
0333500
0333500,1549  ,BP2.1UAA   
PLASMA TELEVISION
BP2.1UAA   
2

BP2.1UAA   
Plasma Television^
0
1549  
  


  itemNo:BP2.1UAA

  itemNo:BP2.1UAA
  MultiPhraseQuery(itemNo:"bp 2 (1 21) (uaa
bp21uaa)")
  itemNo:"bp 2 (1 21) (uaa bp21uaa)"
  

22.539911 = (MATCH) weight(itemNo:"bp 2 (1 21) (uaa bp21uaa)" in 134993),
product of:
  0.9994 = queryWeight(itemNo:"bp 2 (1 21) (uaa bp21uaa)"), product of:
45.079826 = idf(itemNo: bp=829 2=29303 1=43943 21=6716 uaa=32 bp21uaa=1)
0.02218287 = queryNorm
  22.539913 = (MATCH) fieldWeight(itemNo:"bp 2 (1 21) (uaa bp21uaa)" in
134993), product of:
1.0 = tf(phraseFreq=1.0)
45.079826 = idf(itemNo: bp=829 2=29303 1=43943 21=6716 uaa=32 bp21uaa=1)
0.5 = fieldNorm(field=itemNo, doc=134993)

  

  LuceneQParser
  
1.0

  0.0
  
0.0

  
  
0.0
  
  
0.0
  
  

0.0
  
  
0.0
  
  
0.0

  


  1.0
  
1.0
  
  

0.0
  
  
0.0
  
  
0.0

  
  
0.0
  
  
0.0
  


  









2) a NON-successful search (where i do NOT insert a period / dot) in to the
itemNo field and the search does NOT return a document

 itemNo:BP21UAA

http://hfsthssolr1.intra.searshc.com:8180/solrpartscat/core1/select/?q=itemNo%3ABP21UAA&version=2.2&start=0&rows=10&indent=on&debugQuery=on





  0
  1
  
on
10

2.2
on
0
itemNo:BP21UAA
  




  itemNo:BP21UAA
  itemNo:BP21UAA
  MultiPhraseQuery(itemNo:"bp 21 (uaa
bp21uaa)")
  itemNo:"bp 21 (uaa bp21uaa)"
  
  LuceneQParser

  
1.0

  1.0
  
1.0
  

  
0.0
  
  
0.0
  
  
0.0

  
  
0.0
  
  
0.0
  



  0.0
  
0.0
  
  
0.0

  
  
0.0
  
  
0.0
  
  

0.0
  
  
0.0
  

  




the parsedquery part of the debug ouput looks like it DOES contain the term
that i am entering for my search criteria on the itemNo field ??

does this make sense?

thank you,
mark



--
View this message in context: 
http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-and-periods-or-dots-tp3724822p3726614.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: struggling with solr.WordDelimiterFilterFactory and periods "." or dots

2012-02-09 Thread geeky2

>>
OK, first question is why are you searching on two different values?
Is that intentional? 
<<

yes - our users have to be able to locate a part or model number (that may
or may not have periods in that number) even if they do NOT enter the number
with the embedded periods.  

example: 

actual part number in our database is BP2.1UAA

however the user needs to be able to search on BP21UAA and find that part.

there are business reason why a user may see something different in the
field then is actually in the database.

does this make sense?



>>
If I'm reading your problem right, you should
be able to get/not get any response just by toggling whether the
period is in the search URL, right? 
<<

yes - simply put - the user MUST get a hit on the above mentioned part if
they enter BP21UAA or BP2.1UAA.

>>
But assuming that's not the problem, there's something you're
not telling us. In particular, why is this parsing as "MultiPhraseQuer"?
<<

sorry - i did not know i was doing this or how it happened - it was not
intentional and i did not notice this until your posting.  i am not sure of
the implications related to this or what it means to have something as a
MultiPhraseQuery.

>>
Are you putting quotes in somehow, either through the URL or by
something in your solrconfig.xml?
<<

i did not use quotes in the url - i cut and pasted the urls for my tests in
the message thread.  i do not see quotes as part of the url in my previous
post.

what would i be looking for in the solrconfig.xml file that would force the
MultiPhraseQuery?

it seems that this is the crux of the issue - but i am not sure how to
determine what is manifesting the quotes?  as previously stated - the quotes
are not being entered via the url - they are pasted (in this message thread)
exactly as i pulled them from the browser.

thank you,
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-and-periods-or-dots-tp3724822p3730070.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: struggling with solr.WordDelimiterFilterFactory and periods "." or dots

2012-02-10 Thread geeky2
hello,

>>
Or does your field in schema.xml have anything like
autoGeneratePhraseQueries="true" in it?
<<

there is no reference to this in our production schema.

this is extremely confusing.

i am not completely clear on the issue?

reviewing our previous messages - it looks like the data is being tokenized
correctly according to the analysis page and output from Luke.

it also looks like the definition of the field and field type is correct in
the schema.xml

it also looks like there is no errant data (quotes) being introduced in to
the query string submitted to solr:

example:

*http://hfsthssolr1.intra.searshc.com:8180/solrpartscat/core1/select?indent=on&version=2.2&q=itemNo%3ABP21UAA&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=&debugQuery=on&explainOther=&hl.fl=*

*so - does the real issue reside in HOW the query is being contructed /
parsed ???

and if so - what drives this query to become a MultiPhraseQuery with
embedded quotes 
*

itemNo:BP21UAA
itemNo:BP21UAA
MultiPhraseQuery(itemNo:"bp 21 (uaa
bp21uaa)")itemNo:"bp 21 (uaa
bp21uaa)"

please note - i also mocked up a simple test on my personal linux box - just
using the solr 3.5 distro (we are using 3.3.0 on our production box under
centOS)

i was able to get a simple test to work and yes - my query does look
different

output from my simple mock up on my personal box:

*http://localhost:8983/solr/select?indent=on&version=2.2&q=manu%3ABP21UAA&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=&debugQuery=on&explainOther=&hl.fl=*

manu:BP21UAAmanu:BP21UAAmanu:bp manu:21
manu:uaa manu:bp21uaamanu:bp manu:21
manu:uaa manu:bp21uaa

schema.xml





any suggestions would be greatly appreciated.

mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/struggling-with-solr-WordDelimiterFilterFactory-and-periods-or-dots-tp3724822p3733486.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: spellcheck configuration not providing suggestions or corrections

2012-02-13 Thread geeky2
hello 

thank you for the suggestion - however this did not work.

i went in to solrconfig and change the count to 20 - then restarted the
server and then did a reimport.



is it possible that i am not firing the request handler that i think i am
firing ?


  


default

false

true

20
  explicit


  spellcheck

  


query sent to server:

http://hfsthssolr1.intra.searshc.com:8180/solrpartscat/core1/select/?q=itemDescSpell%3Agusket%0D%0A&version=2.2&start=0&rows=10&indent=on&spellcheck=true&spellcheck.build=true

results:

00trueon0itemDescSpell:gusket
true102.2

--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-configuration-not-providing-suggestions-or-corrections-tp3740877p3741521.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: spellcheck configuration not providing suggestions or corrections

2012-02-13 Thread geeky2
thank you sooo much - that was it.

also - thank you for the tip on which field to hit, eg itemDesc in stead of
itemDescSpell.

thank you,
mark



--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-configuration-not-providing-suggestions-or-corrections-tp3740877p3741783.html
Sent from the Solr - User mailing list archive at Nabble.com.


proper syntax for using sort query parameter in responseHandler

2012-02-17 Thread geeky2
what is the proper syntax for including sort directive in my responseHandler?

i tried this but got an error:


  

  edismax
  all
  10
  itemNo^1.0
  *:*
 * rankNo desc*


  itemType:1


  false

  


thank you
mark

--
View this message in context: 
http://lucene.472066.n3.nabble.com/proper-syntax-for-using-sort-query-parameter-in-responseHandler-tp3755077p3755077.html
Sent from the Solr - User mailing list archive at Nabble.com.


need to support bi-directional synonyms

2012-02-22 Thread geeky2
hello all,

i need to support the following:

if the user enters "sprayer" in the desc field - then they get results for
BOTH "sprayer" and "washer".

and in the other direction

if the user enters "washer" in the desc field - then they get results for
BOTH "washer" and "sprayer". 

would i set up my synonym file like this?

assuming expand = true..

sprayer => washer
washer => sprayer

thank you,
mark

--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-to-support-bi-directional-synonyms-tp3767990p3767990.html
Sent from the Solr - User mailing list archive at Nabble.com.


does the location of a match (within a field) affect the score?

2012-03-02 Thread geeky2
hello all,

example:

i have a field named itemNo

the user does a search, itemNo:665

there are three document in the core, that look like this

doc1 - itemNo = 1237899*665*

doc2 - itemNo = *665*1237899

doc3 - itemNo = 123*665*7899



does the location or placement of the search string (beginning, middle, end)
affect the scoring of the document?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/does-the-location-of-a-match-within-a-field-affect-the-score-tp3793634p3793634.html
Sent from the Solr - User mailing list archive at Nabble.com.


need input - lessons learned or best practices for data imports

2012-03-05 Thread geeky2
hello all,

we are approaching the time when we will move our first solr core in to a
more "production like" environment.  as a precursor to this, i am attempting
to write some documents on impact assessment and batch load / data import
strategies.

does anyone have processes or lessons learned - that they can share?

maybe a good place to start - but not limited to - would be how do people
monitor data imports (we are using a very simple DIH hooked to an informix
schema) and send out appropriate notifications?

thank you for any help or suggestions,
mark


--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-input-lessons-learned-or-best-practices-for-data-imports-tp3801327p3801327.html
Sent from the Solr - User mailing list archive at Nabble.com.


does solr have a mechanism for intercepting requests - before they are handed off to a request handler

2012-03-09 Thread geeky2
hello all,

does solr have a mechanism that could intercept a request (before it is
handed off to a request handler).

the intent (from the business) is to send in a generic request - then
pre-parse the url and send it off to a specific request handler.

thank you,
mark 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/does-solr-have-a-mechanism-for-intercepting-requests-before-they-are-handed-off-to-a-request-handler-tp3813255p3813255.html
Sent from the Solr - User mailing list archive at Nabble.com.


suggestions on automated testing for solr output

2012-03-16 Thread geeky2
hello all,

i know this is never a fun topic for people, but our SDLC mandates that we
have unit test cases that attempt to validate the output from specific solr
queries.

i have some ideas on how to do this, but would really appreciate feedback
from anyone that has done this or is doing it now.

the ideal situation (for this environment) would be something script based
and automated.

thanks for any input,
mark


--
View this message in context: 
http://lucene.472066.n3.nabble.com/suggestions-on-automated-testing-for-solr-output-tp3833049p3833049.html
Sent from the Solr - User mailing list archive at Nabble.com.


spellcheck file format - multiple words on a line?

2012-03-23 Thread geeky2
hello all,

for business reasons, we are sourcing the spellcheck file from another
business group.  

the file we receive looks like the example data below

can solr support this type of format - or do i need to process this file in
to a format that has a single word on a single line?

thanks for any help
mark



// snipped from spellcheck file sourced from business group

14-INCH CHAIN
14-INCH RIGHT TINE
1/4 open end ignition wrench
150 DEGREES CELSIUS
15 foot I wire
15 INCH
15 WATT
16 HORSEPOWER ENGINE
16 HORSEPOWER GASOLINE ENGINE
16-INCH BAR
16-INCH CHAIN
16l Cross
16p SIXTEEN PIECE FLAT FLEXIBLE CABLE


--
View this message in context: 
http://lucene.472066.n3.nabble.com/spellcheck-file-format-multiple-words-on-a-line-tp3853096p3853096.html
Sent from the Solr - User mailing list archive at Nabble.com.


preventing words from being indexed in spellcheck dictionary?

2012-03-27 Thread geeky2
hello all,

i am creating a spellcheck dictionary from the itemDescSpell field in my
schema.

is there a way to prevent certain words from entering the dictionary - as
the dictionary is being built?

thanks for any help
mark

// snipped from solarconfig.xml


  default
  itemDescSpell
  true
  spellchecker_mark



--
View this message in context: 
http://lucene.472066.n3.nabble.com/preventing-words-from-being-indexed-in-spellcheck-dictionary-tp3861472p3861472.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: preventing words from being indexed in spellcheck dictionary?

2012-03-27 Thread geeky2
thank you very much for the info ;)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/preventing-words-from-being-indexed-in-spellcheck-dictionary-tp3861472p3861987.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: preventing words from being indexed in spellcheck dictionary?

2012-03-27 Thread geeky2
hello,

should i apply the StopFilterFactory at index time or query time.

right now - per the schema below - i am applying it at BOTH index time and
query time.

is this correct?

thank you,
mark


// snipped from schema.xml






  

  
  
  
  
  
  


  
  
  
  
  

  


--
View this message in context: 
http://lucene.472066.n3.nabble.com/preventing-words-from-being-indexed-in-spellcheck-dictionary-tp3861472p3862722.html
Sent from the Solr - User mailing list archive at Nabble.com.


authentication for solr admin page?

2012-03-28 Thread geeky2
hello,

environment:

running solr 3.5 under jboss 5.1

i have been searching the user list along with the locations below - to find
out how you require a user to authenticate in to the solr /admin page.  i
thought this would be a common issue - but maybe not ;)

any help would be apprecaited

thank you,
mark



http://drupal.org/node/658466

http://wiki.apache.org/solr/SolrSecurity#Write_Your_Own_RequestHandler_or_SearchComponent





--
View this message in context: 
http://lucene.472066.n3.nabble.com/authentication-for-solr-admin-page-tp3865665p3865665.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: preventing words from being indexed in spellcheck dictionary?

2012-03-28 Thread geeky2
thank you, James.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/preventing-words-from-being-indexed-in-spellcheck-dictionary-tp3861472p3865670.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: authentication for solr admin page?

2012-03-28 Thread geeky2
update -

ok - i was reading about replication here:

http://wiki.apache.org/solr/SolrReplication

and noticed comments in the solrconfig.xml file related to HTTP Basic
Authentication and the usage of the following tags:

username
password

*Can i place these tags in the request handler to achieve an authentication
scheme for the /admin page?*

// snipped from the solrconfig.xml file

  

thanks for any help
mark

--
View this message in context: 
http://lucene.472066.n3.nabble.com/authentication-for-solr-admin-page-tp3865665p3865747.html
Sent from the Solr - User mailing list archive at Nabble.com.


why does building war from source produce a different size file?

2012-03-29 Thread geeky2

hello all,

i have been pulling down the 3.5 solr war file from the mirror site.

the size of this file is:

6403279 Nov 22 14:54 apache-solr-3.5.0.war

when i build the war file from source - i get a different sized file:

 ./dist/apache-solr-3.5-SNAPSHOT.war

6404098 Mar 29 11:41 ./dist/apache-solr-3.5-SNAPSHOT.war

am i building from the wrong source?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-building-war-from-source-produce-a-different-size-file-tp3868307p3868307.html
Sent from the Solr - User mailing list archive at Nabble.com.


is there a downside to combining search fields with copyfield?

2012-04-12 Thread geeky2
hello everyone,

can people give me their thoughts on this.

currently, my schema has individual fields to search on.

are there advantages or disadvantages to taking several of the individual
search fields and combining them in to a single search field?

would this affect search times, term tokenization or possibly other things.

example of individual fields

brand
category
partno

example of a single combined search field

part_info (would combine brand, category and partno)

thank you for any feedback
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-there-a-downside-to-combining-search-fields-with-copyfield-tp3905349p3905349.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: is there a downside to combining search fields with copyfield?

2012-04-12 Thread geeky2

>>
You end up with one multivalued field, which means that you can only
have one analyzer chain.
<<

actually two of the three fields being considered for combination in to a
single field ARE multivalued fields.

would this be an issue?

>>
  With separate fields, each field can be
analyzed differently.  Also, if you are indexing and/or storing the
individual fields, you may have data duplication in your index, making
it larger and increasing your disk/RAM requirements.
<<

this makes sense


>>
  That field will
have a higher termcount than the individual fields, which means that
searches against it will naturally be just a little bit slower.
<<

ok

>>
  Your
application will not have to do as much work to construct a query, though.
<<

actually this is the primary reason this came up.  

>>
If you are already planning to use dismax/edismax, then you don't need
the overhead of a copyField.  You can simply provide access to (e)dismax
search with the qf (and possibly pf) parameters predefined, or your
application can provide these parameters.

http://wiki.apache.org/solr/ExtendedDisMax
<<

can you elaborate on this and how EDisMax would preclude the need for
copyfield?

i am using extended dismax now in my response handlers.

here is an example of one of my requestHandlers

  

  edismax
  all
  5
  itemNo^1.0
  *:*


  itemType:1
  rankNo asc, score desc


  false

  






Thanks,
Shawn 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-there-a-downside-to-combining-search-fields-with-copyfield-tp3905349p3906265.html
Sent from the Solr - User mailing list archive at Nabble.com.


searching across multiple fields using edismax - am i setting this up right?

2012-04-12 Thread geeky2
hello all,

i just want to check to make sure i have this right.

i was reading on this page: http://wiki.apache.org/solr/ExtendedDisMax,
thanks to shawn for educating me.

*i want the user to be able to fire a requestHandler but search across
multiple fields (itemNo, productType and brand) WITHOUT them having to
specify in the query url what fields they want / need to search on*

this is what i have in my request handler


  

  edismax
  all
  5
  *itemNo^1.0 productType^.8 brand^.5*
  *:*


  rankNo asc, score desc


  false

  

this would be an example of a single term search going against all three of
the fields

http://bogus:bogus/somecore/select?qt=partItemNoSearch&q=*dishwasher*&debugQuery=on&rows=100

this would be an example of a multiple term search across all three of the
fields

http://bogus:bogus/somecore/select?qt=partItemNoSearch&q=*dishwasher
123-xyz*&debugQuery=on&rows=100


do i understand this correctly?

thank you,
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-across-multiple-fields-using-edismax-am-i-setting-this-up-right-tp3906334p3906334.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: searching across multiple fields using edismax - am i setting this up right?

2012-04-13 Thread geeky2
thank you for the response.

it seems to be working well ;)

1) i tried your suggestion about removing the qt parameter - 

*somecore/partItemNoSearch*&q=dishwasher&debugQuery=on&rows=10

but this results in a 404 error message - is there some configuration i am
missing to support this short-hand syntax for specifying the requestHandler
in the url ?



2) ok - good suggestion.



3) yes it looks like it IS searching across all three (3) fields.

i noticed that for the itemNo field, it reduced the search string from
dishwasher to dishwash - it this because of stemming on the field type, used
for the itemNo field?

dishwasherdishwasher+DisjunctionMaxQuery((brand:dishwasher^0.5 |
*itemNo:dishwash* | productType:dishwasher^0.8))+(brand:dishwasher^0.5 | itemNo:dishwash |
productType:dishwasher^0.8)





--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-across-multiple-fields-using-edismax-am-i-setting-this-up-right-tp3906334p3907875.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr replication failing with error: Master at: is not available. Index fetch failed

2012-04-23 Thread geeky2
hello all,

enviornment: centOS and solr 3.5

i am attempting to set up replication betweeen two solr boxes (master and
slave).

i am getting the following in the logs on the slave box.

2012-04-23 10:54:59,985 SEVERE [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Master at:
http://someip:someport/somepath/somecore/admin/replication/ is not
available. Index fetch failed. Exception: Invalid version (expected 2, but
10) or the data in not in 'javabin' format

master jvm (jboss host) is being started like this:

-Denable.master=true

slave jvm (jboss host) is being started like this:

-Denable.slave=true

does anyone have any ideas?

i have done the following:

used curl http://someip:someport/somepath/somecore/admin/replication/ from
slave to successfully see master

used ping from slave to master

switched out the dns name for master to hard coded ip address

made sure i can see
http://someip:someport/somepath/somecore/admin/replication/ in a browser


this is my request handler - i am using the same config file on both the
master and slave - but sending in the appropriate switch on start up (per
the solr wiki page on replication)



  ${enable.master:false}
  startup
  commit



  schema.xml,stopwords.txt,elevate.xml

  00:00:10


1


  ${enable.slave:false}
  http://someip:someport/somecore/admin/replication/

  00:00:20


  internal

  5000
  1


  


any suggestions would be great

thank you,
mark



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-failing-with-error-Master-at-is-not-available-Index-fetch-failed-tp3932921p3932921.html
Sent from the Solr - User mailing list archive at Nabble.com.


correct location in chain for EdgeNGramFilterFactory ?

2012-04-24 Thread geeky2
hello all,

i want to experiment with the EdgeNGramFilterFactory at index time.

i believe this needs to go in post tokenization - but i am doing a pattern
replace as well as other things.

should the EdgeNGramFilterFactory go in right after the pattern replace?





  






*put EdgeNGramFilterFactory here ===> ?*





  
  







  


thanks for any help,



--
View this message in context: 
http://lucene.472066.n3.nabble.com/correct-location-in-chain-for-EdgeNGramFilterFactory-tp3935589p3935589.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr replication failing with error: Master at: is not available. Index fetch failed

2012-04-24 Thread geeky2
hello,

thank you for the reply,

yes - master has been indexed.

ok - makes sense - the polling interval needs to change

i did check the solr war file on both boxes (master and slave).  they are
identical.  actually - if they were not indentical - this would point to a
different issue altogether - since our deployment infrastructure - rolls the
war file to the slaves when you do a deployment on the master.

this has me stumped - not sure what to check next.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-failing-with-error-Master-at-is-not-available-Index-fetch-failed-tp3932921p3935699.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr replication failing with error: Master at: is not available. Index fetch failed

2012-04-24 Thread geeky2
that was it!

thank you.

i did notice something else in the logs now ...

what is the meaning or implication of the message, "Connection reset".?



2012-04-24 12:59:19,996 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 12:59:39,998 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
*2012-04-24 12:59:59,997 SEVERE [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Master at:
http://bogus:bogusport/somepath/somecore/replication/ is not available.
Index fetch failed. Exception: Connection reset*
2012-04-24 13:00:19,998 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:00:40,004 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:00:59,992 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:01:19,993 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:01:39,992 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:01:59,989 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:02:19,990 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:02:39,989 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-24 13:02:59,991 INFO  [org.a

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-failing-with-error-Master-at-is-not-available-Index-fetch-failed-tp3932921p3936107.html
Sent from the Solr - User mailing list archive at Nabble.com.


faceted searches - design question - facet field not part of qf search fields

2012-04-24 Thread geeky2


hello all,

this is more of a design / newbie question on how others combine faceted
search fields in to their requestHandlers.

say you have a request handler set up like below.

does it make sense (from a design perspective) to add a faceted search field
that is NOT part of the main search fields (itemNo, productType, brand) in
the qf param?

for example, augment the requestHandler below to include a faceted search on
itemDesc?

would this be confusing ? - to be searching across three fields - but
offering faceted suggestions on itemDesc?

just trying to understand how others approach this

thanks

  

  edismax
  all
  10
  itemNo^1.0 productType^.8 brand^.5
  *:*


 

  false

  



  


--
View this message in context: 
http://lucene.472066.n3.nabble.com/faceted-searches-design-question-facet-field-not-part-of-qf-search-fields-tp3936509p3936509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: faceted searches - design question - facet field not part of qf search fields

2012-04-25 Thread geeky2
thank you BOTH, Erick and Hos for the insight.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/faceted-searches-design-question-facet-field-not-part-of-qf-search-fields-tp3936509p3938080.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr replication failing with error: Master at: is not available. Index fetch failed

2012-04-26 Thread geeky2
hello,

sorry - i overlooked this message - thanks for checking back and thanks for
the info.

yes - replication seems to be working now:

tailed from logs just now:

2012-04-26 09:21:33,284 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-26 09:21:53,279 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-26 09:22:13,279 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.
2012-04-26 09:22:33,279 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Slave in sync with master.



 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-replication-failing-with-error-Master-at-is-not-available-Index-fetch-failed-tp3932921p3941447.html
Sent from the Solr - User mailing list archive at Nabble.com.


impact of EdgeNGramFilterFactory on indexing process?

2012-04-26 Thread geeky2

Hello all,

i am experimenting with EdgeNGramFilterFactory - on two of the fieldTypes in
my schema.

   

i believe i understand this - but want to verify:

1) will this increase my index time?
2) will increase the number of documents in my index?

thank you

--
View this message in context: 
http://lucene.472066.n3.nabble.com/impact-of-EdgeNGramFilterFactory-on-indexing-process-tp3941743p3941743.html
Sent from the Solr - User mailing list archive at Nabble.com.


should slave replication be turned off / on during master clean and re-index?

2012-04-27 Thread geeky2
hello all,

i am just getting replication going on our master and two (2) slaves.

from time to time, i may need to do a complete re-index and clean on the
master.

should replication on the slave - remain On or Off during a full clean and
re-index on the Master?

thank you,

--
View this message in context: 
http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945531.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: should slave replication be turned off / on during master clean and re-index?

2012-04-27 Thread geeky2
hello,

thank you for the reply,

>>
Does a "clean" mean issuing a deletion query (e.g.
*:*) prior to re-indexing all of your content?  I
don't think the slaves will download any changes until you've committed at
some point on the master.  
<<

well, in this case when i say, "clean"  (on the Master), i mean selecting
the "Full Import with Cleaning" button from the DataImportHandler
Development Console page in solr.  at the top of the page, i have the check
boxes selected for verbose and clean (*but i don't have the commit checkbox
selected*).

by doing the above process - doesn't this issue a deletion query - then
start the import?

and as a follow-up - when actually is the commit being done?


here is my from my solrconfig.xml file on the master

  
*
  6
  1000
*
10
  






--
View this message in context: 
http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3945954.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: should slave replication be turned off / on during master clean and re-index?

2012-05-01 Thread geeky2
hello shawn,

thanks for the reply.

ok - i did some testing and yes you are correct.  

autocommit is doing the "commit" work in chunks. yes - the slaves are also
going to having everything to nothing, then slowly building back up again,
lagging behind the master.

... and yes - this is probably not what we need - as far as a replication
strategy for the slaves.

you said, you don't use autocommit.  if so - then why don't you use / like
autocommit?

since we have not done this here - there is no established reference point,
from an operations perspective.

i am looking to formulate some sort of operation strategy, so ANY ideas or
input is really welcome.



it seems to me that we have to account for two operational strategies - 

the first operational mode is a "daily" append to the solr core after the
database tables have been updated.  this can probably be done with a simple
delta import.  i would think that autocommit could remain on for the master
and replication could also be left on so the slaves picked up the changes
ASAP.  this seems like the mode that we would / should be in most of the
time.


the second operational mode would be a "build from scratch" mode, where
changes in the schema necessitated a full re-index of the data.  given that
our site (powered by solr) must be up all of the time, and that our full
index time on the master (for the moment) is hovering somewhere around 16
hours - it makes sense that some sort of parallel path - with a cut-over,
must be used.

in this situation is it possible to have the indexing process going on in
the background - then have one commit at the end - then turn replication on
for the slaves?

are there disadvantages to this approach?

also - i really like your suggestion of a "build core" and "live core".  is
this approach you use?

thank you for all of the great input




then 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3952904.html
Sent from the Solr - User mailing list archive at Nabble.com.


dataimport handler (DIH) - notify when it has finished?

2012-05-01 Thread geeky2
Hello all,

is there a notification / trigger / callback mechanism people use that
allows them to know when a dataimport process has finished?

we will be doing daily delta-imports and i need some way for an operations
group to know when the DIH has finished.

thank you,



--
View this message in context: 
http://lucene.472066.n3.nabble.com/dataimport-handler-DIH-notify-when-it-has-finished-tp3953339.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: should slave replication be turned off / on during master clean and re-index?

2012-05-03 Thread geeky2
thanks for all of the advice / help.

i appreciate it ;)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/should-slave-replication-be-turned-off-on-during-master-clean-and-re-index-tp3945531p3959088.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr snapshots - old school and replication - new school ?

2012-05-03 Thread geeky2
hello all,

enviornment: centOS and solr 3.5

i want to make sure i understand the difference between  snapshots and solr
replication.

snapshots are "old school" and have been deprecated with solr replication
"new school".

do i have this correct?

btw: i have replication working (now), between my master and two slaves - i
just want to make sure i am not missing a larger picture ;)

i have been reading the Smiley Pugh book (pg 349) as well as material on the
wiki at:

http://wiki.apache.org/solr/SolrCollectionDistributionScripts

http://wiki.apache.org/solr/SolrReplication


thank you,



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-snapshots-old-school-and-replication-new-school-tp3959152.html
Sent from the Solr - User mailing list archive at Nabble.com.


not getting expected results when doing a delta import via full import

2012-05-14 Thread geeky2
hello all,


i am not getting the expected results when trying to set up delta imports
according to the wiki documentation here:

http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport?highlight=%28delta%29|%28import%29



i have the following set up in my DIH,

query="select [complicated sql goes here] and
('${dataimporter.request.clean}' != 'false' OR some_table.upd_by_ts  >
'${dataimporter.last_index_time}')"> 

i have the following set up in the shell script to invoke my import process
(either a full w/clean or delta)

# change clean=true for full, clean=false for delta

SERVER="http://some_server:port/some_core/dataimport -F command=full-import
-F clean=false"

curl $SERVER


when i do a full import (clean=true) i see all of the documents (via the
stats page) show up in the core.

when i do a delta import (clean=false) i see ~900 fewer records in the
import, but i should see much fewer (~84,000) records less, based on the
fact that i am updating the upd_by_ts field to the current timestamp on
84,000 records!

can someone tell me what i am missing?

thank you,




--
View this message in context: 
http://lucene.472066.n3.nabble.com/not-getting-expected-results-when-doing-a-delta-import-via-full-import-tp3983711.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: not getting expected results when doing a delta import via full import

2012-05-14 Thread geeky2
update on this:

i also tried manipulating the timestamps in the dataimport.properties file
to advance the date so that no records could be older than last_index_time

example:

#Mon May 14 12:42:49 CDT 2012
core1-model.last_index_time=2012-05-15 14\:38\:55
last_index_time=2012-05-15 14\:38\:55
~

this leads me to believe that date comparisons are not being done correctly
or have not been configured correctly.

so what does something need to be configured for the date comparison to
work?

example from wiki:

OR last_modified > '${*dataimporter.last_index_time*}'">



--
View this message in context: 
http://lucene.472066.n3.nabble.com/not-getting-expected-results-when-doing-a-delta-import-via-full-import-tp3983711p3983715.html
Sent from the Solr - User mailing list archive at Nabble.com.


need help with getting exact matches to score higher

2012-05-15 Thread geeky2
Hello all,


i am trying to tune our core for exact matches on a single field (itemNo)
and having issues getting it to work.  

in addition - i need help understanding the output from debugQuery=on where
it presents the scoring.

my goal is to get exact matches to arrive at the top of the results. 
however - what i am seeing is non-exact matches arrive at the top of the
results with MUCH higher scores.



// from schema.xml - i am copying itemNo in to the string field for use in
boosting

  
  

// from solrconfig.xml - i have the boost set for my special exact match
field and the sorting on score desc.

  

  edismax
  all
  10
  *itemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
brand^.5*
  *:*
 * score desc*
  true
  itemDescFacet
  brandFacet
  divProductTypeIdFacet





  



// analysis output from debugQuery=on

here you can see that the top socre for itemNo:9030 is a part that does not
start with 9030.

the entries below (there are 4) all have exact matches - but they rank below
this part - ???



2TTZ9030C1000A* ">
0.585678 = (MATCH) max of:
  0.585678 = (MATCH) weight(itemNo:9030^0.9 in 582979), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
27.173943 = (MATCH) fieldWeight(itemNo:9030 in 582979), product of:
  2.6457512 = tf(termFreq(itemNo:9030)=7)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=582979)




9030*   ">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 499864), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 499864), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=499864)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 538826), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 538826), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=538826)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544313), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544313), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544313)


9030   *">
0.22136548 = (MATCH) max of:
  0.22136548 = (MATCH) weight(itemNo:9030^0.9 in 544657), product of:
0.021552926 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  10.270785 = idf(docFreq=55, maxDocs=594893)
  0.0023316324 = queryNorm
10.270785 = (MATCH) fieldWeight(itemNo:9030 in 544657), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  10.270785 = idf(docFreq=55, maxDocs=594893)
  1.0 = fieldNorm(field=itemNo, doc=544657)








--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-getting-exact-matches-to-score-higher-tp3983882.html
Sent from the Solr - User mailing list archive at Nabble.com.


doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread geeky2

hello,

After doing a DIH full-import (with clean=true) after deleting records in
the database, i noticed that the number of documents processed, did change.


example:

Indexing completed. Added/Updated: 595908 documents. Deleted 0 documents.

however, i noticed the numbers on the statistics page did not change nor do
they match the number of indexed records -


can someone help me understand the difference in these numbers and the
meaning of maxDoc / numDoc?

numDocs : 594893
maxDoc : 594893 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: doing a full-import after deleting records in the database - maxDocs

2012-05-15 Thread geeky2
hello 

thanks for the reply

this is the output - docsPending = 0

commits : 1786
autocommit maxDocs : 1000
autocommit maxTime : 6ms
autocommits : 1786
optimizes : 3
rollbacks : 0
expungeDeletes : 0
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 1787752
cumulative_deletesById : 0
cumulative_deletesByQuery : 3
cumulative_errors : 0 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/doing-a-full-import-after-deleting-records-in-the-database-maxDocs-tp3983948p3983995.html
Sent from the Solr - User mailing list archive at Nabble.com.


index-time boosting using DIH

2012-05-22 Thread geeky2
hello all,

can i use the technique described on the wiki at:

http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts

if i am populating my core using a DIH?

looking at the posts on this subject and the wiki docs - leads me to believe
that you can only use this when you are using the xml interface for
importing data?

thank you

--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-time-boosting-using-DIH-tp3985508.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: index-time boosting using DIH

2012-05-22 Thread geeky2
thanks for the reply,

so to use the $docBoost pseudo-field name, would you do something like below
- and would this technique likely increase my total index time?




  

 
  

 
 
 ... 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-time-boosting-using-DIH-tp3985508p3985527.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: index-time boosting using DIH

2012-05-22 Thread geeky2
thank you james for the feedback - i appreciate it.

ultimately - i was trying to decide if i was missing the boat by ONLY using
query time boosting, and i should really be using index time boosting.

but after your reply, reading the solr book, and looking at the lucene dox -
it looks like index-time boosting is not what i need.  i can probably do
better by using query-time boosting and the proper sort params.

thanks again

--
View this message in context: 
http://lucene.472066.n3.nabble.com/index-time-boosting-using-DIH-tp3985508p3985539.html
Sent from the Solr - User mailing list archive at Nabble.com.


need to verify my understanding of default value of mm (minimum match) for edismax

2012-05-24 Thread geeky2
environment: solr 3.5
default operator is OR

i want to make sure i understand how the mm param(minimum match) works for
the edismax parser

http://wiki.apache.org/solr/ExtendedDisMax?highlight=%28dismax%29#mm_.28Minimum_.27Should.27_Match.29

it looks like the rule is 100% of the terms must match across the fields,
unless i over ride this with the mm=x param - do i have this right?

what i am seeing is a query that matches on:

q=singer sewing 9010

will fail if it is changed to:

q=singer sewing machine 9010

for the second query - if i add mm=3 - then it comes back with results

thank you


--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-to-verify-my-understanding-of-default-value-of-mm-minimum-match-for-edismax-tp3985936.html
Sent from the Solr - User mailing list archive at Nabble.com.


possible status codes from solr during a (DIH) data import process

2012-05-31 Thread geeky2
hello all,

i have been asked to write a small polling script (bash) to periodically
check the status of an import on our Master.  our import times are small,
but there are business reasons why we want to know the status of an import
after a specified amount of time.

i need to perform certain actions based on the "status" of the import, and
therefore need to quantify which tags to check and their appropriate states.

i am using the command from the DataImportHandler HTTP API to get the status
of the import:

OUTPUT=$(curl -v
http://${SERVER}:${PORT}/somecore/dataimport?command=status)




can someone tell me if i have these rules correct?

1) during an import - the status tag will have a busy state:

example:

  busy

2) at the completion of an import (regardless of failure or success) the
status tag will have an "idle" state:

example:

  idle


3) to determine if an import failed or succeeded - you must interrogate the
tags underand specifically look for :

success: 
Indexing completed. Added/Updated: 603378 documents. Deleted 0
documents.

failure: 
Indexing completed. Added/Updated: 603378 documents. Deleted 0
documents.

thank you,


--
View this message in context: 
http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110.html
Sent from the Solr - User mailing list archive at Nabble.com.


eliminate adminPath tag from solr.xml file?

2012-06-01 Thread geeky2
hello all,

referring to:

http://wiki.apache.org/solr/CoreAdmin#Core_Administration

if you wanted to eliminate administration of the core from the web site,

could you eliminate either solr.xml or remove the 

 from the solr.xml file?

thank you,


--
View this message in context: 
http://lucene.472066.n3.nabble.com/eliminate-adminPath-tag-from-solr-xml-file-tp3987262.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: possible status codes from solr during a (DIH) data import process

2012-06-01 Thread geeky2
thank you ALL for the great feedback - very much appreciated!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987263.html
Sent from the Solr - User mailing list archive at Nabble.com.


seeing errors during replication process on slave boxes - read past EOF

2012-06-03 Thread geeky2
hello all,

environment: solr 3.5

1 - master
2 - slave

slaves are set to poll master every 10 minutes.

i have had replication running on one master and two slaves - for a few
weeks now.  these boxes are not production boxes - just QA/test boxes.

right after i started a re-index on the master - i started to see the
following errors on both of the slave boxes.

in previous test runs - i have not noticed any errors.

can someone help me understand what is causing these errors?

thank you,

2012-06-03 19:30:23,104 INFO  [org.apache.solr.update.UpdateHandler]
(pool-16-thread-1) start
commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
2012-06-03 19:30:23,164 SEVERE [org.apache.solr.handler.ReplicationHandler]
(pool-16-thread-1) SnapPull failed
org.apache.solr.common.SolrException: Index fetch failed :
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:331)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)
at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: java.io.IOException: read past EOF:
MMapIndexInput(path="/appl/solr/stress/partcatalog/index/core1/index.20120514101522/_5kgm.fdx")
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1103)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:418)
at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:470)
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:321)
... 11 more
Caused by: java.io.IOException: read past EOF:
MMapIndexInput(path="/appl/solr/stress/partcatalog/index/core1/index.20120514101522/_5kgm.fdx")
at
org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte(MMapDirectory.java:279)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
at
org.apache.lucene.store.MMapDirectory$MMapIndexInput.readInt(MMapDirectory.java:315)
at org.apache.lucene.index.FieldsReader.(FieldsReader.java:138)
at
org.apache.lucene.index.SegmentCoreReaders.openDocStores(SegmentCoreReaders.java:212)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:117)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:93)
at
org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:235)
at
org.apache.lucene.index.ReadOnlyDirectoryReader.(ReadOnlyDirectoryReader.java:34)
at
org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:506)
at
org.apache.lucene.index.DirectoryReader.access$000(DirectoryReader.java:45)
at
org.apache.lucene.index.DirectoryReader$2.doBody(DirectoryReader.java:498)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:754)
at
org.apache.lucene.index.DirectoryReader.doOpenNoWriter(DirectoryReader.java:493)
at
org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:450)
at
org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:396)
at
org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:520)
at org.apache.lucene.index.IndexReader.reopen(IndexReader.java:697)
at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:414)
at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:425)
at
org.apache.solr.search.SolrIndexReader.reopen(SolrIndexReader.java:35)
at
org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:501)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1083)
... 14 more
2012-06-03 19:30:23,197 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Skipping download for
/appl/solr/stress/partcatalog/index/core1/index.20120514101522/_5kiq.tis
2012-06-03 19:30:23,198 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Skipping download for
/appl/solr/stress/partcatalog/index/core1/index.20120514101522/_5kit.tis
2012-06-03 19:30:23,198 INFO  [org.apache.solr.handler.SnapPuller]
(pool-12-thread-1) Skipping download for
/appl/solr/stress/partcatalog/index/core1/index.20120514101522/_5kgm.fdt
2012-06-03 19:30:23,198 INFO  [org.apa

Re: seeing errors during replication process on slave boxes - read past EOF

2012-06-04 Thread geeky2
hello,

i have shell scripts that handle all of the operational tasks.  

example:

curl -v http://${SERVER}.bogus.com:${PORT}/somecore/dataimport -F
command=full-import -F clean=${CLEAN} -F commit=${COMMIT} -F
optimize=${OPTIMIZE}


--
View this message in context: 
http://lucene.472066.n3.nabble.com/seeing-errors-during-replication-process-on-slave-boxes-read-past-EOF-tp3987489p3987617.html
Sent from the Solr - User mailing list archive at Nabble.com.


question about jmx value (avgRequestsPerSecond) output from solr

2012-06-27 Thread geeky2
hello all,

environment: centOS, solr 3.5, jboss 5.1

i have been using wily (a monitoring tool) to instrument our solr instances
in stress.

can someone help me to understand something about the jmx values being
output from solr?  please note - i am new to JMX.

problem / issue statement: for a given request handler (partItemDescSearch),
i see output from the jmx MBean for the metric avgRequestsPerSecond - AFTER
my test harness has completed and there is NO request activity to this
request handler - taking place (verified in solr log files).

example scenario during testing:  during a test run - the test harness will
fire requests at request handler (partItemDescSearch) and all numbers look
fine.   then after the test harness is done - the metric
avgRequestsPerSecond does not immediately drop to 0.  instead - it appears
as if JMX is somehow averaging this metric and gradually trending it
downward toward 0.

continual checking of this metric (in the JMX tree - see screen shot) shows
the number trending downward instead of a hard stop at 0.

is this behavior - just the way jmx works?

thanks mark

http://lucene.472066.n3.nabble.com/file/n3991616/test1.bmp 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-jmx-value-avgRequestsPerSecond-output-from-solr-tp3991616.html
Sent from the Solr - User mailing list archive at Nabble.com.


avgTimePerRequest JMX M-Bean displays with NaN instead of 0 - when no activity

2012-06-28 Thread geeky2
hello all,

environment: solr 3.5, jboss, wily

we have been setting up jmx monitoring for our solr installation.

while running tests - i noticed that of the 6 JMX M-Beans
(avgRequestsPerSecond, avgTimePerRequest, errors, requests, timeouts,
totalTime) ...

the avgTimePerRequest M-Bean was producing "NaN" when there was no search
activity.

all of the other M-Beans displayed a 0 (zero) when there was no search
activity.

we were able to compensate for this issue with custom scripting in wily on
our side.

can someone help me understand this inconsistency?

is this just a WAD (works as a designed) ?

thanks for any help or insight



--
View this message in context: 
http://lucene.472066.n3.nabble.com/avgTimePerRequest-JMX-M-Bean-displays-with-NaN-instead-of-0-when-no-activity-tp3991962.html
Sent from the Solr - User mailing list archive at Nabble.com.


maxNumberOfBackups does not cleanup - jira 3361

2012-07-10 Thread geeky2

environment: solr 3.5

hello all,

i have a question on this jira -
https://issues.apache.org/jira/browse/SOLR-3361

the jira states that, "with "backupAfter"=commit, the backups do not get
cleaned up"

however - we are noticing this same issue in our environment, when using
optimize.

can someone confirm that this bug applies to "optimze" as well?

thank you



example:


  optimize


--
View this message in context: 
http://lucene.472066.n3.nabble.com/maxNumberOfBackups-does-not-cleanup-jira-3361-tp3994156.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: maxNumberOfBackups does not cleanup - jira 3361

2012-07-10 Thread geeky2
thank you James - that is good to know.

for the short-term we'll just use cron and kill backup directories that are
older than x.

for the long-term, we'll just migrate to 4.0

thanks again


--
View this message in context: 
http://lucene.472066.n3.nabble.com/maxNumberOfBackups-does-not-cleanup-jira-3361-tp3994156p3994191.html
Sent from the Solr - User mailing list archive at Nabble.com.


need help understanding times used in dataimport?command=status

2012-07-11 Thread geeky2
hello all,

i noticed something in one of our logs that periodically polls the status of
an data import.

can someone help me understand where / how the "times" for "Full Dump
Started" are derived?

here it shows the dataimport dump starting at 1:32




  
0
0
  
  

  db-data-config.xml

  
  status
  busy
  A command is still running...
  
0:0:8.182
2
18834
18818
0
*2012-07-11 01:32:18*
  
  This response format is experimental. It is likely to
change in the future.




however - here is shows the dump starting at 2:17



  
0
0
  
  

  db-data-config.xml

  
  status
  busy
  A command is still running...
  
0:45:8.373
3
8138060
0
*2012-07-11 02:17:11*
  
  This response format is experimental. It is likely to
change in the future.







  
0
0
  
  

  db-data-config.xml

  
  status
  idle
  
  
3
8528239
0
*2012-07-11 02:17:11*
Indexing completed. Added/Updated: 8464051 documents.
Deleted 0 documents.
*2012-07-11 02:21:17*
*2012-07-11 02:21:17*
8464051
*0:48:58.712*
  
  This response format is experimental. It is likely to
change in the future.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-times-used-in-dataimport-command-status-tp3994437.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help understanding an issue with scoring

2012-08-23 Thread geeky2
update:

as an experiment - i changed the query to a wildcard (9030*) instead of an
explicit value (9030)

example:

QUERY="http://$SERVER.intra.searshc.com:${PORT}/solrpartscat/core1/select?qt=itemNoProductTypeBrandSearch&q=9030*&rows=2000&debugQuery=on&fl=*,score";

this resulted in a results list that appears much more rational from a sort
order perspective -

however - the wildcard query is not acceptable from a performance stand
point.

any input or illumination would be appreciated ;)

thank you

itemNo, score, rankNo, partCnt

  [9030],1.0,10353,1
[90302   ],1.0,6849,1
[9030P   ],1.0,444,1
[903093  ],1.0,51,1
[9030430 ],1.0,47,1
[9030],1.0,37,1
[903057-9010 ],1.0,26,1
[903061-9010 ],1.0,20,1
[903046-9010 ],1.0,18,1
[903056-9010 ],1.0,14,1
[903095  ],1.0,14,1
[90303-MR1-000   ],1.0,14,1
[903097-9050 ],1.0,12,1
[903046-9011 ],1.0,12,1
[903097-9010 ],1.0,11,1
[903097-9040 ],1.0,11,1
[903063-9100 ],1.0,6,1
[903066-9011 ],1.0,6,1
[903098  ],1.0,3,1




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-tp4002897p4002919.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help understanding an issue with scoring

2012-08-23 Thread geeky2
looks like the original complete list of the results did not get attached to
this thread 

here is a snippet of the list.

what i am trying to demonstrate, is the difference in scoring and
ultimately, sorting - and the breadth of documents (a few hundred) between
the two documents of interest (9030 and 90302)

thank you,

itemNo, score, rankNo, partCnt

  [9030],12.014701,10353,1
[9030],12.014701,37,1
[9030],12.014701,1,1
[9030   ],12.014701,0,167
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[9030],12.014701,0,1
[PC-9030],7.509188,0,169
[58-9030 ],7.509188,0,1
[9030-1R ],7.509188,0,1
[903028-9030 ],7.509188,0,1
[903139-9030 ],7.509188,0,1
[903091-9030 ],7.509188,0,1
[903099-9030 ],7.509188,0,1
[903153-9030 ],7.509188,0,1
[031-9030],7.509188,0,1
[308-9030],7.509188,0,1
[9030-6010   ],7.509188,0,1
[9030-6010   ],7.509188,0,1
[9030-6006   ],7.509188,0,1
[9030-6008   ],7.509188,0,1
[9030-6008   ],7.509188,0,1
[9030-6001   ],7.509188,0,1
[9030-6003   ],7.509188,0,1
[9030-6006   ],7.509188,0,1
[208568-9030 ],7.509188,0,1
[79-9030 ],7.509188,0,1
[33-9030 ],7.509188,0,1
[M-9030  ],7.509188,0,1

... a few hundred more ...

[LGQ9030PQ1 ],0.41475832,0,150
[LEQ9030PQ0 ],0.41475832,0,124
[LEQ9030PQ1 ],0.41475832,0,123
[CWE9030BCE ],0.41475832,0,115
[PJDS9030Z   ],0.29327843,0,1
[8A-CT9-030-010  ],0.29327843,0,1
[RDT9030A],0.29327843,0,1
[PJDG9030Z   ],0.29327843,0,1
[90302   ],0.20737916,6849,1
~   



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-tp4002897p4002922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Holy cow do I love 4.0's admin screen

2012-08-23 Thread geeky2
Andy,

we are not running solr 4.0 here in production.

can you elaborate on your comment related to your polling script written in
ruby and how the new data import status screen makes your polling app
obsolete?

i wrote my own polling app (in shell) to work around the very same issues:

http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-td3987110.html

thx for the post



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Holy-cow-do-I-love-4-0-s-admin-screen-tp4002912p4002936.html
Sent from the Solr - User mailing list archive at Nabble.com.


using tie parameter of edismax to raise a score (disjunction max query)?

2012-08-23 Thread geeky2

Hello all,

this "more specific" question is related to my earlier post at:
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-td4002897.html

i am reading here about the tie parameter:
http://wiki.apache.org/solr/ExtendedDisMax?highlight=%28edismax%29#tie_.28Tie_breaker.29

*can i use the edismax, tie= parameter, to "raise" the following score?*

my goal is to raise the total score of this document (see score snippet
below) to 9.11329.

to do this - would i use tie=0.0 to make a pure "disjunction max query" --
only the maximum scoring sub query contributes to the final score?


  
*0.20737723* = (MATCH) max of:
  0.20737723 = (MATCH) weight(itemNo:9030^0.9 in 1796597), product of:
0.022755474 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  0.0027743944 = queryNorm
*9.11329* = (MATCH) fieldWeight(itemNo:9030 in 1796597), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  1.0 = fieldNorm(field=itemNo, doc=1796597)


thank you








--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-tie-parameter-of-edismax-to-raise-a-score-disjunction-max-query-tp4002935.html
Sent from the Solr - User mailing list archive at Nabble.com.


need help understanding an issue with scoring

2012-08-23 Thread geeky2
hello,

i am trying to understand the "debug" output from a query, and specifically
- how scores for two (2) documents are derived and why they are so far
apart.

the user is entering 9030 for the search

the search is rightfully returning the top document, however - the question
is why is the document with id 90302 so far down on the list.  

i have attached a text file i generated with xslt, pulling the document
information.  the text file has the itemNo, the rankNo and the partCnt.  the
sort order of the response handler is:

  score desc, rankNo desc, partCnt desc



if you look at the text file - you will see that 90302 is 174'th on the
list!  90302 has a rankNo of 6849 - and i would think that would drive it
much higher on the list and therefore much closer to 9030.

what is happening from a business perspective - is - 9030 is one of our top
selling parts as is 90302.  they need to be closer together in the results
instead of separated by 170+ documents that have a rankNo of 0.

i have also CnP the response handler that is being used - below

can someone help me understand the scoring so i can correct this?

this is the scoring for the two documents:

  
12.014634 = (MATCH) max of:
  0.20737723 = (MATCH) weight(itemNo:9030^0.9 in 2308681), product of:
0.022755474 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  0.0027743944 = queryNorm
9.11329 = (MATCH) fieldWeight(itemNo:9030 in 2308681), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  1.0 = fieldNorm(field=itemNo, doc=2308681)
  12.014634 = (MATCH) fieldWeight(itemNoExactMatchStr:9030 in 2308681),
product of:
1.0 = tf(termFreq(itemNoExactMatchStr:9030)=1)
12.014634 = idf(docFreq=140, maxDocs=8566704)
1.0 = fieldNorm(field=itemNoExactMatchStr, doc=2308681)





  
0.20737723 = (MATCH) max of:
  0.20737723 = (MATCH) weight(itemNo:9030^0.9 in 1796597), product of:
0.022755474 = queryWeight(itemNo:9030^0.9), product of:
  0.9 = boost
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  0.0027743944 = queryNorm
9.11329 = (MATCH) fieldWeight(itemNo:9030 in 1796597), product of:
  1.0 = tf(termFreq(itemNo:9030)=1)
  9.11329 = idf(docFreq=2565, maxDocs=8566704)
  1.0 = fieldNorm(field=itemNo, doc=1796597)


~  

  

  edismax
  all
  10
  itemNoExactMatchStr^30 itemNo^.9 divProductTypeDesc^.8
brand^.5
  *:*
  score desc, rankNo desc, partCnt desc
  true
  itemDescFacet
  brandFacet
  divProductTypeIdFacet





  
 
thank you for any help




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-tp4002897.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help understanding an issue with scoring

2012-08-23 Thread geeky2
hello,


this is the query i am using:

 cat goquery.sh
#!/bin/bash

SERVER=$1
PORT=$2


QUERY="http://$SERVER.blah.blah.com:${PORT}/solrpartscat/core1/select?qt=itemNoProductTypeBrandSearch&q=9030&rows=2000&debugQuery=on&fl=*,score";

curl -v $QUERY




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-tp4002897p4002969.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help understanding an issue with scoring

2012-08-28 Thread geeky2
Chris, Jack,

thank you for the detailed replies and help ;)






--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-understanding-an-issue-with-scoring-tp4002897p4003782.html
Sent from the Solr - User mailing list archive at Nabble.com.


need help with exact match search

2012-10-19 Thread geeky2
environment: solr 3.5

Hello,

i have a query for an exact match that is bringing back one (1) additional
record that is NOT an exact match.  

when i do an exact match search for 404 - i should get back three (3)
document, *but i get back the additional record, with an
itemModelNoExactMatchStr of DUS-404-19  *   

can someone help me understand what i am missing or not setting up
correctly?


response from solr with 4 documents 



  
0
1

  itemModelNoExactMatchStr asc
  itemType:2
  all
  itemModelNoExactMatchStr^30.0
  *:*
  50
  edismax
  true
*  itemModelNoExactMatchStr:404*
  modelItemNoSearch
  50
  false

  
  **

  
Kitchen Equipment*
  
  0212020
  0212020,0431  ,404   

  ELECTRIC GENERAL SLICER WITH VACU BASE
 * 404*
  404   

  2
  13
  
GENERAL
  
  0431  
  0


  
Vacuum, Canister
  
  0642000
  0642000,0517  ,404   

  HOOVER 
  404
 * 404   
*
  2
  48
  
HOOVER
  
  0517  
  0


  
Power roller
  
  0733200
  0733200,1164  ,404   

  POWER PAINTER
  404
 * 404   
*
  2
  39
  
WAGNER
  
  1164  
  0


  
Dishwasher^
  
  013
  013,0164  ,DUS-404-19

  DISHWASHERS
  DUS-404-19 
  *DUS-404-19
*
  2
  185
  
CALORIC
  
  0164  
  0

  
  
itemModelNoExactMatchStr:404
itemModelNoExactMatchStr:404
+itemModelNoExactMatchStr:404
+itemModelNoExactMatchStr:404

  
10.053003 = (MATCH) fieldWeight(itemModelNoExactMatchStr:404 in 4745495),
product of:
  1.0 = tf(termFreq(itemModelNoExactMatchStr:404)=1)
  10.053003 = idf(docFreq=971, maxDocs=8304922)
  1.0 = fieldNorm(field=itemModelNoExactMatchStr, doc=4745495)

  
10.053003 = (MATCH) fieldWeight(itemModelNoExactMatchStr:404 in 4781972),
product of:
  1.0 = tf(termFreq(itemModelNoExactMatchStr:404)=1)
  10.053003 = idf(docFreq=971, maxDocs=8304922)
  1.0 = fieldNorm(field=itemModelNoExactMatchStr, doc=4781972)

  
10.053003 = (MATCH) fieldWeight(itemModelNoExactMatchStr:404 in 8186768),
product of:
  1.0 = tf(termFreq(itemModelNoExactMatchStr:404)=1)
  10.053003 = idf(docFreq=971, maxDocs=8304922)
  1.0 = fieldNorm(field=itemModelNoExactMatchStr, doc=8186768)

  
5.0265017 = (MATCH) fieldWeight(itemModelNoExactMatchStr:404 in 4665718),
product of:
  1.0 = tf(termFreq(itemModelNoExactMatchStr:404)=1)
  10.053003 = idf(docFreq=971, maxDocs=8304922)
  0.5 = fieldNorm(field=itemModelNoExactMatchStr, doc=4665718)


ExtendedDismaxQParser



  itemType:2


  itemType:2


  1.0
  
1.0

  1.0


  0.0


  0.0


  0.0


  0.0


  0.0

  
  
0.0

  0.0


  0.0


  0.0


  0.0


  0.0


  0.0

  

  





i have looked at some of the threads up here related to this topic, but
still do not understand why the additional document is coming back.

here is my query:

http://someserver/somecore/select?qt=modelItemNoSearch&q=itemModelNoExactMatchStr:404&debugQuery=true&rows=50


here is my RH from the solrconfig.xml

  

  edismax
  all
  10
  itemModelNoExactMatchStr^30.0
  *:*


  itemType:2
  itemModelNoExactMatchStr asc


  false

  


here is the field, copyField and text type from schema.xml


 


  



  
  



  






--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-exact-match-search-tp4014832.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help with exact match search

2012-10-19 Thread geeky2
hello jack,

thank you very much for the reply - i will re-test and let you know.

really appreciate it ;)

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-exact-match-search-tp4014832p4014848.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: need help with exact match search

2012-10-22 Thread geeky2
hello jack,

that was it!

thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-help-with-exact-match-search-tp4014832p4015103.html
Sent from the Solr - User mailing list archive at Nabble.com.


large text blobs in string field

2012-11-02 Thread geeky2
hello 

environment - solr 3.5

i would like to know if anyone is using the technique of placing large text
blobs in to a "non-indexed" string field and if so - are there any good/bad
aspects to consider?

we are thinking of doing this to represent a 1:M relationship with the
"Many" being represented as a string in the schema (probably comprised
either of xml or json objects).

we are looking at the classic part : model scenario, where the client would
look up a part and the document would contain a string field with
potentially 200+ model numbers.  edge cases for this could be 400+ model
numbers.

thx

 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/large-text-blobs-in-string-field-tp4017882.html
Sent from the Solr - User mailing list archive at Nabble.com.


  1   2   >