Re: complex multi valued fields

2010-01-18 Thread Shalin Shekhar Mangar
On Tue, Jan 12, 2010 at 7:55 PM, Adamsky, Robert wrote:

>
> I have a document that has a multi-valued field where each value in
> the field itself is comprised of two values itself.  Think of an invoice
> doc
> with multi value line items - each line item having quantity and product
> name.
>
> One option I see is to have a line item multi value field and when
> producing
> the document to pass to Solr, concat the quantity and desc and put it in
> the multi
> value field.
>
> My preference would be the ability to define such complex multi valued
> fields
> out of the box.  Is that supported in a Solr 1.4 environment?  Basically a
> field
> type that allows you to define the other fields that make up a field.
>
> This could look like something like this in schema.xml if supported:
>
> 
>  
>  
> 
>
>
Well, no, atleast not with Solr 1.4. This is a new feature being added to
Solr and it is already in trunk. The existing poly field does not let you
have different types for the individual items but you should be able to
write your own poly field for this task.

-- 
Regards,
Shalin Shekhar Mangar.


Re: schema question

2010-01-18 Thread Uri Boness
Yeah, probably the SignatureUpdateProcessorFactory can do the trick, but 
you still need to write a custom Signature.
(we should really offer a simple "ConcatSignature" implementation for 
generating predictable combination keys)

+1

Cheers,
Uri

Chris Hostetter wrote:

: TemplateTranformer. Otherwise, if you really must do it in Solr you can write
: your own custom UpdateProcessor and plug it in:

Can't the SignatureUpdateProcessorFactory handle this using something like 
Lookup3Signature?


(we should really offer a simple "ConcatSignature" implementation for 
generating predictable combination keys)


http://wiki.apache.org/solr/Deduplication#solrconfig.xml


-Hoss


  


Re: NullPointerException in ReplicationHandler.postCommit + question about compression

2010-01-18 Thread Shalin Shekhar Mangar
On Wed, Jan 13, 2010 at 12:51 AM, Stephen Weiss wrote:

> Hi Solr List,
>
> We're trying to set up java-based replication with Solr 1.4 (dist tarball).
>  We are running this to start with on a pair of test servers just to see how
> things go.
>
> There's one major problem we can't seem to get past.  When we replicate
> manually (via the admin page) things seem to go well.  However, when
> replication is triggered by a commit event on the master, the master gets a
> NullPointerException and no replication seems to take place.
>
>  SEVERE: java.lang.NullPointerException
>>at
>> org.apache.solr.handler.ReplicationHandler$4.postCommit(ReplicationHandler.java:922)
>>at
>> org.apache.solr.update.UpdateHandler.callPostCommitCallbacks(UpdateHandler.java:78)
>>at
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:411)
>>
>
> Does anyone know off the top of their head what this might indicate, or
> know what further troubleshooting steps we should be taking to isolate the
> issue?
>

That is a strange one. It looks like the latest commit point was null. Do
you have a deletion policy section in your solrconfig.xml? Are you always
able to reproduce the exception?


>
> Also, on a (probably) unrelated topic, we're kinda confused by this section
> of the slave config:
>
>
>internal
>
> Since we *are* on a LAN, what exactly should we be doing here?  The
> language is somewhat unclear... I thought that meant that we should just
> comment out the line altogether, but others think it means that we should
> leave it set to "internal".  We get that compression is probably unnecessary
> for our more vanilla setup, we're just not 100% sure how to express that
> correctly.
>
>
During our tests we found that enabling compression on a gigabit ethernet
actually degrades transfer rate because of the compress/de-compress
overhead. Just comment out that line to disable compression.

-- 
Regards,
Shalin Shekhar Mangar.


Updating a single field in a Solr document

2010-01-18 Thread Raghuveer Kancherla
Hi,
I have 2 fields one with captures the category of the documents and an other
which is a pre processed text of the document. Text of the document is
fairly large.
The category of the document changes often while the text remains the same.
Search happens on both fields.

The problem is, I have to index both the text and the category each time the
category changes. The text being large obviously makes this suboptimal. Is
there a patch or a tricky way to avoid indexing the text field every time.

Thanks,
Raghu


TermsComponent, multiple fields, total count

2010-01-18 Thread Lukas Kahwe Smith
Hi,

I want to use TermsComponent for both auto complete suggestions but also 
showing a search "quality" meter. As in indicate the total number of matches 
(doesnt need to be accurate, just a ballpark figure especially if there are a 
lot of matches). I also want to match multiple fields at once.

I guess I can just issue multiple requests in order to get multiple fields 
searched. But the total number is a bit more tricky. I can of course simply add 
up the counts for the limited number of results. But this is maybe a bit too 
inaccurate and also seems like Lucene/Solr should be able to give me this 
number more efficiently.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





Re: NullPointerException in ReplicationHandler.postCommit + question about compression

2010-01-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
When you copy paste config from wiki, just copy what you need.
excluding documentation and comments

On Wed, Jan 13, 2010 at 12:51 AM, Stephen Weiss  wrote:
> Hi Solr List,
>
> We're trying to set up java-based replication with Solr 1.4 (dist tarball).
>  We are running this to start with on a pair of test servers just to see how
> things go.
>
> There's one major problem we can't seem to get past.  When we replicate
> manually (via the admin page) things seem to go well.  However, when
> replication is triggered by a commit event on the master, the master gets a
> NullPointerException and no replication seems to take place.
>
>> SEVERE: java.lang.NullPointerException
>>        at
>> org.apache.solr.handler.ReplicationHandler$4.postCommit(ReplicationHandler.java:922)
>>        at
>> org.apache.solr.update.UpdateHandler.callPostCommitCallbacks(UpdateHandler.java:78)
>>        at
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:411)
>>        at
>> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
>>        at
>> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169)
>>        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>>        at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>>        at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
>>        at
>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115)
>>        at
>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361)
>>        at
>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>>        at
>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>>        at
>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>>        at
>> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
>>        at
>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>>        at
>> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>>        at
>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>        at org.mortbay.jetty.Server.handle(Server.java:324)
>>        at
>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>>        at
>> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879)
>>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:741)
>>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:213)
>>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
>>        at
>> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>>        at
>> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>>
>
>
> This is the master config:
>
>  
>    
>        
>        commit
>
>        
>        
>
>        
>         name="confFiles">solrconfig_slave.xml:solrconfig.xml,schema.xml,synonyms.txt,stopwords.txt,elevate.xml
>
>        
>        00:00:10
>    
>  
>
>
> and... the slave config:
>
>  
>    
>
>        
>         name="masterUrl">http://hostname.obscured.com:8080/solr/calendar_core/replication
>
>        
>        00:00:20
>
>        
>        
>        internal
>        
>
>      
>        5000
>        1
>
>        
>        
>
>     
>  
>
>
> Does anyone know off the top of their head what this might indicate, or know
> what further troubleshooting steps we should be taking to isolate the issue?
>
> Also, on a (probably) unrelated topic, we're kinda confused by this section
> of the slave config:
>
>        
>        internal
>
> Since we *are* on a LAN, what exactly should we be doing here?  The language
> is somewhat unclear... I thought that meant that we should just comment out
> the line altogether, but others think it means that we should leave it set
> to "internal".  We get that compression is probably unnecessary for our more
> vanilla setup, we're just not 100% sure how to express that correctly.
>
> Thanks in advance for any advice!
>
> --
> Steve
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-18 Thread Erick Erickson
Nope. The problem is that SOLR needs to create a ranked
list. It has to search the entire corpus every time. There's
always the possibility that the very last document examined
would rank highest.

So the search times should be unchanged, no matter how
many rows you return, but the time to assemble and transmit
the return packet will vary.

HTH
Erick

On Mon, Jan 18, 2010 at 12:05 AM, Gora Mohanty  wrote:

> Hi,
>
> Does specifying a smaller number of rows, e.g., with
> q=test&start=0&rows=XX affect the query efficiency?
> I realise, of course, that increasing the number of
> rows will lead to inefficiencies in data transfer.
> This is with a standard Solr setup, without sharding,
> etc.
>
> Seems like a newbie query, but I cannot seem to find
> documentation on it, nor did my experiments with measuring
> search times show any definitive results.
>
> Regard,
> Gora
>


multi field search

2010-01-18 Thread Lukas Kahwe Smith
Hi,

I realize that I can copy all fields together into one multiValue field and set 
that as the defaultSearchField. However in that case I cannot leverage the 
various custom analyzers I want to apply to the fields separately (name should 
use doublemetaphone, street should use the world splitter etc.). I can of 
course also do an OR query as well. But it would be nice to be able to do:

q=*:foo

and that would simply search all fields against the query "foo".

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





Re: multi field search

2010-01-18 Thread Sven Maurmann

Hi,

you might want to use the Dismax-Handler.

Sven

--On Monday, January 18, 2010 02:58:09 PM +0100 Lukas Kahwe Smith 
 wrote:



Hi,

I realize that I can copy all fields together into one multiValue
field and set that as the defaultSearchField. However in that case
I cannot leverage the various custom analyzers I want to apply to
the fields separately (name should use doublemetaphone, street
should use the world splitter etc.). I can of course also do an OR
query as well. But it would be nice to be able to do:

q=*:foo

and that would simply search all fields against the query "foo".

regards,
Lukas Kahwe Smith
m...@pooteeweet.org


Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-18 Thread Yonik Seeley
On Mon, Jan 18, 2010 at 8:57 AM, Erick Erickson  wrote:
> Nope. The problem is that SOLR needs to create a ranked
> list. It has to search the entire corpus every time. There's
> always the possibility that the very last document examined
> would rank highest.

There's also the priority queue used to collect the top matches that
needs to remain ordered.
Finding and scoring matching documents will normally dominate the
time, but if "N" becomes large (for collecting the top N matches), the
priority queue operations can become significant.

-Yonik
http://www.lucidimagination.com


Re: analyzer type="query" with NGramTokenFilterFactory forces phrase query

2010-01-18 Thread Wangsheng Mei
I faced a similar problem when I was dealing with Chinese words search.
By simply adding a PositionFilter at the end of analyzer, the damn phrase
query disappeared  and replaced by term queries which is what I've expected.
That's very nice, thank you very much!

Note that Chinese words segmentation is very different from English words
segmentation in that the latter use a whitespace as the delimiter.
So if I search "中国汉字", solr(lucene) will treat is as a phrase search because
it doesn't see any whitespace within the query string.But in fact, it should
be considered as BooleanQuery(OR) with two term queries search in this case.
Anyway, I am confused by solr(lucene)'s behavior on this. Is it a bug?

2010/1/1 AHMET ARSLAN 

> > "if this is the expected behaviour is
> > there a way to override it?"[1]
> >
> > [1] me
>
>
> Using PositionFilterFactory[1] after NGramFilterFactory can yield parsed
> query:
>
> field:fa field:am field:mi field:il field:ly field:fam field:ami field:mil
> field:ily
>
> [1]
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory
>
>
>
>


-- 
梅旺生


Re: analyzer type="query" with NGramTokenFilterFactory forces phrase query

2010-01-18 Thread Robert Muir
the way that queryparser treats whitespace is also a problem for
languages that have words that contain spaces, like vietnamese.
i think it also causes grief for multi-word synonyms, such that they
don't work correctly at querytime:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter

2010/1/18 Wangsheng Mei :
> I faced a similar problem when I was dealing with Chinese words search.
> By simply adding a PositionFilter at the end of analyzer, the damn phrase
> query disappeared  and replaced by term queries which is what I've expected.
> That's very nice, thank you very much!
>
> Note that Chinese words segmentation is very different from English words
> segmentation in that the latter use a whitespace as the delimiter.
> So if I search "中国汉字", solr(lucene) will treat is as a phrase search because
> it doesn't see any whitespace within the query string.But in fact, it should
> be considered as BooleanQuery(OR) with two term queries search in this case.
> Anyway, I am confused by solr(lucene)'s behavior on this. Is it a bug?
>
> 2010/1/1 AHMET ARSLAN 
>
>> > "if this is the expected behaviour is
>> > there a way to override it?"[1]
>> >
>> > [1] me
>>
>>
>> Using PositionFilterFactory[1] after NGramFilterFactory can yield parsed
>> query:
>>
>> field:fa field:am field:mi field:il field:ly field:fam field:ami field:mil
>> field:ily
>>
>> [1]
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory
>>
>>
>>
>>
>
>
> --
> 梅旺生
>



-- 
Robert Muir
rcm...@gmail.com


Re: Multi-word Terms

2010-01-18 Thread shamrockstores

Thank you.

While interesting what I'm really after is a programmatic way to get at
multi-word terms and their frequencies from a given document.  

Is this possible?



Ahmet Arslan wrote:
> 
>> What is the best way to essentially get a term frequency
>> vector for
>> multi-word terms?
> 
> To use solr.ShingleFilterFactory and TermVectorComponent.
> 
> http://wiki.apache.org/solr/TermVectorComponent
> 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
> 
> 
>   
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Multi-word-Terms-tp27182199p27214838.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-18 Thread Walter Underwood
"Search the entire corpus" makes it sound like Solr is grepping the documents. 
The corpus has already been converted to an inverted index before the search, 
so only the terms in the query are retrieved.

For basic, relevance-sorted search, there are two kinds of work done by Solr: 
work per query term and work per document. The work per query term depends on 
the number of query terms. The work per document depends on the number of 
documents (rows) requested and the number of fields requested.

Search times are not "unchanged" when you request more documents. If you 
request 10K documents, your search will be much, much slower than if you 
request 10.

wunder

On Jan 18, 2010, at 5:57 AM, Erick Erickson wrote:

> Nope. The problem is that SOLR needs to create a ranked
> list. It has to search the entire corpus every time. There's
> always the possibility that the very last document examined
> would rank highest.
> 
> So the search times should be unchanged, no matter how
> many rows you return, but the time to assemble and transmit
> the return packet will vary.
> 
> HTH
> Erick
> 
> On Mon, Jan 18, 2010 at 12:05 AM, Gora Mohanty  wrote:
> 
>> Hi,
>> 
>> Does specifying a smaller number of rows, e.g., with
>> q=test&start=0&rows=XX affect the query efficiency?
>> I realise, of course, that increasing the number of
>> rows will lead to inefficiencies in data transfer.
>> This is with a standard Solr setup, without sharding,
>> etc.
>> 
>> Seems like a newbie query, but I cannot seem to find
>> documentation on it, nor did my experiments with measuring
>> search times show any definitive results.
>> 
>> Regard,
>> Gora
>> 



Re: Multi-word Terms

2010-01-18 Thread Ahmet Arslan
> Thank you.
> 
> While interesting what I'm really after is a programmatic
> way to get at
> multi-word terms and their frequencies from a given
> document.  
> 
> Is this possible?
> 

What do you mean by programmatic way? You mean without indexing? Multi-word 
terms means phrases right? Like "tap water"?

you can use this field type to index your documents.

 
  


  


and if you register TermsComponent in solrconfig.xml by doing:






true
shingle_text_field


termsComponent



http://localhost:8983/solr/terms will give you multi-word terms sorted by term 
frequency. Also you can use TermVectorComponent to get term frequencies of 
multi-terms of a particular document. 
Additionally admin/schema.jsp shows top n terms if you want.





filter query parsing problem

2010-01-18 Thread John Thorhauer
I am submitting a query and it seems to be parsing incorrectly.  Here
is the query with the debug output.  Any ideas what the problem is:


  
((VLog:814124 || VLog:12342) && (PublisherType:U || PublisherType:A))
  


+(VLog:814124 VLog:12342) +PublisherType:u


I would have thought that the parsed filter would have looked like this:
+(VLog:814124 VLog:12342) +(PublisherType:u PublisherType:a)

Thanks for the help,
John Thorhauer


Re: filter query parsing problem

2010-01-18 Thread Ahmet Arslan
> I am submitting a query and it seems
> to be parsing incorrectly.  Here
> is the query with the debug output.  Any ideas what
> the problem is:
> 
> 
>   
>     ((VLog:814124 || VLog:12342) &&
> (PublisherType:U || PublisherType:A))
>   
> 
> 
>     +(VLog:814124 VLog:12342)
> +PublisherType:u
> 
> 
> I would have thought that the parsed filter would have
> looked like this:
>         +(VLog:814124
> VLog:12342) +(PublisherType:u PublisherType:a)

It seems that stopfilterfactory is eating A which is a stop word. You can 
remove stopfilterfactory from analyzer chain of type of PublisherType. Or you 
can remove entry a from stopwords.txt.





Specify logging options from command line in Solr 1.4?

2010-01-18 Thread Mat Brown
Hi all,

Wondering if anyone can point me at a simple way to specify basic
logging options (log level, log file location) when starting the Solr
example jar from the command line.

As a bit of background, I maintain a Ruby library for Solr called
Sunspot that ships with a Solr installation for ease of use. Sunspot
includes a script for starting Solr with various options, including
logging options. With Solr 1.3, I was able to write out a
logging.properties file and then set the system property
java.util.logging.config.file via the command line; this no longer
seems to work with Solr 1.4.

I understand that Solr 1.4 has moved to SLF4J, but I haven't been able
to find a readily available answer to the above question in the SLF4J
or Solr logging documentation. To be honest, I've always found logging
in Java rather mystifying.

Any help much appreciated!
Mat


Re: Specify logging options from command line in Solr 1.4?

2010-01-18 Thread Mark Miller
Mat Brown wrote:
> Hi all,
>
> Wondering if anyone can point me at a simple way to specify basic
> logging options (log level, log file location) when starting the Solr
> example jar from the command line.
>
> As a bit of background, I maintain a Ruby library for Solr called
> Sunspot that ships with a Solr installation for ease of use. Sunspot
> includes a script for starting Solr with various options, including
> logging options. With Solr 1.3, I was able to write out a
> logging.properties file and then set the system property
> java.util.logging.config.file via the command line; this no longer
> seems to work with Solr 1.4.
>
> I understand that Solr 1.4 has moved to SLF4J, but I haven't been able
> to find a readily available answer to the above question in the SLF4J
> or Solr logging documentation. To be honest, I've always found logging
> in Java rather mystifying.
>
> Any help much appreciated!
> Mat
>   
By default, even though Solr uses SLF4J, it will actually use the Java
Util logging Impl:

http://wiki.apache.org/solr/SolrLogging

So you just specify a util logging properties file on the sommand line with:

-Djava.util.logging.config.file=myLoggingConfigFilePath 

An example being:

handlers=java.util.logging.FileHandler, java.util.logging.ConsoleHandler

# Default global logging level. 
# Loggers and Handlers may override this level 
.level=INFO

java.util.logging.ConsoleHandler.level=INFO
java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter


# --- FileHandler --- 
# Override of global logging level 
java.util.logging.FileHandler.level=ALL

# Naming style for the output file: 
# (The output file is placed in the directory 
# defined by the "user.home" System property.) 
java.util.logging.FileHandler.pattern=%h/java%u.log

# Limiting size of output file in bytes: 
java.util.logging.FileHandler.limit=5

# Number of output files to cycle through, by appending an 
# integer to the base file name: 
java.util.logging.FileHandler.count=1

# Style of output (Simple or XML): 
java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter


-- 
- Mark

http://www.lucidimagination.com





Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-18 Thread Gora Mohanty
On Mon, 18 Jan 2010 13:21:27 -0800
Walter Underwood  wrote:
[...]
> For basic, relevance-sorted search, there are two kinds of work
> done by Solr: work per query term and work per document. The work
> per query term depends on the number of query terms. The work per
> document depends on the number of documents (rows) requested and
> the number of fields requested.
[...]

Thanks to both you and Erick for your responses. This makes a lot
more sense now.

Regards,
Gora


Tokenization and wild card search

2010-01-18 Thread johnmunir

Hi,
 
I have an issue and I'm not sure how to address it, so I hope someone can help 
me.
 
I have the following text in one of my fields: "ABC_Expedition_ERROR".   When I 
search on it like: "MyField:SDD_Expedition_PCB" (without quotes) it will fail 
to find me only this word “ABC_Expedition_ERROR” which I think is due to 
tokenization because of the underscore.
 
My solution is: "MyField:"SDD_Expedition_PCB"" (without the outer quotes, but 
quotes around the word “ABC_Expedition_ERROR”).  This works fine.  But then, 
how do I search on "SDD_Expedition_PCB" with wild card?  For example: 
"MyField:SDD_Expedition*" will not work.
 
Any help is greatly appreciated.
 
Thanks.
 
-- JM
 
=


How can I boost bq in FieldQParserPlugin?

2010-01-18 Thread Wangsheng Mei
Hi, ALL.

My original query is:
http://myhost:8080/solr/select?q=ipod&*bq=userId:12345^0.5*
&fq=&start=0&rows=10&fl=*%2Cscore&qt=dismax&wt=standard&debugQuery=on&explainOther=&hl.fl=

It works this way.
But I would like to place bq phrase in the default solrconfig.xml
configuration to make the query string more brief, so I did the following?
http://myhost:8080/solr/select?q=ipod&*bq={!field f=userId v=$qq}&qq=12345*
&fq=&start=0&rows=10&fl=*%2Cscore&qt=dismax&wt=standard&debugQuery=on&explainOther=&hl.fl=

However, filedQueryParser doesn't accespt a boost parameter, then what shall
I do?

Thanks in advance.




-- 
梅旺生


build path

2010-01-18 Thread Siv Anette Fjellkårstad
Hi!
I try to run the tests of Solr 1.4 in Eclipse, but a most of them fails. The 
error messages indicate that I miss some config files in my build path. Is 
there any documentation of how to get Solr up and running in Eclipse? If not; 
How did you set up (build path for) Solr in Eclipse?

Another question; Some of the tests also fail when I run ant test. Is that 
normal?

Sincerely,
Siv


This email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, 
http://www.steria.no. This email and any attachments may contain 
confidential/intellectual property/copyright information and is only for the 
use of the addressee(s). You are prohibited from copying, forwarding, 
disclosing, saving or otherwise using it in any way if you are not the 
addressee(s) or responsible for delivery. If you receive this email by mistake, 
please advise the sender and cancel it immediately. Steria may monitor the 
content of emails within its network to ensure compliance with its policies and 
procedures. Any email is susceptible to alteration and its integrity cannot be 
assured. Steria shall not be liable if the message is altered, modified, 
falsified, or even edited.


Re: build path

2010-01-18 Thread Wangsheng Mei
maybe you should add "-Dsolr.solr.home=" to your JAVA_OPTS
before your servlet container starts.


2010/1/19 Siv Anette Fjellkårstad 

> Hi!
> I try to run the tests of Solr 1.4 in Eclipse, but a most of them fails.
> The error messages indicate that I miss some config files in my build path.
> Is there any documentation of how to get Solr up and running in Eclipse? If
> not; How did you set up (build path for) Solr in Eclipse?
>
> Another question; Some of the tests also fail when I run ant test. Is that
> normal?
>
> Sincerely,
> Siv
>
>
> This email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051
> OSLO, http://www.steria.no. This email and any attachments may contain
> confidential/intellectual property/copyright information and is only for the
> use of the addressee(s). You are prohibited from copying, forwarding,
> disclosing, saving or otherwise using it in any way if you are not the
> addressee(s) or responsible for delivery. If you receive this email by
> mistake, please advise the sender and cancel it immediately. Steria may
> monitor the content of emails within its network to ensure compliance with
> its policies and procedures. Any email is susceptible to alteration and its
> integrity cannot be assured. Steria shall not be liable if the message is
> altered, modified, falsified, or even edited.
>



-- 
梅旺生


Fastest way to use solrj

2010-01-18 Thread Tim Terlegård
There are a few ways to use solrj. I just learned that I can use the
javabin format to get some performance gain. But when I try the binary
format nothing is added to the index. This is how I try to use this:

server = new CommonsHttpSolrServer("http://localhost:8983/solr";)
server.setRequestWriter(new BinaryRequestWriter())
request = new UpdateRequest()
request.setAction(UpdateRequest.ACTION.COMMIT, true, true);
request.setParam("stream.file", "/tmp/data.bin")
request.process(server)

Should this work? Could there be something wrong with the file? I
haven't found a good reference for how to create a javabin file, but
by reading the source code I came up with this (groovy code):

fieldId = new NamedList()
fieldId.add("name", "id")
fieldId.add("val", "9-0")
fieldId.add("boost", null)
fieldText = new NamedList()
fieldText.add("name", "text")
fieldText.add("val", "Some text")
fieldText.add("boost", null)
fieldNull = new NamedList()
fieldNull.add("boost", null)
doc = [fieldNull, fieldId, fieldText]
docs = [doc]
root = new NamedList()
root.add("docs", docs)
fos = new FileOutputStream("data.bin")
new JavaBinCodec().marshal(root, fos)

I haven't found any examples of using stream.file like this with a
binary file. Is it supported? Is it better/faster to use
StreamingUpdateSolrServer and send everything over HTTP instead? Would
code for that look something like this?

while (moreDocs) {
xmlDoc = readDocFromFileUsingSaxParser()
doc = new SolrInputDocument()
doc.addField("id", "9-0")
doc.addField("text", "Some text")
server.add(doc)
}

To me it instinctively looks as if stream.file would be faster because
it doesn't have to use HTTP and it doesn't have to create a bunch of
SolrInputDocument objects.

/Tim


SV: build path

2010-01-18 Thread Siv Anette Fjellkårstad
I apologize for the newbie questions :|
Do I need a servlet container to run the tests?

Kind regards,
Siv




Fra: Wangsheng Mei [mailto:hairr...@gmail.com]
Sendt: ti 19.01.2010 08:49
Til: solr-user@lucene.apache.org
Emne: Re: build path



maybe you should add "-Dsolr.solr.home=" to your JAVA_OPTS
before your servlet container starts.


2010/1/19 Siv Anette Fjellkårstad 

> Hi!
> I try to run the tests of Solr 1.4 in Eclipse, but a most of them fails.
> The error messages indicate that I miss some config files in my build path.
> Is there any documentation of how to get Solr up and running in Eclipse? If
> not; How did you set up (build path for) Solr in Eclipse?
>
> Another question; Some of the tests also fail when I run ant test. Is that
> normal?
>
> Sincerely,
> Siv
>
>
> This email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051
> OSLO, http://www.steria.no  . This email and any 
> attachments may contain
> confidential/intellectual property/copyright information and is only for the
> use of the addressee(s). You are prohibited from copying, forwarding,
> disclosing, saving or otherwise using it in any way if you are not the
> addressee(s) or responsible for delivery. If you receive this email by
> mistake, please advise the sender and cancel it immediately. Steria may
> monitor the content of emails within its network to ensure compliance with
> its policies and procedures. Any email is susceptible to alteration and its
> integrity cannot be assured. Steria shall not be liable if the message is
> altered, modified, falsified, or even edited.
>



--
???



This email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, 
http://www.steria.no. This email and any attachments may contain 
confidential/intellectual property/copyright information and is only for the 
use of the addressee(s). You are prohibited from copying, forwarding, 
disclosing, saving or otherwise using it in any way if you are not the 
addressee(s) or responsible for delivery. If you receive this email by mistake, 
please advise the sender and cancel it immediately. Steria may monitor the 
content of emails within its network to ensure compliance with its policies and 
procedures. Any email is susceptible to alteration and its integrity cannot be 
assured. Steria shall not be liable if the message is altered, modified, 
falsified, or even edited.