Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-06-21 Thread Surendra
Hi Chris

I did a proper checkout of TIKA 0.9 and built the jars as specified in the
"http://tika.apache.org/0.9/gettingstarted.html"; and replaced the existing
tika0.4 jars with 0.9 jars. I don't see any difference. The documents are
getting indexed but the fmap.content(attr_content) is still not available for
me. Am I missing something? Between I'm digging further in this isse... if I can
get any further help it would be great! Thanks for your time...

-- Surendra




Re: Complex situation

2011-06-21 Thread roySolr
Thanks it works!!

I want to change the format of the NOW in SOLR. Is it possible? Now date
format looks like this:

-MM-dd T HH:mm:sss Z

In my db the format is dd-MM. How can i fix the NOW so i can do something
like * TO NOW(dd-mm)??

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Complex-situation-tp3071936p3089632.html
Sent from the Solr - User mailing list archive at Nabble.com.


Using two repeater to rapidly switching Master and Slave (Replication)?

2011-06-21 Thread Mark Schoy
Hi,

I have an idea how to switching master and slave in case of one server
is crashing:

Setting up two server as repeater but disabling master and slave
config on both with false.

Now you can dynamically disable and enable master or slave option by url:

enable / disable replication on master:
http://master_host:port/solr/replication?command=disablereplication
http://master_host:port/solr/replication?command=enablereplication

enable / disable polling on slave:
http://slave_host:port/solr/replication?command=disablepoll
http://slave_host:port/solr/replication?command=enablepoll

Does this work?


Where is LogTransformer log file path??

2011-06-21 Thread Alucard
Hi all.

I follow the steps of creating a LogTransformer in DataImportHandler wiki:




The java statement that start Solr:

java "-Dremarks=solr:8983"
"-Djava.util.logging.config.file=logging.properties" -jar start.jar

logging.properties file content

# Default global logging level:
.level = DEBUG

# Write to a file:
handlers = java.util.logging.FileHandler

# Write log messages in human readable format:
java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter

# Log to the logs subdirectory, with log files named solrxxx.log
java.util.logging.FileHandler.pattern = logs/solr_log-%g.log
java.util.logging.FileHandler.append = true
java.util.logging.FileHandler.count = 10
java.util.logging.FileHandler.limit = 500 #Roughly 5MB



So the log file (solr_log0.log) is there, startup message are properly
logged.  However,
when I do a delta import, the message defined in logTemplate attribute is
not logged.

I have done some research but cannot find anything related to:
LogTransformer file path/log path or so on...

So, can anyone please tell me where are those messgae logged?

Thank you in advance for any help.

Ellery


Re: Where is LogTransformer log file path??

2011-06-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
it will be in the solr logs

On Tue, Jun 21, 2011 at 2:18 PM, Alucard  wrote:
> Hi all.
>
> I follow the steps of creating a LogTransformer in DataImportHandler wiki:
>
>  pk="office_add_Key" transformer="LogTransformer" logLevel="debug"
>                    logTemplate="office_add_Key:
> ${office_address.office_add_Key}, last_index_time:
> ${dataimporter.last_index_time}"
> ...
>>
> 
>
> The java statement that start Solr:
>
> java "-Dremarks=solr:8983"
> "-Djava.util.logging.config.file=logging.properties" -jar start.jar
>
> logging.properties file content
>
> # Default global logging level:
> .level = DEBUG
>
> # Write to a file:
> handlers = java.util.logging.FileHandler
>
> # Write log messages in human readable format:
> java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter
>
> # Log to the logs subdirectory, with log files named solrxxx.log
> java.util.logging.FileHandler.pattern = logs/solr_log-%g.log
> java.util.logging.FileHandler.append = true
> java.util.logging.FileHandler.count = 10
> java.util.logging.FileHandler.limit = 500 #Roughly 5MB
>
> 
>
> So the log file (solr_log0.log) is there, startup message are properly
> logged.  However,
> when I do a delta import, the message defined in logTemplate attribute is
> not logged.
>
> I have done some research but cannot find anything related to:
> LogTransformer file path/log path or so on...
>
> So, can anyone please tell me where are those messgae logged?
>
> Thank you in advance for any help.
>
> Ellery
>



-- 
-
Noble Paul


Re: Where is LogTransformer log file path??

2011-06-21 Thread Alucard
Thank you, but what do you mean by "solr logs"?

Actually I cannot find my message in "Solr logs", which is resided in:
/logs/solr_log-%g.log


2011/6/21 Noble Paul നോബിള്‍ नोब्ळ् 

> it will be in the solr logs
>
> On Tue, Jun 21, 2011 at 2:18 PM, Alucard  wrote:
> > Hi all.
> >
> > I follow the steps of creating a LogTransformer in DataImportHandler
> wiki:
> >
> >  > pk="office_add_Key" transformer="LogTransformer" logLevel="debug"
> >logTemplate="office_add_Key:
> > ${office_address.office_add_Key}, last_index_time:
> > ${dataimporter.last_index_time}"
> > ...
> >>
> > 
> >
> > The java statement that start Solr:
> >
> > java "-Dremarks=solr:8983"
> > "-Djava.util.logging.config.file=logging.properties" -jar start.jar
> >
> > logging.properties file content
> >
> > # Default global logging level:
> > .level = DEBUG
> >
> > # Write to a file:
> > handlers = java.util.logging.FileHandler
> >
> > # Write log messages in human readable format:
> > java.util.logging.FileHandler.formatter =
> java.util.logging.SimpleFormatter
> >
> > # Log to the logs subdirectory, with log files named solrxxx.log
> > java.util.logging.FileHandler.pattern = logs/solr_log-%g.log
> > java.util.logging.FileHandler.append = true
> > java.util.logging.FileHandler.count = 10
> > java.util.logging.FileHandler.limit = 500 #Roughly 5MB
> >
> > 
> >
> > So the log file (solr_log0.log) is there, startup message are properly
> > logged.  However,
> > when I do a delta import, the message defined in logTemplate attribute is
> > not logged.
> >
> > I have done some research but cannot find anything related to:
> > LogTransformer file path/log path or so on...
> >
> > So, can anyone please tell me where are those messgae logged?
> >
> > Thank you in advance for any help.
> >
> > Ellery
> >
>
>
>
> --
> -
> Noble Paul
>


Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Michael McCandless
OK that sounds like a good solution!

You can also have CMS limit how many merges are allowed to run at
once, if your IO system has trouble w/ that much concurrency.

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 20, 2011 at 6:29 PM, Shawn Heisey  wrote:
> On 6/20/2011 3:18 PM, Michael McCandless wrote:
>>
>> With segmentsPerTier at 35 you will easily cross 70 segs in the index...
>> If you want optimize to run in a single merge, I would lower
>> sementsPerTier and mergeAtOnce (maybe back to the 10 default), and set
>> your maxMergeAtOnceExplicit to 70 or higher...
>>
>> Lower mergeAtOnce means merges run more frequently but for shorter
>> time, and, your searching should be faster (than 35/35) since there
>> are fewer segments to visit.
>
> Thanks again for more detailed information.  There is method to my madness,
> which I will now try to explain.
>
> With a value of 10, the reindex involves enough merges that there is are
> many second level merges, and a third-level merge.  I was running into
> situations on my development platform (with its slow disks) where there were
> three merges happening at the same time, which caused all indexing activity
> to cease for several minutes.  This in turn would cause JDBC to time out and
> drop the connection to the database, which caused DIH to fail and rollback
> the entire import about two hours (two thirds) in.
>
> With a mergeFactor of 35, there are no second level merges, and no
> third-level merges.  I can do a complete reindex successfully even on a
> system with slow disks.
>
> In production, one shard (out of six) is optimized every day to eliminate
> deleted documents.  When I have to reindex everything, I will typically go
> through and manually optimize each shard in turn after it's done.  This is
> the point where I discovered this two-pass problem.
>
> I don't want to do a full-import with optimize=true, because all six large
> shards build at the same time in a Xen environment.  The I/O storm that
> results from three optimizes happening on each host at the same time and
> then replicating to similar Xen hosts is very bad.
>
> I have now set maxMergeAtOnceExplicit to 105.  I think that is probably
> enough, given that that I currently do not experience any second level
> merges.  When my index gets big enough, I will increase the ram buffer.  By
> then I will probably have more memory, so the first-level merges can still
> happen entirely from I/O cache.
>
> Shawn
>
>


Applying boost factors at run time

2011-06-21 Thread Kissue Kissue
Hi,

I have the following situation:

1. I am using Solr 3.1
2. I am using the edismax query handler for my queries
3. I am using the SolrJ client library
4. Currently i have configured the fields i want to search on and the bosst
factors in solr config.

But i have just been told that we would need the bosst factors to be stored
in a database so that admin can modify them as at when needed. So i want to
know if it is possible to set the boost factors at runtime for the fields
using the values stored in the database using Solr J?

Thanks


Re: Applying boost factors at run time

2011-06-21 Thread Ahmet Arslan


--- On Tue, 6/21/11, Kissue Kissue  wrote:

> From: Kissue Kissue 
> Subject: Applying boost factors at run time
> To: solr-user@lucene.apache.org
> Date: Tuesday, June 21, 2011, 1:31 PM
> Hi,
> 
> I have the following situation:
> 
> 1. I am using Solr 3.1
> 2. I am using the edismax query handler for my queries
> 3. I am using the SolrJ client library
> 4. Currently i have configured the fields i want to search
> on and the bosst
> factors in solr config.
> 
> But i have just been told that we would need the bosst
> factors to be stored
> in a database so that admin can modify them as at when
> needed. So i want to
> know if it is possible to set the boost factors at runtime
> for the fields
> using the values stored in the database using Solr J?

Yes you can always override defaults - defined in solrconfig.xml - in every 
request.

SolrQuery.set("qf", "myField^newBoostFactor");


Re: commit time and lock

2011-06-21 Thread Erick Erickson
What is it you want help with? You haven't told us what the
problem you're trying to solve is. Are you asking how to
speed up indexing? What have you tried? Have you
looked at: http://wiki.apache.org/solr/FAQ#Performance?

Best
Erick

On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods  wrote:
> I am using solrj to index the data. I have around 5 docs indexed. As at
> the time of commit due to lock server stop giving response so I was
> calculating commit time:
>
> double starttemp = System.currentTimeMillis();
> server.add(docs);
> server.commit();
> System.out.println("total time in commit = " + (System.currentTimeMillis() -
> starttemp)/1000);
>
> It taking around 9 second to commit the 5000 docs with 15 fields. However I
> am not confirm the lock time of index whether it is start
> since server.add(docs); time or server.commit(); time only.
>
> If I am changing from above to following
>
> server.add(docs);
> double starttemp = System.currentTimeMillis();
> server.commit();
> System.out.println("total time in commit = " + (System.currentTimeMillis() -
> starttemp)/1000);
>
> then commit time becomes less then 1 second. I am not sure which one is
> right.
>
> please help.
>
> regards
> Jonty
>


Re: Complex situation

2011-06-21 Thread Erick Erickson
No, you can't as far as I know. The time format in Solr is fixed. Besides,
I don't know what NOW(dd-mm) would mean The day represented
by dd-mm in the current year?

You can probably make your db select emit the dates in the Solr format

Best
Erick

On Tue, Jun 21, 2011 at 3:37 AM, roySolr  wrote:
> Thanks it works!!
>
> I want to change the format of the NOW in SOLR. Is it possible? Now date
> format looks like this:
>
> -MM-dd T HH:mm:sss Z
>
> In my db the format is dd-MM. How can i fix the NOW so i can do something
> like * TO NOW(dd-mm)??
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Complex-situation-tp3071936p3089632.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Using two repeater to rapidly switching Master and Slave (Replication)?

2011-06-21 Thread Erick Erickson
It should, but there are a couple of issues.
1> you have to make your remaining slaves poll the new master for index updates.
2> your indexing process has to be pointed the new master (if it's external)
3> you have to make sure anything you've indexed to the master that has NOT
 been replicated to the new master gets re-indexed. This is often done by
 just re-indexing everything from, say, an hour before the master crashed.

Best
Erick

On Tue, Jun 21, 2011 at 3:44 AM, Mark Schoy  wrote:
> Hi,
>
> I have an idea how to switching master and slave in case of one server
> is crashing:
>
> Setting up two server as repeater but disabling master and slave
> config on both with false.
>
> Now you can dynamically disable and enable master or slave option by url:
>
> enable / disable replication on master:
> http://master_host:port/solr/replication?command=disablereplication
> http://master_host:port/solr/replication?command=enablereplication
>
> enable / disable polling on slave:
> http://slave_host:port/solr/replication?command=disablepoll
> http://slave_host:port/solr/replication?command=enablepoll
>
> Does this work?
>


Re: Complex situation

2011-06-21 Thread roySolr
Yes, current year. I understand that something like dd-mm-yy isn't possible.

I will fix this in my db,

Thanks for your help!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Complex-situation-tp3071936p3090247.html
Sent from the Solr - User mailing list archive at Nabble.com.


solr 3.2 and jetty auth shows forbidden 403

2011-06-21 Thread Markus.Rietzler
we are testing the upgrade to solr 3.2. a quick test look good. solr 3.2. comes 
up and we can do searches with our configs (using the "old" dismax handler, 
which i have inserted in solrconfig.xml). only problem is, that i am not able 
to set up user auth in jetty.
i took the same config files that are working in solr 1.4.1. when i try to 
access any page under
in the webdefault.xml i have

/CORE/admin/*

which should do user auth if i access, but

http://my.server:8983/solr/CORE/admin/

i'll get a 403 forbidden, no auth request no nothing. in the jetty docs i have 
read, that in the case directory browsing is switched of you have to name the 
page. but even when i call it with /solr/CORE/admin/index.jsp i'll the the 
forbidden.

i don't find any logfiles which shows why i receive forbidden? any help here or 
do i have to ask the jetty people?


mfg

Markus Rietzler

Rechenzentrum der Finanzverwaltung

Tel: 0211/4572-2130




Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-06-21 Thread Surendra
Hi Andreas
I tried solr 3.1 as well as 3.2... i was not able to overcome these issues with
the newer versions too. For me, I need the attr_content:* should return me
results (with 1.4.1 this is successful) which is not happening . It indexes well
in 3.1 but in 3.2 i have the following issue.
Invalid version or the data in not in 'javabin' format
--Surendra





Re: Applying boost factors at run time

2011-06-21 Thread Kissue Kissue
Many thanks for the tip. I will give it a go.


On Tue, Jun 21, 2011 at 11:48 AM, Ahmet Arslan  wrote:

>
>
> --- On Tue, 6/21/11, Kissue Kissue  wrote:
>
> > From: Kissue Kissue 
> > Subject: Applying boost factors at run time
> > To: solr-user@lucene.apache.org
> > Date: Tuesday, June 21, 2011, 1:31 PM
> > Hi,
> >
> > I have the following situation:
> >
> > 1. I am using Solr 3.1
> > 2. I am using the edismax query handler for my queries
> > 3. I am using the SolrJ client library
> > 4. Currently i have configured the fields i want to search
> > on and the bosst
> > factors in solr config.
> >
> > But i have just been told that we would need the bosst
> > factors to be stored
> > in a database so that admin can modify them as at when
> > needed. So i want to
> > know if it is possible to set the boost factors at runtime
> > for the fields
> > using the values stored in the database using Solr J?
>
> Yes you can always override defaults - defined in solrconfig.xml - in every
> request.
>
> SolrQuery.set("qf", "myField^newBoostFactor");
>


problem with wild card query with spellchecker

2011-06-21 Thread Romi
I am enabling spell checking using solr  in  search application. i also want
to run wild card queries.
the problem i am facing is when i search for for example diam* then it gives
me a suggestion for diamond and search results for diamond. while i have
some other words in my document say for example diamender,diamounder but i
am not getting results for them

cant i use wild card query with spell checker?? 

-
Thanks & Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-with-wild-card-query-with-spellchecker-tp3090651p3090651.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Shawn Heisey

On 6/20/2011 12:31 PM, Michael McCandless wrote:

For back-compat, mergeFactor maps to both of these, but it's better to
set them directly eg:

 
   10
   20
 

(and then remove your mergeFactor setting under indexDefaults)


When I did this and ran a reindex, it merged once it reached 10 
segments, despite what I had defined in the mergePolicy.  This is Solr 
3.2 with the patch from SOLR-1972 applied.  I've included the config 
snippet below into solrconfig.xml using xinclude via another file.  I 
had to put mergeFactor back in to make it work right.  I haven't checked 
yet to see whether an optimize takes one pass.  That will be later today.



35
35
105


Shawn



Solr 3.2.0 + Jetty 7.4.2

2011-06-21 Thread Benedict, Keith (Digital)
I'm attempting to work through the configuration for the home folder for solr 
running on a standalone jetty 7.4.2 setup, this is being used on a Mac OS X 
10.6.7

I have this working currently under one condition and that is that I specify 
the system property when I start Jetty either in the terminal window manually 
or my configuration file for the launchd startup.

Essentially it is this line:  java -Dsolr.solr.home=/Library/Solr/Current -jar 
start.jar

This works but I would prefer to be able to configure this value in the 
application specific context, I seem to see a lot of reference to JNDI and then 
a lot of half answers to this configuration ( or I am missing the 
obvious...this is new and I have been scouring but could use a leg up here)  

I'm sure there could be more detail, though most times I see people simply 
asking for the version of jetty and solr being used to confirm what should be 
donelet me know what else I might need to specify in order to get some help 
on this, in the mean time I'll keep playing with it.  Thanks

Keith

Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-06-21 Thread Mattmann, Chris A (388J)
Hi Surendra,

Thanks. Besides replacing the tika-*-0.9.jar files, you also need to replace 
the dependency jar files for the other libs as well since they have been 
upgraded. It's also possible that b/c of API changes, Solr 1.4.1 won't work 
with Tika 0.9 without modifying the ExtractingRequestHandler  code...

Cheers,
Chris

On Jun 21, 2011, at 12:28 AM, Surendra wrote:

> Hi Chris
> 
> I did a proper checkout of TIKA 0.9 and built the jars as specified in the
> "http://tika.apache.org/0.9/gettingstarted.html"; and replaced the existing
> tika0.4 jars with 0.9 jars. I don't see any difference. The documents are
> getting indexed but the fmap.content(attr_content) is still not available for
> me. Am I missing something? Between I'm digging further in this isse... if I 
> can
> get any further help it would be great! Thanks for your time...
> 
> -- Surendra
> 
> 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: Problem with CSV update handler

2011-06-21 Thread Yonik Seeley
On Tue, Jun 21, 2011 at 2:15 AM, Rafał Kuć  wrote:
> Hello!
>
> Once again thanks for the response ;) So the solution is to generate
> the data files once again and either adding the space after doubled
> encapsulator

Maybe...
I can't tell if the file is encoded correctly or not since I don't
know what the decoded values are supposed to be from your example.

-Yonik
http://www.lucidimagination.com

> or changing the encapsulator to the character that does
> not occur in the filed values (of course the one taht will be
> split).
>
>
> --
> Regards,
>  Rafał Kuć
>  http://solr.pl
>
>> Multi-valued CSV fields are double encoded.
>
>> We start with: "aaa ""bbb""ccc"'
>> Then decoding one leve, we get:  aaa "bbb"ccc
>> Decoding again to get individual values results in a decode error
>> because the encapsulator appears unescaped in the middle of the second
>> value (i.e. invalid CSV).
>
>> One easier way to fix this is to use a different encapsulator for the
>> sub-values of a multi-valued field by adding f.title.encapsulator=%27
>> (a single quote char)
>
>> But I can't really tell you exactly how to encode or specify options
>> to the CSV loader when I don't know what the actual values you want
>> after "aaa ""bbb""ccc"' is decoded.
>
>> -Yonik
>> http://www.lucidimagination.com
>
>
>
>> On Mon, Jun 20, 2011 at 5:46 PM, Rafał Kuć  wrote:
>>> Hi!
>>>
>>>  Yonik, thanks for the reply. I just realized that the example I gave
>>> was not full - the error is returned by Solr only when the field is
>>> multivalued and the values in the fields are splited. For example, the
>>> following curl command give me the mentioned error:
>>>
>>> curl
>>> 'http://localhost:8983/solr/update/csv?fieldnames=id,title&commit=true&en
>>> capsulator=%22&f.title.split=true&f.title.separator=%20' -H
>>> 'Content-type:text/plain' -d '"1","aaa ""bbb""ccc"'
>>>
>>> while the following is executed without any problem:
>>> curl
>>> 'http://localhost:8983/solr/update/csv?fieldnames=id,title&commit=true&en
>>> capsulator=%22&f.title.split=true&f.title.separator=%20' -H
>>> 'Content-type:text/plain' -d '"1","aaa ""bbb"" ccc"'
>>>
>>> The only difference between those two is the additional space
>>> character in between bbb"" and ccc in the second example.
>>>
>>> Am I doing something wrong ? ;)
>>>
>>> --
>>> Regards,
>>>  Rafał Kuć
>>>  http://solr.pl
>>>
 This works fine for me:
>>>
 curl http://localhost:8983/solr/update/csv -H
 'Content-type:text/plain' -d 'id,name
 "1","aaa ""bbb"" ccc"'
>>>
 -Yonik
 http://www.lucidimagination.com
>>>
>>>
 On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć  wrote:
> Hello!
>
>  I have a question about the CSV update handler. Lets say I have the
> following file sent to CSV update handler using curl:
>
> id,name
> "1","aaa ""bbb""ccc"
>
> It throws an error, saying that:
> Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
> token end delimiter
>
> If I change the contents of the file to:
>
> id,name
> "1","aaa ""bbb"" ccc"
>
> it works without a problem. This anyone encountered this ? Is it know 
> behavior ?
>
> --
> Regards,
>  Rafał Kuć
>
>
>
>>>
>>>
>>>
>>>
>>>
>
>
>
>
>


Re: [ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-06-21 Thread roySolr
Are you working on some changes to support earlier versions of PHP?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-PHP-Solr-Extension-1-0-1-Stable-Has-Been-Released-tp3024040p3090702.html
Sent from the Solr - User mailing list archive at Nabble.com.


rename a core to same name of existing core

2011-06-21 Thread Koji Sekiguchi
I accidentally rename a core to the same name of existing core, e.g. using 
example-DIH:

http://localhost:8983/solr/admin/cores?action=RENAME&core=db&other=tika

I expected solr throws an exception, but it worked, and the existing core
(tika) is gone.

Does it a known bug (but I couldn't find open issue in jira) or intended 
behavior?

koji
-- 
http://www.rondhuit.com/en/


Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Michael McCandless
On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey  wrote:
> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>
>> For back-compat, mergeFactor maps to both of these, but it's better to
>> set them directly eg:
>>
>>     
>>       10
>>       20
>>     
>>
>> (and then remove your mergeFactor setting under indexDefaults)
>
> When I did this and ran a reindex, it merged once it reached 10 segments,
> despite what I had defined in the mergePolicy.  This is Solr 3.2 with the
> patch from SOLR-1972 applied.  I've included the config snippet below into
> solrconfig.xml using xinclude via another file.  I had to put mergeFactor
> back in to make it work right.  I haven't checked yet to see whether an
> optimize takes one pass.  That will be later today.
>
> 
> 35
> 35
> 105
> 

Hmm something strange is going on.

In Solr 3.2, if you attempt to use mergeFactor and useCompoundFile
inside indexDefaults (and outside the mergePolicy), when your
mergePolicy is TMP, you should see a warning like this:

  Use of compound file format or mergefactor cannot be configured if
merge policy is not an instance of LogMergePolicy. The configured
policy's defaults will be used.

And it shouldn't "work".  But, using the "right" params inside your
mergePolicy section ought to work (though, I don't think this is well
tested...).  I'm not sure why you're seeing the opposite of what I'd
expect...

I wonder if you're actually really getting the TMP?  Can you turn on
verbose IndexWriter infoStream and post the output?

Mike McCandless

http://blog.mikemccandless.com


Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Robert Muir
the problem is that before
https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the
TieredMergePolicy "setters" *before* it tried to apply these 'global'
mergeFactor etc params.

So, even if you set them explicitly inside the , they
would then get clobbered by these 'global' params / defaults / etc.

I fixed this order in SOLR-2567 so that the settings inside the
 *always* take precedence, e.g. they are applied last.

So, I think it might be difficult/impossible to configure this MP with
3.2 due to this.

On Tue, Jun 21, 2011 at 10:58 AM, Michael McCandless
 wrote:
> On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey  wrote:
>> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>>
>>> For back-compat, mergeFactor maps to both of these, but it's better to
>>> set them directly eg:
>>>
>>>     
>>>       10
>>>       20
>>>     
>>>
>>> (and then remove your mergeFactor setting under indexDefaults)
>>
>> When I did this and ran a reindex, it merged once it reached 10 segments,
>> despite what I had defined in the mergePolicy.  This is Solr 3.2 with the
>> patch from SOLR-1972 applied.  I've included the config snippet below into
>> solrconfig.xml using xinclude via another file.  I had to put mergeFactor
>> back in to make it work right.  I haven't checked yet to see whether an
>> optimize takes one pass.  That will be later today.
>>
>> 
>> 35
>> 35
>> 105
>> 
>
> Hmm something strange is going on.
>
> In Solr 3.2, if you attempt to use mergeFactor and useCompoundFile
> inside indexDefaults (and outside the mergePolicy), when your
> mergePolicy is TMP, you should see a warning like this:
>
>  Use of compound file format or mergefactor cannot be configured if
> merge policy is not an instance of LogMergePolicy. The configured
> policy's defaults will be used.
>
> And it shouldn't "work".  But, using the "right" params inside your
> mergePolicy section ought to work (though, I don't think this is well
> tested...).  I'm not sure why you're seeing the opposite of what I'd
> expect...
>
> I wonder if you're actually really getting the TMP?  Can you turn on
> verbose IndexWriter infoStream and post the output?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>


Read past EOF error due to broken connection

2011-06-21 Thread Anuj Kumar
Hello Everyone,

While trying to index a set of documents on remote Solr instance, the
connection broke and it left the index in an inconsistent state. Now, when I
start the instance, it fails while getting the searcher with the following
exception-

Caused by: java.io.IOException: read past EOF
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:207)
 at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
at
org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:40)
 at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:71)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:268)
 at
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:79)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:753)
 at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:428)
 at org.apache.lucene.index.IndexReader.open(IndexReader.java:371)
at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:38)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1080)

I saw this JIRA entry- https://jira.atlassian.com/browse/JRA-12030 that asks
to re-index. Is there a way to restore the last known good index snapshot?

Thanks,
Anuj


Re: Optimize taking two steps and extra disk space

2011-06-21 Thread Shawn Heisey

On 6/21/2011 9:09 AM, Robert Muir wrote:

the problem is that before
https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the
TieredMergePolicy "setters" *before* it tried to apply these 'global'
mergeFactor etc params.

So, even if you set them explicitly inside the, they
would then get clobbered by these 'global' params / defaults / etc.

I fixed this order in SOLR-2567 so that the settings inside the
  *always* take precedence, e.g. they are applied last.

So, I think it might be difficult/impossible to configure this MP with
3.2 due to this.


That seems to be confirmed by my infostream.  It's using 
LogByteSizeMergePolicy whether I have mergeFactor configured or not.  
The patch for SOLR-2567 applies with fuzz, but the result won't compile.


Unless I can find a way to patch 3.2 to allow using and configuring TMP, 
I guess I'll just have to live with a two-pass optimize.  It only adds a 
few minutes to the process, and I currently have the disk space 
available, so it's not the end of the world.  I am seeing enough 
improvements coming in 3.3 that I will have to lobby for upgrading to it 
a couple of weeks after it gets released.  It won't come out in time for 
this cycle.


Thanks,
Shawn



Re: upgrading to Tika 0.9 on Solr 1.4.1

2011-06-21 Thread Andreas Kemkes
We are successfully extracting PDF content with Solr 3.1 and Tika 0.9.

Replace
fontbox-1.3.1.jar jempbox-1.3.1.jar pdfbox-1.3.1.jar tika-core-0.8.jar 
tika-parsers-0.8.jar 

with
 
fontbox-1.4.0.jar jempbox-1.4.0.jar pdfbox-1.4.0.jar tika-core-0.9.jar 
tika-parsers-0.9.jar 

I'm not entirely certain, if a recompile of Solr was necessary or not.
Andreas




From: Surendra 
To: solr-user@lucene.apache.org
Sent: Tue, June 21, 2011 5:18:31 AM
Subject: Re: upgrading to Tika 0.9 on Solr 1.4.1

Hi Andreas
I tried solr 3.1 as well as 3.2... i was not able to overcome these issues with
the newer versions too. For me, I need the attr_content:* should return me
results (with 1.4.1 this is successful) which is not happening . It indexes well
in 3.1 but in 3.2 i have the following issue.
Invalid version or the data in not in 'javabin' format
--Surendra

Re: Indexing-speed issues (chart included)

2011-06-21 Thread Mathias Hodler
Sorry, here are some details:

requestHandler: XmlUpdateRequesetHandler
protocol: http (10 concurrend threads)
document: 1kb size, 15 fields

cpu load: 20%
memory usage: 50%

But generally speaking, is that normal or must be something wrong with my
configuration, ...



2011/6/17 Erick Erickson 

> Well, it's kinda hard to say anything pertinent with so little
> information. How are you indexing things? What kind of documents?
> How are you feeding docs to Solr?
>
> You might review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Fri, Jun 17, 2011 at 8:10 AM, Mark Schoy  wrote:
> > Hi,
> >
> > If I start indexing documents it getting slower the more documents were
> > added without commiting and optimizing:
> >
> > http://imageshack.us/photo/my-images/695/solrchart.png/
> >
> > I've changed the mergeFactor from 10 to 30, changed maxDocs
> (100,1000,1)
> > but it always getting slower the more documents were added.
> > If I'm using elasticsearch which is also based on lucene I'm getting
> > constant indexing rates (without commiting and optimizing too)
> >
> > Does anybody know whats wrong?
> >
>


Good time for an upgrade to Solr/Lucene trunk?

2011-06-21 Thread Gregg Donovan
We (Etsy.com) are currently using a version of trunk from mid-October 2010
(SVN tag 1021515, to be exact). We'd like to upgrade to the current trunk
and are wondering if this is a good time. Is the new stuff (esp. DocValues)
stable? Are any other major features or performance improvements about to
land on trunk that are worth waiting a few weeks for?

Thanks for the guidance!

--Gregg

Gregg Donovan
Technical Lead, Search, Etsy.com
gr...@etsy.com


java.lang.NoSuchMethodError: org.apache.xpath.XPathContext.(Z)V

2011-06-21 Thread Laurent Fleifel
Hi !

I want to integrate Solr (Solr 1.4) in a Jonas server. However, I get this
error on jonas :

java.lang.NoSuchMethodError: org.apache.xpath.XPathContext.(Z)V
at org.apache.xpath.jaxp.XPathImpl.eval(XPathImpl.java:207)
at org.apache.xpath.jaxp.XPathImpl.evaluate(XPathImpl.java:281)
at org.apache.solr.core.Config.evaluate(Config.java:166)
at org.apache.solr.core.SolrConfig.initLibs(SolrConfig.java:435)
at org.apache.solr.core.SolrConfig.(SolrConfig.java:131)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134)


and Solr's logs :
CoreContainer was not shutdown prior to finalize(), indicates a bug --
POSSIBLE RESOURCE LEAK!!!

Does anyone know how to solve this issue ? Maybe the configuration ?

Thanks,

Laurent FLEIFEL


Velocity.properties trouble in Solr 1.4.0

2011-06-21 Thread Chip Calhoun
Hi everyone,
 
I'm trying to get Velocity running in Solr 1.4.0, and I'm having a weird 
problem.  When I navigate to http://localhost:8983/solr/itas , I get an error 
message which I'll paste to the end of this email.  It says it can't find 
velocity.properties, despite the fact that I have this file in my 
C:\Apache\apache-solr-1.4.0\example\solr\conf directory.  What am I missing?  
Thanks.
 
Chip
 
 
HTTP ERROR: 500
 
java.lang.RuntimeException: Can't find resource 'velocity.properties' in 
classpath or 'solr/conf/', cwd=C:\apache\apache-solr-1.4.0\example
 
java.lang.RuntimeException: java.lang.RuntimeException: Can't find resource 
'velocity.properties' in classpath or 'solr/conf/', 
cwd=C:\apache\apache-solr-1.4.0\example
 at 
org.apache.solr.request.VelocityResponseWriter.getEngine(VelocityResponseWriter.java:148)
 at 
org.apache.solr.request.VelocityResponseWriter.write(VelocityResponseWriter.java:44)
 at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:325)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
 at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
 at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
 at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
 at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.RuntimeException: Can't find resource 
'velocity.properties' in classpath or 'solr/conf/', 
cwd=C:\apache\apache-solr-1.4.0\example
 at 
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260)
 at 
org.apache.solr.request.SolrVelocityResourceLoader.getResourceStream(SolrVelocityResourceLoader.java:40)
 at 
org.apache.solr.request.VelocityResponseWriter.getEngine(VelocityResponseWriter.java:137)
 ... 20 more
 
RequestURI=/solr/itas
 
Powered by Jetty://


Re: problem with wild card query with spellchecker

2011-06-21 Thread Erick Erickson
You can use prefix with TermsComponent, which may do what you
need.

Best
Erick

On Tue, Jun 21, 2011 at 9:40 AM, Romi  wrote:
> I am enabling spell checking using solr  in  search application. i also want
> to run wild card queries.
> the problem i am facing is when i search for for example diam* then it gives
> me a suggestion for diamond and search results for diamond. while i have
> some other words in my document say for example diamender,diamounder but i
> am not getting results for them
>
> cant i use wild card query with spell checker??
>
> -
> Thanks & Regards
> Romi
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/problem-with-wild-card-query-with-spellchecker-tp3090651p3090651.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Propogating an accurate exceptions to the end user

2011-06-21 Thread JohnRodey
Solr3.1 using SolrJ

So I have a gui that allows folks to search my solr repository and I want to
show appropriate errors when something bad happens, but my problem is that
the Solr exception are not very pretty and sometimes are not very
descriptive.

For instance if I enter a bad query the message on the exception is "Error
executing query" and if I do getCause().getMessage() it gives "Bad Request 
Bad Request  request: http://1.2.3.4:1234/solr/";
This really doesn't help my user too much.

Another example is if a master search server serves out a request to a bunch
of shards I just get a Connection Refused error that doesn't specify which
connection was refused.

I can't image I am the first to run into this and was curious what others
do?  Do people just try to catch all common exceptions and print those
pretty?  What about exceptions that you don't test for?  How about
exceptions that don't really explain the real problem?

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Propogating-an-accurate-exceptions-to-the-end-user-tp3091548p3091548.html
Sent from the Solr - User mailing list archive at Nabble.com.


velocity: hyperlinking to documents

2011-06-21 Thread okayndc
hello,

i'm not sure of the correct velocity syntax to link, let's say a title
field, to the actual document itself. i have a hostname, a category (which
is also the directory where the file sits) and filename fields in my schema. 
can i potentially use these fields to get at the document itself?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/velocity-hyperlinking-to-documents-tp3091504p3091504.html
Sent from the Solr - User mailing list archive at Nabble.com.


case insensitive searches but return original case

2011-06-21 Thread Jamie Johnson
Is it possible to do case insensitive searches but return the original
case?  So for instance the original field is:

John Smith

I need to be able to do case insensitive searches tokenized searches, but
when the value is returned for faceting I'd like the value to be just "John
Smith", not "john" and "smith" or "john smith".  Is this possible?  I know I
can probably do this by having an additional field which is for faceting and
another which is for searching (won't give me case insensitive I don't
think) but is there a more elegant way to do this?


DIH Scheduling

2011-06-21 Thread sabman
There is information 
http://wiki.apache.org/solr/DataImportHandler#Scheduling here  about
Scheduling but I don't understand how to use them. I am not a Java developer
so maybe I am missing something obvious. 

Based on instructions 
http://stackoverflow.com/questions/3206171/how-can-i-schedule-data-imports-in-solr/6379306#6379306
here , it says Create classes ApplicationListener, HTTPPostScheduler and
SolrDataImportProperties. Where do I create them and how do I add them to
solr?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-Scheduling-tp3091764p3091764.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: case insensitive searches but return original case

2011-06-21 Thread Erick Erickson
Not really. The problem here is that facets are done on terms. To
search effectively, Solr needs tokenized, lower-cased etc. terms.

But since faceting is really just faceting on terms, this is incompatible
with returning multi-term facets like "John Smith" so about all you can
do is to copyfield to an un-analyzed field and facet on that... Note that
you do NOT have to store the field you facet on BTW.

Best
Erick

On Tue, Jun 21, 2011 at 12:53 PM, Jamie Johnson  wrote:
> Is it possible to do case insensitive searches but return the original
> case?  So for instance the original field is:
>
> John Smith
>
> I need to be able to do case insensitive searches tokenized searches, but
> when the value is returned for faceting I'd like the value to be just "John
> Smith", not "john" and "smith" or "john smith".  Is this possible?  I know I
> can probably do this by having an additional field which is for faceting and
> another which is for searching (won't give me case insensitive I don't
> think) but is there a more elegant way to do this?
>


Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
Hi,

I have a "X" field in my index, which is a feature hash I would like to use
to remove the duplicates in my result.
I cant keep this as the unique id field. Is there any method or any
parameter at the search time to remove the duplicates on a particular
field(hash in this case)?

Thanks in advance,

Regards,
Pooja


Re: Removing duplicate field at the time of search

2011-06-21 Thread Erick Erickson
I think this is what you're looking for:
http://wiki.apache.org/solr/Deduplication

Best
Erick

On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani  wrote:
> Hi,
>
> I have a "X" field in my index, which is a feature hash I would like to use
> to remove the duplicates in my result.
> I cant keep this as the unique id field. Is there any method or any
> parameter at the search time to remove the duplicates on a particular
> field(hash in this case)?
>
> Thanks in advance,
>
> Regards,
> Pooja
>


Re: Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
Hi Eric,

Thanks for the quick reply.
I had looked at the deduplication but I found it to deduplication at the
index time, right? I would prefer to do deduplication at the search time!

Regards,
Pooja

On Tue, Jun 21, 2011 at 11:15 PM, Erick Erickson wrote:

> I think this is what you're looking for:
> http://wiki.apache.org/solr/Deduplication
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani 
> wrote:
> > Hi,
> >
> > I have a "X" field in my index, which is a feature hash I would like to
> use
> > to remove the duplicates in my result.
> > I cant keep this as the unique id field. Is there any method or any
> > parameter at the search time to remove the duplicates on a particular
> > field(hash in this case)?
> >
> > Thanks in advance,
> >
> > Regards,
> > Pooja
> >
>


Re: case insensitive searches but return original case

2011-06-21 Thread Jamie Johnson
Thanks for the reply, I did see that but I am displaying the information in
that field as well so I'll need to store them for this case.  For fields I
don't need to display I know that I can just tell it not to store it.

On Tue, Jun 21, 2011 at 1:34 PM, Erick Erickson wrote:

> Not really. The problem here is that facets are done on terms. To
> search effectively, Solr needs tokenized, lower-cased etc. terms.
>
> But since faceting is really just faceting on terms, this is incompatible
> with returning multi-term facets like "John Smith" so about all you can
> do is to copyfield to an un-analyzed field and facet on that... Note that
> you do NOT have to store the field you facet on BTW.
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 12:53 PM, Jamie Johnson  wrote:
> > Is it possible to do case insensitive searches but return the original
> > case?  So for instance the original field is:
> >
> > John Smith
> >
> > I need to be able to do case insensitive searches tokenized searches, but
> > when the value is returned for faceting I'd like the value to be just
> "John
> > Smith", not "john" and "smith" or "john smith".  Is this possible?  I
> know I
> > can probably do this by having an additional field which is for faceting
> and
> > another which is for searching (won't give me case insensitive I don't
> > think) but is there a more elegant way to do this?
> >
>


Re: Removing duplicate field at the time of search

2011-06-21 Thread Erick Erickson
Well, in trunk and the soon-to-be-released Solr 3.3, you could use grouping,
what is the use-case here? Are you going to show all the docs (even duplicates)
some of the time?

Best
Erick

On Tue, Jun 21, 2011 at 1:53 PM, Pooja Verlani  wrote:
> Hi Eric,
>
> Thanks for the quick reply.
> I had looked at the deduplication but I found it to deduplication at the
> index time, right? I would prefer to do deduplication at the search time!
>
> Regards,
> Pooja
>
> On Tue, Jun 21, 2011 at 11:15 PM, Erick Erickson 
> wrote:
>
>> I think this is what you're looking for:
>> http://wiki.apache.org/solr/Deduplication
>>
>> Best
>> Erick
>>
>> On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani 
>> wrote:
>> > Hi,
>> >
>> > I have a "X" field in my index, which is a feature hash I would like to
>> use
>> > to remove the duplicates in my result.
>> > I cant keep this as the unique id field. Is there any method or any
>> > parameter at the search time to remove the duplicates on a particular
>> > field(hash in this case)?
>> >
>> > Thanks in advance,
>> >
>> > Regards,
>> > Pooja
>> >
>>
>


Re: case insensitive searches but return original case

2011-06-21 Thread Erick Erickson
Right. I'm saying that you can store one or the other, but there
is no good reason to store both. The facet values are the
values retrieved from the index, not the stored values. So you
can pull the stored values from either the searchable author field
just fine

Best
Erick

On Tue, Jun 21, 2011 at 2:03 PM, Jamie Johnson  wrote:
> Thanks for the reply, I did see that but I am displaying the information in
> that field as well so I'll need to store them for this case.  For fields I
> don't need to display I know that I can just tell it not to store it.
>
> On Tue, Jun 21, 2011 at 1:34 PM, Erick Erickson 
> wrote:
>
>> Not really. The problem here is that facets are done on terms. To
>> search effectively, Solr needs tokenized, lower-cased etc. terms.
>>
>> But since faceting is really just faceting on terms, this is incompatible
>> with returning multi-term facets like "John Smith" so about all you can
>> do is to copyfield to an un-analyzed field and facet on that... Note that
>> you do NOT have to store the field you facet on BTW.
>>
>> Best
>> Erick
>>
>> On Tue, Jun 21, 2011 at 12:53 PM, Jamie Johnson  wrote:
>> > Is it possible to do case insensitive searches but return the original
>> > case?  So for instance the original field is:
>> >
>> > John Smith
>> >
>> > I need to be able to do case insensitive searches tokenized searches, but
>> > when the value is returned for faceting I'd like the value to be just
>> "John
>> > Smith", not "john" and "smith" or "john smith".  Is this possible?  I
>> know I
>> > can probably do this by having an additional field which is for faceting
>> and
>> > another which is for searching (won't give me case insensitive I don't
>> > think) but is there a more elegant way to do this?
>> >
>>
>


Re: Removing duplicate field at the time of search

2011-06-21 Thread Pooja Verlani
I am fine to remove the duplicates and not show them up for this use case.
But grouping can also help me show one representative from the group.
At present I am using solr 1.4. Any idea how to achieve it otherwise if not
by using solr 3.3.

Regards,
Pooja

On Tue, Jun 21, 2011 at 11:55 PM, Erick Erickson wrote:

> Well, in trunk and the soon-to-be-released Solr 3.3, you could use
> grouping,
> what is the use-case here? Are you going to show all the docs (even
> duplicates)
> some of the time?
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 1:53 PM, Pooja Verlani 
> wrote:
> > Hi Eric,
> >
> > Thanks for the quick reply.
> > I had looked at the deduplication but I found it to deduplication at the
> > index time, right? I would prefer to do deduplication at the search time!
> >
> > Regards,
> > Pooja
> >
> > On Tue, Jun 21, 2011 at 11:15 PM, Erick Erickson <
> erickerick...@gmail.com>wrote:
> >
> >> I think this is what you're looking for:
> >> http://wiki.apache.org/solr/Deduplication
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Jun 21, 2011 at 1:40 PM, Pooja Verlani  >
> >> wrote:
> >> > Hi,
> >> >
> >> > I have a "X" field in my index, which is a feature hash I would like
> to
> >> use
> >> > to remove the duplicates in my result.
> >> > I cant keep this as the unique id field. Is there any method or any
> >> > parameter at the search time to remove the duplicates on a particular
> >> > field(hash in this case)?
> >> >
> >> > Thanks in advance,
> >> >
> >> > Regards,
> >> > Pooja
> >> >
> >>
> >
>


Re: Question about SolrResponseBase.toString()

2011-06-21 Thread Chris Hostetter

: I'm working with Solrj, and I like to use the SolrResponseBase.toString()
: method, as it seems to return JSON.  However, the JSON returned is not

many of the toString methods on internal solr objects use {} to show 
encapsulation when recursively calling toString() on sub objects, but they 
are not intended to be parsed as JSON -- they are just for 
debugging/hjuman consumption.

: valid, as it misses quotes.  If I search directly against Solr using
: 
http://localhost:8080/apache-solr-3.1-SNAPSHOT/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on&wt=json
... 
: When I search through the Solrj API, and to a SolrResponseBase.toString(),
: it looks like this:

the former is the output from the JSONResponseWriter, the later is the 
simple toString of the SolrJ client side objects (which may hve been 
parsed from the Binary format, or xml depending on how your SolrJ client 
was configured).  

If you want access to the JSON String produced by the JSONResponseWriter 
(on the server side), i suggest you either access the server directly 
(bypassing SolrJ completely) or use SolrJ just for constructing/executing 
the requests, but implement your own ResponseParser with a getWriterType() 
that returns "json" and does nothing in the processResponse(...) methods 
except to slurp down the raw JSON string.

-Hoss


Re: Solr Clustering For Multiple Pages

2011-06-21 Thread Stanislaw Osinski
Hi,

Currently, only the clustering of search results is implemented in Solr,
clustering of the whole index is not possible out of the box. In other
words, clustering applies only to the records you fetch during searching.
For example, if you set rows=10, only the 10 returned documents will be
clustered. You can try setting larger rows values (e.g. 100, 200, 500) to
get more clusters.

Staszek

On Mon, Jun 20, 2011 at 11:36, nilay@gmail.com wrote:

> Hi
>
> How can i create cluster for all records.
> Currently i  am sending clustering=true  param to solr  and it give the
> cluster in  response ,
> but it give for 10 rows because  rows=10 . So please suggest me how can i
> get the cluster for all records .
>
> How can i search with in cluster .
>
>  e.g  cluster created
>   Model(20)
>   Test(10)
>
> if i click on Model the i should get 20 records by filter so please  give
> me
> idea about   this .
>
>
> Please help me  to resolve this problem
>
> Regards
> Nilay Tiwari
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3085507.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


searching using solrj and RecordSeparator characters

2011-06-21 Thread Jamie Johnson
I have a field with a RecordSeparator in it, how can i go about searching on
this field using solrj and solr?


Re: searching using solrj and RecordSeparator characters

2011-06-21 Thread Ahmet Arslan
> I have a field with a RecordSeparator
> in it, how can i go about searching on
> this field using solrj and solr?

What do you mean by RecordSeparator?


Re: searching using solrj and RecordSeparator characters

2011-06-21 Thread Jamie Johnson
ASCII RecordSeparator http://www.bbdsoft.com/ascii.html


(char)30 will create it in Java

On Tue, Jun 21, 2011 at 4:41 PM, Ahmet Arslan  wrote:

> > I have a field with a RecordSeparator
> > in it, how can i go about searching on
> > this field using solrj and solr?
>
> What do you mean by RecordSeparator?
>


Re: velocity: hyperlinking to documents

2011-06-21 Thread Erik Hatcher
I guess you mean from the /browse view?

You can override/replace hit.vm (in conf/velocity/hit.vm) with whatever you 
like.  Here's an example from a demo I recently did using the open Best Buy 
data where I mapped their url value for a product into a url_s field in Solr 
and rendered a link to it:

   #field("name_t")

The #field macro is handy, as it'll show the stored value (if specified in fl) 
or return the highlighted stuff if that is in the response.  And $doc is a 
reference to the SolrDocument instance with the following API:   


Erik


On Jun 21, 2011, at 12:18 , okayndc wrote:

> hello,
> 
> i'm not sure of the correct velocity syntax to link, let's say a title
> field, to the actual document itself. i have a hostname, a category (which
> is also the directory where the file sits) and filename fields in my schema. 
> can i potentially use these fields to get at the document itself?
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/velocity-hyperlinking-to-documents-tp3091504p3091504.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: DIH Scheduling

2011-06-21 Thread Gora Mohanty
On Tue, Jun 21, 2011 at 10:41 PM, sabman  wrote:
> There is information
> http://wiki.apache.org/solr/DataImportHandler#Scheduling here  about
> Scheduling but I don't understand how to use them. I am not a Java developer
> so maybe I am missing something obvious.
[...]

Depending on what operating system you are using, you could probably
use curl, and the OS periodic-task scheduler, like UNIX cron, to achieve
DIH scheduling, without programming in Java.

Regards,
Gora


Re: DIH Scheduling

2011-06-21 Thread sabman
Thanks. Using curl would be an option but ideally I want to implement it
using this scheduler. I want to add Solr as part of another application
package and send it to clients. So rather than asking them run a cron job it
would be easier to have Solr configured to run the scheduler.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-Scheduling-tp3091764p3092985.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: wrong index version of solr3.2?

2011-06-21 Thread Chris Hostetter

: After switching to solr 3.2 and building a new index from scratch I ran
: check_index which reports:
: Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1]
: 
: Why do I get FORMAT_3_1 and Lucene 3.1, anything wrong with my index?

That's just because the index format didn't change between 3.1 and 3.2, 
but i can understand how it might be confusing. 

I filed an issue to try and make this less confusing in the future...
https://issues.apache.org/jira/browse/LUCENE-3226


-Hoss


Re: copyField generates "multiple values encountered for non multiValued field"

2011-06-21 Thread Chris Hostetter

: This is for debugging purposes, so I am sending the exact same data that are
: already stored in Solr's index.
...
: ERROR: [288400] multiple values encountered for non multiValued field
: "field2" [fieldvalue, fieldvalue]
: 
: The scenario:
: - "field1" is implicitly single value, type "text", indexed and stored
: - "field2" is generated via a copyField directive in schema.xml, implicitly
: single value, type "string", indexed and stored
: 
: What appears to happen:
: - On the first "add" (SolrClient::addDocuments(array(SolrInputDocument
: theDocument))), regular fields like "field1" get overwritten as intended
: - "field2", defined with a copyField, but still single value, gets
: _appended_ instead
: - When I retrieve the updated document in a query and try to add it again,
: it won't let me because of the inconsistent multi-value state
...
: But: Solr appears to be generating the corrupted state itsself via
: copyField?
: What's going wrong? I'm pretty confused...

I think you are missunderstanding the error you are seeing.  Solr isn't 
creating any inconsistent state, the multiValued check does in fact happen 
after the copyFields.


Based on your description, this is what it sounds to me like you are 
doing and why you are getting your error...

Initially sending solr a doc that looks like this...

id=1; field1=fieldvalue

...which when copyFields are evaluated winds up looking like this...

id=1; field1=fieldvalue; field2=fieldvalue

...that document goes in the index, and you then execute a query that 
matches it, and fetch the stored values of that document from solr -- 
getting all three fields back (ie, field1, field2).

You then attempt to index that document again, sending all 3 fields...

id=1; field1=fieldvalue; field2=fieldvalue

...which when copyFields are evaluated winds up looking like this...

id=1; field1=fieldvalue; field2=fieldvalue; field2=fieldvalue

..and that's why you get the error you are seeing.

If i'm missunderstanding your "retrieve the updated document in a query 
and try to add it again" process, can you please provide some example 
configs and the exact steps to reproduce (using the post.jar, or curl, or 
something simple that doesn't require PECL)

-Hoss


Re: ampersand, dismax, combining two fields, one of which is keywordTokenizer

2011-06-21 Thread Chris Hostetter

: It seems like the problem is when different fields in the 'qf' produce a
: different number of tokens for a given query.  dismax needs to know the number
: of tokens in the input in order to calculate 'mm', when 'mm' is expressed as a
: percentage, or when different mm's are given for different numbers of input
: tokens.

actually the fundmental problem is that when this situation arrises, 
dismax has no way of knowing *if* you want the token that only produced a 
TermQuery in fieldA but not fieldB to counted at all.

In your case, you don't want the "&" query against your simple (non 
whitespace striping) field to count in computing minShouldMax, but how 
does dismax know that?

if someone has a field that not only strips out punctuation, but also 
ignores anything that doesn't match one of my known keywords (using the 
KeepWordsFilter) they woud want the exact oposite situation as you -- they 
are really counting on the cases where a token produces a valid query for 
that special field to be a factor, don't want the number of clauses used 
to compute minShouldMatch to be lowered artificially just all the other 
tokens in the input don't don't produce anything for that field.

bottom line: as long as one field produces a token for a chunk of input, 
that's a clause -- it may only be a clause that's queried against one 
field, but it's still a clause.

: So what if dismax could recognize that different fields were producing
: different arrity of input, and use the _smallest_ number for it's 'mm'
: calculations, instead of current behavior where it's effectively the largest
: number? (Or '1' if the smallest number is '0'?!) That would in some cases
: produce errors in the other direction -- more hits coming back than you
: naively/intuitively expect.   Not sure if that would be worse or better. Seems
: better to me, less bad failure mode.

consider my previous example, and something similar to Jira searching 
where you might have a "projectCode" field with a query time 
KeepWordsFilter that only matches project codes ... right now, a query 
like q=SOLR+foo+bar+baz&mm=100%&wf=productCode^100+text would give you 
some really nice results that match all the input, but if SOLR is a 
projectCode those issues bubble to the top -- with your proposal, the 
effective mm would be "1" (because the projectCode field would only wind 
up with the SOLR clause) and you'd get all sorts of crap -- because those 
other clauses are all still there.  so you'd get *all* project:Solr 
issues, and *all* issues matching text:foo, and *all* issues matching 
text:bar etc...

: Or better yet, but surely harder perhaps infeasible to code, it would somehow
: apply the 'mm' differently to each field. Not even sure what that means

That's pretty much impossible.  the whole nature of the dismax style 
parser is that a DisjunctionMaxQuery is computed for each "word" of the 
q, across all "fields" in the qf -- it's those DisjunctionMaxQueries that 
are wrapped in a BooleanQuery with minShouldMatch set on it...

http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/

...if you "fliped" that matrix along the diagonal to hvae a differnet mm 
per field, you'd lose the value of the field specific boosts.


Ultimately the problem you had with "&" is the same problem people have 
with stopwords, and comes down to the same thing: if you don't want some 
chunk of text to be "significant" when searchng a field in your qf, have 
your analyzer remove it -- if the analyzer for a field in the qf produces 
a token, dismax assumes it's significant to the query and factors into the 
mm and matching and scoring.


-Hoss


Re: found a bug in query parser upgrading from 1.4.1 to 3.1

2011-06-21 Thread Chris Hostetter

: 
http://localhost:8983/solr/select?q=life&qf=description_text&defType=dismax&sort=scores:rails_f+desc
...
: If I insert the same document into solr 3.1 and run the same query I get the
: error:
: 
: Problem accessing /solr/select. Reason:
: 
: undefined field scores
: 
: For some reason, solr has cutoff the column name from the colon
: forward so "scores:rails_f" becomes "scores"

Yep, this has also been reported in SOLR-2606, I've adding some comments 
there...

https://issues.apache.org/jira/browse/SOLR-2606

-Hoss


Re: sending results of function query to range query

2011-06-21 Thread Chris Hostetter

: I am not sure if I can use function queries this way. I have a query 
: like this"attributeX:[* TO ?]" in my DB. I replace the ? with input from 
: the front end. Obviously, this works fine. However, what I really want 
: to do is "attributeX:[* TO (3 * ?)]" Is there anyway to embed the 
: results of a function query inside the query?

not really ... you'd need to do that evaluation in whatever code 
substitutes the value.

the thing people tend to forget about "function queries" and the "function 
syntax" in solr is that it's not about simple numerics -- the function is 
applied against every document, and the result is the set of scores across 
all documents, so it isn't even possibly to implement a syntax where the 
output of a function is used as the endpoint of a range, because the 
output of that function could be different for every doc.

-Hoss


RE: ampersand, dismax, combining two fields, one of which is keywordTokenizer

2011-06-21 Thread Jonathan Rochkind
Thanks, that's helpful. 

It still seems like current behavior does the "wrong" thing in _many_ cases (I 
know a lot of people get tripped up by it, sometimes on this list) -- but I 
understand your cases where it does the right thing, and where what I'm 
suggesting would be the wrong thing. 

> Ultimately the problem you had with "&" is the same problem people have 
> with stopwords, and comes down to the same thing: if you don't want some 
> chunk of text to be "significant" when searchng a field in your qf, have 
> your analyzer remove it 

Ah, but see the problem people have with stopwords is when they actually DID 
that. They didn't want a term to be 'significant' in one field, but they DID 
want it to be 'significant' in another field... but how this effects the 'mm' 
ends up being kind of counter-intuitive for some (but not other) 
setups/intentions.   It's counter-intuitive to me that adding a field to the 
'qf' set results in _fewer_ hits than the same 'qf' set without the new field 
-- although I understand your cases where you added the field to the 'qf' 
precisely in order to intentionally get that behavior, that's definitely not a 
universal case. 

And the fact that unpredictable changes to field analysis that aren't as simple 
as stopwords can lead to this same problem (as in this case where one field 
ignores punctuation and the other doesn't) -- it's definitely a trap waiting 
for some people. 

I wonder if it would be a good idea to have a parameter to (e)dismax that told 
it which of these two behaviors to use? The one where the 'term count' is based 
on the maximum number of terms from any field in the 'qf', and one where it's 
based on the minimum number of terms produced from any field in the qf?  I am 
still not sure how feasible THAT is, but it seems like a good idea to me. The 
current behavior is definitely a pitfall for many people.  

Or maybe a feature where you tell dismax, the number of tokens produced by 
field X, THAT's the one you should use for your 'term count' for mm, all the 
other fields are really just in there as sort of supplementary -- for boosting, 
or for bringing a few more results in; but NOT the case where you intentionally 
add a 'qf' with KeepWordsFilter in order to intentionally _reduce_ the result 
set . I think that's a pretty common use case too. 

Jonathan


MultiValued facet behavior question

2011-06-21 Thread Bill Bell
I have a field: specialties that is multiValued.

It indicates the doctor's specialties: cardiologist, internist, etc.

When someone does a search: "Cardiologist", I use
q=cardiologist&defType=dismax&qf=specialties&facet=true&facet.field=specialt
ies

What I want to come out in the facet is the Cardiologist (since it matches
exactly) and the number that matches: 700.
I don't want to see the other values that are not Cardiologist.

Now I see:

Cardiologist: 700
Internist: 45
Family Doctor: 20

This means that several Cardiologist's are also internists and family
doctors. When it matches exactly, I don't want to see Internists, Family
Doctors. How do I send a query to Solr with a condition.
Facet.query=specialties:Cardiologist&facet.field=specialties

Then if the query returns something use it, otherwise use the field one?

Other ideas?





RE: ampersand, dismax, combining two fields, one of which is keywordTokenizer

2011-06-21 Thread Chris Hostetter

: not other) setups/intentions.  It's counter-intuitive to me that adding 
: a field to the 'qf' set results in _fewer_ hits than the same 'qf' set 

agreed .. but that's where looking the debug info comes in to understand 
the reason for that behavior is that your old qf treated part of your 
input as garbage and that new field respects it and uses it in the 
calculation.

mind you: the "fewer hits" behavior only happens when using a percentage 
value in mm ... if you had mm=2 you'd get more results, but you've asked 
for "66%" (or whatever) and with that new qf there is a differnet number 
of clauses produced by query parsing.

: I wonder if it would be a good idea to have a parameter to (e)dismax 
: that told it which of these two behaviors to use? The one where the 
: 'term count' is based on the maximum number of terms from any field in 
: the 'qf', and one where it's based on the minimum number of terms 
: produced from any field in the qf?  I am still not sure how feasible 

even in your use case, i don't think you are fully considering what that 
would produce.  imagine that an mmType=min param existed and gave you what 
you're asking for.  Now imagine that you have two fields, one named 
"simple" that strips all punctuation and one named "complex" that doesn't, 
and you have a query like this...

q=Foo & Bar
qf=simple complex
mm=100%
mmType=min

  * Foo produces tokens for all qf
  * & only produces tokens for some qf (complex)
  * Bar products tokens for all qf

your mmType would say "there are only 2 tokens that we can query across 
all fields, so our computed minShouldMatch should be 100% of 2 == 2"

sounds good so far right?

the problem is you still have query clause coming from that "&" 
character ... you have 3 real clauses, one of which is that term query for 
"complex:&" which means that with your (computed) minShouldMatch of 2 you 
would see matches for any doc that happened to have indexed the "&" symbol 
in the "complex" field and also matched *either* of Foo or Bar (in either 
field)

So while a lot of your results would match both Foo and Bar, you'd get 
still get a bunch of weird results.

: Or maybe a feature where you tell dismax, the number of tokens produced 
: by field X, THAT's the one you should use for your 'term count' for mm, 

Hmmm maybe.  i'd have to see a patch in action and play with it, to 
really think it through ... hmmm ... honestly i really can't imagine how 
that would be helpful in general...

in order to use a feature like that you'd have to really think hard about 
the query analysis of your fields, and which ones will produce which 
tokens in which situations in order to make sure you pick the *right* 
value for that param -- but once you've done that hard thinking you might 
as well feed it back into your schema.xml and say "the query analyzer for 
field 'complex' should prune any tokens that only contain punctuation" 
(instead of saying "'complex' will produce tokens that only contain 
punctuation, so lets tell dismax to compute mm based only on 'simple').  
Afterall, there might not be one single field that you can pick -- maybe 
'complex' lets tokens that are all punctuation through but strips 
stopwords, and maybe 'simple' does the opposite ... no param value you 
pick will help you with that possibility, you really just need to fix the 
query analyzers to make sense if you want to use both of those two fields 
in the qf.


-Hoss


Re: MultiValued facet behavior question

2011-06-21 Thread Darren Govoni

So are you saying that for all results for "cardiologist",
you don't want facets not matching "Cardiologist" to be
returned as facets?

what happens when you make q=specialities:Cardiologist?
instead of just q=Cardiologist?

Seems that if you make the query on the field, then all
your results will necessarily qualify and you can discard
any additional facets you don't want (e.g. that don't
match the initial query term).

Maybe you can write what you see now, with what you
want to help clarify.

On 06/21/2011 09:47 PM, Bill Bell wrote:

I have a field: specialties that is multiValued.

It indicates the doctor's specialties: cardiologist, internist, etc.

When someone does a search: "Cardiologist", I use
q=cardiologist&defType=dismax&qf=specialties&facet=true&facet.field=specialt
ies

What I want to come out in the facet is the Cardiologist (since it matches
exactly) and the number that matches: 700.
I don't want to see the other values that are not Cardiologist.

Now I see:

Cardiologist: 700
Internist: 45
Family Doctor: 20

This means that several Cardiologist's are also internists and family
doctors. When it matches exactly, I don't want to see Internists, Family
Doctors. How do I send a query to Solr with a condition.
Facet.query=specialties:Cardiologist&facet.field=specialties

Then if the query returns something use it, otherwise use the field one?

Other ideas?








Re: MultiValued facet behavior question

2011-06-21 Thread Bill Bell
Doing it with q=specialities:Cardiologist or
q=Cardiologist&defType=dismax&qf=specialties
does not matter, the issue is how I see facets. I want the facets to only
show the one match,
and not all the multiValued fields in specialties that match...

Example,

Name|specialties
Bell|Cardiologist
Smith|Cardiologist,Family Doctor
Adams,Cardiologist,Family Doctor,Internist

When I facet.field=specialties I get:

Cardiologist: 3
Internist: 1
Family Doctor: 1


I only want it to return:

Cardiologist: 3

Because this matches exactly... Facet on the field that matches and only
return the number for that.

It can get more complicated. Here is another example:

q=cardiology&defType=dismax&qf=specialties


(Cardiology and cardiologist are stems)...

But I don't really know which value in Cardiologist match perfectly.

Again, I only want it to return:

Cardiologist: 3

If I searched on q=internist&defType=dismax&qf=specialties, I want the
result to be:


Internist: 1


Does this all make sense?







On 6/21/11 8:23 PM, "Darren Govoni"  wrote:

>So are you saying that for all results for "cardiologist",
>you don't want facets not matching "Cardiologist" to be
>returned as facets?
>
>what happens when you make q=specialities:Cardiologist?
>instead of just q=Cardiologist?
>
>Seems that if you make the query on the field, then all
>your results will necessarily qualify and you can discard
>any additional facets you don't want (e.g. that don't
>match the initial query term).
>
>Maybe you can write what you see now, with what you
>want to help clarify.
>
>On 06/21/2011 09:47 PM, Bill Bell wrote:
>> I have a field: specialties that is multiValued.
>>
>> It indicates the doctor's specialties: cardiologist, internist, etc.
>>
>> When someone does a search: "Cardiologist", I use
>> 
>>q=cardiologist&defType=dismax&qf=specialties&facet=true&facet.field=speci
>>alt
>> ies
>>
>> What I want to come out in the facet is the Cardiologist (since it
>>matches
>> exactly) and the number that matches: 700.
>> I don't want to see the other values that are not Cardiologist.
>>
>> Now I see:
>>
>> Cardiologist: 700
>> Internist: 45
>> Family Doctor: 20
>>
>> This means that several Cardiologist's are also internists and family
>> doctors. When it matches exactly, I don't want to see Internists, Family
>> Doctors. How do I send a query to Solr with a condition.
>> Facet.query=specialties:Cardiologist&facet.field=specialties
>>
>> Then if the query returns something use it, otherwise use the field one?
>>
>> Other ideas?
>>
>>
>>
>>
>




Re: Solr Clustering For Multiple Pages

2011-06-21 Thread nilay....@gmail.com
Hi

thanks Alot,

 can you please help me  how can i implement the  filter of topic cluster
like Model(10) when i will click on model then i need to get 10 docs .

Regards
Nilay Tiwari

On Wed, Jun 22, 2011 at 1:14 AM, Stanislaw Osinski-4 [via Lucene] <
ml-node+3092594-1426669115-405...@n3.nabble.com> wrote:

> Hi,
>
> Currently, only the clustering of search results is implemented in Solr,
> clustering of the whole index is not possible out of the box. In other
> words, clustering applies only to the records you fetch during searching.
> For example, if you set rows=10, only the 10 returned documents will be
> clustered. You can try setting larger rows values (e.g. 100, 200, 500) to
> get more clusters.
>
> Staszek
>
> On Mon, Jun 20, 2011 at 11:36, [hidden 
> email]<[hidden
> email] >wrote:
>
> > Hi
> >
> > How can i create cluster for all records.
> > Currently i  am sending clustering=true  param to solr  and it give the
> > cluster in  response ,
> > but it give for 10 rows because  rows=10 . So please suggest me how can i
>
> > get the cluster for all records .
> >
> > How can i search with in cluster .
> >
> >  e.g  cluster created
> >   Model(20)
> >   Test(10)
> >
> > if i click on Model the i should get 20 records by filter so please  give
>
> > me
> > idea about   this .
> >
> >
> > Please help me  to resolve this problem
> >
> > Regards
> > Nilay Tiwari
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3085507.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3092594.html
>  To unsubscribe from Solr Clustering For Multiple Pages, click 
> here.
>
>



-- 
Regards

Nilay Tiwari


-
Regards
Nilay Tiwari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3094290.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Read past EOF error due to broken connection

2011-06-21 Thread pravesh
First commit and then try again to search.

You can also use lucene's CheckIndex tool to check & fix your index (it may
remove some corrupt segments in your index)

Thanx
Pravesh

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Read-past-EOF-error-due-to-broken-connection-tp3091247p3094334.html
Sent from the Solr - User mailing list archive at Nabble.com.


Problem in accessing a variable's changed value outside of if block in javascript code

2011-06-21 Thread Romi
*$("#submit").click(function(){
var query=getquerystring() ; //get the query string entered by user
// get the JSON response from solr server 
var newquery=query;

   
$.getJSON("http://192.168.1.9:8983/solr/db/select/?wt=json&&start=0&rows=100&q="+query+"&json.wrf=?";,
function(result){
//$.each(result.response.docs, function(result){

if(result.response.numFound==0)
{

   
$.getJSON("http://192.168.1.9:8983/solr/db/select/?wt=json&&start=0&rows=100&q="+query+"&spellcheck=true&json.wrf=?";,
function(result){


$.each(result.spellcheck.suggestions, function(i,item){
newquery=item.suggestion;

});

});

}*


favorite


$("#submit").click(function(){
var query=getquerystring() ; //get the query string entered by user
// get the JSON response from solr server 
var newquery=query;

   
$.getJSON("http://192.168.1.9:8983/solr/db/select/?wt=json&&start=0&rows=100&q="+query+"&json.wrf=?";,
function(result){
//$.each(result.response.docs, function(result){

if(result.response.numFound==0)
{

   
$.getJSON("http://192.168.1.9:8983/solr/db/select/?wt=json&&start=0&rows=100&q="+query+"&spellcheck=true&json.wrf=?";,
function(result){


$.each(result.spellcheck.suggestions, function(i,item){
newquery=item.suggestion;

});

});

}

In the above javascript code a variable newquery initialy having value of
query. but when the if condition is true its value have changed. but my
problem is i am not getting its changed value outside of if block while i
want this changed value. how can i do this.


-
Thanks & Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-in-accessing-a-variable-s-changed-value-outside-of-if-block-in-javascript-code-tp3094342p3094342.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Clustering For Multiple Pages

2011-06-21 Thread nilay....@gmail.com
Thanks Alot . I was thinking  i am not doing in correct way . 

-
Regards
Nilay Tiwari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3094379.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Clustering For Multiple Pages

2011-06-21 Thread nilay....@gmail.com
Can you please tell me how can i apply filter in cluster data  in Solr  ? 

Currently i  storing docid and topic name in Map and get the ids  by topic 
from Map and then pass into solr separating by OR condition 

Is there any other way to do this 



-
Regards
Nilay Tiwari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Clustering-For-Multiple-Pages-tp3085507p3094390.html
Sent from the Solr - User mailing list archive at Nabble.com.