from:"Israel Ekpo"

Re: MorphlineSolrSink

2013-07-15 Thread Israel Ekpo

Rajesh,

I think this question is better suited for the FLUME user mailing list.

You will need to configure the sink with the expected values so that the
events from the channels can head to the right place.

On Mon, Jul 15, 2013 at 4:49 PM, Rajesh Jain  wrote:

> Newbie question:
>
> I have a Flume server, where I am writing to sink which is a RollingFile
> Sink.
>
> I have to take this files from the sink and send it to Solr which can index
> and provide search.
>
> Do I need to configure MorphineSolrSink?
>
> What is the mechanism's to do this or send this data over to Solr.
>
> Thanks,
> Rajesh
>

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: SOLR search performance - Linux vs Windows servers

2010-06-16 Thread Israel Ekpo

Thats a good note.

I get this kind of question a lot.

Most of the time, the reason is because there are database servers (MySQL)
and Webservers (Apache) and other processes running on the Linux box.

Try to verify that the load, number of processors/cores as well as other
environment settings are similar before drawing a conclusion.

On Wed, Jun 16, 2010 at 5:43 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> BB,
>
> Could it be that you are comparing apples and oranges?
> * Is the hardware identical?
> * Are indices identical?
> * Are JVM versions the same?
> * Are JVM arguments identical?
> * Are the two boxes "equally idle" when Solr is not running?
>
> * etc.
>
> In general, no, there is no reason why Windows would automatically be
> faster than Linux.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: bbarani 
> > To: solr-user@lucene.apache.org
> > Sent: Wed, June 16, 2010 5:06:55 PM
> > Subject: SOLR search performance - Linux vs Windows servers
> >
> >
> Hi,
>
> I have SOLR instances running in both Linux / windows server
> > (same version /
> same index data). Search  performance is good in windows
> > box compared to
> Linux box.
>
> Some queries takes more than 10 seconds in
> > Linux box but takes just a second
> in windows box. Have anyone encountered
> > this kind of issue before?
>
> Thanks,
> BB
> --
> View this message in
> > context:
> > href="
> http://lucene.472066.n3.nabble.com/SOLR-search-performance-Linux-vs-Windows-servers-tp901069p901069.html
> "
> > target=_blank
>  > >
> http://lucene.472066.n3.nabble.com/SOLR-search-performance-Linux-vs-Windows-servers-tp901069p901069.html
> Sent
> > from the Solr - User mailing list archive at Nabble.com.
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

[PECL-DEV] [ANNOUNCEMENT] solr-0.9.11 (beta) Released

2010-06-21 Thread Israel Ekpo

The new PECL package solr-0.9.11 (beta) has been released at
http://pecl.php.net/.

Release notes
-
- Added ability to specify response writer in constructor option ("wt")
- Added new method to set response writer SolrClient::setResponseWriter()
- Currently, the only supported response writers are 'xml' and 'phpnative'
- Added support for new native Solr response writer
- New response writer is available at
https://issues.apache.org/jira/browse/SOLR-1967

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
added in Solr 1.4. The extension has features such as built-in, serializable
query string builder objects which effectively simplifies the manipulation
of name-value pair request parameters across repeated requests. The response
from the Solr server is also automatically parsed into native php objects
whose properties can be accessed as array keys or object properties without
any additional configuration on the client-side. Its advanced HTTP client
reuses the same connection across multiple requests and provides built-in
support for connecting to Solr servers secured behind HTTP Authentication or
HTTP proxy servers. It is also able to connect to SSL-enabled containers.
Please consult the documentation for more details on features.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-0.9.11.tgz
Documentation http://docs.php.net/

Authors
-
Israel Ekpo  (lead)

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

[NEWS] New Response Writer for Native PHP Solr Client

2010-06-22 Thread Israel Ekpo

Hi Solr users,

If you are using Apache Solr via PHP, I have some good news for you.

There is a new response writer for the PHP native extension, currently
available as a plugin.

This new feature adds a new response writer class to the
org.apache.solr.request package.

This class is used by the PHP Native Solr Client driver to prepare the query
response from Solr.

This response writer allows you to configure the way the data is serialized
for the PHP client.

You can use your own class name and you can also control how the properties
are serialized as well.

The formatting of the response data is very similar to the way it is
currently done by the PECL extension on the client side.

The only difference now is that this serialization is happening on the
server side instead.

You will find this new response writer particularly useful when dealing with
responses for

- highlighting
- admin threads responses
- more like this responses

to mention just a few

You can pass the "objectClassName" request parameter to specify the class
name to be used for serializing objects.

Please note that the class must be available on the client side to avoid a
PHP_Incomplete_Object error during the unserialization process.

You can also pass in the "objectPropertiesStorageMode" request parameter
with either a 0 (independent properties) or a 1 (combined properties).

These parameters can also be passed as a named list when loading the
response writer in the solrconfig.xml file

Having this control allows you to create custom objects which gives the
flexibility of implementing custom __get methods, ArrayAccess, Traversable
and Iterator interfaces on the PHP client side.

Until this class in incorporated into Solr, you simply have to copy the jar
file containing this plugin into your lib directory under $SOLR_HOME

The jar file is available here

https://issues.apache.org/jira/browse/SOLR-1967

Then set up the configuration as shown below and then restart your servlet
container

Below is an example configuration in solrconfig.xml




SolrObject

0


Below is an example implementation on the PHP client side.

Support for specifying custom response writers will be available starting
from the 0.9.11 version (released today) of the PECL extension for Solr
currently available here

http://pecl.php.net/package/solr

Here is an example of how to use the new response writer with the PHP
client.


$property_name;
} else if (isset($_properties[$property_name])) { return
$_properties[$property_name]; }

return null;
}
}

$options = array
(
'hostname' => 'localhost',
'port' => 8983,
'path' => '/solr/'
);

$client = new SolrClient($options);

$client->setResponseWriter("phpnative");

$response = $client->ping();

$query = new SolrQuery();

$query->setQuery(":");

$query->set("objectClassName", "SolrClass");
$query->set("objectPropertiesStorageMode", 1);

$response = $client->query($query);

$resp = $response->getResponse();

?>


Documentation of the changes to the PECL extension are available here

http://docs.php.net/manual/en/solrclient.construct.php
http://docs.php.net/manual/en/solrclient.setresponsewriter.php

Please contact me at ie...@php.net, if you have any questions or comments.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Disk usage per-field

2010-07-03 Thread Israel Ekpo

Currently, this feature is not available.

The amount of space a field consumes varies and depends on whether the field
is index only, stored only or indexed and stored.

It also depends on how the field is analyzed

On Fri, Jul 2, 2010 at 2:59 PM, Shawn Heisey  wrote:

> On 6/30/2010 5:44 PM, Shawn Heisey wrote:
>
>> Is it possible for Solr (or Luke/Lucene) to tell me exactly how much of
>> the total index disk space is used by each field?  It would also be very
>> nice to know, for each field, how much is used by the index and how much is
>> used for stored data.
>>
>>
> Still interested in this.  It would be perfectly OK if such a thing were
> completely external to Solr and required a good chunk of time to calculate.
>  I would not need to do it very often.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: ANNOUNCE: Stump Hoss @ Lucene Revolution

2010-08-23 Thread Israel Ekpo

Chris,

I have a couple of questions I would like to through your way.

Is there a place where one can sign up for this.

Is sounds very interesting.

On Mon, Aug 23, 2010 at 4:49 PM, Chris Hostetter
wrote:

>
> Hey everybody,
>
> As you (hopefully) have heard by now, Lucid Imagination is sponsoring a
> Lucene/Solr conference in Boston about 6 weeks from now.  We've got a lot of
> really great speakers lined up to give some really interesting technical
> talks, so I offered to do something a little bit different.
>
> I'm going to be in the hot seat for a "Stump The Chump" style session,
> where I'll be answering Solr questions live and unrehearsed...
>
>http://bit.ly/stump-hoss
>
> The goal is to really make me sweat and work hard to think of creative
> solutions to non-trivial problems on the spot -- like when I answer
> questions on the solr-user mailing list, except in a crowded room with
> hundreds of people staring at me and laughing.
>
> But in order to be a success, we need your questions/problems/challenges!
>
> If you had a tough situation with Solr that you managed to solve with a
> creative solution (or haven't solved yet) and are interesting to see what
> type of solution I might come up with under pressure, please email a
> description of your problem to st...@lucenerevolution.org -- More details
> online...
>
> http://lucenerevolution.org/Presentation-Abstracts-Day1#stump-hostetter
>
> Even if you won't be able to make it to Boston, please send in any
> challenging problems you would be interested to see me tackle under the gun.
>  The session will be recorded, and the video will be posted online shortly
> after the conference has ended.  And if you can make it to Boston: all the
> more fun to watch live and in person (and maybe answer follow up questions)
>
> In any case, it should be a very interesting session: folks will either get
> to learn a lot, or laugh at me a lot, or both.  (win/win/win)
>
>
> -Hoss
>
> --
> http://lucenerevolution.org/  ...  October 7-8, Boston
> http://bit.ly/stump-hoss  ...  Stump The Chump!
>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Null Pointer Exception while indexing

2010-09-16 Thread Israel Ekpo

Try removing the data directory and then restart your Servlet container and
see if that helps.

On Thu, Sep 16, 2010 at 3:28 AM, Lance Norskog  wrote:

> Which version of Solr? 1.4?, 1.4.1? 3.x branch? trunk? if the 3.x or the
> trunk, when did you pull it?
>
>
> andrewdps wrote:
>
>> What could be possible error for
>>
>> 14-Sep-10 4:28:47 PM org.apache.solr.common.SolrException log
>> SEVERE: java.util.concurrent.ExecutionException:
>> java.lang.NullPointerException
>>at java.util.concurrent.FutureTask$Sync.innerGet(libgcj.so.90)
>>at java.util.concurrent.FutureTask.get(libgcj.so.90)
>>at
>>
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:439)
>>at
>>
>> org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:602)
>>at java.util.concurrent.Executors$RunnableAdapter.call(libgcj.so.90)
>>at java.util.concurrent.FutureTask$Sync.innerRun(libgcj.so.90)
>>at java.util.concurrent.FutureTask.run(libgcj.so.90)
>>at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$2(libgcj.so.90)
>>at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(libgcj.so.90)
>>at java.util.concurrent.ThreadPoolExecutor$Worker.run(libgcj.so.90)
>>at java.lang.Thread.run(libgcj.so.90)
>> Caused by: java.lang.NullPointerException
>>at
>> org.apache.solr.search.FastLRUCache.getStatistics(FastLRUCache.java:252)
>>at org.apache.solr.search.FastLRUCache.toString(FastLRUCache.java:280)
>>at java.lang.StringBuilder.append(libgcj.so.90)
>>at
>> org.apache.solr.search.SolrIndexSearcher.close(SolrIndexSearcher.java:223)
>>at org.apache.solr.core.SolrCore$6.close(SolrCore.java:1246)
>>at org.apache.solr.util.RefCounted.decref(RefCounted.java:57)
>>at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1192)
>>at java.util.concurrent.FutureTask$Sync.innerRun(libgcj.so.90)
>>at java.util.concurrent.FutureTask.run(libgcj.so.90)
>>...3 more
>>
>> I get this error(after indexing a few records I get the above error and
>> again starts indexing.i get the same error after indexing few hundred
>> records) when I try to index the marc record on the server.I worked fine
>> on
>> the local system.
>>
>> Thanks
>>
>>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: [PECL-DEV] Re: PHP Solr API

2010-10-01 Thread Israel Ekpo

Scott,

You can also use the SolrClient::setServlet() method with
SolrClient::TERMS_SERVLET_TYPE as the type

http://www.php.net/manual/en/solrclient.setservlet.php



On Fri, Oct 1, 2010 at 12:57 AM, Scott Yeadon wrote:

>  Hi,
>
> Sorry, scrap that, just found that SolrQuery is a subclass of
> ModifiableParams so can do this via "add" method and seems to work ok.
>
> Apologies for the noise.
>
> Scott.
>
>
> On 1/10/10 2:35 PM, Scott Yeadon wrote:
>
>>  Hi,
>>
>> Just wondering if there is a way of setting the "qt" parameter in the Solr
>> PHP API. I want to use the Term Vector Component but not sure this is
>> supported in the API?
>>
>> Thanks
>>
>> Scott.
>>
>
>
> --
> PECL development discussion Mailing List (http://pecl.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-10 Thread Israel Ekpo

Hi All,

I am currently working on a couple of bug fixes for the Solr PECL extension
that will be available in the next release 0.9.12 sometime this month.

http://pecl.php.net/package/solr

Documentation of the current API and features for the PECL extension is
available here

http://www.php.net/solr

A couple of users in the community were asking when the PHP extension will
be moving from beta to stable.

The API looks stable so far with no serious issues and I am looking to
moving it from *Beta* to *Stable *on November 20 2010

If you are using Solr via PHP and would like to see any new features in the
extension please feel free to send me a note.

I would like to incorporate those changes in 0.9.12 so that user can try
them out and send me some feedback before the release of version 1.0

Thanks in advance for your response.

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-12 Thread Israel Ekpo

On Mon, Oct 11, 2010 at 3:33 AM, Lukas Kahwe Smith wrote:

>
> On 11.10.2010, at 07:03, Israel Ekpo wrote:
>
> > I am currently working on a couple of bug fixes for the Solr PECL
> extension
> > that will be available in the next release 0.9.12 sometime this month.
> >
> > http://pecl.php.net/package/solr
> >
> > Documentation of the current API and features for the PECL extension is
> > available here
> >
> > http://www.php.net/solr
> >
> > A couple of users in the community were asking when the PHP extension
> will
> > be moving from beta to stable.
> >
> > The API looks stable so far with no serious issues and I am looking to
> > moving it from *Beta* to *Stable *on November 20 2010
> >
> > If you are using Solr via PHP and would like to see any new features in
> the
> > extension please feel free to send me a note.
> >
> > I would like to incorporate those changes in 0.9.12 so that user can try
> > them out and send me some feedback before the release of version 1.0
> >
> > Thanks in advance for your response.
>
>
> we already had some emails about this.
> imho there are too many methods for specialized tasks, that its easy to get
> lost in the API, especially since not all of them have written documentation
> yet beyond the method signatures.
>
> also i do think that there should be methods for escaping and also
> tokenizing lucene queries to enable "validation" of the syntax used etc.
>
> see here for a use case and a user land implementation:
> http://pooteeweet.org/blog/1796
>
> regards,
> Lukas Kahwe Smith
> m...@pooteeweet.org
>
>
>
>
Thanks Lukas for your feed back.

Could you clarify the part about too many methods for specialized task? From
feedback that I have received so far, most users like the specialization and
and a small fraction do not. So it might be a matter of preference. I
decided to add the specialized methods in the SolrQuery class because at the
time, that was what most of the users wanted to see in the API. They cannot
be removed now.

As per the documentation, all of the methods are documented with at least a
brief heading or summary of what it is supposed to do.

http://php.net/solr

The user needs to understand first which query parameters they need to send
to Solr and then they can use one of the SolrQuery methods for that purpose.
Additional information is available from Solr Tutorials and the wiki
itself.  If one choses not to use a specialized method there is always the
get(), set() and add() methods that allows you to pass the parameter values
directly instead of using a specialized method for that parameter.

For escaping queries, we already have the following method

SolrUtils::escapeQueryChars
http://www.php.net/manual/en/solrutils.escapequerychars.php
http://www.php.net/manual/en/class.solrutils.php

As per the tokenization, it is not clear exactly what you were referring to.
I think it is best for the analysis of any of the tokens to be handled at
the server layer. There are tools in the admin interface for analyzing and
breaking down the query components into tokens.

I also took a look at your blog but I could not immediately the use case you
were referring to. A little more detail on this will be helpful.

Thanks Lukas for your input.

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-12 Thread Israel Ekpo

On Tue, Oct 12, 2010 at 7:42 AM, Peter Blokland  wrote:

> hi,
>
> On Mon, Oct 11, 2010 at 01:03:07AM -0400, Israel Ekpo wrote:
>
> > If you are using Solr via PHP and would like to see any new features in
> the
> > extension please feel free to send me a note.
>
> I'm currently testing a setup with Solr via PHP, and was wondering if
> support for the ExtractingRequestHandler is planned ? It may be that I
> missed something in the documentation, but for now it looks like I need
> to build my own POST's to the /solr/update/extract handler.
>
> --
> CUL8R, Peter.
>
> www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
>

Peter,

That is an excellent idea.

I will add that to the wishlist.

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-12 Thread Israel Ekpo

On Tue, Oct 12, 2010 at 8:43 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Hi Isreal,
>
> On Mon, Oct 11, 2010 at 7:03 AM, Israel Ekpo  wrote:
>
> > If you are using Solr via PHP and would like to see any new features in
> the
> > extension please feel free to send me a note.
>
>
> we actually tried to grab some informations from solr's dataimport-page,
> but
> therefore we had to generate the complete url manually. which means, we
> have
> to access the solr-object to get hostname, port, etc and construct the
> needed url ourself.
>
> perhaps it's an idea to implement something like $solr->executeHttpRequest(
> 'GET', 'dataimport', array( 'command' => 'status' ) which could easily
> reuse
> all given informations and also for example the existing proxy handling.
>
> Regards
> Stefan
>

Stefan,

I agree with you. Excellent idea.

I am currently working on a feature that will allow you to specify the
target path (url) and then able to send any parameters or xml request to the
server.

I think this feature will take care of this.

What do you think?

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-12 Thread Israel Ekpo

On Tue, Oct 12, 2010 at 12:44 PM, Ken Stanley  wrote:

> >
> > > > If you are using Solr via PHP and would like to see any new features
> in
> > > the
> > > > extension please feel free to send me a note.
> >
>
> I'm new to this list, but in seeing this thread - and using PHP SOLR - I
> wanted to make a suggestion that - while minor - I think would greatly
> improve the quality of the extension.
>
> (I'm basing this mostly off of SolrQuery since that's where I've
> encountered
> the issue, but this might be true elsewhere)
>
> Whenever a method is supposed to return an array (i.e.,
> SolrQuery::getFields(), SolrQuery::getFacets(), etc), if there is no data
> to
> return, a null is returned. I think that this should be normalized across
> the board to return an empty array. First, the documentation is
> contradictory (http://us.php.net/manual/en/solrquery.getfields.php) in
> that
> the method signature says that it returns an array (not mixed), while the
> Return Values section says that it returns either an array or null.
> Secondly, returning an array under any circumstance provides more
> consistency and less logic; for example, let's say that I am looking for
> the
> fields (as-is in its current state):
>
>  // .. assume a proper set up
> if ($solrquery->getFields() !== null) {
>foreach ($solrquery->getFields() as $field) {
>// Do something
>}
> }
> ?>
>
> This is a minor request, I know. But, I feel that it would go a long way
> toward polishing the extension up for general consumption.
>
> Thank you,
>
> Ken Stanley
>
> PS. I apologize if this request has come through the pipes already; as I've
> stated, I am new to this list; I have yet to find any reference to my
> request. :)
>


Great recommendation Ken.

Thanks for catching that! That should be a quick one.

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Term is duplicated when updating a document

2010-10-15 Thread Israel Ekpo

Which fields are modified when the document is updated/replaced.

Are there any differences in the content of the fields that you are using
for the AutoSuggest.

Have you changed you schema.xml file recently? If you have, then there may
have been changes in the way these fields are analyzed and broken down to
terms.

This may be a bug if you did not change the field or the schema file but the
terms count is changing.

On Fri, Oct 15, 2010 at 9:14 AM, Thomas Kellerer  wrote:

> Hi,
>
> we are updating our documents (that represent products in our shop) when a
> dealer modifies them, by calling
> SolrServer.add(SolrInputDocument) with the updated document.
>
> My understanding is, that there is no other way of updating an existing
> document.
>
>
> However we also use a term query to autocomplete the search field for the
> user, but each time adocument is updated (added) the term count is
> incremented. So after starting with a new index the count is e.g. 1, then
> the document (that contains that term) is updated, and the count is 2, the
> next update will set this to 3 and so on.
>
> One the index is optimized (by calling SolServer.optimize()) the count is
> correct again.
>
> Am I missing something or is this a bug in Solr/Lucene?
>
> Thanks in advance
> Thomas
>
>

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Commits on service after shutdown

2010-10-18 Thread Israel Ekpo

The documents should be implicitly committed when the Lucene index is
closed.

When you perform a graceful shutdown, the Lucene index gets closed and the
documents get committed implicitly.

When the shutdown is abrupt as in a KILL -9, then this does not happen and
the updates are lost.

You can use the auto commit parameter when sending your updates so that the
changes are saved right away, thought this could slow down the indexing
speed considerably but I do not believe there are parameters to keep those
un-commited documents "alive" after a kill.

On Mon, Oct 18, 2010 at 2:46 PM, Ezequiel Calderara wrote:

>  Hi, i'm new in the mailing list.
> I'm implementing Solr in my actual job, and i'm having some problems.
> I was testing the consistency of the "commits". I found for example that if
> we add X documents to the index (without commiting) and then we restart the
> service, the documents are commited. They show up in the results. This is
> interpreted to me like an error.
> But when we add X documents to the index (without commiting) and then we
> kill the process and we start it again, the documents doesn't appear. This
> behaviour is the one i want.
>
> Is there any param to avoid the auto-committing of documents after a
> shutdown?
> Is there any param to keep those un-commited documents "alive" after a
> kill?
>
> Thanks!
>
> --
> __
> Ezequiel.
>
> Http://www.ironicnet.com 
>

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Removing Common Web Page Header and Footer from All Content Fetched by Nutch

2010-10-18 Thread Israel Ekpo

Hi All,

I am indexing a web application with approximately 9500 distinct URL and
contents using Nutch and Solr.

I use Nutch to fetch the urls, links and the crawl the entire web
application to extract all the content for all pages.

Then I run the solrindex command to send the content to Solr.

The problem that I have now is that the first 1000 or so characters of some
pages and the last 400 characters of the pages are showing up in the search
results.

These are contents of the common header and footer used in the site
respectively.

The only work around that I have now is to index everything and then go
through each document one at a time to remove the first 1000 characters if
the levenshtein distance between the first 1000 characters of the page and
the common header is less than a certain value. Same applies to the footer
content common to all pages.

Is there a way to ignore certain "stop phrase" so to speak in the Nutch
configuration based on levenshtein distance or jaro winkler distance so that
certain parts of the fetched data that matches this stop phrases will not be
parsed?

Any useful pointers would be highly appreciated.

Thanks in advance.


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Setting solr home directory in websphere

2010-10-18 Thread Israel Ekpo

You need to make sure that the following system variable is one of the
values specific in the JAVA_OPTS environment variable

-Dsolr.solr.home=path_to_solr_home



On Mon, Oct 18, 2010 at 10:20 PM, Kevin Cunningham <
kcunning...@telligent.com> wrote:

> I've installed Solr a hundred times using Tomcat (on Windows) but now need
> to get it going with WebSphere (on Windows).  For whatever reason this seems
> to be black magic :)  I've installed the war file but have no idea how to
> set Solr home to let WebSphere know where the index and config files are.
>  Can someone enlighten me on how to do this please?
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Removing Common Web Page Header and Footer from All Content Fetched by Nutch

2010-10-19 Thread Israel Ekpo

Thanks Otis and Markus for your input.

I will check it out today.

On Tue, Oct 19, 2010 at 4:45 AM, Markus Jelsma
wrote:

> Unfortunately, Nutch still uses Tika 0.7 in 1.2 and trunk. Nutch needs to
> be
> upgraded to Tika 0.8 (when it's released or just the current trunk). Also,
> the
> Boilerpipe API needs to be exposed through Nutch configuration, which
> extractor
> can be used, which parameters need to be set etc.
>
> Upgrading to Tika's trunk might be relatively easy but exposing Boilerpipe
> surely isn't.
>
> On Tuesday, October 19, 2010 06:47:43 am Otis Gospodnetic wrote:
> > Hi Israel,
> >
> > You can use this: http://search-lucene.com/?q=boilerpipe&fc_project=Tika
> > Not sure if it's built into Nutch, though...
> >
> > Otis
> > 
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > - Original Message 
> >
> > > From: Israel Ekpo 
> > > To: solr-user@lucene.apache.org; u...@nutch.apache.org
> > > Sent: Mon, October 18, 2010 9:01:50 PM
> > > Subject: Removing Common Web Page Header and Footer from All Content
> > > Fetched by
> > >
> > >Nutch
> > >
> > > Hi All,
> > >
> > > I am indexing a web application with approximately 9500 distinct  URL
> and
> > > contents using Nutch and Solr.
> > >
> > > I use Nutch to fetch the urls,  links and the crawl the entire web
> > > application to extract all the content for  all pages.
> > >
> > > Then I run the solrindex command to send the content to  Solr.
> > >
> > > The problem that I have now is that the first 1000 or so characters  of
> > > some pages and the last 400 characters of the pages are showing up in
> > > the  search results.
> > >
> > > These are contents of the common header and footer  used in the site
> > > respectively.
> > >
> > > The only work around that I have now is  to index everything and then
> go
> > > through each document one at a time to remove  the first 1000
> characters
> > > if the levenshtein distance between the first 1000  characters of the
> > > page and the common header is less than a certain value.  Same applies
> > > to the footer content common to all pages.
> > >
> > > Is there a way  to ignore certain "stop phrase" so to speak in the
> Nutch
> > > configuration based  on levenshtein distance or jaro winkler distance
> so
> > > that certain parts of the  fetched data that matches this stop phrases
> > > will not be parsed?
> > >
> > > Any  useful pointers would be highly appreciated.
> > >
> > > Thanks in  advance.
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536600 / 06-50258350
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-19 Thread Israel Ekpo

Hi All,

Just wanted to post an update on where we stand with all the requests for
new features

List of Features Requested In SOLR PECL Extension

1. Ability to Send Custom Requests to Custom URLS other than select, update,
terms etc.
2. Ability to add files (pdf, office documents etc)
3. Windows version of latest releases.
4. Ensuring that SolrQuery::getFields(), SolrQuery::getFacets() et al
returns an array consistently.
5. Lowering Libxml version to 2.6.16

If there is anything that you think I left out please let me know. This is a
summary.

On Wed, Oct 13, 2010 at 3:48 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> On Tue, Oct 12, 2010 at 6:29 PM, Israel Ekpo  wrote:
>
> > I think this feature will take care of this.
> >
> > What do you think?
>
>
> sounds good!
>

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Modelling Access Control

2010-10-23 Thread Israel Ekpo

Hi Paul,

Regardless of how you implement it, I would recommend you use filter queries
for the permissions check rather than making it part of the main query.

On Sat, Oct 23, 2010 at 4:03 AM, Paul Carey  wrote:

> Hi
>
> My domain model is made of users that have access to projects which
> are composed of items. I'm hoping to use Solr and would like to make
> sure that searches only return results for items that users have
> access to.
>
> I've looked over some of the older posts on this mailing list about
> access control and saw a suggestion along the lines of
> acl: AND (actual query).
>
> While this obviously works, there are a couple of niggles. Every item
> must have a list of valid user ids (typically less than 100 in my
> case). Every time a collaborator is added to or removed from a
> project, I need to update every item in that project. This will
> typically be fewer than 1000 items, so I guess is no big deal.
>
> I wondered if the following might be a reasonable alternative,
> assuming the number of projects to which a user has access is lower
> than a certain bound.
> (acl: OR acl: OR ... ) AND (actual query)
>
> When the numbers are small - e.g. each user has access to ~20 projects
> and each project has ~20 collaborators - is one approach preferable
> over another? And when outliers exist - e.g. a project with 2000
> collaborators, or a user with access to 2000 projects - is one
> approach more liable to fail than the other?
>
> Many thanks
>
> Paul
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Modelling Access Control

2010-10-23 Thread Israel Ekpo

Hi All,

I think using filter queries will be a good option to consider because of
the following reasons

* The filter query does not affect the score of the items in the result set.
If the ACL logic is part of the main query, it could influence the scores of
the items in the result set.

* Using a filter query could lead to better performance in complex queries
because the results from the query specified with fq are cached
independently from that of the main query. Since the result of a filter
query is cached, it will be used to filter the primary query result using
set intersection without having to fetch the ids of the documents from the
fq again a second time.

It think this will be useful because we could assume that the ACL portion in
the fq is relatively constant since the permissions for each user is not
something that is changing frequently.

http://wiki.apache.org/solr/FilterQueryGuidance


On Sat, Oct 23, 2010 at 2:58 PM, Dennis Gearon wrote:

> why use filter queries?
>
> Wouldn't reducing the set headed into the filters by putting it in the main
> query be faster? (A question to learn, since I do NOT know :-)
>
> Dennis Gearon
>
> Signature Warning
> 
> It is always a good idea to learn from your own mistakes. It is usually a
> better idea to learn from others’ mistakes, so you do not have to make them
> yourself. from '
> http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
>
> EARTH has a Right To Life,
>  otherwise we all die.
>
>
> --- On Sat, 10/23/10, Israel Ekpo  wrote:
>
> > From: Israel Ekpo 
> > Subject: Re: Modelling Access Control
> > To: solr-user@lucene.apache.org
> > Date: Saturday, October 23, 2010, 7:01 AM
> > Hi Paul,
> >
> > Regardless of how you implement it, I would recommend you
> > use filter queries
> > for the permissions check rather than making it part of the
> > main query.
> >
> > On Sat, Oct 23, 2010 at 4:03 AM, Paul Carey 
> > wrote:
> >
> > > Hi
> > >
> > > My domain model is made of users that have access to
> > projects which
> > > are composed of items. I'm hoping to use Solr and
> > would like to make
> > > sure that searches only return results for items that
> > users have
> > > access to.
> > >
> > > I've looked over some of the older posts on this
> > mailing list about
> > > access control and saw a suggestion along the lines
> > of
> > > acl: AND (actual query).
> > >
> > > While this obviously works, there are a couple of
> > niggles. Every item
> > > must have a list of valid user ids (typically less
> > than 100 in my
> > > case). Every time a collaborator is added to or
> > removed from a
> > > project, I need to update every item in that project.
> > This will
> > > typically be fewer than 1000 items, so I guess is no
> > big deal.
> > >
> > > I wondered if the following might be a reasonable
> > alternative,
> > > assuming the number of projects to which a user has
> > access is lower
> > > than a certain bound.
> > > (acl: OR acl: OR
> > ... ) AND (actual query)
> > >
> > > When the numbers are small - e.g. each user has access
> > to ~20 projects
> > > and each project has ~20 collaborators - is one
> > approach preferable
> > > over another? And when outliers exist - e.g. a project
> > with 2000
> > > collaborators, or a user with access to 2000 projects
> > - is one
> > > approach more liable to fail than the other?
> > >
> > > Many thanks
> > >
> > > Paul
> > >
> >
> >
> >
> > --
> > °O°
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the
> > gift.
> > Quality First. Measure Twice. Cut Once.
> > http://www.israelekpo.com/
> >
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Modelling Access Control

2010-10-25 Thread Israel Ekpo

On Mon, Oct 25, 2010 at 8:16 AM, Paul Carey  wrote:

> Many thanks for all the responses. I now plan on benchmarking and
> validating both the filter query approach, and maintaining the ACL
> entirely outside of Solr. I'll decide from there.
>
> Paul
>


Great.

I am looking forward for some feedback on the benchmarks.
-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Highlighting for non-stored fields

2010-10-26 Thread Israel Ekpo

Check out this link

http://wiki.apache.org/solr/FieldOptionsByUseCase

You need to store the field if you want to use the highlighting feature.

If you need to retrieve and display the highlighted snippets then the fields
definitely needs to be stored.

To use term offsets, it will be a good idea to enable the following
attributes for that field  termVectors termPositions termOffsets

The only issue here is that your storage costs will increase because of
these extra features.

Nevertheless, you definitely need to store the field if you need to retrieve
it for highlighting purposes.

On Tue, Oct 26, 2010 at 6:50 AM, Phong Dais  wrote:

> Hi,
>
> I've been looking thru the mailing archive for the past week and I haven't
> found any useful info regarding this issue.
>
> My requirement is to index a few terabytes worth of data to be searched.
> Due to the size of the data, I would like to index without storing but I
> would like to use the highlighting feature.  Is this even possible?  What
> are my options?
>
> I've read about termOffsets, payload that could possibly be used to do this
> but I have no idea how this could be done.
>
> Any pointers greatly appreciated.  Someone please point me in the right
> direction.
>
>  I don't mind having to write some code or digging thru existing code to
> accomplish this task.
>
> Thanks,
> P.
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Documents are deleted when Solr is restarted

2010-10-26 Thread Israel Ekpo

The Solr home is the -Dsolr.solr.home Java System property

Also make sure that -Dsolr.data.dir is define for your data directory, if it
is not already defined in the solrconfig.xml file

On Tue, Oct 26, 2010 at 10:46 AM, Upayavira  wrote:

> You need to watch what you are setting your solr.home to. That is where
> your indexes are being written. Are they getting overwritten/lost
> somehow. Watch the files in that dir while doing a restart.
>
> That's a start at least.
>
> Upayavira
>
> On Tue, 26 Oct 2010 16:40 +0300, "Mackram Raydan" 
> wrote:
> > Hey everyone,
> >
> > I apologize if this question is rudimentary but it is getting to me and
> > I did not find anything reasonable about it online.
> >
> > So basically I have a Solr 1.4.1 setup behind Tomcat 6. I used the
> > SolrTomcat wiki page to setup. The system works exactly the way I want
> > it (proper search, highlighting, etc...). The problem however is when I
> > restart my Tomcat server all the data in Solr (ie the index) is simply
> > lost. The admin shows me the number of docs is 0 when it was before in
> > the thousands.
> >
> > Can someone please help me understand why the above is happening and how
> > can I workaround it if possible?
> >
> > Big thanks for any help you can send my way.
> >
> > Regards,
> >
> > Mackram
> >
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Implementing Search Suggestion on Solr

2010-10-27 Thread Israel Ekpo

I think you may want to configure the field type used for the spell check to
use the synonyms file/database.

That way synonyms are also processed during index time.

This could help.

On Wed, Oct 27, 2010 at 6:47 AM, Antonio Calo'  wrote:

> Hi
>
> If I understood, you will build a kind of dictionary or ontology or
> thesauru and you will use it if Solr query results are few. At query time
> (before or after) you will perform a query on this dictionary in order to
> retrieve the suggested word.
>
> If you  need to do this, you can try to cvreate a custom request handler
> where you can controll the querying process in a simple manner (
> http://wiki.apache.org/solr/SolrRequestHandler).
>
> With the custom request handler, you can add custom code to check query
> results before submitting query to solr or analizing the query before
> sending result to client. I never coded one, but I think this is a good
> starting point.
>
> Hope this can help you
>
> Antonio
>
>
>
> Il 27/10/2010 11.03, Pablo Recio ha scritto:
>
>  Thanks, it's not what I'm looking for.
>>
>> Actually I need something like search "Ubuntu" and it will prompt "Maybe
>> you
>> will like 'Debian' too" or something like that. I'm not trying to do it
>> automatically, manually will be ok.
>>
>> Anyway, is good article you shared, maybe I will implement it, thanks!
>>
>> 2010/10/27 Jakub Godawa
>>
>>  I am a real rookie at solr, but try this:
>>> http://solr.pl/2010/10/18/solr-and-autocomplete-part-1/?lang=en
>>>
>>> 2010/10/27 Pablo Recio
>>>
>>>  Hi,

 I don't want to be annoying, but I'm looking for a way to do that.

 I repeat the question: is there a way to implement Search Suggestion
 manually?

 Thanks in advance.
 Regards,

 2010/10/18 Pablo Recio Quijano

  Hi!
>
> I'm trying to implement some kind of Search Suggestion on a search
>
 engine
>>>
 I

> have implemented. This search suggestions should not be automatically
>
 like

> the one described for the SpellCheckComponent [1]. I'm looking
>
 something
>>>
 like:
>
> "SAS oppositions" =>  "Public job offers for some-company"
>
> So I will have to define it manually. I was thinking about synonyms [2]
>
 but

> I don't know if it's the proper way to do it, because semantically
>
 those
>>>
 terms are not synonyms.
>
> Any ideas or suggestions?
>
> Regards,
>
> [1] http://wiki.apache.org/solr/SpellCheckComponent
> [2]
>
>
>>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>>>
>>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

ZendCon 2010 - Slides on Building Intelligent Search Applications with Apache Solr and PHP 5

2010-11-03 Thread Israel Ekpo

Due to popular demand, the link to my slides @ ZendCon are now available
here in case anyone else is looking for it.

http://slidesha.re/bAXNF3

The sample code will be uploaded shortly.

Feedback is also appreciated

http://joind.in/2261

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: solr init.d script

2010-11-09 Thread Israel Ekpo

I think it would be a better idea to load solr via a servlet container like
Tomcat and then create the init.d script for tomcat instead.

http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6

On Tue, Nov 9, 2010 at 2:47 AM, Eric Martin  wrote:

> Er, what flavor?
>
> RHEL / CentOS
>
> #!/bin/sh
>
> # Starts, stops, and restarts Apache Solr.
> #
> # chkconfig: 35 92 08
> # description: Starts and stops Apache Solr
>
> SOLR_DIR="/var/solr"
> JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=mustard -jar start.jar"
> LOG_FILE="/var/log/solr.log"
> JAVA="/usr/bin/java"
>
> case $1 in
>start)
>echo "Starting Solr"
>cd $SOLR_DIR
>$JAVA $JAVA_OPTIONS 2> $LOG_FILE &
>;;
>stop)
>echo "Stopping Solr"
>cd $SOLR_DIR
>$JAVA $JAVA_OPTIONS --stop
>;;
>restart)
>$0 stop
>sleep 1
>$0 start
>;;
>*)
>echo "Usage: $0 {start|stop|restart}" >&2
>exit 1
>;;
> esac
>
> 
>
>
> Debian
>
> http://xdeb.org/node/1213
>
> __
>
> Ubuntu
>
> STEPS
> Type in the following command in TERMINAL to install nano text editor.
> sudo apt-get install nano
> Type in the following command in TERMINAL to add a new script.
> sudo nano /etc/init.d/solr
> TERMINAL will display a new page title "GNU nano 2.0.x".
> Paste the below script in this TERMINAL window.
> #!/bin/sh -e
>
> # Starts, stops, and restarts solr
>
> SOLR_DIR="/apache-solr-1.4.0/example"
> JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=stopkey -jar start.jar"
> LOG_FILE="/var/log/solr.log"
> JAVA="/usr/bin/java"
>
> case $1 in
>start)
>echo "Starting Solr"
>cd $SOLR_DIR
>$JAVA $JAVA_OPTIONS 2> $LOG_FILE &
>;;
>stop)
>echo "Stopping Solr"
>cd $SOLR_DIR
>$JAVA $JAVA_OPTIONS --stop
>;;
>restart)
>$0 stop
>sleep 1
>$0 start
>;;
>*)
>echo "Usage: $0 {start|stop|restart}" >&2
>exit 1
>;;
> esac
> Note: In above script you might have to replace /apache-solr-1.4.0/example
> with appropriate directory name.
> Press CTRL-X keys.
> Type in Y
> When ask File Name to Write press ENTER key.
> You're now back to TERMINAL command line.
>
> Type in the following command in TERMINAL to create all the links to the
> script.
> sudo update-rc.d solr defaults
> Type in the following command in TERMINAL to make the script executable.
> sudo chmod a+rx /etc/init.d/solr
> To test. Reboot your Ubuntu Server.
> Wait until Ubuntu Server reboot is completed.
> Wait 2 minutes for Apache Solr to startup.
> Using your internet browser go to your website and try a Solr search.
>
>
>
> -Original Message-
> From: Nikola Garafolic [mailto:nikola.garafo...@srce.hr]
> Sent: Monday, November 08, 2010 11:42 PM
> To: solr-user@lucene.apache.org
> Subject: solr init.d script
>
> Hi,
>
> Does anyone have some kind of init.d script for solr, that can start,
> stop and check solr status?
>
> --
> Nikola Garafolic
> SRCE, Sveucilisni racunski centar
> tel: +385 1 6165 804
> email: nikola.garafo...@srce.hr
>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: solr init.d script

2010-11-09 Thread Israel Ekpo

Yes.

I recommend running Solr via a servlet container.

It is much easier to manage compared to running it by itself.

On Tue, Nov 9, 2010 at 10:03 AM, Nikola Garafolic
wrote:

> I  have two nodes running one jboss server each and using one (single) solr
> instance, thats how I run it for now.
>
> Do you recommend running jboss with solr via servlet? Two jboss run in
> load-balancing for high availability purpose.
>
> For now it seems to be ok.
>
>
> On 11/09/2010 03:17 PM, Israel Ekpo wrote:
>
>> I think it would be a better idea to load solr via a servlet container
>> like
>> Tomcat and then create the init.d script for tomcat instead.
>>
>> http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6
>>
>>
> --
> Nikola Garafolic
> SRCE, Sveucilisni racunski centar
> tel: +385 1 6165 804
> email: nikola.garafo...@srce.hr
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: New PHP API for Solr (Logic Solr API)

2011-03-26 Thread Israel Ekpo

Lukas,

How do you think it should have been designed?

Most libraries are not going to have all the features that you need and
while there may be features about the library that you do not like others
may really appreciate them being there.

As I said earlier in an earlier email a couple of months ago, the
SolrQuery:set(), get() and add() methods do exist for you to use if you
prefer not to use the feature specific methods in the SolrQuery class, thats
the beauty of it.

The PECL extension was something I designed to use on a personal project and
it was really helpful in managing faceted search and other features that
solr has to offer. I decided to share it with the PHP community because I
felt others might need similar functionality. So it is possible that they
may have been use cases that applied to my project that may not be
applicable to yours

I initially used the SolrJ API to access Solr via Java and then when I had a
PHP project I decided to use something similar to SolrJ but at the time
there was nothing similar in the PHP realm

http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/package-summary.html

A review of the SolrJ API will offer more explanations on some of the
features present in the PECL API

I will really love to get feedback from others about the design of the PECL
library about any other missing or extraneous features

Thanks.

On Mon, Mar 7, 2011 at 4:04 AM, Lukas Kahwe Smith wrote:

>
> On 07.03.2011, at 09:43, Stefan Matheis wrote:
>
> > Burak,
> >
> > what's wrong with the existing PHP-Extension
> > (http://php.net/manual/en/book.solr.php)?
>
>
> the main issue i see with it is that the API isn't "designed" much. aka it
> just exposes lots of features with dedicated methods, but doesnt focus on
> keeping the API easy to overview (aka keep simple things simple and make
> complex stuff possible). at the same time fundamental stuff like quoting are
> not covered.
>
> that being said, i do not think we really need a proliferation of solr
> API's for PHP, even if this one is based on PHP 5.3 (namespaces etc). btw
> there is already another PHP 5.3 based API, though it tries to also unify
> other Lucene based API's as much as possible:
> https://github.com/dstendardi/Ariadne
>
> regards,
> Lukas Kahwe Smith
> m...@pooteeweet.org
>
>
>
>

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr Php Client

2011-04-07 Thread Israel Ekpo

Hi,

Could you send the enter list of parameters you are ending to solr via the
SolrClient and SolrQuery object?

Please open a bug request here with the details

http://pecl.php.net/bugs/report.php?package=solr

On Thu, Apr 7, 2011 at 7:59 PM, Haspadar  wrote:

> Hello
> I updated Solr to version 3.1 on my project. And now when the application
> calls getResponse () method (PECL extension) I get the following:
> "Fatal error: Uncaught exception 'SolrException' with message 'Error
> un-serializing response' in /home/.../Adapter/Solr.php: 78"
>
> How can I fix it?
>
> Thanks
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Solr Php Client

2011-04-09 Thread Israel Ekpo

Cool.

I will take a look at the issue later tomorrow.

On Fri, Apr 8, 2011 at 2:28 AM, Haspadar  wrote:

> I'm entering only a query parameter.
> I posted a bug description there -
> http://pecl.php.net/bugs/bug.php?id=22634
>
>
> 2011/4/8 Israel Ekpo 
>
> > Hi,
> >
> > Could you send the enter list of parameters you are ending to solr via
> the
> > SolrClient and SolrQuery object?
> >
> > Please open a bug request here with the details
> >
> > http://pecl.php.net/bugs/report.php?package=solr
> >
> > On Thu, Apr 7, 2011 at 7:59 PM, Haspadar  wrote:
> >
> > > Hello
> > > I updated Solr to version 3.1 on my project. And now when the
> application
> > > calls getResponse () method (PECL extension) I get the following:
> > > "Fatal error: Uncaught exception 'SolrException' with message 'Error
> > > un-serializing response' in /home/.../Adapter/Solr.php: 78"
> > >
> > > How can I fix it?
> > >
> > > Thanks
> > >
> >
> >
> >
> > --
> > °O°
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the gift.
> > Quality First. Measure Twice. Cut Once.
> > http://www.israelekpo.com/
> >
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: PECL SOLR PHP extension, JSON output

2011-05-08 Thread Israel Ekpo

There are instructions here for Solr 1.4

https://issues.apache.org/jira/browse/SOLR-1967

I have not finished the version of the plugin that will allow you to use
phpnative in 3.1 yet

I will post them as soon as I can

I have not been working on the PECL extension for a while now but I am
planning to modify the source to include support for JSON response writer
soon.

Stay tuned.

On Thu, Apr 21, 2011 at 9:47 AM, Ralf Kraus  wrote:

> Am 21.04.2011 13:58, schrieb roySolr:
>
>  I have tried that but it seems like JSON is not supported
>>
>> Parameters
>>
>> responseWriter
>>
>> One of the following :
>>
>> - xml
>>  - phpnative
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/PECL-SOLR-PHP-extension-JSON-output-tp2846092p2846728.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> And I can´t get phpnative working with SOLR 3.1 :-(
>
> --
> Greets,
> Ralf Kraus
>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: phpnative response writer in SOLR 3.1 ?

2011-05-08 Thread Israel Ekpo

Sorry for the late response.

I am working on an updated version for the latest release of Solr and Lucene

I will post my changes soon within the week.

Thank you for your patience.

On Fri, Apr 15, 2011 at 3:20 AM, Ralf Kraus  wrote:

> Am 14.04.2011 09:53, schrieb Ralf Kraus:
>
>  Hello,
>>
>> I just updatet to SOLR 3.1 and wondering if the phpnative response writer
>> plugin is part of it?
>> ( https://issues.apache.org/jira/browse/SOLR-1967 )
>>
>> When I try to compile the sources files I get some errors :
>>
>> PHPNativeResponseWriter.java:57:
>> org.apache.solr.request.PHPNativeResponseWriter is not abstract and does not
>> override abstract method
>> getContentType(org.apache.solr.request.SolrQueryRequest,org.apache.solr.response.SolrQueryResponse)
>> in org.apache.solr.response.QueryResponseWriter
>> public class PHPNativeResponseWriter implements QueryResponseWriter {
>>   ^
>> PHPNativeResponseWriter.java:70: method does not override a method from
>> its superclass
>>@Override
>> ^
>>
>> Is there a new JAR File or something I could use with SOLR 3.1? Because
>> the SOLR pecl Package only uses XML oder PHPNATIVE as response writer (
>> http://pecl.php.net/package/solr )
>>
>>
> No hints at all ?
>
> --
> Greetings,
> Ralf Kraus
>
>


-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

[ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-06-04 Thread Israel Ekpo

The new PECL package solr-1.0.1 (stable) has been released at
http://pecl.php.net/.

Release notes
-
- Added support for json response writer in SolrClient
- Removed final bit from classes so that they can be mocked in unit tests
- Changed from beta to stable
- Included phpdoc stubs in source to enable autocomplete of Solr classes and
methods in IDE during development
- Lowered libxml2 version requirement to 2.6.16

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
added in Solr 3.1. The extension has features such as built-in, serializable
query string builder objects which effectively simplifies the manipulation
of name-value pair request parameters across repeated requests. The response
from the Solr server is also automatically parsed into native php objects
whose properties can be accessed as array keys or object properties without
any additional configuration on the client-side. Its advanced HTTP client
reuses the same connection across multiple requests and provides built-in
support for connecting to Solr servers secured behind HTTP Authentication or
HTTP proxy servers. It is also able to connect to SSL-enabled containers.
Please consult the documentation for more details on features. Included in
the source code are phpdoc stubs that enable autocomplete of Solr classes
and methods in IDE during development in userland.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-1.0.1.tgz

Authors
-
Israel Ekpo  (lead)

-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: [ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-06-11 Thread Israel Ekpo

it looks like you have to upgrade to php 5.3.x

Unfortunately, it looks like that method signature was different in that
version of PHP.

I would have to make additional changes to support the earlier versions of
PHP

On Tue, Jun 7, 2011 at 9:05 AM, roySolr  wrote:

> Hello,
>
> I have some problems with the installation of the new PECL package
> solr-1.0.1.
>
> I run this command:
>
> pecl uninstall solr-beta ( to uninstall old version, 0.9.11)
> pecl install solr
>
> The installing is running but then it gives the following error message:
>
> /tmp/tmpKUExET/solr-1.0.1/solr_functions_helpers.c: In function
> 'solr_json_to_php_native':
> /tmp/tmpKUExET/solr-1.0.1/solr_functions_helpers.c:1123: error: too many
> arguments to function 'php_json_decode'
> make: *** [solr_functions_helpers.lo] Error 1
> ERROR: `make' failed
>
> I have php version 5.2.17.
>
> How can i fix this?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-PHP-Solr-Extension-1-0-1-Stable-Has-Been-Released-tp3024040p3034350.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: [ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-06-23 Thread Israel Ekpo

I am working on that, I hope to have an answer within a month or so.

On Tue, Jun 21, 2011 at 9:51 AM, roySolr  wrote:

> Are you working on some changes to support earlier versions of PHP?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-PHP-Solr-Extension-1-0-1-Stable-Has-Been-Released-tp3024040p3090702.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
°O°
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Solr Wiki Page Not Responding. Is it down?

2009-08-28 Thread Israel Ekpo

Is the Solr wiki down?

I have tried to access the following URLs and none of them is loading.

http://wiki.apache.org/solr/QueryParametersIndex

http://wiki.apache.org/solr/SimpleFacetParameters

Re: Solr Wiki Page Not Responding. Is it down?

2009-08-28 Thread Israel Ekpo

Thanks Paul.

It was confirmed. I hope it will be back soon.

Here is the result from that page :  "It's not just you!
http://wiki.apache.org looks down from here."

On Fri, Aug 28, 2009 at 1:16 PM, Paul Tomblin  wrote:

> On Fri, Aug 28, 2009 at 1:12 PM, Israel Ekpo wrote:
> > Is the Solr wiki down?
> >
>
> There's a very useful web page for these questions:
>
> http://downforeveryoneorjustme.com/
>
> It confirms that yes, the wiki is down.  I'm currently using the
> Google cache to read the pages I need.
>
> --
> http://www.linkedin.com/in/paultomblin
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: TermsComponent

2009-09-05 Thread Israel Ekpo

Hi Todd,

I have not tried this yet.

But try setting the terms.raw parameter to true.

Maybe that will include the whitespace that is missing from the response.

On Fri, Sep 4, 2009 at 5:46 PM, Todd Benge  wrote:

> Hi,
>
> I was looking at TermsComponent in Solr 1.4 as a way of building a
> autocomplete function.  I have a prototype working but noticed that terms
> that have whitespace in them when indexed are absent the whitespace when
> returned from the TermsComponent.
>
> Any ideas on why that may be happening?  Am I just missing a configuration
> option?
>
> Thanks,
>
> Todd
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: date field

2009-09-08 Thread Israel Ekpo

Hi Gérard,

Concerning the issue with the ":" character you can use the
ClientUtils.escapeQueryChars() method to handle special characters that are
part of the query syntax.

The complete list of special characters is in the source code.

check out the following resources

org/apache/solr/client/solrj/util/ClientUtils.java

http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters


2009/9/8 Gérard Dupont 

> Hi all,
>
> I'm currently facing a little difficulty to index and search on date field.
> The indexing is done in the right way (I guess) and I can find valid date
> in
> the field like "2009-05-01T12:45:32Z". However when I'm searching the user
> don't always give an exact date. for instance they give "2008-05-01" to get
> all documents related to that day.  I can do a trick using wildcard but is
> there another way to do it ? Moreover if they give the full date string (or
> if I hack the query parser) I can have the full syntax, but then the ":"
> annoy me because the Lucene parser does not allow it without quotes. Any
> ideas ?
>
> --
> Gérard Dupont
> Information Processing Control and Cognition (IPCC) - EADS DS
> http://weblab.forge.ow2.org
>
> Document & Learning team - LITIS Laboratory
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Spec Version vs Implementation Version

2009-09-11 Thread Israel Ekpo

What are the differences between specification version and implementation
version

I downloaded the nightly build for September 05 2009 and it has a spec
version of 1.3 and the implementation version states 1.4-dev

What does that mean?


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Single Core or Multiple Core?

2009-09-14 Thread Israel Ekpo

I concur with Uri, but I would also add that it might be helpful to specify
a default core to use somewhere in the configuration file.

So that if no core is specified, the default one will be implicitly
selected.

I am not sure if this feature is available yet.

What do you think?

On Mon, Sep 14, 2009 at 10:46 AM, Uri Boness  wrote:

> Is it really a problem? I mean, as i see it, solr to cores is what RDBMS is
> to databases. When you connect to a database you also need to specify the
> database name.
>
> Cheers,
> Uri
>
>
> On Sep 14, 2009, at 16:27, Noble Paul നോബിള്‍  नोब्ळ् <
> noble.p...@corp.aol.com> wrote:
>
>  The problem is that, if we use multicore it forces you to use a core
>> name. this is inconvenient. We must get rid of this restriction before
>> we move single-core to multicore.
>>
>>
>>
>> On Sat, Sep 12, 2009 at 3:14 PM, Uri Boness  wrote:
>>
>>> +1
>>> Can you add a JIRA issue for that so we can vote for it?
>>>
>>> Chris Hostetter wrote:
>>>

 : > For the record: even if you're only going to have one SOlrCore,
 using
 the
 : > multicore support (ie: having a solr.xml file) might prove handy
 from
 a
 : > maintence standpoint ... the ability to configure new "on deck
 cores"
 with
   ...
 : Yeah, it is a shame that single-core deployments (no solr.xml) does
 not
 have
 : a way to enable CoreAdminHandler. This is something we should
 definitely
 : look at in Solr 1.5.

 I think the most straight forward starting point is to switch how we
 structure the examples so that all of the examples uses a solr.xml with
 multicore support.

 Then we can move forward on deprecating the specification of "Solr Home"
 using JNDI/systemvars and switch to having the location of the solr.xml
 be
 the one master config option with everything else coming after that.

 -Hoss

>>>
>>
>>
>> --
>> -
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: When to use Solr over Lucene

2009-09-16 Thread Israel Ekpo

Comparing Solr to Lucene is not exactly an apples-to-apples comparison.

Solr is a superset of Lucene. It uses the Lucene engine to index and process
requests for data retrieval.

Start here first : *
http://lucene.apache.org/solr/features.html#Solr+Uses+the+Lucene+Search+Library+and+Extends+it
!*

It would be unfair to compare to the Apache webserver to a cgi scripting
interface.

The apache webserver is just the container through with the webrowser
interacts with the CGI scripts.

This is very similar to how Solr is related to Lucene.

On Wed, Sep 16, 2009 at 9:26 AM, balaji.a  wrote:

>
> Hi All,
>   I am aware that Solr internally uses Lucene for search and indexing. But
> it would be helpful if anybody explains about Solr features that is not
> provided by Lucene.
>
> Thanks,
> Balaji.
> --
> View this message in context:
> http://www.nabble.com/When-to-use-Solr-over-Lucene-tp25472354p25472354.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: When to use Solr over Lucene

2009-09-16 Thread Israel Ekpo

Also Solr simplifies the process of implementing the client side interface.
You can use the same indices with clients written in any programming
language.

The client side could be in virtually any programming language of your
choosing.

If you were to work directly with Lucene, that would not be the case.

On Wed, Sep 16, 2009 at 9:49 AM, Israel Ekpo  wrote:

> Comparing Solr to Lucene is not exactly an apples-to-apples comparison.
>
> Solr is a superset of Lucene. It uses the Lucene engine to index and
> process requests for data retrieval.
>
> Start here first : *
> http://lucene.apache.org/solr/features.html#Solr+Uses+the+Lucene+Search+Library+and+Extends+it
> !*
>
> It would be unfair to compare to the Apache webserver to a cgi scripting
> interface.
>
> The apache webserver is just the container through with the webrowser
> interacts with the CGI scripts.
>
> This is very similar to how Solr is related to Lucene.
>
>
> On Wed, Sep 16, 2009 at 9:26 AM, balaji.a  wrote:
>
>>
>> Hi All,
>>   I am aware that Solr internally uses Lucene for search and indexing. But
>> it would be helpful if anybody explains about Solr features that is not
>> provided by Lucene.
>>
>> Thanks,
>> Balaji.
>> --
>> View this message in context:
>> http://www.nabble.com/When-to-use-Solr-over-Lucene-tp25472354p25472354.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

XSD for Solr Response Format Version 2.2

2009-09-21 Thread Israel Ekpo

I am working on an XSD document for all the types in the response xml
version 2.2

Do you think there is a need for this?

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Limit number of docs that can be indexed (security)

2009-09-21 Thread Israel Ekpo

Valdir,

I think you are making it more complicated that it needs to be.

As the administrator, if you don't want them to modify the contents of the
solrconfig.xml file then you should not give them access to do so.

If they already have access to change the contents of the file, you can
revoke such privileges.

That should do it. The users should only work on the client side (adding
documents, sending queries)

On Mon, Sep 21, 2009 at 6:14 PM, Valdir Salgueiro wrote:

> Hello,
>
> I need a way to limit the number of documents that can be indexed on my
> solr-based application. Here is what I have come up with: create a *
> UpdateRequestProcessor* and register it on *solrconfig.xml*. When the user
> tries to add a document, check if the docs limit has been reached. The
> problem is, the user can modify solrconfig.xml and remove the *
> UpdateRequestProcessor* so he can index as much as he wants.
>
> Any ideas how to implement such restriction in a "safer" manner?
>
> Thanks in advance,
> Valdir
>
> PS: Of course, I also need to make sure the user cannot modify how many
> files he can index, but I think some encription on the properties file
> which
> holds that information will do for now.
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Quotes in query string cause NullPointerException

2009-10-01 Thread Israel Ekpo

Don't be too hard on yourself.

Sometimes, mistakes like that can happen even to the most brilliant and most
experienced.

On Thu, Oct 1, 2009 at 2:15 PM, Andrew Clegg  wrote:

>
> Sorry! I'm officially a complete idiot.
>
> Personally I'd try to catch things like that and rethrow a
> 'QueryParseException' or something -- but don't feel under any obligation
> to
> listen to me because, well, I'm an idiot.
>
> Thanks :-)
>
> Andrew.
>
>
> Erik Hatcher-4 wrote:
> >
> > don't forget q=...  :)
> >
> >   Erik
> >
> > On Oct 1, 2009, at 9:49 AM, Andrew Clegg wrote:
> >
> >>
> >> Hi folks,
> >>
> >> I'm using the 2009-09-30 build, and any single or double quotes in
> >> the query
> >> string cause an NPE. Is this normal behaviour? I never tried it with
> >> my
> >> previous installation.
> >>
> >> Example:
> >>
> >> http://myserver:8080/solr/select/?title:%22Creatine+kinase%22
> >>
> >> (I've also tried without the URL encoding, no difference)
> >>
> >> Response:
> >>
> >> HTTP Status 500 - null java.lang.NullPointerException at
> >> java.io.StringReader.(StringReader.java:33) at
> >> org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:
> >> 173) at
> >> org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:
> >> 78) at
> >> org.apache.solr.search.QParser.getQuery(QParser.java:131) at
> >> org
> >> .apache
> >> .solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
> >> at
> >> org
> >> .apache
> >> .solr
> >> .handler
> >> .component.SearchHandler.handleRequestBody(SearchHandler.java:174)
> >> at
> >> org
> >> .apache
> >> .solr
> >> .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
> >> org
> >> .apache
> >> .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> >> at
> >> org
> >> .apache
> >> .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> >> at
> >> org
> >> .apache
> >> .catalina
> >> .core
> >> .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:
> >> 235)
> >> at
> >> org
> >> .apache
> >> .catalina
> >> .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> >> at
> >> org
> >> .apache
> >> .catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:
> >> 233)
> >> at
> >> org
> >> .apache
> >> .catalina.core.StandardContextValve.invoke(StandardContextValve.java:
> >> 175)
> >> at
> >> org
> >> .apache
> >> .catalina.valves.RequestFilterValve.process(RequestFilterValve.java:
> >> 269)
> >> at
> >> org
> >> .apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:
> >> 81)
> >> at
> >> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:
> >> 568)
> >> at
> >> org
> >> .apache
> >> .catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> >> at
> >> org
> >> .jstripe
> >> .tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
> >> at
> >> org
> >> .jstripe
> >> .tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
> >> at
> >> org
> >> .jstripe
> >> .tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
> >> at
> >> org
> >> .jstripe
> >> .tomcat.probe.Tomcat55AgentValve.invoke(Tomcat55AgentValve.java:20)
> >> at
> >> org
> >> .apache
> >> .catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> >> at
> >> org
> >> .apache
> >> .catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:
> >> 109)
> >> at
> >> org
> >> .apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:
> >> 286)
> >> at
> >> org
> >> .apache.coyote.http11.Http11Processor.process(Http11Processor.java:
> >> 844)
> >> at
> >> org.apache.coyote.http11.Http11Protocol
> >> $Http11ConnectionHandler.process(Http11Protocol.java:583)
> >> at org.apache.tomcat.util.net.JIoEndpoint
> >> $Worker.run(JIoEndpoint.java:447)
> >> at java.lang.Thread.run(Thread.java:619)
> >>
> >> Single quotes have the same effect.
> >>
> >> Is there another way to specify exact phrases?
> >>
> >> Thanks,
> >>
> >> Andrew.
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Quotes-in-query-string-cause-NullPointerException-tp25702207p25702207.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Quotes-in-query-string-cause-NullPointerException-tp25702207p25704050.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Announcing the Apache Solr extension in PHP - 0.9.0

2009-10-04 Thread Israel Ekpo

Fellow Apache Solr users,

I have been working on a PHP extension for Apache Solr in C for quite
sometime now.

I just finished testing it and I have completed the initial user level
documentation of the API

Version 0.9.0-beta has just been released.

It already has built-in readiness for Solr 1.4

If you are using Solr 1.3 or later in PHP, I would appreciate if you could
check it out and give me some feedback.

It is very easy to install on UNIX systems. I am still working on the build
for windows. It should be available for Windows soon.

http://solr.israelekpo.com/manual/en/solr.installation.php

A quick list of some of the features of the API include :
- Built in serialization of Solr Parameter objects.
- Reuse of HTTP connections across repeated requests.
- Ability to obtain input documents for possible resubmission from query
responses.
- Simplified interface to access server response data (SolrObject)
- Ability to connect to Solr server instances secured behind HTTP
Authentication and proxy servers

The following components are also supported
- Facets
- MoreLikeThis
- TermsComponent
- Stats
- Highlighting

Solr PECL Extension Homepage
http://pecl.php.net/package/solr

Some examples are available here
http://solr.israelekpo.com/manual/en/solr.examples.php

Interim Documentation Page until refresh of official PHP documentation
http://solr.israelekpo.com/manual/en/book.solr.php

The C source is available here
http://svn.php.net/viewvc/pecl/solr/

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Solr 1.4 Release Party

2009-10-10 Thread Israel Ekpo

I can't wait...

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Solr 1.4 Release Party

2009-10-12 Thread Israel Ekpo

It is my email signature.

It is a sort of hybrid/mashup from different sources.

On Mon, Oct 12, 2009 at 6:49 PM, Michael Masters  wrote:

> Where does the quote come from :)
>
> On Sat, Oct 10, 2009 at 6:38 AM, Israel Ekpo  wrote:
> > I can't wait...
> >
> > --
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the gift.
> > Quality First. Measure Twice. Cut Once.
> >
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Version 0.9.3 of the PECL extension for solr has just been released

2009-10-19 Thread Israel Ekpo

Version 0.9.3 of the PECL extension for solr has just been released.

Some of the methods have been updated and more get* methods have been added
to the Query builder classes.

The user level documentation was also updated to make the installation
instructions a lot clearer.

The latest documentation and source code are available from the project home
page

http://pecl.php.net/package/solr

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Indexing multiple entities

2009-10-29 Thread Israel Ekpo

On Thu, Oct 29, 2009 at 3:31 PM, Christian López Espínola <
penyask...@gmail.com> wrote:

> Hi, my name is Christian and I'm a newbie introducing to solr (and solrj).
>
> I'm working on a website where I want to index multiple entities, like
> Book or Magazine.
> The issue I'm facing is both of them have an attribute ID, which I
> want to use as the uniqueKey on my schema, so I cannot identify
> uniquely a document (because ID is saved in a database too, and it's
> autonumeric).
>
> I'm sure that this is a common pattern, but I don't find the way of solving
> it.
>
> How do you usually solve this? Thanks in advance.
>
>
> --
> Cheers,
>
> Christian López Espínola 
>

Hi Christian,

It looks like you are bringing in data to Solr from a database where there
are two separate tables.

One for *Books* and another one for *Magazines*.

If this is the case, you could define your uniqueKey element in Solr schema
to be a "string" instead of an integer then you can still load documents
from both the books and magazines database tables but your could prefix the
uniqueKey field with "B" for books and "M" for magazines

Like so :



id

Then when loading the books or magazines into Solr you can create the
documents with id fields like this


  
B14000
  
  
M14000
  
  
B14001
  
  
M14001
  


I hope this helps
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: adding and updating a lot of document to Solr, metadata extraction etc

2009-10-30 Thread Israel Ekpo

On Fri, Oct 30, 2009 at 11:23 AM, Eugene Dzhurinsky wrote:

> Hi there!
>
> We are trying to evaluate Apache Solr for our custom search implementation,
> which
> includes the following requirements:
>
> - ability to add/update/delete a lot of documents at once
>
> - ability to iterate over all documents, returned in search, as Lucene does
>  provide within a HitCollector instance. We would need to extract and
>  aggregate various fields, stored in index, to group results and aggregate
> them
>  in some way.
>
> After reading the tutorial I've realized that adding and removal of
> documents
> is performed through passing an XML file to controller in POST request.
> However our XML files may be very, very large - so I hope there is some
> another option to avoid interaction through HTTP protocol.
>
> Also I did not find any way in the tutorial to access the search results
> with
> all fields to be processed by our application.
>
> I think I simply did not read the documentation well or missed some point,
> so
> can somebody please point me to the articles, which may explain basics of
> how
> to achieve my goals?
>
> Thank you very much in advance!
>
> --
> Eugene N Dzhurinsky
>

Hi Eugene

Solr has an embedded version but you are encouraged to use the standard web
service interfaces.

Also, the Solr 1.4 white paper just recently released talks about the the
Streaming Updates Solr Server which according to the white paper can index
documents at an incredibly lightening speed of up to 25K documents per
second.

The white paper can be downloaded here

http://www.lucidimagination.com/whitepaper/whats-new-in-solr-1-4

Info about Streaming Update Solr Server is available here

http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html

If you are still interested in the Embedded version to avoid the HTTP
version you can check out the following links

http://wiki.apache.org/solr/EmbeddedSolr

http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.html

I hope this helps.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: tracking solr response time

2009-11-02 Thread Israel Ekpo

On Mon, Nov 2, 2009 at 8:41 AM, Yonik Seeley wrote:

> On Mon, Nov 2, 2009 at 8:13 AM, bharath venkatesh
>  wrote:
> >We are using solr for many of ur products  it is doing quite well
> > .  But since no of hits are becoming high we are experiencing latency
> > in certain requests ,about 15% of our requests are suffering a latency
>
> How much of a latency compared to normal, and what version of Solr are
> you using?
>
> >  . We are trying to identify  the problem .  It may be due to  network
> > issue or solr server is taking time to process the request  .   other
> > than  qtime which is returned along with the response is there any
> > other way to track solr servers performance ?
> > how is qtime calculated
> > , is it the total time from when solr server got the request till it
> > gave the response ?
>
> QTime is the time spent in generating the in-memory representation for
> the response before the response writer starts streaming it back in
> whatever format was requested.  The stored fields of returned
> documents are also loaded at this point (to enable handling of huge
> response lists w/o storing all in memory).
>
> There are normally servlet container logs that can be configured to
> spit out the real total request time.
>
> > can we do some extra logging to track solr servers
> > performance . ideally I would want to pass some log id along with the
> > request (query ) to  solr server  and solr server must log the
> > response time along with that log id .
>
> Yep - Solr isn't bothered by params it doesn't know about, so just put
> logid=xxx and it should also be logged with the other request
> params.
>
> -Yonik
> http://www.lucidimagination.com
>

If you are not using Java then you may have to track the elapsed time
manually.

If you are using the SolrJ Java client you may have the following options:

There is a method called getElapsedTime() in
org.apache.solr.client.solrj.response.SolrResponseBase which is available to
all the subclasses

I have not used it personally but I think this should return the time spent
on the client side for that request.

The QTime is not the time on the client side but the time spent internally
at the Solr server to process the request.

http://lucene.apache.org/solr//api/solrj/org/apache/solr/client/solrj/response/SolrResponseBase.html

http://lucene.apache.org/solr//api/solrj/org/apache/solr/client/solrj/response/QueryResponse.html

Most likely it could be as a result of an internal network issue between the
two servers or the Solr server is competing with other applications for
resources.

What operating system is the Solr server running on? Is you client
application connection to a Solr server on the same network or over the
internet? Are there other applications like database servers etc running on
the same machine? If so, then the DB server (or any other application) and
the Solr server could be competing for resources like CPU, memory etc.

If you are using Tomcat, you can take a look in
$CATALINA_HOME/logs/catalina.out, there are timestamps there that can also
guide you.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: tracking solr response time

2009-11-02 Thread Israel Ekpo

On Mon, Nov 2, 2009 at 9:52 AM, bharath venkatesh <
bharathv6.proj...@gmail.com> wrote:

> Thanks for the quick response
> @yonik
>
> >How much of a latency compared to normal, and what version of Solr are
> you using?
>
> latency is usually around 2-4 secs (some times it goes more than that
> )  which happens  to  only 15-20%  of the request  other  80-85% of
> request are very fast it is in  milli secs ( around 200,000 requests
> happens every day )
>
> @Israel  we are not using java client ..  we  r using  python at the
> client with response formatted in json
>
> @yonikn @Israel   does qtime measure the total time taken at the solr
> server ? I am already measuring the time to get the response  at
> client  end . I would want  a means to know how much time the solr
> server is taking to respond (process ) once it gets the request  . so
> that I could identify whether it is a solr server issue or internal
> network issue
>

It is the time spent at the Solr server.

I think Yonik already answered this part in his response to your thread :

This is what he said :

QTime is the time spent in generating the in-memory representation for
the response before the response writer starts streaming it back in
whatever format was requested.  The stored fields of returned
documents are also loaded at this point (to enable handling of huge
response lists w/o storing all in memory).


>
> @Israel  we are using rhel server  5 on both client and server .. we
> have 6 solr sever . one is acting as master . both client and solr
> sever are on the same network . those servers are dedicated solr
> server except 2 severs which have DB and memcahce running .. we have
> adjusted the load accordingly
>
>
>
>
>
>
>
> On 11/2/09, Israel Ekpo  wrote:
> > On Mon, Nov 2, 2009 at 8:41 AM, Yonik Seeley
> > wrote:
> >
> >> On Mon, Nov 2, 2009 at 8:13 AM, bharath venkatesh
> >>  wrote:
> >> >We are using solr for many of ur products  it is doing quite well
> >> > .  But since no of hits are becoming high we are experiencing latency
> >> > in certain requests ,about 15% of our requests are suffering a latency
> >>
> >> How much of a latency compared to normal, and what version of Solr are
> >> you using?
> >>
> >> >  . We are trying to identify  the problem .  It may be due to  network
> >> > issue or solr server is taking time to process the request  .   other
> >> > than  qtime which is returned along with the response is there any
> >> > other way to track solr servers performance ?
> >> > how is qtime calculated
> >> > , is it the total time from when solr server got the request till it
> >> > gave the response ?
> >>
> >> QTime is the time spent in generating the in-memory representation for
> >> the response before the response writer starts streaming it back in
> >> whatever format was requested.  The stored fields of returned
> >> documents are also loaded at this point (to enable handling of huge
> >> response lists w/o storing all in memory).
> >>
> >> There are normally servlet container logs that can be configured to
> >> spit out the real total request time.
> >>
> >> > can we do some extra logging to track solr servers
> >> > performance . ideally I would want to pass some log id along with the
> >> > request (query ) to  solr server  and solr server must log the
> >> > response time along with that log id .
> >>
> >> Yep - Solr isn't bothered by params it doesn't know about, so just put
> >> logid=xxx and it should also be logged with the other request
> >> params.
> >>
> >> -Yonik
> >> http://www.lucidimagination.com
> >>
> >
> >
> >
> > If you are not using Java then you may have to track the elapsed time
> > manually.
> >
> > If you are using the SolrJ Java client you may have the following
> options:
> >
> > There is a method called getElapsedTime() in
> > org.apache.solr.client.solrj.response.SolrResponseBase which is available
> to
> > all the subclasses
> >
> > I have not used it personally but I think this should return the time
> spent
> > on the client side for that request.
> >
> > The QTime is not the time on the client side but the time spent
> internally
> > at the Solr server to process the request.
> >
> >
> http://lucene.apache.org/solr//api/solrj/org/apache/solr/client/solrj/response/SolrResponseBase.html
> >
> >
> http://lucene.apach

Re: How to integrate Solr into my project

2009-11-03 Thread Israel Ekpo

2009/11/3 Licinio Fernández Maurelo 

> Hi Caroline,
>
> i think that you must take an overview tour ;-) , solrj is just a solr java
> client ...
>
> Some clues:
>
>
>   - Define your own index schema
> (it's just like a SQL DDL) .
>   - There are different ways to put docs in your index:
>  - SolrJ (Solr client for java env)
>  - DIH  (Data Import
>  Handler) this one is prefered when doing a huge data import from
> DB's, many
>  source formats are supported.
>   - Try to perform queries over your fancy-new index ;-). Learn about
>   searching syntax and
> faceting
>   .
>
>
>
>
>
>
> 2009/11/3 Caroline Tan 
>
> > Ya, it's a Java projecti just browse this site you suggested...
> > http://wiki.apache.org/solr/Solrj
> >
> > Which means, i declared the dependancy to solr-solrj and solr-core jars,
> > have those jars added to my project lib and by following the Solrj
> > tutorial,
> > i will be able to even index a DB table into Solr as well? thanks
> >
> > ~caroLine
> >
> >
> > 2009/11/3 Noble Paul നോബിള്‍ नोब्ळ् 
> >
> > > is it a java project ?
> > > did you see this page http://wiki.apache.org/solr/Solrj ?
> > >
> > > On Tue, Nov 3, 2009 at 2:25 PM, Caroline Tan 
> > > wrote:
> > > > Hi,
> > > > I wish to intergrate Solr into my current working project. I've
> played
> > > > around the Solr example and get it started in my tomcat. But the next
> > > step
> > > > is HOW do i integrate that into my working project? You see, Lucence
> > > > provides API and tutorial on what class i need to instanstiate in
> order
> > > to
> > > > index and search. But Solr seems to be pretty vague on this..as it is
> a
> > > > working solr search server. Can anybody help me by stating the steps
> by
> > > > steps, what classes that i should look into in order to assimiliate
> > Solr
> > > > into my project?
> > > > Thanks.
> > > >
> > > > regards
> > > > ~caroLine
> > > >
> > >
> > >
> > >
> > > --
> > > -
> > > Noble Paul | Principal Engineer| AOL | http://aol.com
> > >
> >
>
>
>
> --
> Lici
>


I would also recommend buying the Solr 1.4 Enterprise Search Server.

It will give you some tips

http://www.amazon.com/Solr-1-4-Enterprise-Search-Server/dp/1847195881/ref=sr_1_1?ie=UTF8&s=books&qid=1257247932&sr=1-1
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Free live video streaming of ApacheCon US 2009

2009-11-04 Thread Israel Ekpo

Thanks a lot.

This will be very helpful to me.

As I am not able to attend.

On Wed, Nov 4, 2009 at 8:25 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Team,
>
> For those Lucene fanatics not in Oakland this week for ApacheCon US,
> don't miss the FREE live video streaming, starting today:
>
>  http://streaming.linux-magazin.de/en/program-apachecon-us-2009.htm
>
> Note that there are many talks available, covering Apache Hadoop,
> Apache HTTPD, Lucene, as well as the Apache Pioneer's Panel and
> keynote presentations.
>
> Lucene's track is this Friday (NOTE these times are UTC -- use
> http://www.timeanddate.com to map to your time zone):
>
>  17:00 Implementing an Information Retrieval Framework for an
>   Organizational Repository, Sithu D Sudarsan
>
>  18:00 Apache Mahout - Going from raw data to information
>   Isabel Drost
>
>  19:15 MIME Magic with Apache Tika
>   Jukka Zitting
>
>  20:15 Keynote: How Open Source Developers Can (Still!) Save The World
>   Brian Behlendorf
>
>  22:00 Building Intelligent Search Applications with the Lucene
>   Ecosystem, Ted Dunning
>
>  23:00 Realtime Search
>   Jason Rutherglen
>
> Happy viewing,
>
> Mike
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: how to use ajax-solr - example?

2009-11-04 Thread Israel Ekpo

On Wed, Nov 4, 2009 at 10:48 AM, Joel Nylund  wrote:

> Hi, I looked at the documentation and I have no idea how to get started?
> Can someone point me to or show me an example of how to send a query to a
> solr server and paginate through the results using ajax-solr.
>
> I would glady write a blog tutorial on how to do this if someone can get me
> started.
>
> I dont know jquery but have used prototype & scriptaculous.
>
> thanks
> Joel
>
>

Joel,

It will be best if you use a scripting language between Solr and JavaScript

This is becasue sending data only between JavaScript and Solr will limit you
to only one domain name.

However, if you are using a scripting language between JavaScript and Solr
you can use the scripting language to retrieve the request parameters from
JavaScript and then same them to Solr with the response writer set to json.

This will cause Solr to send the response in JSON format which the scripting
language can pass on to JavaScript.

This example here will cause Solr to return the response in JSON.

http://example.com:8443/solr/select?q=searchkeyword&wt=json

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Representing a complex schema in solr

2009-11-07 Thread Israel Ekpo

On Sat, Nov 7, 2009 at 11:37 PM, Rakhi Khatwani  wrote:

> Hi,
>
> i have a complex schema as shown below:
>
> Book
>-  Title
>-  Category
>-  Publication
>-  Edition
>-  Publish Date
>-  Author (multivalued)   => Author is a multivalued field containing
> the following attributes.
>-  Name
>-  Age
>-  Location
>-  Gender
>- Qualification
>
>
> i wanna store the above information in solr so that i can query in every
> aspect
>
> one small query example would be:
> 1. search for all the books written by females.
> 2. search for all books writen by young authors...for example between the
> age 22 to 30.
>
> i woudn't wanna use RDBMS coz i have more than one million documents like
> this.
>
> i also tried saving the author as a JSON string. but then i cannot use wild
> card and range queries on it.
>
> any suggessions how wud i represent something like this in solr??
>
> Regards,
> Raakhi
>


Hi Rakhi,

I think you should do this to simplify your storage and retrieval process:

Instead of having one multi-valued "author" field, store each attribute as a
separate multi-valued field.

So name, age, location, gender and qualification will be separate fields in
the schema.

This will allow you to query the way you are asking

q=gender:female

OR by age

q=age:[22 TO 30]

Use tint (solr.TrieIntField) for the age field (if you are using Solr 1.4)

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: adding and updating a lot of document to Solr, metadata extraction etc

2009-11-10 Thread Israel Ekpo

On Tue, Nov 10, 2009 at 8:26 AM, Eugene Dzhurinsky  wrote:

> On Tue, Nov 03, 2009 at 05:49:23PM -0800, Lance Norskog wrote:
> > The DIH has improved a great deal from Solr 1.3 to 1.4. You will be
> > much better off using the DIH from this.
> >
> > This is the current Solr release candidate binary:
> > http://people.apache.org/~gsingers/solr/1.4.0/
>
> In fact we are prohibited to use release candidates/nightly builds, we are
> forced to use only releases of Solr :(
>
> --
> Eugene N Dzhurinsky
>


Well, the official release is out and you can pick it up from your closest
mirror here

http://www.apache.org/dyn/closer.cgi/lucene/solr/


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Commit error

2009-11-11 Thread Israel Ekpo

2009/11/11 Licinio Fernández Maurelo 

> Hi folks,
>
> i'm getting this error while committing after a dataimport of only 12 docs
> !!!
>
> Exception while solr commit.
> java.io.IOException: background merge hit exception: _3kta:C2329239
> _3ktb:c11->_3ktb into _3ktc [optimize] [mergeDocStores]
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2829)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2750)
> at
>
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:401)
> at
>
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
> at
>
> org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:138)
> at
>
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:66)
> at
> org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:170)
> at
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:208)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
> at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.io.IOException: No hay espacio libre en el dispositivo
> at java.io.RandomAccessFile.writeBytes(Native Method)
> at java.io.RandomAccessFile.write(RandomAccessFile.java:499)
> at
>
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:191)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:75)
> at org.apache.lucene.store.IndexOutput.writeBytes(IndexOutput.java:45)
> at
>
> org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:229)
> at
>
> org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:184)
> at
>
> org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:217)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5089)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4589)
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
>
> Index info: 2.600.000 docs | 11G size
> System info: 15GB free disk space
>
> When attempting to commit the disk usage increases until solr breaks ... it
> looks like 15 GB is not enought space to do the merge | optimize
>
> Any advice?
>
> --
> Lici
>


Hi Licinio,

During the the optimization process, the index size would be approximately
double what it was originally and the remaining space on disk may not be
enough for the task.

You are describing exactly what could be going on
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Are subqueries possible in Solr? If so, are they performant?

2009-11-12 Thread Israel Ekpo

On Thu, Nov 12, 2009 at 3:39 PM, Chris Hostetter
wrote:

>
> : I am getting results from one query and I just need 2 index attribute
> values
> : . These index attribute values are used for form new Query to Solr.
>
> can you elaborate on what exactly you mean by "These index attribute
> values are used for form new Query to Solr" ... are you saying that you
> want to take the values from *every* document matching query#1 and use
> them to construct query#2
>
> this sounds like you arent' denormalizing your data enough when building
> your index.
>
> : Since Solr gives result only for GET request, hence there is restriction
> on
> : : forming query with all values.
>
> that's false ... you can post a query if you want, and there are not hard
> constraints on how big a query can be (just practical constraints on what
> your physical hardware can handle in a reasonable amount of time)
>
> : >> > SELECT id, first_name
> : >> > FROM student_details
> : >> > WHERE first_name IN (SELECT first_name
> : >> > FROM student_details
> : >> > WHERE subject= 'Science');
> : >> >
> : >> > If so, how performant is this kind of queries?
>
> even as a sql query this doesn't relaly make much sense to me (at least
> not w/o a better understanding of the table+data)
>
> why wouldn't you just say:
>
>SELECT id, first_name FROM ...WHERE subject='Science'
>
> ..or in Solr...
>
>q=subject:Science&fl=id,first_name
>
>
>
> -Hoss
>
>
It's also important to note that the Solr schema contains only one "table",
so to speak; whereas in the traditional database schema you can have more
than one table in the same schema where you can do JOINs and sub queries
across multiple tables to retrieve the target data.

If you are bringing data from multiple database tables into the Solr index,
they have to be denormalized to fit into just one "table" in Solr.

So you will have to use a BOOLEAN AND or a filter query to simulate the sub
query you are trying to make.

I hope this clears things a bit.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Newbie tips: migrating from mysql fulltext search / PHP integration

2009-11-15 Thread Israel Ekpo

On Mon, Nov 16, 2009 at 12:34 AM, Mattmann, Chris A (388J) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> WOW, +1!! Great job, PHP!
>
> Cheers,
> Chris
>
>
>
> On 11/15/09 10:13 PM, "Otis Gospodnetic" 
> wrote:
>
> Hi,
>
> I'm not sure if you have a specific question there.
> But regarding "PHP integration" part, I just learned PHP now has native
> Solr (1.3 and 1.4) support:
>
>  http://twitter.com/otisg/status/5757184282
>
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> - Original Message 
> > From: mbneto 
> > To: solr-user@lucene.apache.org
> > Sent: Sun, November 15, 2009 4:56:15 PM
> > Subject: Newbie tips: migrating from mysql fulltext search / PHP
> integration
> >
> > Hi,
> >
> > I am looking for alternatives to MySQL fulltext searches.  The combo
> > Lucene/Solr is one of my options and I'd like to gather as much
> information
> > I can before choosing and even build a prototype.
> >
> > My current need does not seem to be different.
> >
> > - fast response time (currently some searches can take more than 11sec)
> > - API to add/update/delete documents to the collection
> > - way to add synonymous or similar words for misspelled ones (ex. Sony =
> > Soni)
> > - way to define relevance of results (ex. If I search for LCD return
> > products that belong to the LCD category, contains LCD in the product
> > definition or ara marked as special offer)
> >
> > I know that I may have to add external code, for example, to take the
> > results and apply some business logic to resort the results but I'd like
> to
> > know, besides the wiki and the solr 1.4 Enterprise Seacrh Server book
> (which
> > I am considering to buy) the tips for solr usage.
>
>
>
> ++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.mattm...@jpl.nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>
>

Hi,

There is native support for Solr in PHP but currently you have to build it
as a PECL extension.

It is currently not bundled with the PHP source yet but it is down loadable
from the PECL project homepage

http://pecl.php.net/package/solr

If you currently have pecl support built into your php installation you can
install it by running the following command

pecl install solr-beta

Some usage examples are available here

http://us3.php.net/manual/en/solr.examples.php

More details are available here

http://www.php.net/manual/en/book.solr.php

I use Solr with PHP 5.2

- In PHP, the SolrClient class has methods to add, update, delete and
rollback changes to the index made since the last commit.
- There are also built-in tools in Solr that allow you to analyze and modify
the data before indexing it and when searching for it.
- with Solr you can define synonyms (check the wiki for more details)
- Solr also allows you to sort by score (relevance)
- You can specify the fields that you want either as (optional, required or
prohibited)

My last two points could take care of your last requirement.

Solr is awesome and most of the search I perform return sub-second response
times.

Its several hundred folds easier and more efficient than MySQL fulltext.
believe me.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: PhP, Solr and Delta Imports

2009-11-16 Thread Israel Ekpo

On Mon, Nov 16, 2009 at 2:49 PM, Pablo Ferrari wrote:

> Hello,
>
> I have an already working Solr service based un full imports connected via
> php to a Zend Framework MVC (I connect it directly to the Controller).
> I use the SolrClient class for php which is great:
> http://www.php.net/manual/en/class.solrclient.php
>
> For now on, every time I want to edit a document I have to do a full import
> again or I can delete the document by its id and add it again with the
> updated info...
> Anyone can guide me a bit in how to do delta imports? If its via php,
> better!
>
> Thanks in advance,
>
> Pablo Ferrari
> Tinkerlabs.net
>

Hello Pablo,

You have a couple of options and you do not have to do a full data re-import
for the entire index.

My example below uses 'doc_id' as the uniqueKey field in your schema. It
also assumes that it is an integer type

1. You can remove the document from the index by query or by id (assuming
you have its id or uniqueKey field) if you want to just take it out of the
active index.

$client = new SolrClient($options);

$client->deleteById(400); // I recommend this one

OR

$client->deleteByQuery('doc_id:400'); // This should work too.

2. If all you want to do is to replace/update an existing document in the
Solr index and you still want the document to remain active in the index
then you can just update it by building a SolrInputDocument object and then
submitting just that document using the SolrClient.

$client = new SolrClient($options);

$doc = new SolrInputDocument();

$doc->addField('doc_id', 334455);
$doc->addField('other_field', 'Other Field Value');
$doc->addField('another_field', 'Another Field Value');

$updateResponse = $client->addDocument($doc);

If your changes are coming from the db it would be helpful to have a time
stamp column that changes each time the record is modified.

Then you can keep track of when the last index process was done and the next
time you can retrieve only 'active' documents that have been modified or
created after this last re-index process. You can send the
SolrInputDocuments to the Solr Index using the SolrClient object as shown
above for each document.

Do not forget to save the changes to the index with a call to
SolrClient::commit()

If you are updating a lot of records, I would remmend waiting till the end
to do the commit (and optimize call if needed).

More examples are available here

http://us2.php.net/manual/en/solr.examples.php

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Solr - Load Increasing.

2009-11-16 Thread Israel Ekpo

On Mon, Nov 16, 2009 at 5:22 PM, Walter Underwood wrote:

> Probably "lakh": 100,000.
>
> So, 900k qpd and 3M docs.
>
> http://en.wikipedia.org/wiki/Lakh
>
> wunder
>
> On Nov 16, 2009, at 2:17 PM, Otis Gospodnetic wrote:
>
> > Hi,
> >
> > Your autoCommit settings are very aggressive.  I'm guessing that's what's
> causing the CPU load.
> >
> > btw. what is "laks"?
> >
> > Otis
> > --
> > Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> > Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >
> >
> >
> > - Original Message 
> >> From: kalidoss 
> >> To: solr-user@lucene.apache.org
> >> Sent: Mon, November 16, 2009 9:11:21 AM
> >> Subject: Solr - Load Increasing.
> >>
> >> Hi All.
> >>
> >>   My server solr box cpu utilization  increasing b/w 60 to 90% and some
> time
> >> solr is getting down and we are restarting it manually.
> >>
> >>   No of documents in solr 30 laks.
> >>   No of add/update requrest solr 30 thousand / day. Avg of every 30
> minutes
> >> around 500 writes.
> >>   No of search request 9laks / day.
> >>   Size of the data directory: 4gb.
> >>
> >>
> >>   My system ram is 8gb.
> >>   System available space 12gb.
> >>   processor Family: Pentium Pro
> >>
> >>   Our solr data size can be increase in number like 90 laks. and writes
> per day
> >> will be around 1laks.   - Hope its possible by solr.
> >>
> >>   For write commit i have configured like
> >>
> >>  1
> >>  10
> >>
> >>
> >>   Is all above can be possible? 90laks datas and 1laks per day writes
> and
> >> 30laks per day read??  - if yes what type of system configuration would
> require.
> >>
> >>   Please suggest us.
> >>
> >> thanks,
> >> Kalidoss.m,
> >>
> >>
> >> Get your world in your inbox!
> >>
> >> Mail, widgets, documents, spreadsheets, organizer and much more with
> your
> >> Sifymail WIYI id!
> >> Log on to http://www.sify.com
> >>
> >> ** DISCLAIMER **
> >> Information contained and transmitted by this E-MAIL is proprietary to
> Sify
> >> Limited and is intended for use only by the individual or entity to
> which it is
> >> addressed, and may contain information that is privileged, confidential
> or
> >> exempt from disclosure under applicable law. If this is a forwarded
> message, the
> >> content of this E-MAIL may not have been sent with the authority of the
> Company.
> >> If you are not the intended recipient, an agent of the intended
> recipient or a
> >> person responsible for delivering the information to the named
> recipient,  you
> >> are notified that any use, distribution, transmission, printing, copying
> or
> >> dissemination of this information in any way or in any manner is
> strictly
> >> prohibited. If you have received this communication in error, please
> delete this
> >> mail & notify us immediately at ad...@sifycorp.com
> >
>
>


Thanks Walter for clarifying that.

I too was wondering what "laks" meant.

It was a bit distracting when I read the original post.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

[ANNOUNCEMENT] solr-0.9.7 (beta) Released

2009-11-17 Thread Israel Ekpo

The new PECL package solr-0.9.7 (beta) has been released at
http://pecl.php.net/.

Release notes
-
- Fixed bug 16924 AC_MSG_NOTICE() is undefined in autoconf 2.13
- Added new method SolrClient::getDebug()
- Modified SolrClient::__construct() so that port numbers and other integer
values for the options can be passed as strings.
- Changed internal string handling mechanism to allow for tracking of memory
allocation in debug mode.
- Lowered minimum php version to 5.2.3. Unfortunately, this is the lowest
PHP version that will be supported. PHP versions lower than 5.2.3 are not
compatible or are causing tests to FAIL.
- Added php stubs for code-completion assists in IDEs and editors.
- Added more examples

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
available in Solr 1.4. The extension has features such as built-in,
serializable query string builder objects which effectively simplifies the
manipulation of name-value pair request parameters across repeated requests.
The response from the Solr server is also automatically parsed into native
php objects whose properties can be accessed as array keys or object
properties without any additional configuration on the client-side. Its
advanced HTTP client reuses the same connection across multiple requests and
provides built-in support for connecting to Solr servers secured behind HTTP
Authentication or HTTP proxy servers. It is also able to connect to
SSL-enabled containers. Please consult the documentation for more details on
features.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-0.9.7.tgz

Authors
-
Israel Ekpo  (lead)

Re: Solr 1.3 query and index perf tank during optimize

2009-11-17 Thread Israel Ekpo

On Tue, Nov 17, 2009 at 2:24 PM, Chris Hostetter
wrote:

>
> : Basically, search entries are keyed to other documents.  We have finite
> : storage,
> : so we purge old documents.  My understanding was that deleted documents
> : still
> : take space until an optimize is done.  Therefore, if I don't optimize,
> the
> : index
> : size on disk will grow without bound.
> :
> : Am I mistaken?  If I don't ever have to optimize, it would make my life
> : easier.
>
> deletions are purged as segments get merged.  if you want to force
> deleted documents to be purged, the only way to do that at the
> moment is to optimize (which merges all segments).  but if you are
> continually deleteing/adding documents, the deletions will eventaully get
> purged even if you never optimize.
>
>
>
>
> -Hoss
>
>

Chris,

Since the mergeFactor controls the segment merge frequency and size and the
number of segments is limited to mergeFactor - 1.

Would one be correct to state that if some documents have been deleted from
the index and the changes finalized with a call to commit, as more documents
are added to the index, eventually the index will be  implicitly "*optimized
*" and the deleted documents will be purged even without explicitly issuing
an optimize statement?

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Announcing the Apache Solr extension in PHP - 0.9.0

2009-11-23 Thread Israel Ekpo

Hi Mike,

Thanks to Pierre, the Windows version of the extension are available here
compiled from trunk  r 291135

http://downloads.php.net/pierre/

I am planning to have 0.9.8 compiled for windows as soon as it is out
sometime later this week.

The 1.0 release should be out sometime before mid December after the API is
finalized and tested.

You can always check the project home page for news about upcoming releases

http://pecl.php.net/package/solr

The documentation is available here
http://www.php.net/manual/en/book.solr.php

Cheers


On Mon, Nov 23, 2009 at 3:28 PM, Michael Lugassy  wrote:

> Thanks Israel, exactly what I was looking for, but how would one get a
> pre-compiled dll for windows? using PHP 5.3 VS9 TS.
>
> On Mon, Oct 5, 2009 at 7:03 AM, Israel Ekpo  wrote:
> > Fellow Apache Solr users,
> >
> > I have been working on a PHP extension for Apache Solr in C for quite
> > sometime now.
> >
> > I just finished testing it and I have completed the initial user level
> > documentation of the API
> >
> > Version 0.9.0-beta has just been released.
> >
> > It already has built-in readiness for Solr 1.4
> >
> > If you are using Solr 1.3 or later in PHP, I would appreciate if you
> could
> > check it out and give me some feedback.
> >
> > It is very easy to install on UNIX systems. I am still working on the
> build
> > for windows. It should be available for Windows soon.
> >
> > http://solr.israelekpo.com/manual/en/solr.installation.php
> >
> > A quick list of some of the features of the API include :
> > - Built in serialization of Solr Parameter objects.
> > - Reuse of HTTP connections across repeated requests.
> > - Ability to obtain input documents for possible resubmission from query
> > responses.
> > - Simplified interface to access server response data (SolrObject)
> > - Ability to connect to Solr server instances secured behind HTTP
> > Authentication and proxy servers
> >
> > The following components are also supported
> > - Facets
> > - MoreLikeThis
> > - TermsComponent
> > - Stats
> > - Highlighting
> >
> > Solr PECL Extension Homepage
> > http://pecl.php.net/package/solr
> >
> > Some examples are available here
> > http://solr.israelekpo.com/manual/en/solr.examples.php
> >
> > Interim Documentation Page until refresh of official PHP documentation
> > http://solr.israelekpo.com/manual/en/book.solr.php
> >
> > The C source is available here
> > http://svn.php.net/viewvc/pecl/solr/
> >
> > --
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the gift.
> > Quality First. Measure Twice. Cut Once.
> >
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

Re: Stopping & Starting

2009-12-03 Thread Israel Ekpo

On Thu, Dec 3, 2009 at 5:01 PM, Yonik Seeley wrote:

> On Thu, Dec 3, 2009 at 4:57 PM, Lee Smith  wrote:
> > Hello All
> >
> > I am just starting out today with solr and looking for some advice but I
> > first have a problem.
> >
> > I ran the start command ie.
> >
> > user:~/solr/example$ java -jar start.jar
> >
> > Which worked perfect and started to explore the interface.
> >
> > But my terminal window dropped and I it has stopped working. If i try and
> > restart it Im getting errors and its still not working.
> >
> > error like:
> > 2009-12-03 21:55:41.785::WARN:  EXCEPTION
> > java.net.BindException: Address already in use
> >
> > So how can I stop and restart the service ?
>
> Try and find the java process and kill it?
> ps -elf | grep java
> kill 
>
> If no other Java processes are running under "user", then "killall
> java" is a quick way to do it (Linux has killall... not sure about
> other systems).
>
> -Yonik
> http://www.lucidimagination.com
>


On Ubuntu, CentOS and some other linux distros, you can run this command:

pkill -f start.jar

OR

pkill -f java

If there are no other java processes running


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

[PECL-DEV] [ANNOUNCEMENT] solr-0.9.8 (beta) Released

2009-12-03 Thread Israel Ekpo

The new PECL package solr-0.9.8 (beta) has been released at
http://pecl.php.net/.

Release notes
-
- Fixed config.w32 for Windows build support. (Pierre, Pierrick)
- Windows .dll now available at http://downloads.php.net/pierre (Pierre)
- Fixed Bug #16943 Segmentation Fault from solr_encode_string() during
attempt to retrieve solrXmlNode->children->content when
solrXmlNode->children is NULL (Israel)
- Disabled Expect header in libcurl (Israel)
- Disabled Memory Debugging when normal debug is enabled (Israel)
- Added list of contributors to the project (README.CONTRIBUTORS)

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
available in Solr 1.4. The extension has features such as built-in,
serializable query string builder objects which effectively simplifies the
manipulation of name-value pair request parameters across repeated requests.
The response from the Solr server is also automatically parsed into native
php objects whose properties can be accessed as array keys or object
properties without any additional configuration on the client-side. Its
advanced HTTP client reuses the same connection across multiple requests and
provides built-in support for connecting to Solr servers secured behind HTTP
Authentication or HTTP proxy servers. It is also able to connect to
SSL-enabled containers. Please consult the documentation for more details on
features.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-0.9.8.tgz
Documentation: http://www.php.net/manual/en/book.solr.php

Authors
-----
Israel Ekpo  (lead)

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: parsing the raw query string?

2009-12-06 Thread Israel Ekpo

Hi

If you are planning to use Solr via PHP, you can take a look at the Solr
PECL extension.

http://www.php.net/manual/en/book.solr.php

which you can download from here

http://pecl.php.net/package/solr

There is a SolrQuery class that allows you to build and manage the
name-value pair parameters which you can then pass on to the SolrClient
object for onward transmission to the Solr server. It is also serializable
so you can cache is in the $_SESSION variable to propagate the parameters
from page to page accross requests.

The SolrQuery class has buillt-in methods to add, update, remove and manage
the Facets, Highlighting, MoreLikeThis, Stats, TermsComponents etc.

I hope this helps.

On Sun, Dec 6, 2009 at 1:25 AM, regany  wrote:

>
> I've just found solr and am looking at what's involved to work with it. All
> the examples I've seen only ever use 1 word search terms being implemented
> as examples, which doesn't help me trying to see how multiple word queries
> work. It also looks like a hell of a lot of processing needs to be done on
> the raw query string even before you can pass it to solr (in PHP) - is
> everyone processing the query string first and building a custom call to
> solr, or is there a query string parser I've missed somewhere? I can't even
> find what operators (if any) are able to be used in the raw query string in
> the online docs (maybe there aren't any??). Any help or points in the right
> direction would be appreciated.
> --
> View this message in context:
> http://old.nabble.com/parsing-the-raw-query-string--tp26662578p26662578.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

SolrQuerySyntax : Types of Range Queries in Solr 1.4

2009-12-09 Thread Israel Ekpo

Hi Guys,

In Lucene 2.9 and Solr 1.4, it is possible to perform inclusive and
exclusive range searches with square and curly brackets respectively.

However, when I looked at the SolrQuerySyntax, only the the include range
search is illustrated.

It seems like the examples only talk about the inclusive range searches.

http://wiki.apache.org/solr/SolrQuerySyntax

Illustrative example:

There is a field in the index name 'year' and it contains the following
values :

2000, 2004, 2005, 2006, 2007, 2008, 2009, 2010

year:[2005 TO 2009] will match 2005, 2006, 2007, 2008, 2009 [inclusive with
square brackets]
year:{2005 TO 2009} will only match 2006, 2007, 2008 {exclusive with curly
brackets}. The bounds are not included.

Is there any other page on the wiki where there are examples of exclusive
range searches with curly brackets?

If not I would like to know so that I can add some examples to the wiki.

Thanks.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: SolrQuerySyntax : Types of Range Queries in Solr 1.4

2009-12-09 Thread Israel Ekpo

On Wed, Dec 9, 2009 at 1:13 PM, Yonik Seeley wrote:

> Solr standard query syntax is an extension of Lucene query syntax, and
> we reference that on the page:
> http://lucene.apache.org/java/2_4_0/queryparsersyntax.html
>
> -Yonik
> http://www.lucidimagination.com
>
> On Wed, Dec 9, 2009 at 1:08 PM, Israel Ekpo  wrote:
> > Hi Guys,
> >
> > In Lucene 2.9 and Solr 1.4, it is possible to perform inclusive and
> > exclusive range searches with square and curly brackets respectively.
> >
> > However, when I looked at the SolrQuerySyntax, only the the include range
> > search is illustrated.
> >
> > It seems like the examples only talk about the inclusive range searches.
> >
> > http://wiki.apache.org/solr/SolrQuerySyntax
> >
> > Illustrative example:
> >
> > There is a field in the index name 'year' and it contains the following
> > values :
> >
> > 2000, 2004, 2005, 2006, 2007, 2008, 2009, 2010
> >
> > year:[2005 TO 2009] will match 2005, 2006, 2007, 2008, 2009 [inclusive
> with
> > square brackets]
> > year:{2005 TO 2009} will only match 2006, 2007, 2008 {exclusive with
> curly
> > brackets}. The bounds are not included.
> >
> > Is there any other page on the wiki where there are examples of exclusive
> > range searches with curly brackets?
> >
> > If not I would like to know so that I can add some examples to the wiki.
> >
> > Thanks.
> >
> > --
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the gift.
> > Quality First. Measure Twice. Cut Once.
> > http://www.israelekpo.com/
> >
>


Hi Yonik,

I saw that.

I posted the question because someone asked me how to do the exclusive
search where the bounds are excluded.

Initially they started with field:[lower-1 TO upper-1] and then I just told
them to use curly brackets so when I came to the Solr wiki to do a search I
did not see any examples with the curly brackets.

For me this was very obvious, but I think it would be nice to add a few
examples with curly brackets to the SolrQuerySyntax examples because most
people that are using Solr for the very first time may not have heard of or
used Lucene before.

Just a thought.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Indexing content on Windows file shares?

2009-12-10 Thread Israel Ekpo

If you are looking to index websites, Nutch would be a better alternative.

However, it could be useful for indexing text files.

There is documentation here for how to add data to the index

http://lucene.apache.org/solr/tutorial.html#Indexing+Data

http://wiki.apache.org/solr/#Search_and_Indexing

There are some clients here to add data to the index programatically.

http://wiki.apache.org/solr/IntegratingSolr



On Thu, Dec 10, 2009 at 3:06 PM, Matt Wilkie  wrote:

> Hello,
>
> I'm new to Solr, I know nothing about it other than it's been touted in a
> couple of places as a possible competitor to Google Search Appliance, which
> is what brought me here. I'm looking for a search engine which can index
> files on windows shares and websites, and, hopefully, integrate with Active
> Directory to ensure results are not returned to users who don't have access
> to those files(s).
>
> Can Solr do this? If so where is the documentation for it? Reconnaisance
> searches of the mailing list and wiki have not turned up anything, so far.
>
> thanks,
>
> --
> matt wilkie
> 
> Geomatics Analyst
> Information Management and Technology
> Yukon Department of Environment
> 10 Burns Road * Whitehorse, Yukon * Y1A 4Y9
> 867-667-8133 Tel * 867-393-7003 Fax
> http://environmentyukon.gov.yk.ca/geomatics/
> 
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Sol server is not set up ??

2009-12-11 Thread Israel Ekpo

On Fri, Dec 11, 2009 at 7:54 AM, regany  wrote:

>
> Hello!
>
> I'm trying to successfully build/install the PHP Solr Extension, but am
> running into an error when doing a "make test" - the following 4 tests
> fail,
> the other 17 pass. The Solr server is definately running because I can
> access it via the admin URL. Anyone know what else may be causing the make
> test to think teh solr server is not set up???
>
> regan
>
> =
> Running selected tests.
> TEST 1/21 [tests/solrclient_001.phpt]
> SKIP SolrClient::addDocument() - Sending a single document to the Solr
> server [tests/solrclient_001.phpt] reason: Solr server is not set up
> TEST 2/21 [tests/solrclient_002.phpt]
> SKIP SolrClient::addDocuments() - sending multiple documents to the Solr
> server [tests/solrclient_002.phpt] reason: Solr server is not set up
> TEST 3/21 [tests/solrclient_003.phpt]
> SKIP SolrClient::addDocuments() - sending a cloned document
> [tests/solrclient_003.phpt] reason: Solr server is not set up
> TEST 4/21 [tests/solrclient_004.phpt]
> SKIP SolrClient::query() - Sending a chained query request
> [tests/solrclient_004.phpt] reason: Solr server is not set up
> --
> View this message in context:
> http://old.nabble.com/Sol-server-is-not-set-uptp26743824p26743824.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
Hi Regan,

This is Israel, the author of the PHP extension.

There is nothing wrong with your Solr server, it is just a configuration
that you have to change in the test_config.php file before running the "make
test" command.

In the tests/test_config.php file you will have to change the value of *
SOLR_SERVER_CONFIGURED* from *false* to* true*.

You can the contents of the file here in the repository

http://svn.php.net/viewvc/pecl/solr/trunk/tests/test.config.php?revision=290120&view=markup

You also have to specify the correct values for the host name and port
numbers.

I am going to make some changes to the README files, the test scripts other
documentations to make sure that this part is clear (why some tests may be
skipped). These changes should be available in the next update release early
next week.

So, please make these changes and try again. It should not be skipped this
time.

Also, I would like to know the version of the Solr extension, the PHP
version and the operating system you are using.

Please let me know if you need any help.

Sincerely,
Israel Ekpo

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: SolrClient::query(): Solr HTTP Error : 'Couldn't connect to server'

2009-12-11 Thread Israel Ekpo

On Fri, Dec 11, 2009 at 6:49 AM, regany  wrote:

>
> hi, I've (hopefully correctly) install the solr php extension.
>
> But I'm receiving the following error when trying to run my test script:
>
> SolrClient::query(): Solr HTTP Error : 'Couldn't connect to server'
>
> Any ideas how to figure out why it's giving the error??
>
> regan
>
>
> 
> /* Domain name of the Solr server */
> define('SOLR_SERVER_HOSTNAME', 'localhost');
>
> define('SOLR_SERVER_PATH', '/solr/core0');
>
> /* Whether or not to run in secure mode */
> define('SOLR_SECURE', false );
>
> /* HTTP Port to connection */
> define('SOLR_SERVER_PORT', ((SOLR_SECURE) ? 8443 : 8983));
>
> $options = array(
>'hostname' => SOLR_SERVER_HOSTNAME
>,'port' => SOLR_SERVER_PORT
>,'path' => SOLR_SERVER_PATH
>
> );
>
> $client = new SolrClient($options);
> $query = new SolrQuery();
> $query->setQuery('apple');
> $query->setStart(0);
> $query->setRows(50);
> $query_response = $client->Query($query);
> print_r($query_response);
> $respose = $query_response->getResponse();
> print_r($response);
>
> ?>
>
>
> --
> View this message in context:
> http://old.nabble.com/SolrClient%3A%3Aquery%28%29%3A-Solr-HTTP-Error-%3A-%27Couldn%27t-connect-to-server%27-tp26742899p26742899.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Hi Regan,

I have the following questions:

0. What version of Apache Solr are you using? 1.3, 1.4, nightly builds?

1. What version of PHP are you using and on what operating system?

2. What version of the Solr extension are you using?

3. Which servlet container are you using for Solr? (Jetty, Tomcat, Glass
fish etc)

4. What is the hostname and port numbers and path to Solr? Is your port
number 8080 or 8983

All please let me know what the output of $client->getDebug() is. This
usually contains very detailed errors of what is happening during the
connection.

I would be happy to help you troubleshoot any errors you are having.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: apache-solr-common.jar

2009-12-14 Thread Israel Ekpo

2009/12/14 Noble Paul നോബിള്‍ नोब्ळ् 

> there is no solrcommon jar anymore. you may use the solrj jar which
> contains all the classes which were there in the comon jar.
>
> On Mon, Dec 14, 2009 at 9:22 PM, gudumba l 
> wrote:
> > Hello All,
> >   I have been using apache-solr-common-1.3.0.jar in my
> module.
> > I am planning to shift to the latest version, because of course it has
> more
> > flexibility. But it is really strange that I dont find any corresponding
> jar
> > of the latest version. I have serached in total apachae solr 1.4 folder
> > (which is downloaded from site), but have not found any. , I am sorry,
> its
> > really silly to request for a jar, but have no option.
> > Thanks.
> >
>
>
>
> --
> -
> Noble Paul | Systems Architect| AOL | http://aol.com
>


I had the same question too earlier last week and I found out after some
research where the packages are bundled.

The specific jar is the dist folder as
apache-solr-1.4.0/dist/apache-solr-solrj-1.4.0.jar

This was where I found the classes in the org.apache.solr.common.* packages

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: solr php client vs file_get_contents?

2009-12-15 Thread Israel Ekpo

On Tue, Dec 15, 2009 at 8:49 AM, Faire Mii  wrote:

> i am using php to access solr and i wonder one thing.
>
> why should i use solr php client when i can use
>
> $serializedResult = file_get_contents('http://localhost:8983/solr/
> select?q=niklas&wt=phps');
>
> to get the result in arrays and then print them out?
>
> i dont really get the difference. is there any richer features with the php
> client?
>
>
> regards
>
> fayer

Hi Faire,

Have you actually used this library before? I think the library is pretty
well thought out.

>From a simple glance at the source code you can see that one can use it for
the following purposes:

1. Adding documents to the index (which you cannot just do with
file_get_contents alone). So that's one diff

2. Updating existing documents

3. Deleting existing documents.

4. Balancing requests across multiple backend servers

There are other operations with the Solr server that the library can also
perform.

Some example of what I am referring to is illustrated here

http://code.google.com/p/solr-php-client/wiki/FAQ

http://code.google.com/p/solr-php-client/wiki/ExampleUsage

IBM also has an interesting article illustrating how to add documents to the
Solr index and issue commit and optimize calls using this library.

http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/

The author of the library can probably give you more details on what the
library has to offer.

I think you should download the source code and spend some time looking at
all the features it has to offer.

In my opinion, it is not fair to compare a well thought out library like
that with a simple php function.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Can solr web site have multiple versions of online API doc?

2009-12-15 Thread Israel Ekpo

2009/12/15 Teruhiko Kurosaka 

> Lucene keeps multiple versions of its API doc online at
> http://lucene.apache.org/java/X_Y_Z/api/all/index.html
> for version X.Y.Z.  I am finding this very useful when
> comparing different versions.  This is also good because
> the javadoc comments that I write for my software can
> reference the API comments of the exact version of
> Lucene that I am using.
>
> At Solr site, I can only find the API doc of the trunk
> build.  I cannot find 1.3.0 API doc, for example.
>
> Can Solr site also maintain the API docs for the past
> stable versions ?
>
> -kuro


Hi Teruhiko

If you downloaded the 1.3.0 release, you should find a "docs" folder inside
the zip file.

This contains the javadoc for that particular release.

You may also re download a 1.3.0 release to get the docs for Solr 1.3.

I hope this helps.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: How use implement Lucene for perl.

2009-12-28 Thread Israel Ekpo

I think you need to send a message to the lucene mailing list instead if you
want to use Lucene directly.

java-u...@lucene.apache.org

The API core Javadoc page has a very simple example which you can use to get
started with a few modifications.

http://lucene.apache.org/java/3_0_0/api/core/index.html

Use the documentation to select the appropriate constructor and method
signatures.

On the other hand, I think Solr can do everything that you need without the
need to interact directly with the lucene API

On Mon, Dec 28, 2009 at 11:42 PM, Maheshwar  wrote:

>
> I am new for Lucene.
> I haven't any idea about Lucene.
> I want to implement Lucene in my search script.
> Please guide me what I needs to be do for Lucene implementation.
>
> Actually, I want to integrate lucene search with message board system where
> people come to post new topic, edit that topic and delete that on needs. I
> want, to update search index at every action.
> So I need some valuable help.
>
>
>
> --
> View this message in context:
> http://old.nabble.com/How-use-implement-Lucene-for-perl.-tp26951130p26951130.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: NOT highlighting synonym

2009-12-28 Thread Israel Ekpo

I think what Erik was referring to was for you to create a separate copy
field with different analyzers and just copy the original value to that copy
field and index it differently.

That way you can use one field for search and another one to display the
highlighting results.



On Mon, Dec 28, 2009 at 1:00 PM, darniz  wrote:

>
> Thanks
> Unfortunately thats not the case.
> We are using the same field to do search on and display that text.
> So looks like in this case this is not possible
> Am i correct
>
>
> We have a custom field type with synonyms defined at query time.
>
> Erik Hatcher-4 wrote:
> >
> >
> > On Dec 23, 2009, at 2:26 PM, darniz wrote:
> >> i have a requirement where we dont want to hightlight synonym matches.
> >> for example i search for caddy and i dont want to highlight matched
> >> synonym
> >> like cadillac.
> >> Looking at highlighting parameters i didn't find any support for this.
> >> anyone can offer any advice.
> >
> > You can control what gets highlighted by which analyzer is used.  You
> > may need a different field for highlighting than you use for searching
> > in this case - but you can just create another field type without the
> > synonym filter in it and use that for highlighting.
> >
> >   Erik
> >
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/NOT-highlighting-synonym-tp26906321p26945921.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Help with creating a solr schema

2010-01-01 Thread Israel Ekpo

On Thu, Dec 31, 2009 at 10:26 AM, JaredM  wrote:

>
> Hi,
>
> I'm new to Solr but so far I think its great.  I've spent 2 weeks reading
> through the wiki and mailing list info.
>
> I have a use case and I'm not sure what the best way is to implement it.  I
> am keeping track of peoples calendar schedules in a really simple way: each
> user can login and input a number of date ranges where they are available
> (so for example - User Alice might be available between 1-Jan-2010 -
> 15-Jan-2010 and 20-Feb-2010 - 22-Feb-2010 and 1-Mar-2010-5-Mar-2010.
>
> In my data model I have this modelled as a one-to-many with a User table
> (consisting of username, some metadata) and an Availability table
> (consisting of start date and end date).
>
> Now I need to search which users are available between a given date range.
> The bit I'm having trouble with is how to store multiple start & end date
> pairs.  Can someone provide some guidance?
> --
> View this message in context:
> http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26979319.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

I have done something similar to this before.

You will have to store the username, firstname, lastname as single valued
fields







However, the start and end dates should be multivalued tint types.

I decided to store the dates as UNIX timestamps. The start dates are stored
as the unix timestamps at 12 midnight of that date (00:00:00)

The end dates are stored as the unix time stamps at 11:59:59 PM on the end
date 23:59:59

This (storing the dates as Trie integers) gave me faster range query
results.

when searching you will also have to convert the dates to unix time stamps
using similar logic before using it in the solr search query

You should use the username of the user as the uniqueKey.

If a user has multiple dates of availability you will enter it like so:



exampleun
examplefn
exampleln
137865661
137865662
137865663
137865681
137865682
137865683




-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: solr 1.4 csv import -- Document missing required field: id

2010-01-01 Thread Israel Ekpo

On Fri, Jan 1, 2010 at 9:13 PM, evana  wrote:

>
> Hi,
>
> I am trying to import a csv file (without "id" field) on solr 1.4
> In schema.xml "id" field is set with required="false".
> But I am getting "org.apache.solr.common.SolrException: Document missing
> required field: id"
>
> Following is the schema.xml fields section
>  
>required="false" />
>   
>multiValued="true"/>
>
>   
>   
>   
>  
>
>  id
>
>
> Following is the csv file
>company_id,customer_name,active
>58,Apache,Y
>58,Solr,Y
>58,Lucene,Y
>60,IBM,Y
>
> Following is the solrj import client
>SolrServer server = new
> CommonsHttpSolrServer("http://localhost:8080/solr";);
>ContentStreamUpdateRequest req = new
> ContentStreamUpdateRequest("/update/csv");
>req.addFile(new File(filename));
>req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
>NamedList result = server.request(req);
>System.out.println("Result: " + result);
>
>
> Could any of you help out please.
>
> Thanks
> --
> View this message in context:
> http://old.nabble.com/solr-1.4-csv-import-Document-missing-required-field%3A-id-tp26990048p26990048.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
The presence of the uniqueKey definition implicitly implies that the id
field is a required field in the document even tough the attribute is set to
false on the field definition.

Try removing the uniqueKey definition for the id field in the schema.xml
file and then try again to run the update script or application.

The uniqueKey definition is not needed if you are going to build the index
from scratch each time you do the import.

However, if you are doing incremental updates, this field is required and
the uniqueKey definition is also needed too to specify what the "primary
key" for the doucment is.

http://wiki.apache.org/solr/UniqueKey


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Help with creating a solr schema

2010-01-01 Thread Israel Ekpo

On Fri, Jan 1, 2010 at 9:47 PM, JaredM  wrote:

>
> Thanks Ahmet and Israel.  I prefer Israel's approach since the amount of
> metadata for the user is quite high but I'm not clear how to get around one
> problem:
>
> If I had 2 availabilities (I've left it in human-readable form instead of
> as
> a UNIX timestamp only for ease of understanding):
>
> 10-Jan-2010
> 20-Jan-2010
> 25-Jan-2010
> 28-Jan-2010
>
> and I wanted to query for availability between 12-Jan-2010 to 26-Jan-2010
> then then wouldn't the above document be returned (even though the use
> would
> not be available 20-25 Jan?
> --
> View this message in context:
> http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26990178.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Unfortunately,

For this particular use case, if you are using the out-of-the-box features
available in Solr 1.4, without a custom Solr plugin using a custom Lucene
filter and some special value storage arrangement for the fields, you will
have to store each start and end date as a separate document. So, there will
be N separate documents for each username if that user has N distinct
periods of availabilty. The start date and end date fields would also have
to be single valued instead of multi-valued as I specified in the earlier
post.

Sorry.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

[PECL-DEV] [ANNOUNCEMENT] solr-0.9.9 (beta) Released

2010-01-10 Thread Israel Ekpo

The new PECL package solr-0.9.9 (beta) has been released at
http://pecl.php.net/.

Release notes
-
- Fixed Bug #17009 Creating two SolrQuery objects leads to wrong query value
- Reset the buffer for the request data from the previous request in
SolrClient
- Added new internal static function solr_set_initial_curl_handle_options()
- Moved the intialization of CURL handle options to
solr_set_initial_curl_handle_options() function
- Resetting the CURL options on the (CURL *) handle after each request is
completed
- Added more explicit error message to indicate that cloning SolrParams
objects and its descendants is currently not yet supported

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
available in Solr 1.4. The extension has features such as built-in,
serializable query string builder objects which effectively simplifies the
manipulation of name-value pair request parameters across repeated requests.
The response from the Solr server is also automatically parsed into native
php objects whose properties can be accessed as array keys or object
properties without any additional configuration on the client-side. Its
advanced HTTP client reuses the same connection across multiple requests and
provides built-in support for connecting to Solr servers secured behind HTTP
Authentication or HTTP proxy servers. It is also able to connect to
SSL-enabled containers. Please consult the documentation for more details on
features.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-0.9.9.tgz
Documentation: http://us.php.net/solr

Authors
-
Israel Ekpo  (lead)

Re: What is this error means?

2010-01-12 Thread Israel Ekpo

Ellery,

A preliminary look at the source code indicates that the error is happening
because the solr server is taking longer than expected to respond to the
client

http://code.google.com/p/solr-php-client/source/browse/trunk/Apache/Solr/Service.php

The default time out handed down to Apache_Solr_Service:_sendRawPost() is 60
seconds since you were calling the addDocument() method

So if it took longer than that (1 minute), then it will exit with that error
message.

You will have to increase the default value to something very high like 10
minutes or so on line 252 in the source code since there is no way to
specify that in the constructor or the addDocument method.

Another alternative will be to update the default_socket_timeout in the
php.ini file or in the code using ini_set

I hope that helps



On Tue, Jan 12, 2010 at 9:33 PM, Ellery Leung  wrote:

>
> Hi, here is the stack trace:
>
> 
> Fatal error:  Uncaught exception 'Exception' with message '"0"
> Status: Communication Error' in
> C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv
> ice.php:385
> Stack trace:
> #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652):
> Apache_Solr_Ser
> vice->_sendRawPost('http://127.0.0', ' #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676):
> Apache_Solr_Ser
> vice->add(' #2
>
> C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221):
> Apache_Solr_Service->addDocument(Object(Apache_Solr_Document))
> #3
>
> C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262):
> SolrSearchEngine->buildIndex(Array, 'key')
> #4
> C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51):
> So
> lrSearchEngine->createFullIndex('contacts', Array, 'key', 'www')
> #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64):
> Indexer-&g
> t;create('www')
> #6 {main}
>  thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li
> ne 385
>
> C:\nginx\html\apps\milio\htdocs\Contacts>pause
> Press any key to continue . . .
>
> Thanks for helping me.
>
>
> Grant Ingersoll-6 wrote:
> >
> > Do you have a stack trace?
> >
> > On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote:
> >
> >> When I am building the index for around 2 ~ 25000 records, sometimes
> >> I
> >> came across with this error:
> >>
> >>
> >>
> >> Uncaught exception "Exception" with message '0' Status: Communication
> >> Error
> >>
> >>
> >>
> >> I search Google & Yahoo but no answer.
> >>
> >>
> >>
> >> I am now committing document to solr on every 10 records fetched from a
> >> SQLite Database with PHP 5.3.
> >>
> >>
> >>
> >> Platform: Windows 7 Home
> >>
> >> Web server: Nginx
> >>
> >> Solr Specification Version: 1.4.0
> >>
> >> Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06
> >> 12:33:40
> >>
> >> Lucene Specification Version: 2.9.1
> >>
> >> Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25
> >>
> >> Solr hosted in jetty 6.1.3
> >>
> >>
> >>
> >> All the above are in one single test machine.
> >>
> >>
> >>
> >> The situation is that sometimes when I build the index, it can be
> created
> >> successfully.  But sometimes it will just stop with the above error.
> >>
> >>
> >>
> >> Any clue?  Please help.
> >>
> >>
> >>
> >> Thank you in advance.
> >>
> >
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Basic questions about Solr cost in programming time

2010-01-26 Thread Israel Ekpo

On Tue, Jan 26, 2010 at 3:00 PM, Jeff Crump wrote:

> Hi,
> I hope this message is OK for this list.
>
> I'm looking into search solutions for an intranet site built with Drupal.
> Eventually we'd like to scale to enterprise search, which would include the
> Drupal site, a document repository, and Jive SBS (collaboration software).
> I'm interested in Lucene/Solr because of its scalability, faceted search
> and
> optimization features, and because it is free. Our problem is that we are a
> non-profit organization with only three very busy programmers/sys admins
> supporting our employees around the world.
>
> To help me argue for Solr in terms of total cost, I'm hoping that members
> of
> this list can share their insights about the following:
>
> * About how many hours of programming did it take you to set up your
> instance of Lucene/Solr (not counting time spent on optimization)?
>
>
For me this generally took 30 to 70 hours to create the entire search
application depending on the features on the web application and the
complexity of the site.


> * Are there any disadvantages of going with a certified distribution rather
> than the standard distribution?
>
>
> The people at Lucid Imagination can probably provide a better answer for
this. It is not really a disadvantage to go with the certified version but
you may have to pay in order to get the certified distribution. However, you
will get dedicated support if you happen to run into any issues or need
technical assistance.

If you use the standard version you can always get help from the mailing
list if you have any issues.



> Thanks and best regards,
> Jeff
>
> Jeff Crump
> jcr...@hq.mercycorps.org
>
>
>
>
>
>
>
>
>
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Question about custom Lucene filters and Solr

2010-02-16 Thread Israel Ekpo

Hi Jon,

You will need to write a plugin

You will need custom Query parser and an Update Handler depending on what
you are doing.

The implementation of an Update Handler or Update Request Processor is not
recommended because it is considered to be advanced.

Take a look at the following links for more information

http://wiki.apache.org/solr/SolrPlugins

http://wiki.apache.org/solr/UpdateRequestProcessor

http://lucene.apache.org/solr/api/org/apache/solr/update/UpdateHandler.html

http://lucene.apache.org/solr/api/org/apache/solr/search/QParserPlugin.html

http://lucene.apache.org/solr/api/org/apache/solr/update/processor/UpdateRequestProcessor.html

On Tue, Feb 16, 2010 at 2:43 PM, Jon Bodner  wrote:

> Hello,
>
> I'm interested in using Solr with a custom Lucene Filter (like the one
> described in section 6.4.1 of the Lucene In Action, Second Edition book).
>  I'd like to filter search results from a Lucene index against information
> stored in a relational database.  I don't want to move the relational
> database information into the search index, because it could change
> frequently.
>
> I looked at writing my own custom Solr SearchComponent, but the
> documentation for those seems slim.  Is this the correct approach?  Is there
> another way?
>
> Thanks,
>
> Jon
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: If you could have one feature in Solr...

2010-02-24 Thread Israel Ekpo

Grant,

One feature that I would like to see is the ability to do a Bitwise search

I have had to work around this with a Query Parser plugin that uses a
org.apache.lucene.search.Filter

I think having this feature would be very nice and I prefer it to searching
with multiple OR type queries especially when the bit are known ahead of
time.

I can submit the code as a patch once I get the approval to do so.

On Wed, Feb 24, 2010 at 2:20 PM, straup  wrote:

> I actually found the documentation pretty great especially since (my
> experience, anyway) most Java projects seem to default to generic JavaDoc
> derived documentation (and that makes me cry).
>
> That said, more cookbook-style "recipes" or stories would be helpful for
> some of the more esoteric parts of Solr.
>
> Also: real-time indexing and geo.
>
> Cheers,
>
>
> On 2/24/10 9:54 AM, Grant Ingersoll wrote:
>
>>
>> On Feb 24, 2010, at 11:08 AM, Stefano Cherchi wrote:
>>
>>  Decent documentation.
>>>
>>
>> What parts do you feel are lacking?  Or is it just across the board?
>>  Wikis are both good and bad for documentation, IMO.
>>
>> -Grant
>>
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: updating particular field

2010-03-01 Thread Israel Ekpo

Unfortunately, because of how Lucene works internally, you will not be able
to update just one or two fields. You have to resubmit the entire document.

If you only send just one or two fields, then the updated document will only
have the fields sent in the last update.

On Mon, Mar 1, 2010 at 7:09 AM, Suram  wrote:

>
>
>
> Siddhant wrote:
> >
> > Yes. You can just re-add the document with your changes, and the rest of
> > the
> > fields in the document will remain unchanged.
> >
> > On Mon, Mar 1, 2010 at 5:09 PM, Suram  wrote:
> >
> >>
> >> Hi,
> >>
> >> 
> >>   EN7800GTX/2DHTV/256M
> >>  ASUS Computer Inc.
> >>  electronics
> >>  graphics card
> >>  NVIDIA GeForce 7800 GTX GPU/VPU clocked at
> >> 486MHz
> >>  256MB GDDR3 Memory clocked at 1.35GHz
> >>  479.95
> >>  7
> >>  false
> >>  2006-02-13T15:26:37Z/DAY
> >> 
> >>
> >> can i possible to update true without
> >> affect
> >> any field of my previous document
> >>
> >> Thanks in advance
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/updating-particular-field-tp27742399p27742399.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > - Siddhant
> >
> >
>
>
> Hi,
>   Here i don't want to reload entire data just i want u update a field
> i need to change(ie one or more field with id not whole)
>
>
> --
> View this message in context:
> http://old.nabble.com/updating-particular-field-tp27742399p27742671.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: get Server Status, TotalDocCount .... PHP !

2010-03-02 Thread Israel Ekpo

The last time I tried using SolrPHPClient for this stuff, it did not really
handle the response very well because of the JSON response generated on the
server side.

I am not sure if anything has changed since then.

The JSON code generated could not be parsed properly.

If you do not want to analyze the xml response each time and if you are not
using the pecl extension you will need to send a request manually to the
solr server using CURL and you have to specify the response format as phps


On Tue, Mar 2, 2010 at 9:59 AM, stocki  wrote:

>
> Hey-
>
> No i use the SolrPHPClient http://code.google.com/p/solr-php-client/
> i not really want tu use two different php-libs. ^^
>
> what do you mean with unserialize ? XD
>
>
>
>
>
> Guillaume Rossolini-2 wrote:
> >
> > Hi
> >
> > Have you tried the php_solr extension from PECL?  It has a handy
> > SolrPingResponse class.
> > Or you could just call the CORENAME/admin/ping?wt=phps URL and
> unserialize
> > it.
> >
> > Regards,
> >
> > --
> > I N S T A N T  |  L U X E - 44 rue de Montmorency | 75003 Paris | France
> > Tél. : 01 80 50 52 51 | Mob. : 06 09 96 10 29 | web :
> www.instantluxe.com
> >
> >
> > On Tue, Mar 2, 2010 at 2:50 PM, stocki  wrote:
> >
> >>
> >> hello
> >>
> >> I use Solr in my cakePHP Framework.
> >>
> >> How can i get status information of my solr cores ??
> >>
> >> I dont want analyze everytime the responseXML.
> >>
> >> do anybody know a nice way to get status messages from solr ?
> >>
> >> thx ;) Jonas
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/get-Server-Status%2C-TotalDocCount--PHP-%21-tp27756118p27756118.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/get-Server-Status%2C-TotalDocCount--PHP-%21-tp27756118p27756852.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Search on dynamic fields which contains spaces /special characters

2010-03-08 Thread Israel Ekpo

I do not believe the SOLR or LUCENE syntax allows this

You need to get rid of all the spaces in the field name

If not, then you will be searching for "short" in the default field and then
"name1" in the "name" field.

http://wiki.apache.org/solr/SolrQuerySyntax

http://lucene.apache.org/java/2_9_2/queryparsersyntax.html


On Mon, Mar 8, 2010 at 2:17 PM, JavaGuy84  wrote:

>
> Hi,
>
> We have some dynamic fields getting indexed using SOLR. Some of the dynamic
> fields contains spaces / special character (something like: short name,
> Full
> Name etc...). Is there a way to search on these fields (which contains the
> spaces etc..). Can someone let me know the filter I need to pass to do this
> type of search?
>
> I tried with short name:name1 --> this didnt work..
>
> Thanks,
> Barani
> --
> View this message in context:
> http://old.nabble.com/Search-on-dynamic-fields-which-contains-spaces--special-characters-tp27826147p27826147.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Features not present in Solr

2010-03-20 Thread Israel Ekpo

One feature that is not available in Solr is any licensing fees and fine
print.

Also you should not expect to pay in order to use Solr.

On Fri, Mar 19, 2010 at 11:16 PM, Srikanth B  wrote:

> Hello
>
> We are in the process of researching on Solr features. I am looking for two
> things
>1. Features not available in Solr but present in other products like
> Endeca
>2. What one shouldn't not expect from Solr
>
> Any thoughts ?
>
> Thanks in advance
> Srikanth
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Features not present in Solr

2010-03-22 Thread Israel Ekpo

On Mon, Mar 22, 2010 at 3:16 PM, Lance Norskog  wrote:

> Web crawling.


I don't think Solr was designed with Web Crawling in mind. Nutch would be
more better suited for that, I believe.


> Text analysis.
>

This is a bit vague.

Please elaborate further. There is a lot of analysis (stemming, stop-word
removal, character transformation etc) that takes place already though
implicitly based on what fields you define and use in the schema.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters


> Distributed index management.
> A fanatical devotion to the Pope.
>
> There a probably a lot of features already available in Solr out of the box
that most of those other "enterprise level" applications do not have yet.

You would also be surprised to learn that a lot of them use Lucene under the
covers and are actually trying to re-implement what is already available in
Solr.


> On Sun, Mar 21, 2010 at 11:19 PM, MitchK  wrote:
> >
> > Srikanth,
> >
> > I don't know anything about Endeca, so I can't compare Solr to it.
> > However, I know Solr is powerful. Very powerful.
> > So, maybe you should tell us more about your needs to get a good answer.
> >
> > As a response to your second question: You should not expect that Solr is
> > a database. It is an index-server. A database makes your data save. If
> there
> > goes something wrong - which is always possible - Solr gives no
> warranties.
> > Maybe someone other can tell you more about this topic.
> >
> > - Mitch
> >
> >
> > Srikanth B wrote:
> >>
> >> Hello
> >>
> >> We are in the process of researching on Solr features. I am looking for
> >> two
> >> things
> >> 1. Features not available in Solr but present in other products
> >> like
> >> Endeca
> >> 2. What one shouldn't not expect from Solr
> >>
> >> Any thoughts ?
> >>
> >> Thanks in advance
> >> Srikanth
> >>
> >>
> >
> > --
> > View this message in context:
> http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: selecting documents older than 4 hours

2010-04-01 Thread Israel Ekpo

I did something similar.

The only difference with my set up is that I have two columns; one that
store the dates the document was first created and a second that stores the
date it was last updated as unix time stamps

So my query to find documents that are older than 4 hours would be very easy

To find documents that were last updated more than for hours ago you would
do something like this

q=last_update_date:[* TO 1270119278]

The current timestamp now is 1270133678. 4 hours ago was 1270119278

The column types in the schema is tint



On Wed, Mar 31, 2010 at 11:18 PM, herceg_novi  wrote:

>
> Hello, I'd like to select documents older than 4 hours in my Solr 1.4
> installation.
>
> The query
>
> q=last_update_date:[NOW-7DAYS TO NOW-4HOURS]
>
> does not return a correct recordset. I would expect to get all documents
> with last_update_date in the specified range. Instead solr returns all
> documents that exist in the index which is not what I would expect.
> Last_update_date is SolrDate field.
>
> This does not work either
> q=last_update_date:[NOW/DAY-7DAYS TO NOW/HOUR-4HOURS]
>
> This works, but I manually had to calculate the 4 hour difference and
> insert
> solr date formated timestamp into my query (I prefer not to do that)
> q=last_update_date:[NOW/DAY-7DAYS TO 2010-03-31T19:40:34Z]
>
> Any ideas if I can get this to work as expected?
> q=last_update_date:[NOW-7DAYS TO NOW-4HOURS]
>
> Thanks!
> --
> View this message in context:
> http://n3.nabble.com/selecting-documents-older-than-4-hours-tp689975p689975.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: add/update document as distinct operations? Is it possible?

2010-04-05 Thread Israel Ekpo

Chris,

I don't see anything in the headers suggesting that Julian's message was a
hijack of another thread

On Thu, Apr 1, 2010 at 2:17 PM, Chris Hostetter wrote:

>
> : Subject: add/update document as distinct operations? Is it possible?
> : References:
> :
> 
> : In-Reply-To:
> :
> 
>
> http://people.apache.org/~hossman/#threadhijack
> Thread Hijacking on Mailing Lists
>
> When starting a new discussion on a mailing list, please do not reply to
> an existing message, instead start a fresh email.  Even if you change the
> subject line of your email, other mail headers still track which thread
> you replied to and your question is "hidden" in that thread and gets less
> attention.   It makes following discussions in the mailing list archives
> particularly difficult.
> See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking
>
>
>
>
> -Hoss
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Tutorials for developing filter plugins.

2010-04-11 Thread Israel Ekpo

He is referring to the org.apache.lucene.search.Filter classes.

Michael,

I did a search too and I could not really find any useful tutorials on the
subject.

You can take a look at how this is implemented in the Spatial Solr Plugin by
the JTeam

http://www.jteam.nl/news/spatialsolr.html

Their code, I believe, uses the bits() method which has been deprecated in
Lucene 2.9 and removed in 3.0.

The getDocIdSet() method returns a BitSet object which you can prepare from
org.apache.lucene.util.OpenBitSet

I think there is probably some example in the new version (2nd Edition) of
the *Lucene In Action *book on how to do something similar.

You should check it out from the Manning Early Access Program page.

http://www.manning.com/hatcher3/

You should also check out the Solr 1.5 source code for how some of the
Lucene Filter classes are designed.

On Sat, Apr 10, 2010 at 5:23 AM, MitchK  wrote:

>
> Hi Michael,
>
> do you mean a TokenFilter like StopWordFilter?
>
> If you like, you could post some code, so one can help you.
> It's really easy to develop some TokenFilters, if you have a look at
> already
> implemented ones.
>
> Kind regards
> - Mitch
> --
> View this message in context:
> http://n3.nabble.com/Tutorials-for-developing-filter-plugins-tp706874p709897.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: I have a big problem with pagination using apache solr and haystack

2010-04-20 Thread Israel Ekpo

I hear this sort of complaint frequently.

Make ensure you did not forget to send a commit request after deleting any
documents you have removed.

Until the commit request is made those deletes are not yet finalized and the
removed documents will still show up

On Tue, Apr 20, 2010 at 2:37 PM, MitchK  wrote:

>
> Hi Isaac,
>
> how did you implement pagination in Solr? What did you do there?
> Did you ever had a look at your index with q=*:*?
> Maybe you've forgotten to delete some news while testing your application
> and so there are some duplicates.
>
> Another thing is: If you have got only 20 news and Solr seems to have 40
> you
> should be able to find those which are doubled. If not - don't change
> anything, try to find a corporation with a lot of money and declare "I've
> got an application which writes its own news - artificial intelligence?
> Here
> you are!" :).
>
> Hope this helps
> - Mitch
> --
> View this message in context:
> http://n3.nabble.com/I-have-a-big-problem-with-pagination-using-apache-solr-and-haystack-tp732572p733115.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Evangelism

2010-04-29 Thread Israel Ekpo

Checkout Lucid Imagination

http://www.lucidimagination.com/About-Search

This should convince you.

On Thu, Apr 29, 2010 at 2:10 PM, Daniel Baughman wrote:

> Hi I'm new to the list here,
>
>
>
> I'd like to steer someone in the direction of Solr, and I see the list of
> companies using solr, but none have a "power by solr" logo or anything.
>
>
>
> Does anyone have any great links with evidence to majorly successful solr
> projects?
>
>
>
> Thanks in advance,
>
>
>
> Dan B.
>
>
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

Re: Evangelism

2010-04-29 Thread Israel Ekpo

Their main search page has the "Powered by Solr" logo

http://www.lucidimagination.com/search/



On Thu, Apr 29, 2010 at 2:18 PM, Israel Ekpo  wrote:

> Checkout Lucid Imagination
>
> http://www.lucidimagination.com/About-Search
>
> This should convince you.
>
>
> On Thu, Apr 29, 2010 at 2:10 PM, Daniel Baughman wrote:
>
>> Hi I'm new to the list here,
>>
>>
>>
>> I'd like to steer someone in the direction of Solr, and I see the list of
>> companies using solr, but none have a "power by solr" logo or anything.
>>
>>
>>
>> Does anyone have any great links with evidence to majorly successful solr
>> projects?
>>
>>
>>
>> Thanks in advance,
>>
>>
>>
>> Dan B.
>>
>>
>>
>>
>
>
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
> http://www.israelekpo.com/
>



-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/

1 2 >

1 - 100 of 110 matches

Mail list logo