Hello,
I have a custom component which depends on the ordering of a multi-valued
parameter. Unfortunately it looks like the values do not come back in the same
order as they were put in the URL. Here is some code to explain the behavior:
URL: /solr/my_custom_handler?q=something&myparam=foo&mypa
On 17/03/2012, geeky2 wrote:
> hello all,
>
> i know this is never a fun topic for people, but our SDLC mandates that we
> have unit test cases that attempt to validate the output from specific solr
> queries.
>
> i have some ideas on how to do this, but would really appreciate feedback
> from any
I'm puzzled on whether or not Solr is the right system for solving this
problem I've got. I'm using some Solr indexes for autocompletion, and I
have a desire to rank the results by their value to the requesting user.
Essentially, I'll tally the number of times the user has chosen particular
results
Hi,
Is there a way for SOLR / SOLRJ to index files directly bypassing HTTP
streaming.
Use case:
* Text Files to be indexed are on file server (A) (some potentially large -
several 100 MB)
* SOLRJ client is on server (B)
* SOLR server is on server (C) running with dynamically created SOLR cores
hello all,
i know this is never a fun topic for people, but our SDLC mandates that we
have unit test cases that attempt to validate the output from specific solr
queries.
i have some ideas on how to do this, but would really appreciate feedback
from anyone that has done this or is doing it now.
Hello,
I want to do highlighting by "hand" into my indexed document which can be
XML, HTML, PDF, SVG, CGM...
Given a search query I want to be able to extract all the terms occurring
in this query to be able to do custom highlighting on the results. The
returned terms should be coherent with the
It seems that you are using the bbyopen data. If have made up your mind on
using the JSON data then simply store it in ElasticSearch instead of Solr
as they do take any valid JSON structure. Otherwise, you can download the
xml archive from bbyopen and prepare a schema:
Here are some generic instru
I'm still having issues replicating in my work environment. Can anyone
explain how the replication mechanism works? Is it communicating across
ports or through zookeeper to manager the process?
On Thu, Mar 8, 2012 at 10:57 PM, Matthew Parker <
mpar...@apogeeintegration.com> wrote:
> All,
>
> I
Hello,
Frankly speaking the computational complexity of Lucene search depends from
size of search result: numFound*log(start+rows), but from size of index.
Regards
On Fri, Mar 16, 2012 at 9:34 PM, Jamie Johnson wrote:
> I'm curious if anyone tell me how Solr/Lucene performs in a situation
> wh
On Fri, Mar 16, 2012 at 8:38 PM, Carlos Gonzalez-Cadenas <
c...@experienceon.com> wrote:
> On Fri, Mar 16, 2012 at 9:26 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> Hello Carlos,
>>
>
>> so, search all terms with MUST first, you've got the best result in terms
>> of precision a
bq: Shouldn't it be able to take any valid JSON structure?
No, that was never the intent. The intent here was just to provide
a JSON-compatible format for indexing data for those who
don't like/want to use XML or SolrJ or Solr doesn't index arbitrary
XML either. And I have a hard time imaginin
Ok, so my issue is that it must be a flat structure. Why isn't the JSON
parser able to deconstruct the object into a flatter structure for indexing?
Shouldn't it be able to take any valid JSON structure?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Error-while-trying-to-lo
I don't believe Solr indexes arbitrary JSON, just as it
does not index arbitrary XML. You need the input
to be quite specific to how Solr expects the data,
it's a relatively flat structure. There is an example
in /solr/example/exampledocs/books.json that
will give you an idea of the expected format
Hello all,
Yesterday was my first time using this (or any) email list and I think I did
something wrong. Anyways, I will try this again.
I have installed Solr search on my Drupal 7 installation. Currently, it works
as an 'All' search tool. I'd like to limit the scope of the search with an
I'm curious if anyone tell me how Solr/Lucene performs in a situation
where you have 100,000 documents each with 100 tokens vs having
1,000,000 documents each with 10 tokens. Should I expect the
performance to be the same? Any information would be greatly
appreciated.
I am trying to load a json document that has the following structure:
...
"accessoriesImage": null,
"department": "ET",
"shipping": [
{
"nextDay": 10.19,
"secondDay": 6.45,
"ground": 1.69
}
],
"preowned": false,
"format": "CD",
...
When executing the curl reques
It's really up to you. All any app needs to connect to Solr is the HTTP
connection, even if you use something like SolrJ. Yes, there'll
be some latency but I suspect you'll only really notice that if you're
trying to index massive amounts of data across the wire.
Best
Erick
On Fri, Mar 16, 2012 a
On Fri, Mar 16, 2012 at 9:26 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Hello Carlos,
>
Hello Mikhail:
Thanks for your answer.
>
> I have two concerns about your approach. First-K (not top-K honestly)
> collector approach impacts recall of your search and using disjunctive
> q
Hi,
Call me crazy, but I don’t like the idea of having a single server which not
only runs my PHP site on Apache, but also runs SOLR and Nutch, inclusive of
Tomcat.
Is it a terrible idea to have one Rackspace VPS account which runs the PHP
site with MYSQL database, and another rackspace account w
Am 16.03.2012 16:42, schrieb Mike Austin:
It seems that the biggest real-world advantage is the ability to control
core creation and replacement with no downtime. The negative would be the
isolation however the are still somewhat isolated. What other benefits and
common real-world situations wo
I'm trying to understand the difference between multiple Tomcat indexes
using context fragments versus using one application with multiple cores?
Since I'm currently using tomcat context fragments to run 7 different
indexes, could I get help understanding more why I would want to use solr
cores ins
I guess I don't quite understand. If the description field
is single valued, simply specifying that field on the
fl parameter should return it.
It would help if you showed some sample documents,
because I can't tell whether you only have one descriptor
per document or several
By the way, you'
http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/
-> That's an excellent read - thanks a lot for the heads-up!
Kind regards,
Alex
On 16.03.2012 14:08, Erick Erickson wrote:
Your problem is that you're saying with the -myField:* "Remove from
the result set all documents
Solr newbie here, but this looks familier.
Another thing to make sure of is that the plugin jars are not ialready
loaded from the standard java classpath.
I had a problem with this in that some jars were being loaded by the
standard java classloader,
and my some other plugins were being loaded by
Hello,
I have this configuration where a single master builds the Solr index and it
replicates to two slave Solr instances. Regular queries are sent only to those
two slaves. Configurations are the same for everyone (except of replication
section, of course).
My problem: it's happened that, in
Hello,
I'm having trouble adding a pdf file to my index. It's multicored. My server
object instantiates properly (StreamingUpdateSolrServer). In my request object
(ContentStreamUpdateRequest) I add a couple of literals to populate fields in
the index that the parsed content of the PDF won't
Am 16.03.2012 15:05, schrieb stockii:
i have 8 cores ;-)
i thought that replication is defined in solrconfig.xml and this file is
only load on startup and i cannot change master to slave and slave to master
without restarting the servlet-container ?!?!?!
No, you can reload the whole core at an
i have 8 cores ;-)
i thought that replication is defined in solrconfig.xml and this file is
only load on startup and i cannot change master to slave and slave to master
without restarting the servlet-container ?!?!?!
-
--- System ---
Is there any analyzer out there which handles the mailto: scheme?
UAX29URLEmailTokenizer seems to split at the wrong place:
mailto:t...@example.org ->
mailto:test
example.org
As a workaround I use
mailto:";
replacement="mailto: "/>
Regards,
Kai Gülzau
novomind AG
___
Hi Alejandro,
I followed your instructions step by step, but it still isn't working
HTTP Status 404 - /solr/admin
type Status report
message /solr/admin
description The requested resource (/solr/admin) is not available.
I used
Apache Tomcat/6.0.35
Xampp 1.7.7
Sun JDK 7
--
View this messag
Flattery will get you a lot ...
Yeah, I expect you're hitting a merge issue. To test, set up autocommit
to only trigger after a lot of docs are committed. You should see the
time before the big pause change radically (perhaps disappear if
you don't commit until the run is done).
Note that it'll s
I'd go ahead and do the query time boosts. The "penalty" will
be a single multiplication per doc (I think), and probably not
noticeable. And it's much more flexible/easier...
Best
Erick
On Thu, Mar 15, 2012 at 9:21 PM, Arcadius Ahouansou
wrote:
> Hello.
>
> I have an SQL database with documents
Well, a lot depends upon the query analysis. Are you using
the *exact* same analysis chains in both? Look at the admin/analysis
page and see how your term evaluates. I'm guessing that
WordDelimiterFilterFactory is being used in the 3.5 case and not
in the 1.4.1 case so the 3.5 case is matching ever
Thanks Erick!!
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, March 16, 2012 6:58 PM
To: solr-user@lucene.apache.org
Subject: Re: Regarding Indexing Multiple Columns Best Practise
I would *guess* you won't notice much/any difference. Note that, if
Since Erick is really active answering now so posting a quick question :)
I am using:
DIH
Solr 3.5 on Windows
Building Auto Recommendation Utility
Having around 1 Billion Query Strings (3-6 words each) in database. Indexing
them using NGram.
Merge Factor = 30
Auto Commit not set.
DIH halted a
I would *guess* you won't notice much/any difference. Note that, if you use
a fieldType with the increment gap > 1 (the default is often set to 100),
phrase queries (slop) will perform differently depending upon which option
you choose.
Best
Erick
On Thu, Mar 15, 2012 at 10:49 AM, Husain, Yavar
At a guess, you don't have any paths to solr dist. Try copying all the other lib
directives from the example (not core) dir (adjusting paths as necessary). The
error message indicates you aren't getting to
/dist/apache-solr-velocity-3.5.0.jar
Best
Erick
On Thu, Mar 15, 2012 at 9:48 AM, ViruS
What you think the results of stemming should be and what they
actually are sometimes differ ...
Look at the admin/analysis page, check the "verbose" boxes
and try recharging rechargeable and you'll see, step by step,
the results of each element of the analysis chain. Since
the Porter stemmer is a
What's the use-case? Presumably you have different configs...
I'm actually not sure if you can do a reload
see: http://wiki.apache.org/solr/CoreAdmin#RELOAD
without a core, but you could try.
Best
Erick
On Thu, Mar 15, 2012 at 4:59 AM, stockii wrote:
> Hello.
>
> Is it possible to switch master
Your problem is that you're saying with the -myField:* "Remove from
the result set all documents with any value in myField", which is not
what you want. Lucene query language is not strictly boolean logic,
here's an excellent writeup:
http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-
I think you're using PHP to request solr.
You can ask solr to respond in several different formats (xml, json,
php, ...), see http://wiki.apache.org/solr/QueryResponseWriter .
Depending on how you connect to solr from php, you may want to use
html_entity_decode before using mb_substr.
--
Ta
That's because of the space.
If you want to include the space in the search query (performing exact
match), then use double quotes around your search terms :
q=multiplex_name:"Agent Vinod"
Online documentation :
* http://wiki.apache.org/solr/SolrQuerySyntax
*
http://lucene.apache.org/core/ol
I am running solr 3.5 with a mysql data connector. Solr is configured to
use UTF8 as encoding:
unfortunatelly solr does encode special characters like "ä" into
htmlentities:
ä
which leads to problems when cutting strings with php mb_substr(..)
How can I configure solr to deliver UTF-8 instea
Mikhail & Ludovic,
Thanks for both your replies, very helpful indeed!
Ludovic, I was actually looking into just that and did some tests with
SolrJ, it does work well but needs some changes on the Solr server if we
want to send out individual documents a various times. This could be done
with a w
Hi,
Does an update query to solr work well when sent with a timeout
parameter ? https://issues.apache.org/jira/browse/SOLR-502
For example, consider an update query was fired with a timeout of 30
seconds, and the request got aborted half way due to the timeout. Can
this corrupt the index in any wa
Hi,
I put all those jars into SOLR_HOME/lib. I do not specify them in
solrconfig.xml explicitely, and they are all found all right.
Would that be an option for you?
Chantal
On Thu, 2012-03-15 at 17:43 +0100, ViruS wrote:
> Hello,
>
> I just now try to switch from 3.4.0 to 3.5.0 ... i make ne
You could use the MappingUpdateProcessor for this, doing the mapping through a
simple synonyms-like config file at index time, indexing the description in a
String field. https://issues.apache.org/jira/browse/SOLR-2151
Or you could make a SearchComponent plugin doing the same thing "live" at que
Hello Roberto,
Exact match needs extra " (double-quotes) surrounding the exact
thing you want to query in the id field.
Give a try to a query like this :
id:"http://127.0.0.1:/my/personal/testuser/Personal
Documents/cal9.pdf"
See this wiki page :
Hello Carlos,
I have two concerns about your approach. First-K (not top-K honestly)
collector approach impacts recall of your search and using disjunctive
queries impacts precision e.g. I want to find some fairly small and quiet,
and therefore unpopular "Lemond Hotel" you parse my phrase into Lemo
Dear all,
I've got an issue querying for the "id" field in solr. the "id"
field is filled with document url taking from a sharepoint library
using manifoldcf repository connector.
in my index there are these document ids:
http:
50 matches
Mail list logo