>>Instead of indexing documents about 'sports' and searching for hits
>>based upon 'basketball', 'football' etc.. I simply want to index the
>>taxonomy and classify documents into it. This is a an ancient
>>AI/Data-Mining discipline.. but the standard methods of 'indexing' the
>>taxonomy are/were
Hi,
I have downloade solr1.3.0 .
I need to index chinese content ,for this i have defined a new field in the
schema
as
I beleive solr1.3 already has the cjkanalyzer by default.
my schema in the testing stage has only 2 fields
However when i index the chinese text into
Did you commit after the updates?
2009/1/27 revathy arun
> Hi,
>
> I have downloade solr1.3.0 .
>
> I need to index chinese content ,for this i have defined a new field in the
> schema
>
> as
>
>
> positionIncrementGap="100">
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I beleive solr1.3 already
Hi
I have committed.The admin page does not show any docs pending or committed
or any errors.
Regards
Sujatha
On 1/27/09, Shalin Shekhar Mangar wrote:
>
> Did you commit after the updates?
>
> 2009/1/27 revathy arun
>
> > Hi,
> >
> > I have downloade solr1.3.0 .
> >
> > I need to index chines
this is the stats of my updatehandler
but i still dont see any index created
*stats: *commits : 7
autocommits : 0
optimizes : 2
docsPending : 0
adds : 0
deletesById : 0
deletesByQuery : 0
errors : 0
cumulative_adds : 0
cumulative_deletesById : 0
cumulative_deletesByQuery : 0
cumulative_errors : 0
Are you looking for it in the right place? It is very unlikely that a commit
happens and index is not created.
The index is usually created inside the data directory as configured in your
solconfig.xml
Can you search for *:* from the solr admin page and see if documents are
returned?
On Tue, Jan
Hi Shalin,
The admin page stats are as follows
searcherName : searc...@1d4c3d5 main
caching : true
numDocs : 0
maxDoc : 0
*name: * /update *class: * org.apache.solr.handler.XmlUpdateRequestHandler
*version: * $Revision: 690026 $ *description: * Add documents with XML *
stats: *handlerStart :
Solr 1.3
I'm trying to get highlighting working, with no luck so far.
Query with params q=cyrus&fl=*,score&qt=standard&hl=true&hl.fl=title
+description finds 182 documents in my index. All of the top 10 hits
contain the word "cyrus", but the highlights list is empty. The fields
"title" and
I turned these fields to indexed + stored but the results are exactly
the same, no matter if I search in these fields or elsewhere.
Wiadomość napisana w dniu 2009-01-27, o godz. 13:09, przez Jarek Zgoda:
Solr 1.3
I'm trying to get highlighting working, with no luck so far.
Query with params
errors: 11
What were those?
My hunch is your indexer had issues. What did Solr output into the
console or log during indexing?
Erik
On Jan 27, 2009, at 6:56 AM, revathy arun wrote:
Hi Shalin,
The admin page stats are as follows
searcherName : searc...@1d4c3d5 main
caching : true
Finally found that the fields have to have an analyzer to be
highlighted. Neat.
Can I ask somebody to document these all requirements?
Wiadomość napisana w dniu 2009-01-27, o godz. 13:49, przez Jarek Zgoda:
I turned these fields to indexed + stored but the results are
exactly the same, no m
I am also getting the same issue. Did any one found the solution for this...
Please respond
sbutalia wrote:
>
> I'm having the same issue.. have you had any progress with this?
>
--
View this message in context:
http://www.nabble.com/Error-in-Integrating-JBoss-4.2-and-Solr-1.3.0%3A-tp2020203
Making requests in parallel, using the default connection manager,
which is multi-threaded, and we are reusing a single CommonsHttpSolrServer
for all requests.
wunder
On 1/26/09 10:59 PM, "Noble Paul നോബിള് नोब्ळ्"
wrote:
> are you making requests in parallel ?
> which ConnectionManager are y
When you query by *:*, what order does it use. Is there a chance they will
come in a different order as you page through the results (and miss/dupicate
some). Is it best to put the order explicitly by 'id' or is that implied
already?
On Mon, Jan 26, 2009 at 12:00 PM, Ian Connor wrote:
> *:* took
*:* will default to sorting by document insertion order (Lucene's
document id, _not_ your Solr uniqueKey). And no, you won't miss any
by paging - order will be maintained.
Erik
On Jan 27, 2009, at 9:52 AM, Ian Connor wrote:
When you query by *:*, what order does it use. Is there a
I found how the issue is created .when solr warm up the new searcher with
cacheLists , if the queryResultCache is enable the issue is created.
notice:as I mentioned before I commit with waitflush=false and
waitsearcher=false
so it has problem in case the queryResultCache is on,
but I don't know
On Tue, Jan 27, 2009 at 8:51 PM, Parisa wrote:
>
> I found how the issue is created .when solr warm up the new searcher with
> cacheLists , if the queryResultCache is enable the issue is created.
>
> notice:as I mentioned before I commit with waitflush=false and
> waitsearcher=false
>
> so it has
So it was me defining it in schema.xml rather than solrconfig.xml.
17:17 < erikhatcher> where are you defining the qparser plugin?
17:18 < erikhatcher> it's very odd... if it isn't picking them up but
you reference them, it would certainly give an error
17:18 < karlwettin> as a first level chil
On Tue, Jan 27, 2009 at 1:36 AM, Hannes Carl Meyer wrote:
> Yeah, know it, the challenge on this method is the calculation of the score
> and parametrization of thresholds.
Not as worried about score itself as the score thresholds for prediction in/out.
> Is it really neccessary to use Solr for
Hello,
I am trying to get Solr to properly work. I have set up a Solr test
server (using jetty as mentioned in the tutorial). Also I had to modify
the schema.xml so that I have different fields for different languages
(with their own stemmers) that occur in the content management system
that I am
That's interesting SolrJ doesn't touch HTTPClient params if one is
provided in the constructor.
I guess I'd try to sniff the headers first and see if any difference
sticks out between the clients.
I normally just use netcat and pretend to be the solr server.
-Yonik
On Tue, Jan 27, 2009 at 1
if you use this constructor:
public CommonsHttpSolrServer(URL baseURL, HttpClient client)
then solrj never touches the HttpClient configuration.
I normally reuse a single CommonsHttpSolrServer as well.
On Jan 27, 2009, at 9:52 AM, Walter Underwood wrote:
Making requests in parallel, using
Hi,
I would like to know how it can be implemented.
Index1 has fields id,1,2,3 and index2 has fields id,5,6,7.
The ID in both indexes are unique id.
Can I use "a kind of " distributed search and/or multicore to search, sort,
and facet through 2 indexes (index1 and index2)?
Thanks,
Jae joo
Oh I see, thanks for the clarification.
Unfortunately this brings me back to same problem I started with: implicit
properties aren't available when managing indexes through the REST api. I
know there is a patch in the works for this issue but I can't wait for it.
Is there any way to share the solr
27 jan 2009 kl. 17.23 skrev Neal Richter:
Is it really neccessary to use Solr for it? Things going much
faster with
Lucene low-level api and much faster if you're loading the
classification
corpus into the RAM.
Good points. At the moment I'd rather have a daemon with a service
API.. as
Hi,
Starting about one week ago, our index size gets tripled during
optimization.
The current index statistics are:
numDocs : 192702132
size: 76G
And we do optimization for every 6M docs update.
Since we keep getting new data, the index size increases every day. Before,
the index size was on
Hi, I plan to use solr to index a large number of documents extracted
from emails bodies, such documents could be in different languages,
and a single document could be in more than one language. In the same
way, the query string could be words in different languages.
I read that a common approac
First, I'd search the mail archive for the topic of languages, it's
been discussed often and there's a wealth of information that
might be of benefit, far more information than I can remember.
As to whether your approach will be "too big, too slow...", you
really haven't given enough information t
Hello folks!
We've been thinking about ways to improve organic search results for a
while (really, who hasn't?) and I'd like to get some ideas on ways to
implement a feedback system that uses user behavior as input.
Basically, it'd work on the premise that what the user actually
clicked o
I guess I've been called to the chalkboard...
I haven't looked specifically at putting the taxonomy in Lucene/Solr,
but it is an interesting idea. In reading the paper you mentioned,
there are some interesting ideas there and Solr could obviously just
as easily be used as Lucene, I think.
I've been thinking about the same thing. We have a set of queries
that defy straightforward linguistics and ranking, like figuring
out how to match "charlie brown" to "It's the Great Pumpkin,
Charlie Brown" in October and to "A Charlie Brown Christmas"
in December.
I don't have any solutions yet,
I'm considering building some tools for our internal non-technical staff
to write to synonyms.txt, elevate.xml, spellings.txt, and protwords.txt
so software developers don't have to maintain them. Before my team
starts building these tools, has anyone done this before? If so, are
these tools avai
They are documented in http://wiki.apache.org/solr/
FieldOptionsByUseCase and in the FAQ , but I agree that it could be
more readily accessible.
-Mike
On 27-Jan-09, at 5:26 AM, Jarek Zgoda wrote:
Finally found that the fields have to have an analyzer to be
highlighted. Neat.
Can I ask so
Could it be the framework you are using around it? I know some IOC
containers will auto pool objects underneath as a service without you really
knowing it is being done or has to be explicitly turned off. Just a
thought. I use a single server for all requests behind a Hivemind setup ...
umm not
I shall give a patch today
On Tue, Jan 27, 2009 at 11:58 PM, Mark Ferguson
wrote:
> Oh I see, thanks for the clarification.
>
> Unfortunately this brings me back to same problem I started with: implicit
> properties aren't available when managing indexes through the REST api. I
> know there is a
if you are making requests in parallel , then it is likely that you
see many connections open at a time. They will get cleaned up over
time . But if you wish to clean them up explicitly use
httpclient.getHttpConnectionManager()r#closeIdleConnections()
On Tue, Jan 27, 2009 at 8:22 PM, Walter Underw
Hello, dear members.
I'm a little bit confused about dismax syntax. as far as i know (and i might
be wrong) it supports default query language such as +WORD -WORD
What about parentheses ?
my title of doc consist of WORD1 WORD2 WORD3. when i'm trying to search
+WORD1 +(WORD2 WORD4) + WORD3 it doe
i'm a little bit noob in java compiler so could you please tell me what tools
are used to apply patch SOLR-236 (Field groupping), does it need to be
applied on current solr-1.3 (and nightly builds of 1.4) or it already in
box?
what batch file stands for solr compilation in its distributive?
--
V
Hi,
This is the only info in the tomcat log at indexing
Jan 27, 2009 3:46:15 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/lang_prototype path=/update params={} status=0 QTime=191
I dont see any ohter errors in the logs .
when i use curl to update i get success message.
and commit
Hi All,
Is it possible to store only limited text in the field, say, max 1
mb? The field maxfieldlength limits only the number of tokens to be
indexed, but stores complete content.
Thanks,
Siddharth
since you are asking about 'batch file' , are you using windows?
I recommend using TortoiseSVN to apply patch
On Wed, Jan 28, 2009 at 10:05 AM, surfer10 wrote:
>
> i'm a little bit noob in java compiler so could you please tell me what tools
> are used to apply patch SOLR-236 (Field groupping), d
There is a patch given for SOLR-883 .
On Wed, Jan 28, 2009 at 9:43 AM, Noble Paul നോബിള് नोब्ळ्
wrote:
> I shall give a patch today
>
> On Tue, Jan 27, 2009 at 11:58 PM, Mark Ferguson
> wrote:
>> Oh I see, thanks for the clarification.
>>
>> Unfortunately this brings me back to same problem I
i found Hoss's explanations at
http://www.nabble.com/Dismax-and-Grouping-query-td12938168.html#a12938168
seems to be i cant do this. so my question is transforming to following:
can i join multiple dismax queries into one? for instance if i'm looking for
+WORD1 +(WORD2 WORD3)
it can be translate
This is just what I needed, thank you so much for the quick response! It's
really appreciated!
Mark
On Tue, Jan 27, 2009 at 9:59 PM, Noble Paul നോബിള് नोब्ळ् <
noble.p...@gmail.com> wrote:
> There is a patch given for SOLR-883 .
>
> On Wed, Jan 28, 2009 at 9:43 AM, Noble Paul നോബിള് नोब्ळ्
>
If you're using a Solr build post-r721758, then copyfield has a
maxChars property you can take advantage of. I'm probably
misremembering some of the exact names of these elements/attributes,
but you can basically have this in your schema.xml:
Then anything you store in field f will get copied
On Tue, Jan 27, 2009 at 2:21 PM, Grant Ingersoll wrote:
> One of the things I am interested in is the marriage of Solr and Mahout
> (which has some Genetic Algorithms support) and other ML (Weka, etc.) tools.
[snip]
I love it, good to know you are thinking big here. Here's another big thought:
Hi,
I a, getting this error in the tomcat log file on passing chinese test to
the content field
The content field uses the ckj tokenizer.
and is defined as
INFO: [] webapp=/lang_prototype path=/update params={} status=0 QTime=69
Jan 28, 2009 12:17:03 PM org.apache.solr.common.
OK I've implemented this before, written academic papers and patents
related to this task.
Here are some hints:
- you're on the right track with the editorial boosting elevators
- http://wiki.apache.org/solr/UserTagDesign
- be darn careful about assuming that one click is enough evidence
Hi,
Thanks Otis, Newton and everyone else for the help on this issue.
Most of the data I index are documents like pdfs, word Docs, open office
documents, etc. I store the content of the document in a field called
content and the remaining metadata of the document like name, id,
created by, modifi
49 matches
Mail list logo