2009/11/13 Noble Paul നോബിള് नोब्ळ् :
> am unable to get the file
> http://old.nabble.com/file/p26335171/dataimport.temp.xml
>
> On Fri, Nov 13, 2009 at 4:57 PM, Andrew Clegg wrote:
OK. Is there anyone trying it out? where is this code ? I can try to help ..
On Fri, Nov 13, 2009 at 8:10 PM, Mauricio Scheffer
wrote:
> I meant the standard IO libraries. They are different enough that the code
> has to be manually ported. There were some automated tools back when
> Microsoft in
I would go with polling Solr to find what is not yet there. In
production, it is better to assume that things will break, and have
backstop janitors that fix them. And then test those janitors
regularly.
On Fri, Nov 13, 2009 at 8:02 PM, Otis Gospodnetic
wrote:
> So I think the question is really:
am unable to get the file
http://old.nabble.com/file/p26335171/dataimport.temp.xml
On Fri, Nov 13, 2009 at 4:57 PM, Andrew Clegg wrote:
>
>
>
> Noble Paul നോബിള് नोब्ळ्-2 wrote:
>>
>> no obvious issues.
>> you may post your entire data-config.xml
>>
>
> Here it is, exactly as last attempt but w
Yeah I ended up created a "boosted" field for @ least debugging, but might
patch / extend / create my own FieldNormModifier using just that criteria +
doing the reset.
- Jon
On Nov 13, 2009, at 12:21 PM, Avlesh Singh wrote:
> AFAIK there is no way to "reset" the doc boost. You would need to re
Apparently one of my conf files was broken - odd that I didn't see any
exceptions. Anyhow - excuse my haste, I don't see the problem now.
-Peter
On Fri, Nov 13, 2009 at 11:06 PM, Peter Wolanin
wrote:
> I'm testing out the final release of Solr 1.4 as compared to the build
> I have been using fr
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/apis/mmap.htm
Normally file I/O in a program means that the data is copied between
the system I/O disk cache and the program's memory. Memory-mapping
means that the program address space points to the disk I/O cache
directly, t
The 'maxSegments' feature is new with 1.4. I'm not sure that it will
cause any less disk I/O during optimize.
The 'mergeFactor=2' idea is not what you think: in this case the index
is always "mostly optimized", so you never need to run optimize.
Indexing is always slower, because you amortize the
Let's take a step back. Why do you need to optimize? You said: "As long as
I'm not optimizing, search and indexing times are satisfactory." :)
You don't need to optimize just because you are continuously adding and
deleting documents. On the contrary!
Otis
--
Sematext is hiring -- http://se
I'm testing out the final release of Solr 1.4 as compared to the build
I have been using from around June.
I'm using hte dismax handler for searches. I'm finding that
highlighting is completely broken as compared to previously. Much
more text is returned than it should for each string in , but t
So I think the question is really:
"If I stop the servlet container, does Solr issue a commit in the shutdown hook
in order to ensure all buffered docs are persisted to disk before the JVM
exits".
I don't have the Solr source handy, but if I did, I'd look for "Shutdown",
"Hook" and "finalize" i
I thought that was the way to use it (but I've never had to use it myself) and
that it means memory through the roof, yes.
If you look at the Solr Admin statistics page, does it show you which Directory
you are using?
For example, on 1 Solr instance I'm looking at I see:
readerDir : org.apache
Unless I slept through it, you still need to explicitly commit, even with SUSS.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> From: "erikea...@yahoo.com"
> To: "solr-user@lucene.
This is one case where permanent caches are interesting. Another case
is highlighting: in some cases highlighting takes a lot of work, and
this work is not cached.
It might be a cleaner architecture to have session-maintaining code in
a separate front-end app, and leave Solr session-free.
On Fri,
This looks exactly like what I was needing ... this looks like it would be a
great tool / addition to Solr web interface but it looks like it only takes
(Directory d, Similarity s) (vs. subset collection of documents) ...
Either way great find, thanks for your help ...
- Jon
On Nov 13, 2009, a
There is no direct way.
Let's say you have a "nocopy_s" and you do not want a copy
"nocopy_str_s". This might work: declare "nocopy_str_s" as a field and
make it not indexed and not stored. I don't know if this will work.
It requires two overrides to work: 1) that declaring a field name that
matc
When does StreamingUpdateSolrServer commit?
I know there's a threshhold and thread pool as params but I don't see a commit
timeout. Do I have to manage this myself?
: I tied to reproduce this in 1.4 using an index/configs created with 1.3,
: but i got a *different* NPE when loading this url...
I should have tried a simpler test ... iget NPE's just trying to execute
a simple search for *:* when i try to use the example index built
in 1.3 (with the 1.3 co
: > FWIW: I was able to reproduce this using the example setup (i picked a
: > doc id at random) �suspecting it was a bug in docFreq
:
: Probably just a null being passed in the text part of the term.
: I bet Luke expects all field values to be strings, but some are binary.
I'm not sure i follow
ah, thanks, i'll tentatively set one in the future, but definitely not 2.9.x
more just to show you the idea, you can do different things depending on
different runs of writing systems in text.
but it doesnt solve everything: you only know its Latin script, not english,
so you can't safely automati
I'm not sure this is what you are looking for,
but there is FieldNormModifier tool in Lucene.
Koji
--
http://www.rondhuit.com/en/
Avlesh Singh wrote:
AFAIK there is no way to "reset" the doc boost. You would need to re-index.
Moreover, there is no way to "search by boost".
Cheers
Avlesh
On
Thanks for the link - there doesn't seem a be a fix version specified,
so I guess this will not officially ship with lucene 2.9?
-Peter
On Wed, Nov 11, 2009 at 10:36 PM, Robert Muir wrote:
> Peter, here is a project that does this:
> http://issues.apache.org/jira/browse/LUCENE-1488
>
>
>> That's
Folks,
I am trying to get Lucene MMAP to work in solr.
I am assuming that when I configure MMAP the entire index will be loaded
into RAM.
Is that the right assumption ?
I have tried the following ways for using MMAP:
Option 1. Using the solr config below for MMAP configuration
-Dorg.apache.luc
On Fri, Nov 13, 2009 at 5:41 PM, Chris Hostetter
wrote:
> : I'm seeing this stack trace when I try to view a specific document, e.g.
> : /admin/luke?id=1 but luke appears to be working correctly when I just
>
> FWIW: I was able to reproduce this using the example setup (i picked a
> doc id at rand
DS requires a bunch of shard names in the url. That's all. Note that a
ds does not use the data of the solr you call.
You can create an entry point for your distributed search by adding a
new element in solrconfig.xml. You would add the
shard list parameter to the "defaults" list. Do not have it
: I'm seeing this stack trace when I try to view a specific document, e.g.
: /admin/luke?id=1 but luke appears to be working correctly when I just
FWIW: I was able to reproduce this using the example setup (i picked a
doc id at random) suspecting it was a bug in docFreq when using multiple
s
ysee...@gmail.com wrote on 11/13/2009 09:06:29 AM:
> On Fri, Nov 13, 2009 at 6:27 AM, Michael McCandless
> wrote:
> > I think we sorely need a Directory impl that down-prioritizes IO
> > performed by merging.
>
> It's unclear if this case is caused by IO contention, or the OS cache
> of the hot p
ysee...@gmail.com wrote on 11/13/2009 09:06:29 AM:
>
> On Fri, Nov 13, 2009 at 6:27 AM, Michael McCandless
> wrote:
> > I think we sorely need a Directory impl that down-prioritizes IO
> > performed by merging.
>
> It's unclear if this case is caused by IO contention, or the OS cache
> of the hot
: which documents have been updated before a successful commit. Now
: stopping solr is as easy as kill -9.
please don't kill -9 ... it's grossly overkill, and doesn't give your
servlet container a fair chance to cleanthings up. A lot of work has been
done to make Lucene indexes robust to hard
Mark Miller wrote on 11/12/2009 07:18:03 PM:
> Ah, the pains of optimization. Its kind of just how it is. One solution
> is to use two boxes and replication - optimize on the master, and then
> queries only hit the slave. Out of reach for some though, and adds many
> complications.
Yes, in my us
: On the CoreAdmin wiki page. thanks
FWIW: The only time the string "schemaName" appears on the CoreAdmin wiki
page is when it mentions that "solr.core.schemaName" is a property that is
available to cores by default.
the documentation for specificly says...
>> The tag accepts the followin
tpunder wrote:
>
> Maybe I misunderstand what you are trying to do (or the facet.query
> feature). If I did an initial query on my data-set that left me with the
> following questions:
> ...
> http://localhost:8983/solr/select/?q=*%3A*&start=0&rows=0&facet=on&facet.query=brand_id:1&facet.query=
If documents are being added to and removed from an index (and commits
are being issued) while a user is searching, then the experience of
paging through search results using the obvious solr mechanism
(&start=100&Rows=10) may be disorienting for the user. For one
example, by the time the user clic
On Thu, Nov 12, 2009 at 3:00 PM, Stephen Duncan Jr wrote:
> On Thu, Nov 12, 2009 at 2:54 PM, Chris Hostetter > wrote:
>
>>
>> oh man, so you were parsing the Stored field values of every matching doc
>> at query time? ouch.
>>
>> Assuming i'm understanding your goal, the conventional way to solv
Peter - if you want, download the code from Lucene in Action 1 or 2, it has
index traversal and indexing. 2nd edition uses Tika.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> Fr
great. thanks. that was helpful
Avlesh Singh wrote:
>
>>
>> you can do it using
>> solrQuery.setFilterQueries() and build AND queries of multiple
>> parameters.
>>
> Nope. You would need to read more -
> http://wiki.apache.org/solr/FilterQueryGuidance
>
> For your impatience, here's a quick sta
The process initially completes with:
2009-11-13 09:40:46
Indexing completed. Added/Updated: 20 documents. Deleted
0 documents.
...but then it fails with:
2009-11-13 09:40:46
Indexing failed. Rolled back all changes.
2009-11-13 09:41:10
2009-11-13 09:41:10
2009-11-13 09
Hi Ian and Ryan,
Thanks for the reply.
Ian, I checked your pasted config, I am using the same one except the
values of 4 25.
Basically I use the set up specified at http://www.gissearch.com/localsolr.
But there are still the same error I pasted in previous email.
Ryan, I just checked out
AFAIK there is no way to "reset" the doc boost. You would need to re-index.
Moreover, there is no way to "search by boost".
Cheers
Avlesh
On Fri, Nov 13, 2009 at 8:17 PM, Jon Baer wrote:
> Hi,
>
> Im trying to figure out if there is an easy way to basically "reset" all of
> any doc boosts which
>
> you can do it using
> solrQuery.setFilterQueries() and build AND queries of multiple parameters.
>
Nope. You would need to read more -
http://wiki.apache.org/solr/FilterQueryGuidance
For your impatience, here's a quick starter -
#and between two fields
solrQuery.setQuery("+field1:foo +field2:
I think I found the answer. needed to read more API documentation :-)
you can do it using
solrQuery.setFilterQueries() and build AND queries of multiple parameters.
Avlesh Singh wrote:
>
> For a starting point, this might be a good read -
> http://www.lucidimagination.com/search/document/f4d9
Luke Handler? - http://wiki.apache.org/solr/LukeRequestHandler
/admin/luke?numTerms=0
Cheers
Avlesh
On Fri, Nov 13, 2009 at 10:05 PM, Eugene Dzhurinsky wrote:
> Hi there!
>
> How can we retrieve the complete list of dynamic fields, which are
> currently
> available in index?
>
> Thank you in adv
Anyone?
Original-Nachricht
> Datum: Thu, 12 Nov 2009 13:29:20 +0100
> Von: gistol...@gmx.de
> An: solr-user@lucene.apache.org
> Betreff: Return doc if one or more query keywords occur multiple times
> Hello,
>
> I am using Dismax request handler for queries:
>
> ...select?q=fo
For a starting point, this might be a good read -
http://www.lucidimagination.com/search/document/f4d91628ced293bf/lucene_query_to_solr_query
Cheers
Avlesh
On Fri, Nov 13, 2009 at 10:02 PM, javaxmlsoapdev wrote:
>
> I already did dive in before. I am using solrj API and SolrQuery object to
> b
Hi there!
How can we retrieve the complete list of dynamic fields, which are currently
available in index?
Thank you in advance!
--
Eugene N Dzhurinsky
pgpKftn1PiY0K.pgp
Description: PGP signature
Heya.. could it be a problem with your solr config files? I seem to
recall a change from the docs as they were to get this working.. I
have...
lat
lng
4
25
localsolr
facet
mlt
highlight
debug
I already did dive in before. I am using solrj API and SolrQuery object to
build query. but its not clear/written how to build booleanQuery ANDing
bunch of different attributes in the index. Any samples please?
Avlesh Singh wrote:
>
> Dive in - http://wiki.apache.org/solr/Solrj
>
> Cheers
> Av
Also:
https://issues.apache.org/jira/browse/SOLR-1302
On Nov 13, 2009, at 11:12 AM, Bertie Shen wrote:
Hey,
I am interested in using LocalSolr to go Local/Geo/Spatial/Distance
search. But the wiki of LocalSolr(http://wiki.apache.org/solr/LocalSolr
)
points to pretty old documentation. Is t
It looks like solr+spatial will get some attention in 1.5, check:
https://issues.apache.org/jira/browse/SOLR-1561
Depending on your needs, that may be enough. More robust/scaleable
solutions will hopefully work their way into 1.5 (any help is always
appreciated!)
On Nov 13, 2009, at 11:12
Dive in - http://wiki.apache.org/solr/Solrj
Cheers
Avlesh
On Fri, Nov 13, 2009 at 9:39 PM, javaxmlsoapdev wrote:
>
> I want to build AND search query against field1 AND field2 etc. Both these
> fields are stored in an index. I am migrating lucene code to Solr.
> Following
> is my existing lucen
Hey,
I am interested in using LocalSolr to go Local/Geo/Spatial/Distance
search. But the wiki of LocalSolr(http://wiki.apache.org/solr/LocalSolr)
points to pretty old documentation. Is there a better document I refer to
for the setting up of LocalSolr and some performance analysis?
Just syn
I want to build AND search query against field1 AND field2 etc. Both these
fields are stored in an index. I am migrating lucene code to Solr. Following
is my existing lucene code
BooleanQuery currentSearchingQuery = new BooleanQuery();
currentSearchingQuery.add(titleDescQuery,Occur.MUST);
highli
On Fri, Nov 13, 2009 at 4:32 AM, gwk wrote:
> I don't know if this is the best solution, or even if it's applicable to
> your situation but we do incremental updates from a database based on a
> timestamp, (from a simple seperate sql table filled by triggers so deletes
Thanks, gwk! This doesn't
Have one thread recursing depth first down the directories & adding to
a queue (fixed size).
Have many threads reading off of the queue and doing the work.
-glen
http://zzzoot.blogspot.com/
2009/11/13 Peter Gabriel :
> Hello.
>
> I am on work with Tika 0.5 and want to scan a folder system about 1
Hello.
I am on work with Tika 0.5 and want to scan a folder system about 10GB.
Is there a comfortable way to scan folders recursively with an existing class
or have i to write it myself?
Any tips for best practise?
Greetings, Peter
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mo
I'm getting the same thing. The process runs, seemingly successfully, and I
can even go to other SOLR pages pointing to the same server and pull queries
against the index with these just-added entires. But the response to the
original import says "failed" and "rollback" both through the XML resp
Chantal Ackermann wrote:
>
> your URL does not include the parameter mlt.boost. Setting that to
> "true" made a noticeable difference for my queries.
>
Hmm, I'm really not sure if this is doing the right thing either. When I add
it I get:
1.0
0.60737264
0.27599618
0.2476748
0.24487767
Hi,
Im trying to figure out if there is an easy way to basically "reset" all of any
doc boosts which you have made (for analytical purposes) ... for example if I
run an index, gather report, doc boost on the report, and reset the boosts @
time of next index ...
It would seem to be from just k
I meant the standard IO libraries. They are different enough that the code
has to be manually ported. There were some automated tools back when
Microsoft introduced .Net, but IIRC they never really worked.
Anyway it's not a big deal, it should be a straightforward job. Testing it
thoroughly cross-
Hi Andrew,
your URL does not include the parameter mlt.boost. Setting that to
"true" made a noticeable difference for my queries.
If not, there is also the parameter
mlt.minwl
"minimum word length below which words will be ignored."
All your other terms seem longer than 3, so it would help i
the included snowball filters support hungarian, romanian, and russian.
On Fri, Nov 13, 2009 at 9:03 AM, Chuck Mysak wrote:
> Hello all,
>
> is there support for non-english language content indexing in Solr?
>
I'm interested in Bulgarian, Hungarian, Romanian and Russian.
>
> Best regards,
>
> C
On Fri, Nov 13, 2009 at 6:27 AM, Michael McCandless
wrote:
> I think we sorely need a Directory impl that down-prioritizes IO
> performed by merging.
It's unclear if this case is caused by IO contention, or the OS cache
of the hot parts of the index being lost by that extra IO activity.
Of course
Hello all,
is there support for non-english language content indexing in Solr?
I'm interested in Bulgarian, Hungarian, Romanian and Russian.
Best regards,
Chuck
Chantal Ackermann wrote:
>
> no idea, I'm afraid - but could you sent the output of
> interestingTerms=details?
> This at least would show what MoreLikeThis uses, in comparison to the
> TermVectorComponent you've already pasted.
>
I can, but I'm afraid they're not very illuminating!
http://
The javabin format does not have many dependencies. it may have 3-4
classes an that is it.
On Fri, Nov 13, 2009 at 6:05 PM, Mauricio Scheffer
wrote:
> Nope. It has to be manually ported. Not so much because of the language
> itself but because of differences in the libraries.
>
>
> 2009/11/13 Nob
On Fri, Nov 13, 2009 at 6:27 AM, Michael McCandless
wrote:
> I think we sorely need a Directory impl that down-prioritizes IO
> performed by merging.
Presumably this "prioritizing Directory impl" could wrap/decorate any
existing Directory.
Mike
Another thing to try, is reducing the maxThreadCount for
ConcurrentMergeScheduler.
It defaults to 3, which I think is too high -- we should change this
default to 1 (I'll open a Lucene issue).
Mike
On Thu, Nov 12, 2009 at 6:30 PM, Jerome L Quinn wrote:
>
> Hi, everyone, this is a problem I've h
Nope. It has to be manually ported. Not so much because of the language
itself but because of differences in the libraries.
2009/11/13 Noble Paul നോബിള് नोब्ळ्
> Is there any tool to directly port java to .Net? then we can etxract
> out the client part of the javabin code and convert it.
>
> O
Hi Andrew,
no idea, I'm afraid - but could you sent the output of
interestingTerms=details?
This at least would show what MoreLikeThis uses, in comparison to the
TermVectorComponent you've already pasted.
Chantal
Andrew Clegg schrieb:
Any ideas on this? Is it worth sending a bug report?
Th
I think we sorely need a Directory impl that down-prioritizes IO
performed by merging.
It would be wonderful if from Java we could simply set a per-thread
"IO priority", but, it'll be a looong time until that's possible.
So I think for now we should make a Directory impl that emulates such
behavi
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> no obvious issues.
> you may post your entire data-config.xml
>
Here it is, exactly as last attempt but with usernames etc. removed.
Ignore the comments and the unused FileDataSource...
http://old.nabble.com/file/p26335171/dataimport.temp.xml dataimpo
Hi,
we are using the following entry in schema.xml to make a copy of one type of
dynamic field to another :
Is it possible to exclude some fields from copying.
We are using Solr1.3
~Vikrant
--
View this message in context:
http://old.nabble.com/exclude-some-fields-from-copying-dynamic-fi
no obvious issues.
you may post your entire data-config.xml
do w/o CachedSqlEntityProcessor first and then apply that later
On Fri, Nov 13, 2009 at 4:38 PM, Andrew Clegg wrote:
>
> Morning all,
>
> I'm having problems with joining child a child entity from one database to a
> parent from anothe
Any ideas on this? Is it worth sending a bug report?
Those links are live, by the way, in case anyone wants to verify that MLT is
returning suggestions with very low tf.idf.
Cheers,
Andrew.
Andrew Clegg wrote:
>
> Hi,
>
> If I run a MoreLikeThis query like the following:
>
> http://www.cat
For this list I usually end up @ http://solr.markmail.org (which I believe also
uses Lucene under the hood)
Google is such a black box ...
Pros:
+ 1 Open Source (enough said :-)
There also seems to always be the notion that "crawling" leads itself to
produce the best results but that is rarel
Lukáš Vlček wrote:
>
> When you need to search for something Lucene or Solr related, which one do
> you use:
> - generic Google
> - go to a particular mail list web site and search from here (if there is
> any search form at all)
>
Both of these (Nabble in the second case) in case any recent p
Morning all,
I'm having problems with joining child a child entity from one database to a
parent from another...
My entity definitions look like this (names changed for brevity):
c is getting indexed fine (it's stored, I can see field 'c' in the search
results) but child.d isn't. I know
Hi,
thanks for inputs so far... however, let's put it this way:
When you need to search for something Lucene or Solr related, which one do
you use:
- generic Google
- go to a particular mail list web site and search from here (if there is
any search form at all)
- go to LucidImagination.com and u
Lukáš Vlček wrote:
>
> I am looking for good arguments to justify implementation a search for
> sites
> which are available on the public internet. There are many sites in
> "powered
> by Solr" section which are indexed by Google and other search engines but
> still they decided to invest resour
Jan-Eirik B. Nævdal schrieb:
Some extra for the pros list:
- Full control over which content to be searchable and not.
- Posibility to make pages searchable almost instant after publication
- Control over when the site is indexed
+1 expecially the last point
you can also add a robot.txt and
I found the solution.
If somebody will run into the same problem, here is how I solved it.
- while uploading the document:
req.setParam("uprefix", "attr_");
req.setParam("fmap.content", "attr_content");
req.setParam("overwrite", "true");
req.setPara
Next to the faceting engine:
- MoreLikeThis
- Highlighting
- Spellchecker
But also more flexible querying using the DisMax handler which is
clearly superior. Solr can also be used to store data which can be
retrieved in an instant! We have used this technique in a site and it is
obviously much fas
Some extra for the pros list:
- Full control over which content to be searchable and not.
- Posibility to make pages searchable almost instant after publication
- Control over when the site is indexed
Friendly
Jan-Eirik
On Fri, Nov 13, 2009 at 10:52 AM, Lukáš Vlček wrote:
> Hi,
>
> I am look
Michael wrote:
I've got a process external to Solr that is constantly feeding it new
documents, retrying if Solr is nonresponding. What's the right way to
stop Solr (running in Tomcat) so no documents are lost?
Currently I'm committing all cores and then running catalina's stop
script, but betw
Hello list,
I'm new to solr but from what I'm experimenting, it's awesome.
I have a small issue regarding the highlighting feature.
It finds stuff (as I see from the query analyzer), but the highlight list
looks something like this:
(the files were added using ContentStreamUpdateRequest re
you must have a corresponding getter which returns String.
public String getValidFrom() {
String s = null;//convert calendar to string
return s;
}
On Fri, Nov 13, 2009 at 2:01 PM, paulhyo wrote:
>
> Hi Paul,
>
> it's working for Query, but not for Updating (Add Bean). The getter me
Hi Paul,
it's working for Query, but not for Updating (Add Bean). The getter method
is returning a Calendar (GregorianCalendar instance)
On the indexer side, a toString() or something equivalent is done and an
error is thrown
Caused by: java.text.ParseException: Unparseable date:
"java.util.Gre
87 matches
Mail list logo