The format is fixed, you can't change it -- something on the Solr end needs to
parse that XML and expects specific XML elements and structure, so it can't
handle whatever one throws at it.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: co
Could you paste the XML you are posting?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: 李学健 <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Thursday, September 18, 2008 11:27:26 PM
> Subject: error when post xml data to solr
>
why to restart solr ? reloading a core may be sufficient.
SOLR-561 already supports this
-
On Thu, Sep 18, 2008 at 5:17 PM, Jason Rutherglen
<[EMAIL PROTECTED]> wrote:
> Servlets is one thing. For SOLR the situation is different. There
> are always small changes people want to make, a new stop
it is surprising as to why this happens
the the javabin offers significant perf improvements over the xml one.
probably you can also try this
javabin
On Thu, Sep 18, 2008 at 10:17 PM, syoung <[EMAIL PROTECTED]> wrote:
>
> I tried setting the 'wt' parameter to both 'xml' and 'javabin
Hi guys.
Is the XML format for inputting data, is a standard one? or can I change it.
That is instead of :
3007WFP
Dell Widescreen UltraSharp 3007WFP
Dell, Inc.
can I enter something like,
100100
BPO
1500
100200
ITES
2500
Thanks
--
View this message in context:
http://www.nab
Thanks Otis for reply! Always appreciated!
That is indeed what we are looking for implementing. But, I'm running
out of time to prototype or experiment for this release.
I'm going to run the two index thing for now, unless I find something
saying is really easy and sensible to run one and collap
hi, all
when i post an xml file to solr, some errors happen as below:
==
com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
at [row,col {unknown-source}]: [1,0]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686)
at c
Gene,
I haven't looked at Field Collapsing for a while, but if you have a single
index and collapse hits on your category field, then won't first 10 hits be
items you are looking for - top 1 item for each category x 10 using a single
query.
Otis
--
Sematext -- http://sematext.com/ -- Lucene -
On Fri, Sep 19, 2008 at 5:55 AM, oleg_gnatovskiy <
[EMAIL PROTECTED]> wrote:
>
> Hello. I am using the spellcheck component
> (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker
> index is kept in RAM, it gets erased every time the Solr server gets
> restarted. I was thinking
Hello. I am using the spellcheck component
(https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker
index is kept in RAM, it gets erased every time the Solr server gets
restarted. I was thinking of using either the firstSearcher or the
newSearcher to reload the index every time So
Matthew,
Thanks, a very good point.
Andrey.
> -Original Message-
> From: Matthew Runo [mailto:[EMAIL PROTECTED]
> Sent: Thursday, September 18, 2008 11:38 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Hardware config for SOLR
>
> I can't speak to a lot of this - but regarding the
Otis,
Would be reasonable to run a query like this
http://localhost:8280/solr/select/?q=terms_x&version=2.2&start=0&rows=0&indent=on
10 times, one for each result from an initial category query on a
different index.
So, it's still 1+10, but I'm not returning values.
This would give me the numbe
Here is what I was able to get working with your help.
(productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR
((*:* -endDate:[* TO *])))
the *:* is what I was missing.
Thanks for your help.
hossman wrote:
>
>
> : If the query stars with a negative clause Lucene return
Barry,
You are seeing the value of the field as it was saved (as the original), but
perhaps something is funky with how it was analyzed/tokenized at search time
and how it is being analyzed now at query time. Double-check your
fieldType/analysis settings for this field and make sure you are us
Daniel Papasian wrote:
Norberto Meijome wrote:
Thanks Yonik. ok, that matches what I've seen - if i know the actual
name of the field I'm after, I can use it in a query it, but i can't
use the dynamic_field_name_* (with wildcard) in the config.
Is adding support for this something that is desir
Hi Otis,
no that does not seem to bring back the correct results either in fact its
still zero results.
Its also not bringing back results if I use the standard handler
http://127.0.0.1:8080/apache-solr-1.3.0/select?q=Output-Type-facet:Monochrome
but the field is visible in the documents retur
I tried setting the 'wt' parameter to both 'xml' and 'javabin'. Neither
worked. However, setting the parser on the server to XMLResponseParser did
fix the problem. Thanks for the help.
Susan
Noble Paul നോബിള് नोब्ळ् wrote:
>
> I guess the post is not sending the correct 'wt' parameter. tr
Hi
sorry I think I've started properly rsyncd :
[EMAIL PROTECTED]:/# ./data/solr/books/bin/rsyncd-enable
[EMAIL PROTECTED]:/# ./data/books/video/bin/rsyncd-start
but then I can't found this snapshot.current files ??
How can I check I did it properly ?
my rsyncd.log :
2008/09/18 18:06:0
Geoff,
In short: all items that you listed are not a problem for Solr. Indices can be
sharded, distributed search is possible, custom ranking is possible, 30 fields
is possible, etc. etc.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From:
Hi,
If all you have to do is "hide" certain fields from search results for some
users, then your application -- the application that sends search requests to
Solr can just use different fl=XXX parameters based on user's permission. I
think that's all you need and the custom fieldType should n
Hi Christian,
While I can't tell you whether the problem with "-" will be solved when you try
it on 1.3, I can tell you that you should probably trim your dates so they are
not as fine as you currently have them, unless you need such precision. We
need to add this to the FAQ. :)
Otis
--
Semat
Barry, does this return the correct hits:
http://127.0.0.1:8080/apache-solr-1.3.0/IvolutionSearch?q=Output-Type-facet:Monochrome
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Barry Harding <[EMAIL PROTECTED]>
> To: "solr-user@lucene.apach
Hi Geoff,
I cannot vouch for Autonomy however, earlier this year we did evaluate
Endeca & Solr and we went with Solr some of the reasons were:
1. Freedom of open source with Solr
2. Very good & active solr open source community
3. Features pretty much overlap with both solr & Endeca
4. Endeca how
I can't speak to a lot of this - but regarding the servers I'd go with
the more powerful ones, if only for the amount of ram. Your index will
likely be larger than 1 gig, and with only two you'll have a lot of
your index not stored in ram, which will slow down your QPS.
Thanks for your time
this is my log file :
[EMAIL PROTECTED]:/home# tail -f /var/log/tomcat5.5/catalina.$(date
+%Y-%m-%d).log
Sep 18, 2008 5:25:02 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Creating a connection for entity books with URL:
jdbc:mysql://master-spare.vip.books.com/books
Sep 18, 2
I would do the field visibility one layer up from the search engine.
That layer already knows about the user and can request the appropriate
fields. Or request them all (better HTTP caching) and only show the
appropriate ones.
As I understand your application, putting access control in Solr
doesn'
On Thu, Sep 18, 2008 at 8:45 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
>
> I agree about that but the last time 4hours later the number wasn't
> different
> :
Do you mean that the number doesn't change at all on refreshing the page?
Can you check the solr log file for exceptions?
I suspect that yo
I agree about that but the last time 4hours later the number wasn't different
:
and if I check now, nothing changed : does it have to go across all the data
like full import, I thought it would bring back just ids which need to be
modify ...?
0:39:36.943
3447914
9054602
492558
0
2008-09-18 16:29
Well it shows the number of documents that have changed, you can't expect
1603970 documents to be indexed instantly.
On Thu, Sep 18, 2008 at 8:24 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
>
> It is exactly what I've done but it can't works like that ...
>
> - what would that mean ... cron job c
On Sep 18, 2008, at 3:23 AM, Geoff Hopson wrote:
As per other thread
1) security down to field level
how complex of a security model do you need?
Is each users field visibility totally distinct? are there a few
basic groups?
If you are willing to write (or hire someone to write) a cus
It is exactly what I've done but it can't works like that ...
- what would that mean ... cron job can't hit it properly ?
- I've browse to /dataimport but it was like nothing was running so I
finally went back to /dataimport?command=delta-import and then to
/dataimport and I refresh it of
It is exactly what I've done but it can't works like that ...
- what would that mean ... cron job can't hit it properly ?
- I've browse to /dataimport but it was like nothing was running so I
finally went back to /dataimport?command=delta-import and then to
/dataimport and I refresh it of
Hit /dataimport again from a browser and refresh periodically to see the
progress (number of documents indexed).
On Thu, Sep 18, 2008 at 7:55 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
>
> It was too long so I finally restart tomcat .. then 5mn later my cron job
> started :
> but it looks like nothin
Otis Gospodnetic wrote:
Perhaps the container logs explain what happened?
How about just throttling to the point where the failure rate is 0%?
Too slow?
Otis's questions regarding dropped inserts sent me back to the drawing
board. The system had been tuned to a slower database to optimiz
My project is looking to index 10s of millions of documents, providing
search across a live-live environment (hence index
distribution/replication is important). Most searches have to be done
(ie to end user) in 5 seconds or less. The index has about 30 fields,
and I reckon that the security access
It depends entirely on the needs of the project. For some things,
Solr is superior to Autonomy, for other things, not.
I used to work at Autonomy (and Verity and Inktomi and Infoseek),
and I chose Solr for Netflix. It is working great for us.
wunder
==
Walter Underwood
Former Ultraseek Architect
It was too long so I finally restart tomcat .. then 5mn later my cron job
started :
but it looks like nothing happening by cron job :
This is my OUTPUT file : tot.txt
00data-config.xmldelta-import,idleThis
response format is experimental. It is likely to change in the
future.
This is my C
From the XML 1.0 spec.: "Legal characters are tab, carriage return,
line feed, and the legal graphic characters of Unicode and ISO/IEC
10646." So, \005 is not a legal XML character. It appears the old StAX
implementation was more lenient than it should have been and Woodstox is
doing the corr
Yes, so it's probably best to make the changes through a remote
interface so that the app will be able to make the appropriate
internal changes. File based system changes are less than ideal,
agreed, however I suppose with an open source project such as SOLR the
kitchen sink affect happens and it
This XML file does not appear to have any style information
associated with it. The document tree is shown below.
−
−
0
0
−
−
data-config.xml
idle
−
4:26:16.934
3451431
9165885
493061
0
2008-09-18 10:01:01
2008-09-18 10:01:01
2008-09-18 10:01:43
2008-09-18 10:01:43
1587889
Dynamic changes are not what I'm against...I'm against dynamic changes
that are triggered by the app noticing that the config have changed.
Jason Rutherglen wrote:
Servlets is one thing. For SOLR the situation is different. There
are always small changes people want to make, a new stop word,
Servlets is one thing. For SOLR the situation is different. There
are always small changes people want to make, a new stop word, a small
tweak to an analyzer. Rebooting the server for these should not be
necessary. Ideally this is handled via a centralized console and
deployed over the network
> multi-core allows you to instantiate a completely
> new core and swap it for the old one, but it's a bit of a heavyweight
> approach.
Multi core seems like more of a hack to get around running multiple
JVMs. It doesn't seem like the most elegant solutions for most
problems because usually the s
Isnt this done in servlet containers for debugging type work? Maybe an
option, but I disagree that this should drive anything in solr. It
should really be turned off in production in servelet containers imo as
well.
This can really be such a pain in the ass on a live site...someone
touches we
> That would allow a single request to see a stable view of the
> schema, while preventing having to make every aspect of the schema
> thread-safe.
Yes that is the best approach.
> Nothing will stop one from using java serialization for config
> persistence,
Persistence should not be serialized.
This should be done. Great idea.
On Wed, Sep 17, 2008 at 3:41 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:
> My vote is for dynamically scanning a directory of configuration files. When
> a new one appears, or an existing file is touched, load it. When a
> configuration disappears, unload it. Th
Hi Yonik,
One approach I have been working on that I will integrate into SOLR is
the ability to use serialized objects for the analyzers so that the
schema can be defined on the client side if need be. The analyzer
classes will be dynamically loaded. Or there is no need for a schema
and plain Ja
Hi,
I have a fairly simple solr setup with several predefined fields that are
indexed and stored and also depending on the type of product I also add various
dynamic fields of type string to a record, and I should mention that I am using
the
solr.DisMaxRequestHandler request handler called "/
Thanks Akshay and Norberto,
I am still trying to make it work. I know the solution is what you pointed
me to but is just taking me some time to make it work.
thanks,
-Sanjay
On Thu, Sep 18, 2008 at 12:34 PM, Norberto Meijome <[EMAIL PROTECTED]>wrote:
> On Thu, 18 Sep 2008 10:53:39 +0530
> "Sanja
I don't think it can works at the index time, because I when somebody look
for a book I want to boost the search in relation with the user language
...so I don"t think it can works, except if I didn't get it.
Thanks for your answer,
hossman wrote:
>
>
> : Is there a way to convert to integer
I guess the post is not sending the correct 'wt' parameter. try
setting wt=javabin explicitly .
wt=xml may not work because the parser still is binary.
check this http://wiki.apache.org/solr/Solrj#xmlparser
On Thu, Sep 18, 2008 at 11:49 AM, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> A qui
Ok Thanks it's very clear.
Just do you know why my cron job doesn't work :
# m h dom mon dow command
*/5 * * * * /usr/bin/wget
http://solr-test.books.com:8080/solr/books/dataimport?command=delta-import
When I go to check the date in conf/dataimport.properties, the date and hour
doesn't change
Hi Chris,
it was a long night for our solr server today because we rebuilt the complete
index using "well formed" date string. And the date field is stored now so that
we can see if there went something wrong :-)
But our problems are solved completely. Now I can give you a very exact
descripti
As per other thread
1) security down to field level
Otherwise I am mostly happy that Solr gives me everything that Autonomy does.
2008/9/18 Otis Gospodnetic <[EMAIL PROTECTED]>:
> Geoff,
>
> Perhaps you can find out the list of features/functionalities that your
> project requires and we can gi
Hi Otis,
Thanks for the response. I'll try and inline some clarity...
2008/9/18 Otis Gospodnetic <[EMAIL PROTECTED]>:
>> I am trying to put together a security model around fields in my
>> index. My requirement is that a user may not have permission to view
>> certain fields in the index when he
On Thu, 18 Sep 2008 10:53:39 +0530
"Sanjay Suri" <[EMAIL PROTECTED]> wrote:
> One of my field values has the name "R__ikk__nen" which contains a special
> characters.
>
> Strangely, as I see it anyway, it matches on the search query 'x' ?
>
> Can someone explain or point me to the solution/doc
56 matches
Mail list logo