I have a couple of questions from some online newspaper folks who are
interested in Solr and are trying to understand how and why it came to be. I
think inherent in these questions is the underlying theme I hear all the
time and that is "Solr is not a content management system. It's a search
engin
On 9/22/06, Michael Imbeault <[EMAIL PROTECTED]> wrote:
I upgraded to the most recent Solr build (9-22) and sadly it's still
really slow. 800 seconds query with a single facet on first_author, 15
millions documents total, the query return 180. Maybe i'm doing
something wrong? Also, this is on my
Chris,
I think what I am trying to do is actually much simpler than what you
are talking about here.
I do plan on returning document ids and retrieving full entity data from
the database- solr would
just be used for the search, not for results display.
The problem is that some data cannot be
On 9/22/06, Tim Archambault <[EMAIL PROTECTED]> wrote:
I have a couple of questions from some online newspaper folks who are
interested in Solr and are trying to understand how and why it came to be. I
think inherent in these questions is the underlying theme I hear all the
time and that is "Solr
On 9/21/06 5:37 PM, "James liu" <[EMAIL PROTECTED]> wrote:
> Yes,it working. the root of my problem is xml muse be encoded by utf-8.
> if use php,it not about www browser. just notice that
> curl header information must be utf-8.
> if use post.sh,xml muse be encoded by utf-8.(my editplus default e
On 9/22/06, Walter Underwood <[EMAIL PROTECTED]> wrote:
This might be a Solr bug. Solr should be able to accept XML in any
of the required encodings (ASCII, Latin 1, UTF-8, and UTF-16).
Getting XML content types exactly right is tricky, see RFC 3023.
Right now Solr pays attention to Content-typ
On 9/22/06 10:22 AM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> What I think might be ideal: If there is a charset definition, then
> let the servlet handle it by requesting a Writer. If there isn't
> a charset definition, request a byte-oriented InputStream from the
> container and let the XML
: I may need to add functionality to Solr's admin pages. The
: functionality that I'm looking to add is the ability to trigger certain
: indexing functions and monitor their progress. I'm wondering if people
: have thoughts about the best way to do this. Here are my initial ideas:
:
: 1. Add ad
Obvious datasources: MSSQL, MySQL, etc. I'm under the impression that I have
to send an XML request to SOLR for every add, update, delete, etc. in my
database.
I believe there's a way to access MSSQL, MySQL etc. directly with Lucene,
but not sure how to do this with SOLR.
Thanks for all your fee
On Sep 22, 2006, at 2:45 PM, Tim Archambault wrote:
I believe there's a way to access MSSQL, MySQL etc. directly with
Lucene,
but not sure how to do this with SOLR.
Nope. Lucene is a pure search engine, with no hooks to databases, or
document parsers, etc. Lots of folks have built these
Okay, I'll use an example.
A recruitment (jobs) customer goes onto our website and posts an online job
posting to our newspaper website. Upon insert into the database, I need to
generate an xml file to be sent to SOLR to ADD as a record to the search
engine. Same goes for an edit, my database u
On 9/22/06 12:25 PM, "Tim Archambault" <[EMAIL PROTECTED]>
wrote:
> A recruitment (jobs) customer goes onto our website and posts an online job
> posting to our newspaper website. Upon insert into the database, I need to
> generate an xml file to be sent to SOLR to ADD as a record to the search
>
I'm really confused. I don't mean "store" the data figuratively as in a
lucene/solr command. Storing an ID number in a solr index isn't going to
help a user find "nurse". I think part of this is that some people feel that
databases like MSSQL, MYSQL should be able to provide quality search
experie
On 9/22/06, Tim Archambault <[EMAIL PROTECTED]> wrote:
I've been talking with other papers about Solr and I think what bothers many
is that there a is a deposit of information in a structured database here
[named A], then we have another set of basically the same data over here
[named B] and they
Sorry, I was not being exact with "store". Lucene has separate
control over whether the value of a field is stored and whether
it is indexed. The term "nurse" might be searchable, but the
only value that is stored in the index for retrieval is the
database key for each matching job.
It seems like
I think you will find that this architecture is quite common. What
commercial packages
provide (remember you are getting this for free!) are the tools for
managing the dynamic
export of data out of your database into the full-text search engine.
Solr provides a very easy way to do this, but ye
Okay. We are all on the same page. I just don't express myself as well in
"programming speak" yet.
I'm going to read up on Otis' "Lucene in Action" tonight. I'd swear he had
an example of how to inject records into a lucene index using java and sql.
Maybe I'm wrong though.
On 9/22/06, Walter U
Excellent news; as you guessed, my schema was (for some reason) set to
version 1.0. This also caused some of the problems I had with the
original SolrPHP (parsing the wrong response).
But better yet, the 800 seconds query is now running in 0.5-2 seconds!
Amazing optimization! I can now do face
On 9/22/06, Michael Imbeault <[EMAIL PROTECTED]> wrote:
Excellent news; as you guessed, my schema was (for some reason) set to
version 1.0.
Yeah, I just realized that having "version" right next to "name" would
lead people to think it's "their" version number, when it's really
Solr's version nu
Regarding XML databases, there is an excellent open-source XML database 'eXist'
which currently uses indexes to speed up both structure-based and content-based
retrieval via XQuery; there are plans on their development roadmap to replace
parts of the indexing mechanism, particularly fulltext ana
: I've been talking with other papers about Solr and I think what bothers many
: is that there a is a deposit of information in a structured database here
: [named A], then we have another set of basically the same data over here
: [named B] and they don't understand why they have to manage to dif
: The best example I can think of is a resume database. You could
: certainly just put the whole resume
: document into the text index and do full text searches. But to answer
: the question of what people
: received a Harvard MBA in the last 10 years and have worked at Intel in
: the last 5 yea
Amen Hoss. I appreciated you explaining in terms of what I can understand,
"jobs." Makes it easier for me to learn.
What you are saying is right-on with what I'm trying to understand. Right
now I have simple Lucene Indexes that basically re-created once daily and
that simply isn't doing the job
2006/9/23, Walter Underwood <[EMAIL PROTECTED]>:
On 9/21/06 5:37 PM, "James liu" <[EMAIL PROTECTED]> wrote:> Yes,it working. the root of my problem is xml muse be encoded by utf-8.> if use php,it not about www browser. just notice that
> curl header information must be utf-8.> if use post.sh,xml mu
24 matches
Mail list logo