Problems:
1) If you get the schema wrong it is painful to live with. You may need to
extract all data and reindex with your new schema. To ease this I wrote an
XSL script that massaged the default Solr XML output into the Solr XML input
format. Extracting is really slow and this process took days.
We've been using a Lucene index as the main data-store for ActiveMath,
the indexing process of which takes the XML fragments apart and stores
them in an organized way, including storage of the relationships both
ways.
The difference between SQL and Lucene in this case? Pure java was the
m
The other option was actually couchdb. It was very nice but the benefits
were not compelling compared to the pure simplicity of just having solr.
With the replication just so simple to setup now - it really does seem to
solve all the problems we are looking for in a redundant distributed storage
s
You might examine what the Apache CouchDB people have done.
It's a document oriented DB that is able to use JSON structured
documents combined with Lucene indexing of the documents with a
RESTful HTTP interface.
It's a stretch, and written in Erlang.. but perhaps there is some
inspiration to be h
gt; From: Ian Connor
> > To: solr-user@lucene.apache.org
> > Sent: Wednesday, January 28, 2009 4:59:28 PM
> > Subject: Re: solr as the data store
> >
> > I am planning with backups, the recovery will only be incremental.
> >
> > Is there an internal fie
There is no existing internal field like that.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Ian Connor
> To: solr-user@lucene.apache.org
> Sent: Wednesday, January 28, 2009 4:59:28 PM
> Subject: Re: solr as the data sto
I am planning with backups, the recovery will only be incremental.
Is there an internal field to know when the last document hit the index or
is this best to build your own "created_at" type field to know when you need
to rebuild from?
After the backup is restored, this field could be read and th
Although the idea that you will need to rebuild from scratch is
unlikely, you might want to fully understand the cost of recovery if you
*do* have to.
If it's incredibly expensive(time or money), you need to keep that in
mind.
-Todd
-Original Message-
From: Ian Connor [mailto:ian.con...
This is perfectly fine. Of course, you lose any relational model. If you
don't have or don't need one, why not.
It used to be the case that backups of live Lucene indices were hard, so people
preferred having a RDBMS be the primary data source, the one they know how to
back up and maintain we
One thing to keep in mind is that things like joins are impossible in
solr, but easy in a database. So if you ever need to do stuff like run
reports, you're probably better off with a database to query on -
unless you cover your bases very well in the solr index.
Thanks for your time!
Matt
10 matches
Mail list logo