Hi,
Has anyone gotten solr to schedule data imports at a certain time interval
through configuring solr?
I tried setting interval=1, which is import every minute but I don't see it
happening.
I'm trying to avoid cron jobs.
Thanks,
Tri
Does anyone has any idea on how to do this?
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1881374.html
Sent from the Solr - User mailing list archive at Nabble.com.
Jonathan,
thanks for your statement. In fact, you are quite right: A lot of people
developed great caching mechanisms.
However, the solution I got in mind was something like an HTTP-Cache - in
most cases on the same box.
I talked to some experts who told me that Squid would be a relatively large
Hi! Sorry for such a break, but I was moving house... anyway:
1. I took the
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:
package org.getopt.solr.analysis;
import org.apache.lucene.analysis.T
Hi,
I am trying to index documents (PDF, Doc, XLS, RTF) using the
ExtractingRequestHandler.
I am following the tutorial at
http://wiki.apache.org/solr/ExtractingRequestHandler
But when i run the following command
*curl
"http://localhost:8983/solr/update/extract?literal.id=mydoc.doc&uprefix=
Hello,
I'd like to use solr to index some documents coming from an rss feed,
like the example at [1], but it seems that the configuration used
there is just for a one-time indexing, trying to get all the articles
exposed in the rss feed of the website.
Is it possible to manage and index just the n
Does anyone know what technology they are using: http://www.indextank.com/
Is it Lucene under the hood?
Thanks, and apologies for cross-posting.
-Glen
http://zzzoot.blogspot.com
--
-
Hi,
I use solr 1.3 with patch for parsing rich documents, and when uploading
for example pdf file, only thing I see in solr.log is following:
INFO: [] webapp=/solr path=/update/rich
params={id=250&stream.type=pdf&fieldnames=id,name&commit=true&stream.fieldname=body&name=iphone+user+guide+pdf+
@Jerry Li
What version of Solr were you using? And was there any
data in the new field? I have no problems here with a quick
test I ran on trunk...
Best
Erick
On Thu, Nov 11, 2010 at 1:37 AM, Jerry Li | 李宗杰 wrote:
> but if I use this field to do sorting, there will be an error occured
> and th
Hi, nizan. I didn't realize that just replying to a thread from my email
client wouldn't get back to you. Here's some info on this thread since your
original post:
On Nov 10, 2010, at 12:30pm, Bob Sandiford wrote:
> Why not use replication? Call it inexperience...
>
> We're really early into
I am facing this weird issue in facet fields
Within config xml
under
−
I have defined the fl as
file_id folder_id display_name file_name priority_text content_type
last_upload upload_by business indexed
But my out xml doesnt contain the element upload_by and business
But i
Hi,
I have a question about boosting.
I have the following fields in my schema.xml:
1. title
2. description
3. ISBN
etc
I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.
Can someone help me on how to implement
Hi,
I have a question about boosting.
I have the following fields in my schema.xml:
1. title
2. description
3. ISBN
etc
I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.
Can someone help me on how to implement
Hi,
Thanks for the offers, I'll take deeper look into them.
In the offers you showed me, if I understand correctly, the call for
creation is done in the client side. I need the mechanism we'll work in the
server side.
I know it sounds stupid, but I need the client side wouldn't know about
which
Hi All,
I'm having some trouble with a query using some wildcard and I was wondering if
anyone could tell me why these two
similar queries do not return the same number of results. Basically, the query
I'm making should return all docs whose title starts
(or contain) the string "lowe'". I suspe
Hmmm. Maybe you need to define what you mean by 'server' and what you mean
by 'client'.
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1883238.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
Maybe just don't understand all the concept there and I mix up server and
client...
Client - The place where I make the http calls (for index, search etc.) -
where I use the CommonsHttpSolrServer as the solr server. This machine isn't
defined as master or slave, it just use solr as search en
I'm going down the route of patching nutch so I can use this ParseMetaTags
plugin:
https://issues.apache.org/jira/browse/NUTCH-809
Also wondering whether I will be able to use the XMLParser to allow me to
parse well formed XHTML, using xpath would be bonus:
https://issues.apache.org/jira/browse/N
Hi,
consider the following fieldtype (used for autocompletion):
This works fine as long as the query string is a single word. For multiple
words, the ranking is weird though.
Example:
Que
No - in reading what you just wrote, and what you originally wrote, I think
the misunderstanding was mine, based on the architecture of my code. In my
code, it is our 'server' level that does the SolrJ indexing calls, but you
meant 'server' to be the Solr instance, and what you mean by 'client' i
I've noticed that using camelCase in field names causes problems.
On 11/5/2010 11:02 AM, Will Milspec wrote:
Hi all,
we're moving from an old lucene version to solr and plan to use the "Copy
Field" functionality. Previously we had "rolled our own" implementation,
sticking title, description,
Hi Robert, All,
I have a similar problem, here is my fieldType,
http://paste.pocoo.org/show/289910/
I want to include stopword removal and lowercase the incoming terms. The idea
being to take, "Foo Bar Baz Ltd" and turn it into "foobarbaz" for the EdgeNgram
filter factory.
If anyone can tell me
Hi, all
I have a question about Solr and SolrJ's rollback.
I try to rollback like below
try{
server.addBean(dto);
server.commit;
}catch(Exception e){
if (server != null) { server.rollback();}
}
I wonder if any Exception thrown, "rollback" process is run. so all
data would not be updated.
but
What you say is true. Solr is not an rdbms.
Kouta Osabe wrote:
Hi, all
I have a question about Solr and SolrJ's rollback.
I try to rollback like below
try{
server.addBean(dto);
server.commit;
}catch(Exception e){
if (server != null) { server.rollback();}
}
I wonder if any Exception thrown,
I am exploring support for Japanese language in solr.
Solr seems to provide CJKTokenizerFactory.
How useful is this module? Has anyone been using this in production for
Japanese language?
One shortfall it seems to have from what I have been able to read up on is
that it can generate lot of false m
You can add an additional field, with using KeywordTokenizerFactory instead of
WhitespaceTokenizerFactory. And query both these fields with an OR operator.
edgytext:(Bill Cl) OR edgytext2:"Bill Cl"
You can even apply boost so that begins with matches comes first.
--- On Thu, 11/11/10, Robert G
Are you storing the upload_by and business fields? You will not be able to
retrieve a field from your index if it is not stored. Check that you have
stored="true" for both of those fields.
- Paige
On Thu, Nov 11, 2010 at 10:23 AM, gauravshetti wrote:
>
> I am facing this weird issue in facet fie
thanks a lot, that setup works pretty well now.
the only problem now is that the StopWords do not work that good anymore. I'll
provide an example, but first the 2 fieldtypes:
I've posted a ConcaFilter in my previous mail which does concatenate tokens.
This works fine, but i
realized that what i wanted to achieve is implemented easier in another way (by
using 2 separate field types).
Have a look at a previous mail i wrote to the list and the reply from Ahmet
Arslan (
Hello All.
My first time post so be kind. Developing a document store with lots and lots
of very small documents. (200 million at the moment. Final size will probably
be double this at 400 million documents). This is Proof of concept development
so we are seeing what a single code can do for us
> This setup now makes troubles regarding StopWords, here's
> an example:
>
> Let's say the index contains 2 Strings: "Mr Martin
> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
> list.
>
> Query: edgytext:Mr Scorsese OR edgytext2:Mr Scorsese^2.0
>
> This way, the only result i get is
Hi,
I cannot find out how this is occurring:
Nolosearch/com/search/apachesolr_search/law
You can see that the John Paul Stevens result yields more description in the
search result because of the keyword relevancy, whereas, the other results
just give you a snippet of the title ba
My Solr corpus is currently created by indexing metadata from a
relational database as well as content pointed to by URLs from the
database. I'm using a pretty generic out of the box Solr schema. The
search results are presented via an AJAX enabled HTML page.
When I perform a search the docu
Thanks Robert, I had been trying to get your ConcatFilter to work, but I'm not
sure what i need in the classpath and where Token comes from.
Will check the thread you mention.
Best
Nick
On 11 Nov 2010, at 18:13, Robert Gründler wrote:
> I've posted a ConcaFilter in my previous mail which does
this is the full source code, but be warned, i'm not a java developer, and i
have no background in lucine/solr development:
// ConcatFilter
import java.io.IOException;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenS
On 12 Nov 2010, at 01:46, Ahmet Arslan wrote:
>> This setup now makes troubles regarding StopWords, here's
>> an example:
>>
>> Let's say the index contains 2 Strings: "Mr Martin
>> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
>> list.
>>
>> Query: edgytext:Mr Scorsese OR edgytext2
I look forward to the eanswers to this one.
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is usually a
better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com
Could anyone help me understand what does "Clyde Phillips" appear in the
results for "Bill Cl"??
"Clyde Phillips" doesn't produce any EdgeNGram that would match "Bill Cl", so
why is it even in the results?
Thanks.
--- On Thu, 11/11/10, Ahmet Arslan wrote:
> You can add an additional field, w
> I'm having some trouble with a query using some wildcard
> and I was wondering if anyone could tell me why these two
> similar queries do not return the same number of results.
> Basically, the query I'm making should return all docs whose
> title starts
> (or contain) the string "lowe'". I suspe
according to the fieldtype i posted previously, i think it's because of:
1. WhiteSpaceTokenizer splits the String "Clyde Phillips" into 2 tokens:
"Clyde" and "Phillips"
2. EdgeNGramFilter gets the 2 tokens, and creates an EdgeNGram for each token:
"C" "Cl" "Cly" ... AND "P" "Ph" "Phi" ...
Th
We're holding a free webinar on migration from FAST to Solr. Details below.
-Yonik
http://www.lucidimagination.com
=
Solr To The Rescue: Successful Migration From FAST ESP to Open Source
Search Based on Apache Solr
Thur
On 2010-11-11, at 3:45 PM, Ahmet Arslan wrote:
>> I'm having some trouble with a query using some wildcard
>> and I was wondering if anyone could tell me why these two
>> similar queries do not return the same number of results.
>> Basically, the query I'm making should return all docs whose
>> t
Hi,
I am using a facet.prefix search with shingle's in my autosuggest:
Now I would like to prevent stop words to appear in the suggestions:
52
6
6
5
25
7
Here I would like to filter out the last 4 suggestions really. Is there a way I
> select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0
> >
> > "wildcard queries are not analyzed" http://search-lucene.com/m/pnmlH14o6eM1/
> >
>
> Yeah I found out about this a couple of minutes after I
> posted my problem. If there is no analyzer then
> why is Solr not finding any documents whe
Ah I see. Thanks for the explanation.
Could you set the defaultOperator to "AND"? That way both "Bill" and "Cl" must
be a match and that would exclude "Clyde Phillips".
--- On Thu, 11/11/10, Robert Gründler wrote:
> From: Robert Gründler
> Subject: Re: EdgeNGram relevancy
> To: solr-user@luc
There's not much to go on here. Boosting works,
and index time as opposed to query time boosting
addresses two different needs. Could you add some
detail? All you've really said is "it didn't work", which
doesn't allow a very constructive response.
Perhaps you could review:
http://wiki.apache.org/
Eric,
Thank you so much for the reply and apologize for not providing all the
details.
The following are the field definitons in my schema.xml:
Copy Fields:
searchFields
Before creating the indexes I feed XML
There are several mistakes in your approach:
copyField just copies data. Index time boost is not copied.
There is no such boosting syntax. /select?q=Each&title^9&fl=score
You are searching on your default field.
This is not your cause of your problem but omitNorms="true" disables index time
b
I don't know all the implications here, but can't you just
insert the StopwordFilterFactory before the ShingleFilterFactory
and turn it loose?
Best
Erick
On Thu, Nov 11, 2010 at 4:02 PM, Lukas Kahwe Smith wrote:
> Hi,
>
> I am using a facet.prefix search with shingle's in my autosuggest:
> p
(10/11/12 1:49), Kumar Pandey wrote:
I am exploring support for Japanese language in solr.
Solr seems to provide CJKTokenizerFactory.
How useful is this module? Has anyone been using this in production for
Japanese language?
CJKTokenizer is used in a lot of places in Japan.
One shortfall it s
On 11.11.2010, at 17:42, Erick Erickson wrote:
> I don't know all the implications here, but can't you just
> insert the StopwordFilterFactory before the ShingleFilterFactory
> and turn it loose?
havent tried this, but i would suspect that i would then get in trouble with
stuff like "united st
>
> Did you run your query without using () and "" operators? If yes can you try
> this?
> &q=edgytext:(Mr Scorsese) OR edgytext2:"Mr Scorsese"^2.0
I didn't use () and "" in my query before. Using the query with those operators
works now, stopwords are thrown out as the should, thanks.
However,
Without the parens, the "edgytext:" only applied to "Mr", the default
field still applied to "Scorcese".
The double quotes are neccesary in the second case (rather than parens),
because on a non-tokenized field because the standard query parser will
"pre-tokenize" on whitespace before sending
Hi,
If you are looking for query time boosting on title field you can do
the following:
/select?q=title:android^10
Also unless you have a very good reason to use string for date data
(in your case pubdate and reldate), you should be using
solr.DateField.
regards,
Ram
On Fri, Nov 12, 2010 at 3:41
Hi again,
we're coming closer to the rollout of our newly created solr/lucene based
search, and i'm wondering
how people handle changes to their schema on live systems.
In our case, we have 3 cores (ie. A,B,C), where the largest one takes about 1.5
hours for a full dataimport from the relation
You can do a similar thing to your case #1 with Solr replication,
handling a lot of the details for you instead of you manually switching
cores and such. Index to a new core, then tell your production solr to
be a slave replicating from that master new core. It still may have some
of the same d
If by "corrupt index" you mean an index that's just not quite
up to date, could you do a delta import? In other words, how
do you make our Solr index reflect changes to the DB even
without a schema change? Could you extend that method
to handle your use case?
So the scenario is something like this
I just upgraded to a later version of the trunk and noticed my
geofilter queries stopped working, apparently because the sfilt
function was renamed to geofilt.
I realize trunk is not stable, but other than looking at every change,
is there an easy way to find changes that are not backward compatib
On Thu, Nov 11, 2010 at 8:21 AM, Matteo Moci wrote:
> Hello,
> I'd like to use solr to index some documents coming from an rss feed,
> like the example at [1], but it seems that the configuration used
> there is just for a one-time indexing, trying to get all the articles
> exposed in the rss feed
On Thu, Nov 11, 2010 at 10:35 AM, Solr User wrote:
> Hi,
>
> I have a question about boosting.
>
> I have the following fields in my schema.xml:
>
> 1. title
> 2. description
> 3. ISBN
>
> etc
>
> I want to boost the field title. I tried index time boosting but it did not
> work. I also tried Quer
Hi,
Not sure if this is the correct place to post but I'm looking for someone to
help finish a Solr install on our LAMP based website. This would be a paid
project.
The programmer that started the project got too busy with his full-time job to
finish the project. Solr has been installed
Hello,
Does anyone know where to download solr4.0 source?
I tried downloading from this page:
http://wiki.apache.org/solr/FrontPage#solr_development
but the link is not working...
Best,
Deche
Hi,
I'm restricted to the following in regards to importing.
I have access to a list (Iterator) of Java objects I need to import into solr.
Can I import the java objects as part of solr's data import interface (whenever
an http request to solr to do a dataimport, it'll call my java class to get
Hi, Kouta:
Any data store does not support rollback AFTER commit, rollback works only
BEFORE.
On Friday, November 12, 2010 12:34:18 am Kouta Osabe wrote:
> Hi, all
>
> I have a question about Solr and SolrJ's rollback.
>
> I try to rollback like below
>
> try{
> server.addBean(dto);
> server.c
In some cases you can rollback to a named checkpoint. I am not too sure but
I think I read in the lucene documentation that it supported named
checkpointing.
On Thu, Nov 11, 2010 at 7:12 PM, gengshaoguang wrote:
> Hi, Kouta:
> Any data store does not support rollback AFTER commit, rollback works
Hi,
Pardon me if this sounds very elementary, but I have a very basic question
regarding Solr search. I have about 10 storage devices running Solaris with
hundreds of thousands of text files (there are other files, as well, but my
target is these text files). The directories on the Solaris boxes a
another question is, can I write my own DataImportHandler class?
thanks,
Tri
From: Tri Nguyen
To: solr user
Sent: Thu, November 11, 2010 7:01:25 PM
Subject: importing from java
Hi,
I'm restricted to the following in regards to importing.
I have access to
http://wiki.apache.org/solr/DIHQuickStart
http://wiki.apache.org/solr/DataImportHandlerFaq
http://wiki.apache.org/solr/DataImportHandler
-Original Message-
From: Tri Nguyen [mailto:tringuye...@yahoo.com]
Sent: Thursday, November 11, 2010 9:34 PM
To: solr-user@lucene.apache.org
Subject: R
Oh, Pardeep:
I don't think lucene is a advanced storage app to support rollback to a
history check point (which would be support only in distributed system, such
as tow phase commit or transactional web services)
yours
On Friday, November 12, 2010
Hi,
Not sure if this is the correct place to post but I'm looking for someone to
help finish a Solr install on our LAMP based website. This would be a paid
project.
The programmer that started the project got too busy with his full-time job to
finish the project. Solr has been installed a
On 11/11/2010 4:45 PM, Robert Gründler wrote:
So far, i can only think of 2 scenarios for rebuilding the index, if we need to
update the schema after the rollout:
1. Create 3 more cores (A1,B1,C1) - Import the data from the database - After
importing, switch the application to cores A1, B1, C
On 11/11/2010 7:44 PM, Deche Pangestu wrote:
Hello,
Does anyone know where to download solr4.0 source?
I tried downloading from this page:
http://wiki.apache.org/solr/FrontPage#solr_development
but the link is not working...
Your best bet is to use svn.
http://lucene.apache.org/solr/version_con
72 matches
Mail list logo