On 11/11/2010 7:44 PM, Deche Pangestu wrote:
Hello,
Does anyone know where to download solr4.0 source?
I tried downloading from this page:
http://wiki.apache.org/solr/FrontPage#solr_development
but the link is not working...
Your best bet is to use svn.
http://lucene.apache.org/solr/version_con
On 11/11/2010 4:45 PM, Robert Gründler wrote:
So far, i can only think of 2 scenarios for rebuilding the index, if we need to
update the schema after the rollout:
1. Create 3 more cores (A1,B1,C1) - Import the data from the database - After
importing, switch the application to cores A1, B1, C
Hi,
Not sure if this is the correct place to post but I'm looking for someone to
help finish a Solr install on our LAMP based website. This would be a paid
project.
The programmer that started the project got too busy with his full-time job to
finish the project. Solr has been installed a
Oh, Pardeep:
I don't think lucene is a advanced storage app to support rollback to a
history check point (which would be support only in distributed system, such
as tow phase commit or transactional web services)
yours
On Friday, November 12, 2010
http://wiki.apache.org/solr/DIHQuickStart
http://wiki.apache.org/solr/DataImportHandlerFaq
http://wiki.apache.org/solr/DataImportHandler
-Original Message-
From: Tri Nguyen [mailto:tringuye...@yahoo.com]
Sent: Thursday, November 11, 2010 9:34 PM
To: solr-user@lucene.apache.org
Subject: R
another question is, can I write my own DataImportHandler class?
thanks,
Tri
From: Tri Nguyen
To: solr user
Sent: Thu, November 11, 2010 7:01:25 PM
Subject: importing from java
Hi,
I'm restricted to the following in regards to importing.
I have access to
Hi,
Pardon me if this sounds very elementary, but I have a very basic question
regarding Solr search. I have about 10 storage devices running Solaris with
hundreds of thousands of text files (there are other files, as well, but my
target is these text files). The directories on the Solaris boxes a
In some cases you can rollback to a named checkpoint. I am not too sure but
I think I read in the lucene documentation that it supported named
checkpointing.
On Thu, Nov 11, 2010 at 7:12 PM, gengshaoguang wrote:
> Hi, Kouta:
> Any data store does not support rollback AFTER commit, rollback works
Hi, Kouta:
Any data store does not support rollback AFTER commit, rollback works only
BEFORE.
On Friday, November 12, 2010 12:34:18 am Kouta Osabe wrote:
> Hi, all
>
> I have a question about Solr and SolrJ's rollback.
>
> I try to rollback like below
>
> try{
> server.addBean(dto);
> server.c
Hi,
I'm restricted to the following in regards to importing.
I have access to a list (Iterator) of Java objects I need to import into solr.
Can I import the java objects as part of solr's data import interface (whenever
an http request to solr to do a dataimport, it'll call my java class to get
Hello,
Does anyone know where to download solr4.0 source?
I tried downloading from this page:
http://wiki.apache.org/solr/FrontPage#solr_development
but the link is not working...
Best,
Deche
Hi,
Not sure if this is the correct place to post but I'm looking for someone to
help finish a Solr install on our LAMP based website. This would be a paid
project.
The programmer that started the project got too busy with his full-time job to
finish the project. Solr has been installed
On Thu, Nov 11, 2010 at 10:35 AM, Solr User wrote:
> Hi,
>
> I have a question about boosting.
>
> I have the following fields in my schema.xml:
>
> 1. title
> 2. description
> 3. ISBN
>
> etc
>
> I want to boost the field title. I tried index time boosting but it did not
> work. I also tried Quer
On Thu, Nov 11, 2010 at 8:21 AM, Matteo Moci wrote:
> Hello,
> I'd like to use solr to index some documents coming from an rss feed,
> like the example at [1], but it seems that the configuration used
> there is just for a one-time indexing, trying to get all the articles
> exposed in the rss feed
I just upgraded to a later version of the trunk and noticed my
geofilter queries stopped working, apparently because the sfilt
function was renamed to geofilt.
I realize trunk is not stable, but other than looking at every change,
is there an easy way to find changes that are not backward compatib
If by "corrupt index" you mean an index that's just not quite
up to date, could you do a delta import? In other words, how
do you make our Solr index reflect changes to the DB even
without a schema change? Could you extend that method
to handle your use case?
So the scenario is something like this
You can do a similar thing to your case #1 with Solr replication,
handling a lot of the details for you instead of you manually switching
cores and such. Index to a new core, then tell your production solr to
be a slave replicating from that master new core. It still may have some
of the same d
Hi again,
we're coming closer to the rollout of our newly created solr/lucene based
search, and i'm wondering
how people handle changes to their schema on live systems.
In our case, we have 3 cores (ie. A,B,C), where the largest one takes about 1.5
hours for a full dataimport from the relation
Hi,
If you are looking for query time boosting on title field you can do
the following:
/select?q=title:android^10
Also unless you have a very good reason to use string for date data
(in your case pubdate and reldate), you should be using
solr.DateField.
regards,
Ram
On Fri, Nov 12, 2010 at 3:41
Without the parens, the "edgytext:" only applied to "Mr", the default
field still applied to "Scorcese".
The double quotes are neccesary in the second case (rather than parens),
because on a non-tokenized field because the standard query parser will
"pre-tokenize" on whitespace before sending
>
> Did you run your query without using () and "" operators? If yes can you try
> this?
> &q=edgytext:(Mr Scorsese) OR edgytext2:"Mr Scorsese"^2.0
I didn't use () and "" in my query before. Using the query with those operators
works now, stopwords are thrown out as the should, thanks.
However,
On 11.11.2010, at 17:42, Erick Erickson wrote:
> I don't know all the implications here, but can't you just
> insert the StopwordFilterFactory before the ShingleFilterFactory
> and turn it loose?
havent tried this, but i would suspect that i would then get in trouble with
stuff like "united st
(10/11/12 1:49), Kumar Pandey wrote:
I am exploring support for Japanese language in solr.
Solr seems to provide CJKTokenizerFactory.
How useful is this module? Has anyone been using this in production for
Japanese language?
CJKTokenizer is used in a lot of places in Japan.
One shortfall it s
I don't know all the implications here, but can't you just
insert the StopwordFilterFactory before the ShingleFilterFactory
and turn it loose?
Best
Erick
On Thu, Nov 11, 2010 at 4:02 PM, Lukas Kahwe Smith wrote:
> Hi,
>
> I am using a facet.prefix search with shingle's in my autosuggest:
> p
There are several mistakes in your approach:
copyField just copies data. Index time boost is not copied.
There is no such boosting syntax. /select?q=Each&title^9&fl=score
You are searching on your default field.
This is not your cause of your problem but omitNorms="true" disables index time
b
Eric,
Thank you so much for the reply and apologize for not providing all the
details.
The following are the field definitons in my schema.xml:
Copy Fields:
searchFields
Before creating the indexes I feed XML
There's not much to go on here. Boosting works,
and index time as opposed to query time boosting
addresses two different needs. Could you add some
detail? All you've really said is "it didn't work", which
doesn't allow a very constructive response.
Perhaps you could review:
http://wiki.apache.org/
Ah I see. Thanks for the explanation.
Could you set the defaultOperator to "AND"? That way both "Bill" and "Cl" must
be a match and that would exclude "Clyde Phillips".
--- On Thu, 11/11/10, Robert Gründler wrote:
> From: Robert Gründler
> Subject: Re: EdgeNGram relevancy
> To: solr-user@luc
> select?q=*:*&fq=title:(+lowe')&debugQuery=on&rows=0
> >
> > "wildcard queries are not analyzed" http://search-lucene.com/m/pnmlH14o6eM1/
> >
>
> Yeah I found out about this a couple of minutes after I
> posted my problem. If there is no analyzer then
> why is Solr not finding any documents whe
Hi,
I am using a facet.prefix search with shingle's in my autosuggest:
Now I would like to prevent stop words to appear in the suggestions:
52
6
6
5
25
7
Here I would like to filter out the last 4 suggestions really. Is there a way I
On 2010-11-11, at 3:45 PM, Ahmet Arslan wrote:
>> I'm having some trouble with a query using some wildcard
>> and I was wondering if anyone could tell me why these two
>> similar queries do not return the same number of results.
>> Basically, the query I'm making should return all docs whose
>> t
We're holding a free webinar on migration from FAST to Solr. Details below.
-Yonik
http://www.lucidimagination.com
=
Solr To The Rescue: Successful Migration From FAST ESP to Open Source
Search Based on Apache Solr
Thur
according to the fieldtype i posted previously, i think it's because of:
1. WhiteSpaceTokenizer splits the String "Clyde Phillips" into 2 tokens:
"Clyde" and "Phillips"
2. EdgeNGramFilter gets the 2 tokens, and creates an EdgeNGram for each token:
"C" "Cl" "Cly" ... AND "P" "Ph" "Phi" ...
Th
> I'm having some trouble with a query using some wildcard
> and I was wondering if anyone could tell me why these two
> similar queries do not return the same number of results.
> Basically, the query I'm making should return all docs whose
> title starts
> (or contain) the string "lowe'". I suspe
Could anyone help me understand what does "Clyde Phillips" appear in the
results for "Bill Cl"??
"Clyde Phillips" doesn't produce any EdgeNGram that would match "Bill Cl", so
why is it even in the results?
Thanks.
--- On Thu, 11/11/10, Ahmet Arslan wrote:
> You can add an additional field, w
I look forward to the eanswers to this one.
Dennis Gearon
Signature Warning
It is always a good idea to learn from your own mistakes. It is usually a
better
idea to learn from others’ mistakes, so you do not have to make them yourself.
from 'http://blogs.techrepublic.com.com
On 12 Nov 2010, at 01:46, Ahmet Arslan wrote:
>> This setup now makes troubles regarding StopWords, here's
>> an example:
>>
>> Let's say the index contains 2 Strings: "Mr Martin
>> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
>> list.
>>
>> Query: edgytext:Mr Scorsese OR edgytext2
this is the full source code, but be warned, i'm not a java developer, and i
have no background in lucine/solr development:
// ConcatFilter
import java.io.IOException;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenS
Thanks Robert, I had been trying to get your ConcatFilter to work, but I'm not
sure what i need in the classpath and where Token comes from.
Will check the thread you mention.
Best
Nick
On 11 Nov 2010, at 18:13, Robert Gründler wrote:
> I've posted a ConcaFilter in my previous mail which does
My Solr corpus is currently created by indexing metadata from a
relational database as well as content pointed to by URLs from the
database. I'm using a pretty generic out of the box Solr schema. The
search results are presented via an AJAX enabled HTML page.
When I perform a search the docu
Hi,
I cannot find out how this is occurring:
Nolosearch/com/search/apachesolr_search/law
You can see that the John Paul Stevens result yields more description in the
search result because of the keyword relevancy, whereas, the other results
just give you a snippet of the title ba
> This setup now makes troubles regarding StopWords, here's
> an example:
>
> Let's say the index contains 2 Strings: "Mr Martin
> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
> list.
>
> Query: edgytext:Mr Scorsese OR edgytext2:Mr Scorsese^2.0
>
> This way, the only result i get is
Hello All.
My first time post so be kind. Developing a document store with lots and lots
of very small documents. (200 million at the moment. Final size will probably
be double this at 400 million documents). This is Proof of concept development
so we are seeing what a single code can do for us
I've posted a ConcaFilter in my previous mail which does concatenate tokens.
This works fine, but i
realized that what i wanted to achieve is implemented easier in another way (by
using 2 separate field types).
Have a look at a previous mail i wrote to the list and the reply from Ahmet
Arslan (
thanks a lot, that setup works pretty well now.
the only problem now is that the StopWords do not work that good anymore. I'll
provide an example, but first the 2 fieldtypes:
Are you storing the upload_by and business fields? You will not be able to
retrieve a field from your index if it is not stored. Check that you have
stored="true" for both of those fields.
- Paige
On Thu, Nov 11, 2010 at 10:23 AM, gauravshetti wrote:
>
> I am facing this weird issue in facet fie
You can add an additional field, with using KeywordTokenizerFactory instead of
WhitespaceTokenizerFactory. And query both these fields with an OR operator.
edgytext:(Bill Cl) OR edgytext2:"Bill Cl"
You can even apply boost so that begins with matches comes first.
--- On Thu, 11/11/10, Robert G
I am exploring support for Japanese language in solr.
Solr seems to provide CJKTokenizerFactory.
How useful is this module? Has anyone been using this in production for
Japanese language?
One shortfall it seems to have from what I have been able to read up on is
that it can generate lot of false m
What you say is true. Solr is not an rdbms.
Kouta Osabe wrote:
Hi, all
I have a question about Solr and SolrJ's rollback.
I try to rollback like below
try{
server.addBean(dto);
server.commit;
}catch(Exception e){
if (server != null) { server.rollback();}
}
I wonder if any Exception thrown,
Hi, all
I have a question about Solr and SolrJ's rollback.
I try to rollback like below
try{
server.addBean(dto);
server.commit;
}catch(Exception e){
if (server != null) { server.rollback();}
}
I wonder if any Exception thrown, "rollback" process is run. so all
data would not be updated.
but
Hi Robert, All,
I have a similar problem, here is my fieldType,
http://paste.pocoo.org/show/289910/
I want to include stopword removal and lowercase the incoming terms. The idea
being to take, "Foo Bar Baz Ltd" and turn it into "foobarbaz" for the EdgeNgram
filter factory.
If anyone can tell me
I've noticed that using camelCase in field names causes problems.
On 11/5/2010 11:02 AM, Will Milspec wrote:
Hi all,
we're moving from an old lucene version to solr and plan to use the "Copy
Field" functionality. Previously we had "rolled our own" implementation,
sticking title, description,
No - in reading what you just wrote, and what you originally wrote, I think
the misunderstanding was mine, based on the architecture of my code. In my
code, it is our 'server' level that does the SolrJ indexing calls, but you
meant 'server' to be the Solr instance, and what you mean by 'client' i
Hi,
consider the following fieldtype (used for autocompletion):
This works fine as long as the query string is a single word. For multiple
words, the ranking is weird though.
Example:
Que
I'm going down the route of patching nutch so I can use this ParseMetaTags
plugin:
https://issues.apache.org/jira/browse/NUTCH-809
Also wondering whether I will be able to use the XMLParser to allow me to
parse well formed XHTML, using xpath would be bonus:
https://issues.apache.org/jira/browse/N
Hi,
Maybe just don't understand all the concept there and I mix up server and
client...
Client - The place where I make the http calls (for index, search etc.) -
where I use the CommonsHttpSolrServer as the solr server. This machine isn't
defined as master or slave, it just use solr as search en
Hmmm. Maybe you need to define what you mean by 'server' and what you mean
by 'client'.
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1883238.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi All,
I'm having some trouble with a query using some wildcard and I was wondering if
anyone could tell me why these two
similar queries do not return the same number of results. Basically, the query
I'm making should return all docs whose title starts
(or contain) the string "lowe'". I suspe
Hi,
Thanks for the offers, I'll take deeper look into them.
In the offers you showed me, if I understand correctly, the call for
creation is done in the client side. I need the mechanism we'll work in the
server side.
I know it sounds stupid, but I need the client side wouldn't know about
which
Hi,
I have a question about boosting.
I have the following fields in my schema.xml:
1. title
2. description
3. ISBN
etc
I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.
Can someone help me on how to implement
Hi,
I have a question about boosting.
I have the following fields in my schema.xml:
1. title
2. description
3. ISBN
etc
I want to boost the field title. I tried index time boosting but it did not
work. I also tried Query time boosting but with no luck.
Can someone help me on how to implement
I am facing this weird issue in facet fields
Within config xml
under
−
I have defined the fl as
file_id folder_id display_name file_name priority_text content_type
last_upload upload_by business indexed
But my out xml doesnt contain the element upload_by and business
But i
Hi, nizan. I didn't realize that just replying to a thread from my email
client wouldn't get back to you. Here's some info on this thread since your
original post:
On Nov 10, 2010, at 12:30pm, Bob Sandiford wrote:
> Why not use replication? Call it inexperience...
>
> We're really early into
@Jerry Li
What version of Solr were you using? And was there any
data in the new field? I have no problems here with a quick
test I ran on trunk...
Best
Erick
On Thu, Nov 11, 2010 at 1:37 AM, Jerry Li | 李宗杰 wrote:
> but if I use this field to do sorting, there will be an error occured
> and th
Hi,
I use solr 1.3 with patch for parsing rich documents, and when uploading
for example pdf file, only thing I see in solr.log is following:
INFO: [] webapp=/solr path=/update/rich
params={id=250&stream.type=pdf&fieldnames=id,name&commit=true&stream.fieldname=body&name=iphone+user+guide+pdf+
Does anyone know what technology they are using: http://www.indextank.com/
Is it Lucene under the hood?
Thanks, and apologies for cross-posting.
-Glen
http://zzzoot.blogspot.com
--
-
Hello,
I'd like to use solr to index some documents coming from an rss feed,
like the example at [1], but it seems that the configuration used
there is just for a one-time indexing, trying to get all the articles
exposed in the rss feed of the website.
Is it possible to manage and index just the n
Hi,
I am trying to index documents (PDF, Doc, XLS, RTF) using the
ExtractingRequestHandler.
I am following the tutorial at
http://wiki.apache.org/solr/ExtractingRequestHandler
But when i run the following command
*curl
"http://localhost:8983/solr/update/extract?literal.id=mydoc.doc&uprefix=
Hi! Sorry for such a break, but I was moving house... anyway:
1. I took the
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:
package org.getopt.solr.analysis;
import org.apache.lucene.analysis.T
Jonathan,
thanks for your statement. In fact, you are quite right: A lot of people
developed great caching mechanisms.
However, the solution I got in mind was something like an HTTP-Cache - in
most cases on the same box.
I talked to some experts who told me that Squid would be a relatively large
Does anyone has any idea on how to do this?
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1881374.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
Has anyone gotten solr to schedule data imports at a certain time interval
through configuring solr?
I tried setting interval=1, which is import every minute but I don't see it
happening.
I'm trying to avoid cron jobs.
Thanks,
Tri
72 matches
Mail list logo