Thanks.
If I set uniqueKey on the field, then I can save duplicates?
I need to remove duplicates only from search results. The ability to save
duplicates are should be.
2010/7/23 Erick Erickson
> If the field is a single token, just define the uniqueKey on it in your
> schema.
>
> Otherwise, th
Thanks for the info Peter, i think i ran into the same isssue some time ago
and could not find out why the backup stopped and also got deleted by solr.
I decided to stop current running updates to solr while backup is running and
wrote an own backuphandler that simply just copies the index-file
Thanks I saw the article,
As far as I can tell the trunk archives only go back to the middle of March
and the 2 patches are from the beginning of the year.
Thus:
*These approaches can be tried out easily using a single set of sample data
and the Solr example application (assumes current trunk cod
Another possibility could be the well known 'field collapse' ;-)
http://wiki.apache.org/solr/FieldCollapsing
Regards,
Peter.
> Thanks.
>
> If I set uniqueKey on the field, then I can save duplicates?
> I need to remove duplicates only from search results. The ability to save
> duplicates are sho
Thanks.
Does it work with Solr 1.4 (Solr 4.0 mentioned in article)?
What about performance? I need only to delete duplicates (I don't need cout
of duplicates or select certain duplicate).
2010/7/23 Peter Karich
> Another possibility could be the well known 'field collapse' ;-)
>
> http://wiki.a
Hi Stephan,
On the iPad, as with the iPhone, I'm afraid you're stuck with using
SQLite if you want any form of database in your app.
I suppose if you wanted to get really ambitious and had a lot of time
on your hands you could use Xcode to try and compile one of the open-
source C-based DB
Hi Pavel!
The patch can be applied to 1.4.
The performance is ok, but for some situations it could be worse than
without the patch.
For us it works good, but others reported some exceptions
(see the patch site: https://issues.apache.org/jira/browse/SOLR-236)
> I need only to delete duplicates
Co
Hi,
One of the things that we were thinking of doing in order to
speed up results from Solr search is to convert fixed-text fields
(such as values from a drop-down) into numeric fields. The thinking
behind this was that searching through numeric values would be
faster than searching through text
Hi all,
as I saw in this discussion [1] there were many issues with PDF indexing in
Solr 1.4 due to TIka library (0.4 Version).
In Solr 1.4.1 the tika library is the same so I guess the issues are the
same.
Could anyone, who contributed to the previous thread, help me in resolving
these issues?
I
Hi,
unfortunately for iPad developers, it seems that it is not possible to
use the Spotlight engine through the SDK:
http://stackoverflow.com/questions/3133678/spotlight-search-in-the-application
Chantal
On Fri, 2010-07-23 at 10:16 +0200, Mark Allan wrote:
> Hi Stephan,
>
> On the iPad, as wit
Thanks, Peter!
I'll try collapsing today.
Example (sorry if table unformated):
id | type | prop_1 | | prop_N | folderId
0 | folder | | | |
1 | file | val1 | | valN1 | 0
2 | file | val3 | |
Hi Everyone
I have a few questions :-)
a) Will the next release of solr be 3.0 (instead of 1.5)?
b) How stable/mature is the current 3x version?
c) Is LocalSolr implemented? where can I find a list of new features?
d) Is this the correct method to download the lasted stable version?
svn co htt
I found my problem! It was a bad custom EntityProcessor I wrote.
My EntityProcessor wasn't checking for hasNext() on the Iterator from my
FileImportDataImportHandler, it was just returning next(). The second bug
was that when the Iterator ran out of records it was returning an empty
Map (it now r
On Fri, 23 Jul 2010 14:44:32 +0530
Gora Mohanty wrote:
[...]
> From some experiments, I see only a small difference between a
> text search on a field, and a numeric search on the corresponding
> numeric field.
[...]
Well, I take that back. Running more rigorous tests with Apache
Bench shows a
I don't specify any sort order, and i do request for the score, so it is
ordered based on that.
My schema consists of these fields:
(changing now to tdate)
and a typical query would be:
fl=id,type,timestamp,score&start=0&q="Coca+Cola"+pepsi+-"dr+pepper"&fq=timestamp:[2010-07-07T00:00:00Z+T
Pavel,
hopefully I understand now your usecase :-) but one question:
> I need to select always *one* file per folder or
> select *only* folders than contains matched files (without files).
What do you mean here with 'or'? Do you have 2 usecases or would one of them be
sufficient?
Because the se
Gora,
just for my interests:
does apache bench sends different queries, or from the logs, or always
the same query?
If it would be always the same query the cache of solr will come and
make the response time super small.
I would like to find a tool or script where I can send my logfile to solr
an
On Fri, Jul 23, 2010 at 6:09 AM, Eric Grobler wrote:
> I have a few questions :-)
>
> a) Will the next release of solr be 3.0 (instead of 1.5)?
The next release will be 3.1 (matching the next lucene version off of
the 3x branch).
Trunk is 4.0-dev
> b) How stable/mature is the current 3x version?
Hi,
is there any wiki/url of the proposed changes or new features that we should
expect with this new release?
On Fri, Jul 23, 2010 at 9:20 AM, Yonik Seeley wrote:
> On Fri, Jul 23, 2010 at 6:09 AM, Eric Grobler
> wrote:
> > I have a few questions :-)
> >
> > a) Will the next release of solr be
I've update the SOLR-792 patch to apply to trunk (using the solr/
directory as the root still, not the higher-level trunk/).
This one I think is an important one that I'd love to see eventually
part of Solr built-in, but the TODO's in TreeFacetComponent ought to
be taken care of first, to g
Hi Erik,
Thanks for the fast update :-)
I will try it soon.
Regards
Eric
On Fri, Jul 23, 2010 at 2:37 PM, Erik Hatcher wrote:
> I've update the SOLR-792 patch to apply to trunk (using the solr/ directory
> as the root still, not the higher-level trunk/).
>
> This one I think is an important one
On Fri, Jul 23, 2010 at 9:33 AM, robert mena wrote:
> Hi,
> is there any wiki/url of the proposed changes or new features that we should
> expect with this new release?
You can see what has already gone in by looking at the appropriate
CHANGES.txt in subversion.
http://svn.apache.org/viewvc/luce
Hey,
I recently moved a solr app from a testing environment into a production
environment, and I'm seeing a brand new error which never occurred during
testing. I'm seeing this in the solrJ-based app logs:
org.apache.solr.common.SolrException: com.caucho.vfs.SocketTimeoutException:
client tim
>If I am doing
>facet=on & facet.field={!ex=State}State & fq={!tag=State}State:Karnataka
>All it gives me is Facets on state excluding only that filter query.. But i
>was not able to do same on third level ..Like facet.field= Give me the
>counts of cities also in state Karantaka..
>Let me know s
Hi Erik,
I must be doing something wrong :-(
I took:
svn co https://svn.apache.org/repos/asf/lucene/dev/trunk mytest
then i copied SOLR-792.path to folder /mytest/solr
then i ran:
patch -p1 < SOLR-792.patch
but I get "can't find file to patch at input line 5"
Is this the correct trunk and pa
I mean two usecases.
I can't index folders only because I have another queries on files. Or I
have to do another index that contains only folders, but then I have to take
care of synchronizing folders in two indexes.
Does range, spatial, etc quiries are supported on multivalued fields?
2010/7/23 P
Thanks Mark!
I'm subscribing to the cocoa-dev list.
On Jul 23, 2010, at 10:17 AM, Mark Allan [via Lucene] wrote:
> Hi Stephan,
>
> On the iPad, as with the iPhone, I'm afraid you're stuck with using
> SQLite if you want any form of database in your app.
>
> I suppose if you wanted to get
Hi! I'm a Solr newbie, and I don't understand why autocommits aren't happening
in my Solr installation.
My one server running Solr:
- Ubuntu 10.04 (Lucid Lynx), with all the latest updates.
- Solr 1.4.0 running on Tomcat6
- Installation was done via "apt-get install solr-common solr-tomcat
tomc
> and a typical query would be:
> fl=id,type,timestamp,score&start=0&q="Coca+Cola"+pepsi+-"dr+pepper"&fq=timestamp:[2010-07-07T00:00:00Z+TO+NOW]+AND+(type:x+OR+type:y)&
> rows=2000
My understanding is that this is essentially what the solr 1.4 trie date fields
are made for, I'd use them, should s
I'm in the process of indexing my demi data to test that, I'll have more
valid data on whether or not it made the differeve In a few days
Thanks
ב-23/07/2010, בשעה 19:42, "Jonathan Rochkind [via Lucene]" <
ml-node+990234-2085494904-316...@n3.nabble.com> כתב/ה:
> and a typical query would be:
>
On Jul 23, 2010, at 9:37 AM, John DeRosa wrote:
> Hi! I'm a Solr newbie, and I don't understand why autocommits aren't
> happening in my Solr installation.
>
> My one server running Solr:
>
> - Ubuntu 10.04 (Lucid Lynx), with all the latest updates.
> - Solr 1.4.0 running on Tomcat6
> - Install
I need to implement a search engine that will allow users to override
pieces of data and then search against or view that data. For example, a
doc that has the following values:
DocId FulltextMeta1 Meta2 Meta3
1 The quick brown fox foofoo foo
> and a typical query would be:
>
fl=id,type,timestamp,score&start=0&q="Coca+Cola"+pepsi+-"dr+pepper"&fq=timestamp:[2010-07-07T00:00:00Z+TO+NOW]+AND+(type:x+OR+type:y)&
> rows=2000
On top of using trie dates, you might consider separating the timestamp portion
and the type portion of the fq into
Hi HSingh,
Maybe the mapping file I attached to
https://issues.apache.org/jira/browse/SOLR-2013 will help?
Steve
> -Original Message-
> From: HSingh [mailto:hsin...@gmail.com]
> Sent: Thursday, July 22, 2010 11:30 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Novice seeking help t
In org.apache.solr.spelling.SpellingQueryConverter, find the line (#84):
final static String PATTERN = "(?:(?!(" + NMTOKEN + ":|\\d+)))[\\p{L}_\\-0-9]+";
and remove the |\\d+ to make it:
final static String PATTERN = "(?:(?!" + NMTOKEN + ":))[\\p{L}_\\-0-9]+";
My testing shows this solves your
On Fri, 23 Jul 2010 14:33:54 +0200
Peter Karich wrote:
> Gora,
>
> just for my interests:
> does apache bench sends different queries, or from the logs, or
> always the same query?
> If it would be always the same query the cache of solr will come
> and make the response time super small.
Yes,
Hi,
Lets say i have table with 3 columns document id Party Value and Party Type.
In this table i have 3 rows. 1st row Document id: 1 Party Value: Pramod
Party Type: Client. 2nd row: Document id: 1 Party Value: Raj Party Type:
Supplier. 3rd row Document id:2 Party Value: Pramod Party Type: Supplier
I think you just want something like:
p_value:"Pramod" AND p_type:"Supplier"
no?
-Kallin Nagelberg
-Original Message-
From: Pramod Goyal [mailto:pramod.go...@gmail.com]
Sent: Friday, July 23, 2010 2:17 PM
To: solr-user@lucene.apache.org
Subject: help with a schema design problem
Hi,
L
Any pointers on how to sort by reverse index order?
http://search.lucidimagination.com/search/document/4a59ded3966271ca/sort_by_index_order_desc
it seems like it should be easy to do with the function query stuff,
but i'm not sure what to sort by (unless I add a new field for indexed
time)
Any p
Yonik,
why do we do not send the output of TermsComponent of every node in the
cluster to a Hadoop instance?
Since TermsComponent does the map-part of the map-reduce concept, Hadoop
only needs to reduce the stuff. Maybe we even do not need Hadoop for this.
After reducing, every node in the cluste
I want to do that. But if i understand correctly in solr it would store the
field like this:
p_value: "Pramod" "Raj"
p_type: "Client" "Supplier"
When i search
p_value:"Pramod" AND p_type:"Supplier"
it would give me result as document 1. Which is incorrect, since in document
1 Pramod is a Clien
On Fri, Jul 23, 2010 at 2:23 PM, MitchK wrote:
> why do we do not send the output of TermsComponent of every node in the
> cluster to a Hadoop instance?
> Since TermsComponent does the map-part of the map-reduce concept, Hadoop
> only needs to reduce the stuff. Maybe we even do not need Hadoop for
On Jul 23, 2010, at 9:37 AM, John DeRosa wrote:
> Hi! I'm a Solr newbie, and I don't understand why autocommits aren't
> happening in my Solr installation.
>
[snip]
"Never mind"... I have discovered my boneheaded mistake. It's so silly, I wish
I could retract my question from the archives.
... Additionally to my previous posting:
To keep this sync we could do two things:
Waiting for every server to make sure that everyone uses the same values to
compute the score and than apply them.
Or: Let's say that we collect the new values every 15 minutes. To merge and
send them over the netwo
That only works if the docs are exactly the same - they may not be.
Ahm, what? Why? If the uniqueID is the same, the docs *should* be the same,
don't they?
--
View this message in context:
http://lucene.472066.n3.nabble.com/a-bug-of-solr-distributed-search-tp983533p990563.html
Sent from the So
With the usecase you specified it should work to just index each "Row" as
you described in your initial post to be a seperate document.
This way p_value and p_type all get singlevalued and you get a correct
combination of p_value and p_type.
However, this may not go so well with other use-cases yo
Hello,
I have an index with lots of different types of documents. One of those
types basically contains extracts of PDF docs. Some of those PDFs can have
1000+ pages, so there would be a lot of stuff to search through.
I am experiencing really terrible performance when querying. My whole index
h
In my case the document id is the unique key( each row is not a unique
document ) . So a single document has multiple Party Value and Party Type.
Hence i need to define both Party value and Party type as mutli-valued. Is
there any way in solr to say p_value[someIndex]="pramod" And
p_type[someIn
On Fri, Jul 23, 2010 at 2:40 PM, MitchK wrote:
> That only works if the docs are exactly the same - they may not be.
> Ahm, what? Why? If the uniqueID is the same, the docs *should* be the same,
> don't they?
Documents aren't supposed to be duplicated across shards... so the
presence of multiple
Are you using the same instance of CommonsHttpSolrServer for all the
requests?
I was.
I also tried creating a new instance every x requests, also resetting
the credentials on the new instances, to see if it would make a
difference.
Doing that, I get an exception after several instances of
Looks like you can sort by _docid_ to get things in index order or
reverse index order.
?sort=_docid_ asc
thank you solr!
On Fri, Jul 23, 2010 at 2:23 PM, Ryan McKinley wrote:
> Any pointers on how to sort by reverse index order?
> http://search.lucidimagination.com/search/document/4a59ded3966
Hi, I have an autocomplete that is currently working with an
NGramTokenizer so if I search for "Yo" both "New York" and "Toyota"
are valid results. However I'm trying to figure out how to best
implement the search so that from a score perspective if the string
matches the beginning of an entire fi
: On top of using trie dates, you might consider separating the timestamp
: portion and the type portion of the fq into seperate fq parameters --
: that will allow them to to be stored in the filter cache seperately. So
: for instance, if you include "type:x OR type:y" in queries a lot, but
: w
> Is there any way in solr to say p_value[someIndex]="pramod"
And p_type[someIndex]="client".
No, I'm 99% sure there is not.
> One way would be to define a single field in the schema as p_value_type =
"client pramod" i.e. combine the value from both the field and store it in a
single field.
yep, f
> > > When i search
> > > p_value:"Pramod" AND p_type:"Supplier"
> > >
> > > it would give me result as document 1. Which is incorrect, since in
> > > document
> > > 1 Pramod is a Client and not a Supplier.
Would it? I would expect it to give you nothing.
-Kal
-Original Message-
From:
just wanted to mention a possible other route, which might be entirely
hypothetical :-)
*If* you could query on internal docid (I'm not sure that it's available
out-of-the-box, or if you can at all)
your original problem, quoted below, could imo be simplified to asking for
the last docid inserted
Hi Steve, This is extremely helpful! What is the best way to also
preserve/append the diacritics in the index in case someone searches using
them? I deeply appreciate your help!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Novice-seeking-help-to-change-filters-to-search
Multiple rows in the OPs example are combined to form 1 solr-document (e.g:
row 1 and 2 both have documentid=1)
Because of this combine, it would match p_value from row1 with p_type from
row2 (or vice versa)
2010/7/23 Nagelberg, Kallin
> > > > When i search
> > > > p_value:"Pramod" AND p_type:"
> I am not sure why some commits take very long time.
Hmm... Because it merges index segments... How large is your index?
> Also is there a way to reduce the time it takes?
You can disable commit in DIH call and use autoCommit instead. It's
kind of hack because you postpone commit operation and ma
> having multiple Request Handlers will not degrade the performance
IMO you shouldn't worry unless you have hundreds of them
On 7/23/10 5:59 PM, Alexey Serba wrote:
> Another option is to set optimize=false in DIH call ( it's true by
> default ).
Ouch - that should really be changed then.
- Mark
Do you use highlighting? ( http://wiki.apache.org/solr/HighlightingParameters )
Try to disable it and compare performance.
On Fri, Jul 23, 2010 at 10:52 PM, ahammad wrote:
>
> Hello,
>
> I have an index with lots of different types of documents. One of those
> types basically contains extracts o
For the sake of any future googlers I'll report my own clueless but
thankfully brief struggle with autocommit.
There are two parts to the story: Part One is where I realize my
config was not contained within my . In
Part Two I realized I had typed "" rather than
"".
--jay
On Fri, Jul 23, 2010 a
: > Is there any way in solr to say p_value[someIndex]="pramod"
: And p_type[someIndex]="client".
: No, I'm 99% sure there is not.
it's possibly in code, by utilizing positions and FieldMaskingSpanQuery...
http://lucene.apache.org/java/2_9_0/api/all/org/apache/lucene/search/spans/FieldMaskingSpan
We have been having problems with SOLR on one project lately. Forgive
me for writing a novel here but it's really important that we identify
the root cause of this issue. It is becoming unavailable at random
intervals, and the problem appears to be memory related. There are
basically two
I'll see you, and raise. My solrconfig.xml wasn't being copied to the server by
the deployment script.
On Jul 23, 2010, at 3:26 PM, Jay Luker wrote:
> For the sake of any future googlers I'll report my own clueless but
> thankfully brief struggle with autocommit.
>
> There are two parts to the
Hello,
I am new to Solr/Lucene and I am evaluating if they suit my need and
replace our in-house system.
Our requirements:
1. I have multiple documents (1M)
2. Each document contains text ranged from few KB to a few MB
3. I want to search for a keyword, search thru all theses document,
and it r
67 matches
Mail list logo