is startTimeISO single or multi-valued? (In other words, do you add a new
value to startTimeISO everytime a document is observed or just the first
time?)
The idea is to clarify the data you store in the index. So how does your
document look like:
id:XXX-YYY-ZZZ
name: theName
startTimeISO:[2015-01
Every entry in the document has a username, starttimeISO and uuid (which is
not starttimeiso)
So every record has a starttimeISO which is the time when the username was
seen.
The document looks like this:
{
Uuid: xxx
StartTimeISO: 2015-01-18T00:00:00.000Z
Username: abc
}
There are multiple recor
Maybe this is because of the "<" sign.
Encode it and try again.
len must be <= 32767; got 35680
On Tue, Jan 13, 2015 at 7:51 PM, Dan Davis wrote:
> The suggester is not working for me with Solr 4.10.2
>
> Can anyone shed light over why I might be getting the exception below when
> I build the
Ok,
Thus as commented before, in case your starttimeISO is single-value you
only need to add the range clause: startTimeISO:["2015-01-19T00:
00:00.000Z" TO "2015-01-20T00:00:00.000Z"]". There is no need to add both
NOT A AND B as the documents that satisfy B will automatically satisfy A.
If you q
I am not querying for a specific usernames.
Each day, there will be many usernames observed at different times.
But there might be some usernames that were never seen in the last 30 days,
but they were observed today.
That is the main challenge I am having.
How to identify which usernames from tod
Hi Harish,
What I was requesting you in my previous mail was to try (yourself) to
understand your data using specific queries. Apart from that, remember that
facet is doing over indexed data thus if you have two documents with nameA
as "user A" and nameB as "user B", and they are tokenized you
http://stackoverflow.com/questions/10805117/solr-transaction-management-using-solrj
Is it true, that a SolrServer-instance denotes a "transaction context"?
Say I have two concurrent threads, each having a SolrServer-instance "pointing"
to the same core. Then each thread can add/update/delete do
Does the TermsComponent (/terms) have something like buildOnCommit ? Or is it
always up-to-date (<- my unit tests deny this)?
Hi,
Deleted terms could confuse you. commit with expunge deletes or optimise will
purge deleted terms.
Ahmet
On Tuesday, January 20, 2015 1:03 PM, Clemens Wyss DEV
wrote:
Does the TermsComponent (/terms) have something like buildOnCommit ? Or is it
always up-to-date (<- my unit tests deny t
> Deleted terms could confuse you
they do ;)
>commit with expunge deletes
How is this done?
-Ursprüngliche Nachricht-
Von: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID]
Gesendet: Dienstag, 20. Januar 2015 12:22
An: solr-user@lucene.apache.org
Betreff: Re: TermsComonent, buildOnCommit?
Hi,
I am working on solr 4.10.2. I have been trapped into
the *performance
issue* where I have indexed 600MB data on 4 shards with single replicas
each. I have defined 2 fields (ngram and frequency). I have removed ID
field and replaced it with ngram field. Therefore, Search perfor
I already replied to you on stack overflow but your response there and the
schema.xml definition here are contrary to each other.
You are using a textSpell field which is tokenized as a unique key. As I
mentioned on stack overflow, it is a bad idea. Yes, it will impact
performance as well as lead
Hi,
curl http://localhost:8983/solr/core/update?commit=true&expungeDeletes=true
ahmet
On Tuesday, January 20, 2015 1:51 PM, Clemens Wyss DEV
wrote:
> Deleted terms could confuse you
they do ;)
>commit with expunge deletes
How is this done?
-Ursprüngliche Nachricht-
Von: Ahmet Arsla
Thx, but sorry for asking:
what is the SolrJ corresponding command?
SolrServer#commit()
SolrServer# commit( boolean waitFlush, boolean waitSearcher )
SolrServer# commit( boolean waitFlush, boolean waitSearcher, boolean softCommit
)
-Ursprüngliche Nachricht-
Von: Ahmet Arslan [mailto:iori
Hi Clemes,
Please see https://issues.apache.org/jira/browse/SOLR-1487 for a solrJ
workaround.
Ahmet
On Tuesday, January 20, 2015 2:22 PM, Clemens Wyss DEV
wrote:
Thx, but sorry for asking:
what is the SolrJ corresponding command?
SolrServer#commit()
SolrServer# commit( boolean waitFlush, b
Great!
-Ursprüngliche Nachricht-
Von: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID]
Gesendet: Dienstag, 20. Januar 2015 13:25
An: solr-user@lucene.apache.org
Betreff: Re: AW: AW: TermsComonent, buildOnCommit?
Hi Clemes,
Please see https://issues.apache.org/jira/browse/SOLR-1487 for a
On 1/20/2015 5:18 AM, Clemens Wyss DEV wrote:
http://stackoverflow.com/questions/10805117/solr-transaction-management-using-solrj
Is it true, that a SolrServer-instance denotes a "transaction context"?
Say I have two concurrent threads, each having a SolrServer-instance "pointing" to the
same c
Hi all,
We've been discussing a way of implementing a federated search by
leveraging the distributed query parts of SolrCloud. I've written this up
at
http://www.flax.co.uk/blog/2015/01/20/solr-superclusters-for-improved-federated-search/
and would welcome any comments or feedback. So far, two com
Thanks Mike,
> but a key difference is that when one client commits, all clients will see
> the updates
That's ok.
What about the -setting(s) in solrconfig.xml. Doesn't this mean
that after adding x elements (or after a certain timeframe), the changes are
commited and hence no more rollbackable
Thanks and sorry for Stackoverflow. You are saying that use "string" type.
But I have used filter = solr.ShingleFilterFactory to break a string into
ngrams.
I want to build query correction just like google is doing - "Did you
mean".
i) I am storing ngrams into gram field and have only single this
Hi
I done some performance test, and I wanted to know if any one saw the same
behavior.
We need to get 1K documents out of 100M documents each time we query solr and
send them to text Analysis.
First configuration had 8 shards on one RAD (Disk F) we got the 1K in around
15 seconds.
Second conf
It sounds like your app needs a lot more RAM so that it is not doing so
much I/O.
-- Jack Krupansky
On Tue, Jan 20, 2015 at 9:24 AM, Nimrod Cohen wrote:
> Hi
>
> I done some performance test, and I wanted to know if any one saw the same
> behavior.
>
>
>
> We need to get 1K documents out of 100
Hey Nimrod,
Nice try. I just want to know that these 8 shards are each on different
system or do you implemented sharding on single system and each shard with
different port?
On Tue, Jan 20, 2015 at 7:54 PM, Nimrod Cohen wrote:
> Hi
>
> I done some performance test, and I wanted to know if any o
Hello Charlie,
theoretically, things may work as you describe them. A few big
HOWEVERs exist as far as I can see:
1. Attributes: as different organisations may use different schemata
(document attributes), the consolidation of results from multiple
sources may present a problem. This may not ari
Hi
All shards are on the same system each one use different port.
BTW
Data size is about 1T, memory is 192G.
NIMROD COHEN
Software Engineer
RTI
(T) +972 (9) 775-3668
(M) +972 (0) 52-5522901
nimrod.co...@nice.com
www.nice.com
-Original Message-
From: Nitin Solanki [mailto:nitinml...@
Hi all,I have a cluster with 36 Shards and 3 replica per shard. I had to
recently restart the entire cluster - most of the shards & replica are back
up - but a few shards have not had any leaders for a long long time (close
to 18 hours now) - I tried reloading these cores and even the servlet
conta
Hi all,
I have a cluster with 36 Shards and 3 replica per shard. I had to recently
restart the entire cluster - most of the shards & replica are back up - but
a few shards have not had any leaders for a long long time (close to 18
hours now) - I tried reloading these cores and even the servlet co
Yes -- autoCommit works just the same as if you had a timer in your app
committing. You have to turn it off if you want to maintain the ability
to roll back predictably.
-Mike
On 01/20/2015 09:19 AM, Clemens Wyss DEV wrote:
Thanks Mike,
but a key difference is that when one client commits,
Yes I got that. But I am still stuck at this point. Consider it like this:
I do not know what are the usernames in all the documents.
I only know there is time associated with each record.
So Say, I have usernames "a", "b", "c", "d" present in my data for the 18th
of January.
And for the 19th, I h
On 1/20/2015 7:18 AM, Nitin Solanki wrote:
> Thanks and sorry for Stackoverflow. You are saying that use "string"
> type. But I have used filter = solr.ShingleFilterFactory to break a
> string into ngrams.
> I want to build query correction just like google is doing - "Did you
> mean".
Shalin is s
On 1/20/2015 8:52 AM, harish singh wrote:
> Yes I got that. But I am still stuck at this point. Consider it like this:
> I do not know what are the usernames in all the documents.
> I only know there is time associated with each record.
>
> So Say, I have usernames "a", "b", "c", "d" present in my
On 1/20/2015 7:45 AM, Nimrod Cohen wrote:
> All shards are on the same system each one use different port.
> BTW
> Data size is about 1T, memory is 192G.
If Solr has to actually go to the disk to satisfy a query, it's going to
be slow. This will always be true, no matter how many disks you use.
Hello,
(sorry for my English , I use a translator)
I used synonyms in solr .
My question is the following:
How to order the results list according to the order of synonyms ?
My synonyms are written as follows in mysynonyms.txt file :
ipad = > apple , Darty , Boulanger
I want that when you sea
Well, that is the problem I am facing. Just checking if there is a way to
compute the diff from 18th for the 19th.
One option is:
Get all the facets for 19th.
Get all facets for 18th.
Do a diff and Eliminate intersection.
But this isn't optimal as the number of facets returned but solr query can
b
Hi all,
I'm using SolrCloud 4.10.0 and trying to incorporate
PostingsSolrHighlighter. One issue that I'm having is that I cannot have the
functionality of "hl.fragsize" in PostingsSolrHighlighter. How can I limit
the size of the highlighted text? I get highlighted results but their
snippet size va
Hi,
I am afraid you don't use the right component.
In your example, you will match "apple", "darty "and "boulanger"
documents, sorted by the default Solr scoring mechanism (TF-IDF) that
won't take the order you specified in your synonyms.txt file into
account for the scoring.
If you want to o
Hi folks!
I have a multiphrase query, for example, from units:
Directory indexStore = newDirectory();
RandomIndexWriter writer = new RandomIndexWriter(random(), indexStore);
add("blueberry chocolate pie", writer);
add("blueberry chocolate tart", writer);
IndexReader r = writer.getReader();
Bonjour et merci pour votre réponse,
(désolé pour mon anglais, j'utilise un traducteur)
j'ai essayé d'utiliser le fichier elevate.xml en y ajoutant :
où 271 est l'identifiant unique du marchand Apple.
Afin d'essayer de le faire prendre en compte je passe les paramètres
suivants :
&force
I have the Solr XML response below using this query:
http://localhost:8983/solr/tt/select/?indent=off&facet=false&wt=xml&fl=title,overallscore,service,reviewdate&q=*:*&fq=id:315&start=0&rows=4&sort=reviewdate%20desc
I want to add paging on the multivalued fields, but the above query throws
the err
Hello All,
We are running solr cloud 4.4 with 30 shards and 3 replicas with real time
indexing on rhel 6.5.The indexing rate is 3K Tps now.We are running into an
issue with replicas going into recovery mode due to connection reset
errors.Soft commit time is 2 min and auto commit is set as 5 minut
Nimrod Cohen [nimrod.co...@nice.com] wrote:
> We need to get 1K documents out of 100M documents each
> time we query solr and send them to text Analysis.
> First configuration had 8 shards on one RAD (Disk F) we
> got the 1K in around 15 seconds.
> Second configuration we removed the RAD and work o
Are we sure this isn't SOLR-6931?
On Tue, Jan 20, 2015 at 11:39 AM, Nishanth S
wrote:
> Hello All,
>
> We are running solr cloud 4.4 with 30 shards and 3 replicas with real time
> indexing on rhel 6.5.The indexing rate is 3K Tps now.We are running into an
> issue with replicas going into recover
Hi,
In case your data looks like:
"id": "1",
"userName": "one",
"startTimeISO": "2015-01-20T17:24:32.888Z"
"id": "2",
"userName": "one",
"startTimeISO": "2015-01-16T17:24:50.208Z"
"id": "3",
"userName": "two",
"startTimeISO": "2015-01-20T17:25:06.109Z"
You could use the next query combination
I think this makes sense to (ie. the setup), since the search is getting 1K
documents each time (for textual analysis, ie. they are probably large
docs), and use Solr as a storage (which is totally fine) then the parallel
multiple drive i/o shards speed things up. The index is probably large, so
it
ok. So I am trying this query:
http://cluster1.com:8983/solr/my_collection_shard4_replica1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userName&fq=startTimeISO:[NOW-1DAY%20TO%20NOW]&fq=-_query_:%22{!join%20from=%
userName%20to=%userName}startTimeISO:[NOW-30DAYS%20TO%20NOW-1DA
What version of Solr?
On Tue, Jan 20, 2015 at 7:07 AM, anand.mahajan wrote:
> Hi all,
>
>
> I have a cluster with 36 Shards and 3 replica per shard. I had to recently
> restart the entire cluster - most of the shards & replica are back up - but
> a few shards have not had any leaders for a long
Thanks a lot Shawn. There is any way to reduce time to retrieve suggestions
fast.
On Tue, Jan 20, 2015 at 9:33 PM, Shawn Heisey wrote:
> On 1/20/2015 7:18 AM, Nitin Solanki wrote:
> > Thanks and sorry for Stackoverflow. You are saying that use "string"
> > type. But I have used filter = solr.Shi
I am also facing the same issue. My solr version is 4.10.2
On Tue, Jan 20, 2015 at 11:33 PM, Erick Erickson
wrote:
> What version of Solr?
>
>
> On Tue, Jan 20, 2015 at 7:07 AM, anand.mahajan
> wrote:
> > Hi all,
> >
> >
> > I have a cluster with 36 Shards and 3 replica per shard. I had to
> re
On 1/20/2015 11:11 AM, Nitin Solanki wrote:
> Thanks a lot Shawn. There is any way to reduce time to retrieve suggestions
> fast.
I know almost nothing about how to use the suggester and spellcheck
features of Solr. I do know that the suggester is based on spellcheck.
I have a spellcheck config
Okay. No Problem. Please somebody check my question which I have mailed on
20th Jan 2015 at 7:48 PM where I have posted my question along with 2
attachments. I am also waiting for Shalin, if he is able to answer.
On Tue, Jan 20, 2015 at 11:49 PM, Shawn Heisey wrote:
> On 1/20/2015 11:11 AM, Niti
Thank you Mike.Sure enough,we are running into the same issue you
mentoined.Is there a quick fix for this other than the patch.I do not see
the tlogs getting replayed at all.It is doing a full index recovery from
the leader and our index size is around 200G.Would lowering the autocommit
settings he
Joel,
Thank you for the links. The AnalyticsQuery is just the thing I need to
return custom stats in the response.
What I'm struggling with now, is how to read the doc field values. I've been
following the CollapsingQParserPlugin model of accessing the field cache in
the Query class getAnalyticsC
Hi,
Currently, there is no way to sort by a multi-value field within solr
(first the system should sort the content of the field, then sort
documents...). Anyway, if you have a clear idea on how the sort should be
done try to accomodate your data to your needs (in case it is posible).
One option
Thanks Alvaro. That worked.
On Tue, Jan 20, 2015 at 9:59 AM, harish singh
wrote:
> ok. So I am trying this query:
>
>
> http://cluster1.com:8983/solr/my_collection_shard4_replica1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userName&fq=startTimeISO:[NOW-1DAY%20TO%20NOW]&fq=-
I'm trying to index certain data from a table and documents located on disk
using jdbc and tika. I can derive the file locations from the table and
using that data I want to also import documents into Solr. However I'm
having trouble with my configuration.
Got it working with the updated config:
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-DIH-using-JDBC-with-TIKA-tp4180737p4180742.html
Sent from the S
I am cool with that. Just wanted to check that there was not one hiding around.
Also, ElasticSearch has a couple of language-specific groups and at
least Russian one gets some traffic every couple of weeks or so.
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-s
Hi,
May be this is basic but I am trying to understand which Tokenizer and
Filter to use. I followed some examples as mentioned in solr wiki but
type-ahead does not show expected suggestions.
Example itemName data can be :
- "ABC12DE" : It does not work as soon as I type 1.. i.e. ABC1
- "ABC_12DE
Were you actually trying to "...divides text at non-letters and
converts them to lower case"? Or were you trying to make it
non-case-sensitive, which would be KeywordTokenizer and
LowerCaseFilter?
Also, normally we do not use NGRam filter on both Index and Query.
That just makes things to match on
Hi
I'm using apache solr 4.9.0 and manifoldcf 1.6.1.
I can't generate index of XML files including tags and attributes.
Is it possible to achieve those by the set value of schema.xml or
solrconfig.xml?
Can any one help me?
Regards,
Aki
Thanks for the response..
a) I am trying to make it non-case-sensitive... itemName data is indexed in
upper case
b) I am looking to display the result as type-ahead suggestion which might
include space, underscore, number...
- "ABC12DE" : It does not work as soon as I type 1.. i.e. ABC1
Output ex
So, try the suggested tokenizers and dump the ngrams from query. See
what happens. Ask a separate question with corrected config/output if
you still have issues.
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 20 January 2015 at 23:08, Vishal Swar
Dear Solr community,
I am diving into Solr recently and I need help in the following usage scenery.
I am working on a project for extract and search bibliographic metadata from
PDF files. Firstly, my PDF files are processed to extract bibliographic
metadata such as title, authors, affilia
But then what happens if:
Autocommit is set to 10 docs
and
I add 11 docs and then decide (due to an exception?) to rollback.
Will only one (i.e. the last added) document be rollen back?
-Ursprüngliche Nachricht-
Von: Michael Sokolov [mailto:msoko...@safaribooksonline.com]
Gesendet: Dien
Hi,
You can find several examples of configuring tika+dih to index pdf in
internet (e.g.
https://tuxdna.wordpress.com/2013/02/04/indexing-the-documents-stored-in-a-database-using-apache-solr-and-apache-tika/
)
Regards.
On Jan 21, 2015 6:54 AM, "Yusniel Hidalgo Delgado" wrote:
>
>
> Dear Solr co
Hi Toke,
Thanks for your answer.
We are using RAID 0 of 8 disk, I don't understand why it should give me the
same performance as disk per drive.
Below is an explanation as I see it please correct me if I'm wrong.
RAID configuration
each shard has data on each one of the 8 disks in the RAID,
Anyone has answer of question which I have asked on 20th Jan 2015 at 7:48 PM
On Tue, Jan 20, 2015 at 11:59 PM, Nitin Solanki
wrote:
> Okay. No Problem. Please somebody check my question which I have mailed on
> 20th Jan 2015 at 7:48 PM where I have posted my question along with 2
> attachments.
67 matches
Mail list logo