Hi, all,
I am going to multilingual search in multicore solr. Specifically, the
design of the solr server is like: I have several cores corresponding to
different languages, where each core has its configuration files and data.
I have following questions:
1. While indexing a document, I use
Hi, all,
In this thread, I would like to ask some technical questions about how the
schema is defined to achieve language specific fields "text".
Say, currently I have the filed "text" defined as follows:
text*" type="text_general" indexed="true"
stored="true" multiValued="true"/>
After index
, returned with a set of scores. Is it
confident to conclude that the highest score gives the most confidence of
the results?
Thanks.
Best Regards,
Ni Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3702041.html
Sent from the
as I know, for some field "title", people can create
"title_en" "title_fr" to incorporate different analyzers in the same schema.
Even this, I am not seeing it happens. Thus, I am thinking whether it is
possible I neglect some obvious point?
"Bing" is very com
Hi, all,
I am using the following jar to index files in xml format, and I want to
look into the source code. Where can I find it? Thanks.
\apache-solr-3.5.0\example\exampledocs>java -jar post.jar *.xml
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.
to index XML files with
many self-defined fields, probably with embedded fields, which one makes
more sense?
Thanks.
Best
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-content-in-XML-files-tp3702795p3702795.html
Sent from the Solr - User mailing list ar
Hi, iorixxx, Thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Closed-Source-code-of-post-in-example-package-of-Solr-tp3702100p3705333.html
Sent from the Solr - User mailing list archive at Nabble.com.
. How would I do to
build and run it? Where should I put the sc in the package? Is IDE a must to
do that?
I cannot find many start-up tutorials about that, thus would be grateful if
any suggestions and hints brought about.
Best
Bing
--
View this message in context:
http://lucene.472066.n3.nabbl
Hi, all,
Thanks for the comment. Then I will abandon post.jar, and try to learn SolrJ
instead.
Best
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-content-in-XML-files-tp3702795p3705563.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi, Erick,
Thanks for commenting on this thread, and I think my problem has been
solved. I might start another thread raising technical questions about using
SolrJ.
Thank you again.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Multilingual
rver
location: class MySolrjTest
server = new
CommonsHttpSolrServer("http://localhost:8983/solr/";);
^
3 errors
Best
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Fail-to-compile-Java-code-trying-to-use-SolrJ-with-Solr-tp37
Hi, all,
Following the previous topic, if I abandon my own code and try to build a
project with the original package apache-solr-3.5.0-src, I failed again.
Following are the description of some technical details, and I hope someone
can help to point out my mistakes.
What I Have
Besides the tool
about the
advantages and disad of the two approaches? Any other alternatives? Thank
you.
Best
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Development-inside-or-outside-of-Solr-tp3759680p3759680.html
Sent from the Solr - User mailing list archive at Nabble.com.
I have looked into the TikaCLI with -language option, and learned that Tika
can output only the language metadata. It cannot help me to solve my problem
though, as my main concern is whether to change Solr or not. Thank you all
the same.
--
View this message in context:
http://lucene.472066.n3.
Hi, François Schiettecatte
Thank you for the reply all the same, but I choose to stick on Solr
(wrapped with Tika language API) and do changes outside Solr.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Development-inside-or-outside-of-Solr
index/query. Thus, we might do revisions in dspace.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Development-inside-or-outside-of-Solr-tp3759680p3768977.html
Sent from the Solr - User mailing list archive at Nabble.com.
solve this?
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3771620.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi, Suneel,
There is a configuration in solrconfig.xml that you might need to look at.
Following I set the limit as 2GB.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-increase-Size-of-Document-in-solr-tp3771813p3771931.html
Sent from
Hi, Dmitry
Thank you. It solved my problem.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Fail-to-compile-Java-code-trying-to-use-SolrJ-with-Solr-tp3708902p3772017.html
Sent from the Solr - User mailing list archive at Nabble.com.
ed similar things before. Pls advice. Thank you.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-tp2570526p3772177.html
Sent from the Solr - User mailing list archive at Nabble.com.
y'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389)
at
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Failed-to-upgrade-tika0-8-to-tika0-10-in-solr3-3-0-tp3772180p3772180.html
Sent from the Solr - User mailing list archive at Nabble.com.
the following link, people has tried to upgrade Tika0.8 to Tika0.9.
http://lucene.472066.n3.nabble.com/upgrading-to-Tika-0-9-on-Solr-1-4-1-td2570526.html
I was thinking, if both the above two steps can be achieved, then maybe I
can get it done. What is your suggestion?
Thank you.
Best Regards,
esult would look like
language_s="en,zh_tw". However, I failed to see the result.
text,attr_stream_name
language_s
true
I will be grateful if anyone can point my mistake or give some hints how to
do the correct things. Th
Solr. Do you think
it is reasonable?
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3782793.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi, Erick,
I get your point. Thank you so much.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/TikaLanguageIdentifierUpdateProcessorFactory-since-Solr3-5-0-to-be-used-in-Solr3-3-0-tp3771620p3782938.html
Sent from the Solr - User mailing list archive
h-tw" while the other one "text_en". Can I do something like
that?
Thank you.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-solr-langid-Solr3-5-0-detect-multiple-languages-in-one-text-tp3821210p3821210.html
Sent from the Solr - User mailing list archive at Nabble.com.
the existing identifier into
Solr?
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-solr-langid-Solr3-5-0-detect-multiple-languages-in-one-text-tp3821210p3821764.html
Sent from the Solr - User mailing list archive at Nabble.com.
dicates a different language range.
Thank you for the thoughtful comments.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-solr-langid-Solr3-5-0-detect-multiple-languages-in-one-text-tp3821210p3824365.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi, all,
I am working on Solr3.3. Recently I found out a new feature (Field
Aliasing/Renaming) in Solr3.6, and I want to use it in Solr3.3. Can I do
that, and how?
Thank you.
Best Regards,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-I-use-Field-Aliasing
Hi,
I am going to evaluate some Lucene/Solr capabilities on handling faceted
queries, in particular, with a single facet field that contains large number
(say up to 1 million) of distinct values. Does anyone have some experience
on how lucene performs in this scenario?
e.g.
Doc1 has tags A B C D
, such as time, what about the sorting way?
If I just need to top ones, is it proper to just add rows?
If I want to add new sorting ways, how to do that?
Thanks so much!
Bing
more complicated than PageRank. Now I have to load all
of matched data from Solr first by keyword and rank them again in my ways
before showing to users. It is correct?
Thanks so much!
Bing
Hi, Kai,
Thanks so much for your reply!
If the retrieving is done on a string field, not a text field, a complete
matching approach should be used according to my understanding, right? If
so, how does Lucene rank the retrieved data?
Best regards,
Bing
On Sun, Jan 22, 2012 at 5:56 AM, Kai Lu
Dear Shashi,
Thanks so much for your reply!
However, I think the value of PageRank is not a static one. It must update
on the fly. As I know, Lucene index is not suitable to be updated too
frequently. If so, how to deal with that?
Best regards,
Bing
On Sun, Jan 22, 2012 at 12:43 PM, Shashi
Hi,
While using ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
The two ways of adding a file are
up.addFile(File)
up.addContentStream(ContentStream)
However my raw files are stored on some remote storage devices. I am able to
get an InputStream object for the
Hello,
I have a field text with type text_general here.
Thanks for the quick reply. Seems like you are suggesting to add explicitly
AND operator. I don't think this solves my problem.
I found it somewhere, and this
works.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-match-all-tokens-in-Query-Text-tp4037758p4037762.h
Hi,
I don't want the field to be tokenized because Solr doesn't support sorting
on a tokenized field. In order to do case insensitive sorting I need to copy
a field to a lowercase but not tokenized field. How to define this?
I did below but it says I need to specify a tokenizer or a class for
ana
Works perfectly. Thank you. I didn't know this tokenizer does nothing before
:)
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-define-a-lowercase-fieldtype-without-tokenizer-tp4040500p4040507.html
Sent from the Solr - User mailing list archive at Nabble.com.
read-many
environment, right? That's what HDFS (Hadoop Distributed File System)
provides. According to my experience, it is really slow when updating a
Lucene Index.
Why did you say I could update Lucene index frequently?
Thanks so much!
Bing
On Mon, Jan 23, 2012 at 11:02 PM, Shashi Kant wrot
,
Bing
earch.
Best regards,
Bing
On Thu, Feb 23, 2012 at 1:35 AM, T Vinod Gupta wrote:
> Bing,
> Its a classic battle on whether to use solr or hbase or a combination of
> both. both systems are very different but there is some overlap in the
> utility. they also differ vastly when it compar
Dear Mr Gupta,
Your understanding about my solution is correct. Now both HBase and Solr
are used in my system. I hope it could work.
Thanks so much for your reply!
Best regards,
Bing
On Fri, Feb 24, 2012 at 3:30 AM, T Vinod Gupta wrote:
> regarding your question on hbase support for h
According to my knowledge, Solr cannot support this.
In my case, I get data by keyword-matching from Solr and then rank the data
by PageRank after that.
Thanks,
Bing
On Wed, Apr 4, 2012 at 6:37 AM, Manuel Antonio Novoa Proenza <
mano...@estudiantes.uci.cu> wrote:
> Hello,
>
> I
which I can
transmit. After transmission, how to append them to the old indexes? Does
the appending block searching?
Thanks so much for your help!
Bing Li
o existing indexes?
Does the appending affect the querying?
I am learning Solr. But it seems that Solr does that for me. However, I have
to set up Tomcat to use Solr. I think it is a little bit heavy.
Thanks!
Bing Li
queries must be responded instantly. That's
what I mean "appending". Does it happen in Solr?
Best,
Bing
On Sat, Nov 20, 2010 at 1:58 AM, Gora Mohanty wrote:
> On Fri, Nov 19, 2010 at 10:53 PM, Bing Li wrote:
> > Hi, all,
> >
> > Since I didn't find that L
Dear Erick,
Thanks so much for your help! I am new in Solr. So I have no idea about the
version.
But I wonder what are the differences between Solr and Hadoop? It seems that
Solr has done the same as what Hadoop promises.
Best,
Bing
On Sat, Nov 20, 2010 at 2:28 AM, Erick Erickson wrote:
>
ching large indexes in a large scale distributed environment, right?
Thanks!
Bing
On Sat, Nov 20, 2010 at 3:01 AM, Gora Mohanty wrote:
> On Sat, Nov 20, 2010 at 12:05 AM, Bing Li wrote:
> > Dear Erick,
> >
> > Thanks so much for your help! I am new in Solr. So I have no id
wish to import the Lucene indexes into Solr, may I have any other
approaches? I know that Solr is a serverized Lucene.
Thanks,
Bing Li
using the initial schema.xml
from Solr.
Why cannot I change the schema.xml?
Thanks so much!
Bing
Dec 5, 2010 4:52:49 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173
indexes are put under
$TOMCAT_HOME/bin. This is NOT what I expect. I hope indexes are under
SolrHome.
Could you please give me a hand?
Best,
Bing Li
Hi, all,
Now I cannot search the index when querying with Chinese keywords.
Before using Solr, I ever used Lucene for some time. Since I need to crawl
some Chinese sites, I use ChineseAnalyzer in the code to run Lucene.
I know Solr is a server for Lucene. However, I have no idea know how to
conf
Dear all,
After reading some pages on the Web, I created the index with the following
schema.
..
..
It must be correct, right? However, when sending a query though SolrNet
Dear Jelsma,
My servlet container is Tomcat 7. I think it should accept Chinese
characters. But I am not sure how to configure it. From the console of
Tomcat, I saw that the Chinese characters in the query are not displayed
normally. However, it is fine in the Solr Admin page.
I am not sure eithe
Dear Jelsma,
After configuring the Tomcat URIEncoding, Chinese characters can be
processed correctly. I appreciate so much for your help!
Best,
LB
On Wed, Jan 19, 2011 at 3:02 AM, Markus Jelsma
wrote:
> Hi,
>
> Yes but Tomcat might need to be configured to accept, see the wiki for more
> inform
Hi, all,
In the past, I always used SolrNet to interact with Solr. It works great.
Now, I need to use SolrJ. I think it should be easier to do that than
SolrNet since Solr and SolrJ should be homogeneous. But I cannot find a
tutorial that is easy to follow. No tutorials explain the SolrJ programmi
n Sat, Jan 22, 2011 at 3:58 PM, Lance Norskog wrote:
> The unit tests are simple and show the steps.
>
> Lance
>
> On Fri, Jan 21, 2011 at 10:41 PM, Bing Li wrote:
> > Hi, all,
> >
> > In the past, I always used SolrNet to interact with Solr. It works great.
>
Dear all,
I got a weird problem. The number of searched documents is much more than
10. However, the size of SolrDocumentList is 10 and the getNumFound() is the
exact count of results. When I need to iterate the results as follows, only
10 are displayed. How to get the rest ones?
Dear all,
I got an exception when querying the index within Solr. It told me that too
many files are opened. How to handle this problem?
Thanks so much!
LB
[java] org.apache.solr.client.solrj.
SolrServerException: java.net.SocketException: Too many open files
[java] at
org.apache.solr.c
Dear Adam,
I also got the OutOfMemory exception. I changed the JAVA_OPTS in catalina.sh
as follows.
...
if [ -z "$LOGGING_MANAGER" ]; then
JAVA_OPTS="$JAVA_OPTS
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager"
else
JAVA_OPTS="$JAVA_OPTS -server -Xms8096m -Xmx80
Dear all,
I need to construct a site which supports searching for a large index. I
think scaling Solr is required. However, I didn't get a tutorial which helps
me do that step by step. I only have two resources as references. But both
of them do not tell me the exact operations.
1)
http://www.luc
Dear all,
I started to learn how to use Solr three months ago. My experiences are
still limited.
Now I crawl Web pages with my crawler and send the data to a single Solr
server. It runs fine.
Since the potential users are large, I decide to scale Solr. After
configuring replication, a single ind
The filtered documents are the final results I need.
I guess the operation performance is lower than relational database, right?
Could you please give me an explanation to that?
Best regards,
Li Bing
Dear Lance,
Could you tell me where I can find the unit tests code?
I appreciate so much for your help!
Best regards,
LB
On Sat, Jan 22, 2011 at 3:58 PM, Lance Norskog wrote:
> The unit tests are simple and show the steps.
>
> Lance
>
> On Fri, Jan 21, 2011 at 10:41 PM,
Dear all,
According to my experiences, when the Lucene index updated frequently, its
performance must become low. Is it correct?
In my system, most data crawled from the Web is indexed and the
corresponding index will NOT be updated any more.
However, some indexes should be updated frequently li
, in most Internet systems, the amount of mutable data is much
less than that of immutable one.
How do you think about my solution?
Best,
LB
On Sat, Mar 5, 2011 at 2:45 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> On Fri, Mar 4, 2011 at 10:09 AM, Bing Li wrote:
>
>
I find that, if I do not restart the master's tomcat for some days,
the load average will keep rising to a high level, solr become slow
and unstable, so I add a crontab to restart the tomcat everyday.
do you boys restart your tomcat ? and is there any way to avoid restart tomcat?
I want to let system do the job instead of system adminm, beause I'm lazy ~ ^__^
But I just want a better way to fix the problem. restart server will
cause some other problem like I need to rebuild the changes happened
during the restart.
2011/7/27 Dave Hall :
> On 27/07/11 11:42, Bing
I just goto apache-solr-3.3.0/solr and run 'ant test'
I find that the junit test will always fail, and told me ’BUILD FAILED‘
but if I type 'ant dist', I can get a apache-solr-3.3-SNAPSHOT.war
with no warning.
Is it a problem just me?
my server:Centos 5.6 64bit/apache-ant-1.8.2 /junit and jdk (
7;m thinking there must be a way of releasing
write lock so other servers may pick up. Is there an API that does so?
Any inputs are appreciated.
Bing
Thanks Lance. The use case is to have a cluster of nodes which runs the same
application with EmbeddedSolrServer on each of them, and they all point to
the same index on NFS. Every application is designed equal, meaning that
everyone may index and/or search.
In such way, after every commit the wr
Hi folks,
Just wondering if there is a query handler that simply takes a query string
and search on all/part of fields for field values?
e.g.
q=*admin*
Response may look like
author: [admin, system_admin, sub_admin]
last_modifier: [admin, system_admin, sub_admin]
doctitle: [AdminGuide, AdminMan
there.
My question is, are the separation / merging strategies configurable?
Basically I want to add a size limit for any individual file. Is it feasible
without changing solr core code?
Thanks!
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-index-storage-strategy-on-
Thanks for the response but wait... Is it related to my question searching
for field values? I was not asking how to use wildcards though.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Does-Solr-support-Value-Search-tp3999654p3999817.html
Sent from the Solr - User mailing
down menu may contain matches on authors, on doctitles, and potentially on
other fields.
Still thanks for your response and hopefully I'm making it clearer.
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Does-Solr-support-Value-Search-tp3999654p327.html
Sen
Makes sense. Thank you.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Multiple-Embedded-Servers-Pointing-to-single-solrhome-index-tp3999451p4000180.html
Sent from the Solr - User mailing list archive at Nabble.com.
s as using SpellCheckComponent. CopyField won't help
since I want the original field name.
Any suggestions?
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Does-Solr-support-Value-Search-tp3999654p4000267.html
Sent from the Solr - User mailing list archive at Nabble.com.
I agree. We chose embedded to minimize the maintenance cost of http solr
servers.
One more concern. Even if I have only one node doing indexing, other nodes
need to reopen index reader periodically to catch up with new changes,
right? Is there a solr request that does this?
Thanks,
Bing
Hello,
Background is that I want to use both Suggest and SpellCheck features in a
single query to have alternatives returned at one time. Right now I can only
specify one of them using spellcheck.dictionary at query time.
default
..
suggest
Hello,
>From spell check component I'm able to get the collation query and its # of
hits. Is it possible to have solr execute the collated query automatically
and return doc search results without resending it on client side?
Thanks,
Bing
--
View this message in context:
http://lucen
ld be appreciated.
Thanks,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tlog-vs-buffer-softcommit-tp4000330.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks for the information. It definitely helps a lot. There're
numDeletesToKeep = 1000; numRecordsToKeep = 100; in UpdateLog so this should
probably be what you're referring to.
However when I was doing indexing the total size of TLogs kept on
increasing. It doesn't sound like the case where the
se to keep an amount of Tlogs for peers to sync up.
Thanks,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tlog-vs-buffer-softcommit-tp4000330p4000509.html
Sent from the Solr - User mailing list archive at Nabble.com.
"new_value"}
Just trying to figure out what's the solrj client code that does this.
Thanks for any help on this,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr4-0-Partially-update-document-tp4000875.html
Sent from the Solr - User mailing list archive at Nabble.com.
Got it at
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/SolrExampleTests.java
Problem solved.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr4-0-Partially-update-document-tp4000875p4000878.html
Sent from the Solr -
ions of a query without getting the normal search results? I may need
to create a new handler for this. Can anyone please give me some ideas on
that?
Thanks,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Getting-Suggestions-without-Search-Results-tp4000968.html
Sen
Great comments. Thanks to you all.
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Getting-Suggestions-without-Search-Results-tp4000968p4001192.html
Sent from the Solr - User mailing list archive at Nabble.com.
iteral.id", str);
...
up.setParams(p);
server.request(up);
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-thousands-file-on-solr-tp4001050p4001196.html
Sent from the Solr - User mailing list archive at Nabble.com.
y to go
as long as you have a preference on better scalability or better stability &
online supports.
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Are-there-any-comparisons-of-Elastic-Search-specifically-with-SOLR-4-tp4000889p4001237.html
Sent from the Solr - Use
plain txt
file to solr to simply index that as a fulltext field without doing
extraction on that file?
Thanks,
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Send-plain-text-file-to-solr-for-indexing-tp4004515.html
Sent from the Solr - User mailing list archive at
So in order to use solrcell I'll have to add a number of dependent libraries,
which is one of what I'm trying to avoid. The second thing is, solrcell
still parses the plain text files and I don't want it to make any change to
those of my exported files.
Any ideas?
Bing
--
View
Thanks Mr.Yagami. I'll look into that.
Jack, for the latter two options, they both require reading the entire text
file into memory, right?
Bing
--
View this message in context:
http://lucene.472066.n3.nabble.com/Send-plain-text-file-to-solr-for-indexing-tp4004515p4004772.html
Sent fro
93 matches
Mail list logo