> I also tried boundary scanner
> &q=iphone&hl.boundaryScanner=simple&hl.fragsize=200&hl.fragmenter=regex&hl.fl=body
hl.boundaryScanner parameter makes sense for FastVectorHighlighter only.
To activate it you need to use &hl.useFastVectorHighlighter=true
"FastVectorHighlighter requires the fiel
> The title sort works in a strange manner because the SOLR
> server treats
> title string based on Upper Case or Lower Case String. Thus
> if we sort in
> ascending order, first the title with numeric shows up then
> the titles in
> alphabetical order which starts with Upper Case & after
> that th
How are you hoping that Sort will work on a multivalued field? Normally,
trying to do this makes no sense.
For example, if you have two authors for a document:
Smith, John
Jones, Joe
Then would you expect the document to sort under 'S' for Smith, or 'J' for
Jones? There's prob
Hello,
I am using Solr 3.5.0 with a IndexBasedSpellChecker configured, and I
noticed that during rebuild, suggestions are not available.
After looking at the source code, I saw that
IndexBasedSpellChecker.build(...) calls spellchecker.clearIndex() before
spellchecker.indexDirectory(...) and I t
Hi,
Thank you for your reply .
That error was resolved but now Im not able to build the solr project using
"ant dist" to generate the war file. It is resulting in the following error.
-
| |
Hi James,
Thank you for your reply.
That issue got resolved;but now, when Im trying to build the solr using "ant
dist" command, its resulting in the following error.
[ivy:retrieve] :: resolving dependencies ::
org.apache.lucene#analyzers-phonetic;working@XXXYYN
[ivy:retrieve] confs: [default]
[
You will first have to map your xml files into Solr xml format. You will
have to do that yourself outside of Solr. At the same time, you should map
any DCMI metadata field names to the corresponding field names, such as
"dc:title" to "title". A number of the DC field names are already in the
So
Hi, is there any way to preserve newlines or line breaks when submitting
content to a Solr string field?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-string-field-stripping-new-lines-line-breaks-tp3984384.html
Sent from the Solr - User mailing list archive at Nabble.co
> Hi, is there any way to preserve
> newlines or line breaks when submitting
> content to a Solr string field?
String is indexed verbatim. Are you using wt=xml in a browser? Try using wt=php
> A search with keyword in Hindi retrieve emptly result
> set. Also a
> retrieved hindi record displays junk characters.
Could it be URIEncoding setting of your servlet container?
http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
Thank you, but I actually just forgot to reload the core0 when I changed
the field type. oops.
On Thu, May 17, 2012 at 3:52 PM, iorixxx [via Lucene] <
ml-node+s472066n3984405...@n3.nabble.com> wrote:
> > Hi, is there any way to preserve
> > newlines or line breaks when submitting
> > content to a
Recently Lucene/Solr went to a new build process using Ivy. Simply put,
dependent .jar files are no longer checked in with Lucene/Solr sources.
Instead while building, Ivy now downloads them from 'repo1.maven.org' From the
error you sent it seems like you do not have access to the Maven repos
I´m new to this list, so... Hello everybody.
I´m trying to run the DIH with more than one entity at the same time,
but only the first entity I call is being indexed. The other doesn´t get
any response.
For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=fal
hi
Added hl.useFastVectorHighlighter=true to query. I was already doing term
vectors.
This worked like a charm.
--
View this message in context:
http://lucene.472066.n3.nabble.com/highlighter-not-respecting-sentence-boundry-tp3984327p3984416.html
Sent from the Solr - User mailing list archive at
Yes. From the doc:
"Multiple 'entity' parameters can be passed on to run multiple entities at
once. If nothing is passed, all entities are executed."
See:
http://wiki.apache.org/solr/DataImportHandler
But that is one invocation of DIH, not two separate updates as you tried.
-- Jack Krupansky
Thanks Jack, but that´s not what I want.
I don´t want multiple entities in one invocation, but two
simultaneous invocations of the DIH with different entities.
Thanks.
Hi,
I am using Solr 3.4 on Tomcat 6 and using DIH to index data from a MS SQL
Server 2008 database.
In case my database is down, or is refusing connections due to any reason,
DIH throws an exception as mentioned below
"org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
exec
Okay, the answer is “Yes, sort of, but...”
“One annoyance is because of how DIH is designed, you need a separate handler
set up in solrconfig.xml for each DIH you plan to run. So you have to plan in
advance how many DIH instances you want to run, which config files they'll use,
etc.”
See:
htt
The wiki here indicates that you can specify "entity" more than once on the
request and it will run multiple entities at the same time, in the same
handler: http://wiki.apache.org/solr/DataImportHandler#Commands
But I can't say for sure that this actually works! Having been in the DIH
code, I
My configuration
>
>
>
> words="stopwords.txt" enablePositionIncrements="true" />
> maxGramSize="15" side="front"/>
>
>
>
>
>
> ignoreCase="true" expand="true"/>
> maxGramSize="15" side="front"/>
>
i'm looking at using Tika to index a bunch of documents. the wiki page seems to
be a little bit out of date ("// TODO: this is out of date as of Solr 1.4 -
dist/apache-solr-cell-1.4.jar and all of contrib/extraction/lib are needed")
and it also looks a little incomplete.
is there an actual list
Okay, first, now that we can see your data, it looks to me like you should
keep it in two fields: 1) a "string" field for exact match, faceting, and
precise wildcarding, and 2) copy to a "text" field for searching by keyword.
For the latter, use a field type/analyzer comparable to "text_en_split
Can I do this in the configuration or I have to change my code.
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 2:23 PM
To: solr-user@lucene.apache.org; Prachi Phatak
Subject: Re: org.apache.solr.common.SolrException:
org.apache.lucen
Code? I'm not sure what you're referring to. These changes are in schema.xml
and solrconfig.xml.
In your query, you need to change:
SKU:soft(*^1.0
to
SKU:soft\(*^1.0
-- Jack Krupansky
-Original Message-
From: Prachi Phatak
Sent: Thursday, May 17, 2012 3:25 PM
To: solr-user@lucene.
> i'm looking at using Tika to index a
> bunch of documents. the wiki page seems to be a little bit
> out of date ("// TODO: this is out of date as of Solr 1.4 -
> dist/apache-solr-cell-1.4.jar and all of
> contrib/extraction/lib are needed") and it also looks a
> little incomplete.
>
> is there a
Sorry, my suggestion for the escaped left parenthesis is if you change SKU
to be a string field. And then have SKU_text as a copy of that field (add a
copyField to your schema.xml for SKU to SKU_text) but with some "text"
type - then you could simply say SKU_text:soft .
-- Jack Krupansky
So do you mean I should change it " class="solr.TextField" to "
class="solr.StrField"?
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 3:00 PM
To: solr-user@lucene.apache.org
Subject: Re: org.apache.solr.common.SolrException:
org.apa
SKU should be type "string" and then SKU_text would be your text type.
Or, you can do it the opposite: SKU would be text and SKU_string for the raw
string value for precise wildcards and faceting.
The Solr example does have "sku" as a text field. You can do it that way or
the opposite. Whiche
Hello all,
In Solr 3.4, the boost factor is explicitly shown in debugQueries:
0.37087926 = (MATCH) sum of:
0.3708323 = (MATCH) weight(ocr:dog^1000.0 in 215624), product of:
0.995 = queryWeight(ocr:dog^1000.0), product of:
1000.0 = boost
2.32497 = idf(docFreq=237626, maxDocs
On 5/15/2012 3:42 PM, Jon Drukman wrote:
I fixed it for now by upping the wait_timeout on the mysql server.
Apparently Solr doesn't like having its connection yanked out from under
it and/or isn't smart enough to reconnect if the server goes away. I'll
set it back the way it was and try yo
Shawn,
Do you think this behavior is because, while the indexing is paused, you reach
some type of timeout so either your db or the jdbc cuts the connection? Or, ar
you thinking something in the DIH/JDBCDataSource code is causing the connection
to drop under these circumstances?
James Dyer
E-
On Thu, May 17, 2012 at 4:51 PM, Tom Burton-West wrote:
> But in Solr 3.6 I am not seeing the boost factor called out.
>
> On the other hand it looks like it may now be incoroporated in the
> queryNorm (Please see example below).
>
> Is there a bug in Solr 3.6 debugQueries? Is there some new be
On 5/17/2012 3:01 PM, Dyer, James wrote:
Do you think this behavior is because, while the indexing is paused, you reach
some type of timeout so either your db or the jdbc cuts the connection? Or, ar
you thinking something in the DIH/JDBCDataSource code is causing the connection
to drop under
hi
It did work in many cases but now is see many cases where it is not working.
Is this something to do with analysis. I m using word delimiter factory on
the field which is being used as hi.field.
Should this field be not tokenized? use one field for search and copy of it
for hl.field?
--
V
copyField is a literal operation that happens at index time -- but it
really has no bearing what so ever on highlighting done at query time.
there is no "memory" of what source fields any values came from, so it
doesn't affect things in any way.
You haven't provided any details about your sch
: I am trying to post JSON Data to Solr using XHR / JQuery and it doesn't seem
You are not POSTing any JSON data. In this method...
: var jqxhr = $.post(url, { "id" : "978-0545139700",
: "cat" : "book",
: "name" : "Harry Potter and the Deathly Hallows",
:
: No. fq queries are standard syntax queries. But they can be arbitrarily
: complex, i.e. fq=model:(member OR new_member)
using param refrences, you can also do some interesting things like...
fq={!term f=model v=$model}&model=member
...which can come in handy for hardcoding certain rules
Yes. The trick is to use a hash value on each document. The
SignatureUpdateProcessor provides a tool for this. Store the hash
value in a hex string field.
Now, do wildcard queries on the hash string: hash:a* will randomly
choose 1/16 of the documents. hash:00* will pick 1/256 of the
documents.
On
Hi,
Is there a way what fields to add to schema.xml prior to crawling with
nutch, rather than crawling over and over again and fixing the fields
one by one?
Regards,
39 matches
Mail list logo