> Could you give an example?
> E.g. lets say I have a field 'title' and a field 'fulltext'
> and my
> search term is 'solr'. What would be the right set of
> parameters to get
> back the whole title-field but only a sniplet of 50 words
> (or three
> sentences or whatever the unit) from the fulltext
On 28.06.2010 23:00 Ahmet Arslan wrote:
>> 1) I can get my docs in the index, but when I search, it
>> returns the entire document. I'd love to have it only
>> return the line (or two) around the search term.
>
> Solr can generate Google-like snippets as you describe.
> http://wiki.apache.org/s
Hi,
I've read some on the autosuggest and I would like to know if the following is
possible with my current configuration.
I'm using solr 1.4.
I'd like to reopen a bug SOLR-1960
https://issues.apache.org/jira/browse/SOLR-1960
"http://wiki.apache.org/solr/ : non-English users get generic MoinMoin page
instead of the desired information"
as I submitted a patch. But jira won't let me do it.
Do I have to clone it?
Teruhiko "Kuro" Kuro
try adding &hl.fl=text
to specify your highlight field. I don't understand why you're only
getting the ID field back though. Do note that the highlighting
is after the docs, related by the ID.
Try a (non highlighting) query of just * to verify that you're
pointing at the index you think you are. I
On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote:
>> 1) I can get my docs in the index, but when I search, it
>> returns the entire document. I'd love to have it only
>> return the line (or two) around the search term.
>
> Solr can generate Google-like snippets as you describe.
> http://wiki.apa
Other question,
Why SOLRJ d'ont close the StringWriter e OutputStreamWriter ?
thanks
2010/6/28 Anderson vasconcelos
> Thanks for responses.
> I instantiate one instance of per request (per delete query, in my case).
> I have a lot of concurrency process. Reusing the same instance (to send,
> d
> It seems that ${ncdat.feature} is not being set.
Try ${dataTable.feature} instead.
On Tue, Jun 29, 2010 at 1:22 AM, Shawn Heisey wrote:
> I am trying to do some denormalizing with DIH from a MySQL source. Here's
> part of my data-config.xml:
>
> query="SELECT *,FROM_UNIXTIME(post_date)
On 6/28/2010 3:28 PM, caman wrote:
In your query 'query="SELECT webtable as wt FROM ncdat_wt WHERE
featurecode='${ncdat.feature}' .. instead of ${ncdat.feature} use
${dataTable.feature} where dataTable is your parent entity name.
I knew it would be something stupid like that. I thought I
Hi,
I am trying to get db indexing up and running, but I am having trouble
getting it working.
In the solrconfig.xml file, I added
data-config.xml
I defined a couple of fields in schema.xml
media_id is defined as the unique
In your query 'query="SELECT webtable as wt FROM ncdat_wt WHERE
featurecode='${ncdat.feature}' .. instead of ${ncdat.feature} use
${dataTable.feature} where dataTable is your parent entity name.
From: Shawn Heisey-4 [via Lucene]
[mailto:ml-node+929151-1527242139-124...@n3.nabble.com]
Here is a screen shot for our cache from New Relic.
http://s4.postimage.org/mmuji-31d55d69362066630eea17ad7782419c.png
Query cache: 55-65%
Filter cache: 100%
Document cache: 63%
Cache size is 512 for above 3 caches.
How do I interpret this data? What are some optimal configuration changes
give
Thanks for responses.
I instantiate one instance of per request (per delete query, in my case).
I have a lot of concurrency process. Reusing the same instance (to send,
delete and remove data) in solr, i will have a trouble?
My concern is if i do this, solr will commit documents with data from oth
I am trying to do some denormalizing with DIH from a MySQL source.
Here's part of my data-config.xml:
query="SELECT *,FROM_UNIXTIME(post_date) as pd FROM ncdat WHERE
did > ${dataimporter.request.minDid} AND did <=
${dataimporter.request.maxDid} AND (did %
${dataimporter.request.numShar
Great, thanks for the pointers.
Thanks,
Peter
On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote:
>> 1) I can get my docs in the index, but when I search, it
>> returns the entire document. I'd love to have it only
>> return the line (or two) around the search term.
>
> Solr can generate Google-
> 1) I can get my docs in the index, but when I search, it
> returns the entire document. I'd love to have it only
> return the line (or two) around the search term.
Solr can generate Google-like snippets as you describe.
http://wiki.apache.org/solr/HighlightingParameters
> 2) There are one or
Hi everyone,
I'm looking for a way to index a bunch of (potentially large) text files. I
would love to see results like Google, so I went through a few tutorials, but
I've still got questions:
1) I can get my docs in the index, but when I search, it returns the entire
document. I'd love to h
Hi,
You can add additional commentreplyjoin entity to story entity, i.e.
...
Thus, you will have multivalued field commentreply that contains list
of related "comment_id, reply_id" ("comment_id," if you don't have any
related replies for this entry
Hi Anderson,
If you are using SolrJ, it's recommended to reuse the same instance per solr
server.
http://wiki.apache.org/solr/Solrj#CommonsHttpSolrServer
But there are other scenarios which may cause this situation:
1. Other application running in the same Solr JVM which doesn't close
properly
Hi,
I'm adding the spellCheckComponent to my current configuration of solr, and I
was wondering if there was a way to set a minimum frequency threshold for the
IndexBasedSpellChecker through solr like there is in the depreciated Spell
Check Request Handler. I know that you can fix most problems
iorixxx wrote:
>
> CustomSimilarityFactory that extends
> org.apache.solr.schema.SimilarityFactory should do it. There is an example
> CustomSimilarityFactory.java under src/test/org...
>
This is exactly what I was looking for... this is very similar ( no put
intended ;) ) to the updateProcess
Yes. For now, I've gone back to Lucene 1.4 and installed Local Lucene. I just
couldn't get the sfilt to work. I'm sure I was probably missing something, but
I think I'll just wait until 1.5 is ready to be shipped.
On Jun 28, 2010, at 12:02 PM, Grant Ingersoll wrote:
>
> On Jun 24, 2010, at
On Jun 24, 2010, at 12:32 AM, Eric Angel wrote:
> I'm using solr 4.0-2010-06-23_08-05-33 and can't figure out how to add the
> spatial types (LatLon, Point, GeoHash or SpatialTile) using
> dataimporthandler. My lat/lngs from the database are in separate fields.
> Does anyone know how to do h
> How would you configure the tfBaselineTfFactors and
> LengthNormFactors when
> configuring via schema.xml?
CustomSimilarityFactory that extends org.apache.solr.schema.SimilarityFactory
should do it. There is an example CustomSimilarityFactory.java under
src/test/org...
iorixxx wrote:
>
> it is in schema.xml:
>
>
>
How would you configure the tfBaselineTfFactors and LengthNormFactors when
configuring via schema.xml? Do I have to create a subclass that hardcodes
these values?
--
View this message in context:
http://lucene.472066.n3.nabble.com/SweetSpotSimi
Hi,
You might also want to check out the new Lucene-Hunspell stemmer at
http://code.google.com/p/lucene-hunspell/
It uses OpenOffice dictionaries with known stems in combination with a large
set of language specific rules.
It handles your example, but it is an early release, so test it thoroughl
Hi All,
I am a new user of Solr.
We are now trying to enable searching on Digg dataset.
It has story_id as the primary key and comment_id are the comment id
which commented story_id, so story_id and comment_id is one-to-many
relationship.
These comment_ids can be replied by some repliers, so
This probably means you're opening new readers without closing
old ones. But that's just a guess. I'm guessing that this really
has nothing to do with the delete itself, but the delete is what's
finally pushing you over the limit.
I know this has been discussed before, try searching the mail
archi
Hi all
When i send a delete query to SOLR, using the SOLRJ i received this
exception:
org.apache.solr.client.solrj.SolrServerException: java.net.SocketException:
Too many open files
11:53:06,964 INFO [HttpMethodDirector] I/O exception
(java.net.SocketException) caught when processing request: Too
there is a first pass query to retrieve all matching document ids from
every shard along with relevant sorting information, the document ids
are then sorted and limited to the amount needed, then a second query
is sent for the rest of the documents metadata.
On Sun, Jun 27, 2010 at 7:32 PM, Babak
splitOnCaseChange is creating multiple tokens from 3dsMax disable it
or enable catenateAll, use the analysys page in the admin tool to see
exactly how your text will be indexed by analyzers without having to
reindex your documents, once you have it right you can do a full
reindex.
On Mon, Jun 28,
the general consensus among people who run into the problem you have
is to use a plurals only stemmer, a synonyms file or a combination of
both (for irregular nouns etc)
if you search the archives you can find info on a plurals stemmer
On Mon, Jun 28, 2010 at 6:49 AM, wrote:
> Thanks for the ti
And note that those are tokens, not characters, not that it should make any
difference
if you've bumped it up that far
Best
Erick
On Mon, Jun 28, 2010 at 9:18 AM, judauphant wrote:
>
> Ok thanks, it works.
>
> Best regards,
> Julien
> --
> View this message in context:
> http://lucene.47206
What if Chinese is mixed with English?
I have text that is entered by users and it could be a mix of Chinese, English,
etc.
What's the best way to handle that?
Thanks.
--- On Mon, 6/28/10, Ahmet Arslan wrote:
> From: Ahmet Arslan
> Subject: Re: Chinese chars are not indexed ?
> To: solr-use
Hello..
I have a little question about Facetting.
When i use "facet.prefix=mau", i geht these result:
49
23
thats fine, but i miss the last word for the result "bluetooth laser". i
want "bluetooth laser maus"
the problem is, when i use this as suggestion and the user search for
"bluetooth lase
Hi all,
I have been using Solr for quite a while, but I never really got into
looking at the code. Last week that all changed, I decided to write a
custom core admin handler. I've posted something on my blog about it,
along with a Drupal centric howto. I'd be interested to know what
people thin
Hi all.
I'm trying to get $deleteDocById working, but any document is being deleted
from my index.
I'm using Full-Import (withOUT cleaning) and a script with:
row.put('$deleteDocById', row.get('codAnuncio'));
The script is passing in this line for every document it processes (for
testing purpos
Thanks for the tip. Yeah, I think the stemming confounds search results as
it stands (porter stemmer).
I was also thinking of using my dictionary of 500,000 words with their
complete morphologies and conjugations and create a synonyms.txt to
provide english accurate morphology.
Is this a good ide
Hi Darren,
You might want to look at the KStemmer
(http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem) instead of
the standard PorterStemmer. It essentially has a 'dictionary' of exception
words where stemming stops if found, so in your case president won't be stemmed
any furthe
Ok thanks, it works.
Best regards,
Julien
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-limit-to-the-first-50-000-chars-for-one-field-tp927635p927725.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello,
I have a title that says "3DVIA Studio & Virtools Maya and 3dsMax
Exporters". The analysis tool for this field gives me these
tokens:3dviadviastudio&;virtoolmaya3dsmaxdssystèmmaxexport
However, when i search for "3dsmax", i get no results :( Furthermore, if i
search for "dsmax" i get t
> Ok, I'm trying to integrate the TikaEntityProcessor as suggested. I'm using
> Solr Version: 1.4.0 and getting the following error:
>
> java.lang.ClassNotFoundException: Unable to load BinURLDataSource or
> org.apache.solr.handler.dataimport.BinURLDataSource
It seems that DIH-Tika integration is
> I use solr 1.4 for search contents in documents (pdf, doc,
> odt ...). I use
> the module "/update/extract".
> When I am researching, I am limited to the first 5
> characters
> (approximately).
> Any word or sentence after is not found (but the field has
> more than 5
> characters when I
Hi,
I use solr 1.4 for search contents in documents (pdf, doc, odt ...). I use
the module "/update/extract".
When I am researching, I am limited to the first 5 characters
(approximately).
Any word or sentence after is not found (but the field has more than 5
characters when I recovered it
Hi,
It seems to me that because the stemming does not produce
grammatically correct stems in many of the cases,
search anomalies can occur like the one I am seeing where I have a
document with "president" in it and it is returned
when I search for "preside", a different word entirely.
Is this co
Hi Li,
Yes, you can issue a delete all by:
curl http://your_solr_server:your_solr_port/solr/update -H
"Content-Type: text/xml" --data-binary
'*:*';
Hope it helps.
Cheers,
Daniel
-Original Message-
From: Li Li [mailto:fancye...@gmail.com]
Sent: 28 June 2010 03:41
To: solr-user@lucene.a
Hello community,
since a few days I recieve daily some mails with suspicious content. It is
said that some of my mails were rejected, because of the file-types of the
mail's attachements and other things.
This wonders me a lot, because I didn't send any mails with attachements and
even the eMail-
I think that this is the best way to use Solr. I've used EmbeddedSolrServer
keeping it in a singleton manner (by using Spring framework).
Also Solr is threadsafe, so you should not have any issue by using it
directly in an Ejb.
Antonio
2010/6/27 Robert Naczinski
> Hello,
>
> there is a recom
Hi,
I have an architectural question about using apache solr/lucene.
I'm building a solr index for searching a CV database. Basically every CV on
there will have some fields like:
rate of pay, address, title
these fields are straight forward. The area I need advise on is, skills and
job history
> oh yes, *...* works. thanks.
>
> I saw tokenizer is defined in schema.xml. There are a few
> places that define the tokenizer. Wondering if it is enough
> to define one for:
It is better to define a brand new field type specific to Chinese.
http://wiki.apache.org/solr/LanguageAnalysis?highlig
oh yes, *...* works. thanks.
I saw tokenizer is defined in schema.xml. There are a few places that define
the tokenizer. Wondering if it is enough to define one for:
51 matches
Mail list logo