Personalized search parameters

2018-01-05 Thread marco
Hi, first of all I want to say that i'm a beginner with the whole Lucene/Solr
environment.
I'm trying to create a simple personalized search engine, and to do so i was
thinking about adding a parameter user= to the uri of the query
requests, that i would need during the scoring phase to rerank the result on
based on the user profile (stored as a normal document).

My question is: how can i create a custom Similarity class that is able to
retrieve a parameter passed during the request phase? I "know" from this 
https://medium.com/@wkaichan/custom-query-parser-in-apache-solr-4634504bc5da
  
that extending QParsePlugin I can access the request parameters, but how can
i pass them during the whole chain of search operations so that they are
accessible during the scoring phase?

Thank you for your help.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Personalized search parameters

2018-01-05 Thread marco
First of all thank you for the reply.
I understand your idea, and that would make the thing a lot easyer, the
problem is that this system is being created as a university project, and we
were specifically asked to develop a personalized search system based on
result reranking.
In particular we have to retrieve the documents with a normal search
followed by a result reranking phase where we calculate the cosine
similarity between the retrieved documents and the user profile.

I'm still looking around on the web, and it seems like i have to deal with
search component, is it right?
An alternative would be to work with plain Lucene, having the ability to
directly instantiate and call QueryParser, Similarity and everithing else
would simplify everything, but it wouldn't be nearly as cool :)



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Personalized search parameters

2018-01-05 Thread marco
This looks like a very good solution actually.
In the mean time i started working in a different way: I created a custom
query componentan from there i accessed the list of results of the query,
and i was searching a way to reorder that list, but i'd be better look to
the RankQuery, it surely looks like a more standard and elegant solution.

Thank you, i'll let you know how it goes with both the methods.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Personalized search parameters

2018-01-05 Thread marco
At the moment I have another problem: is there an efficient way to calculate
the cosine similarity between  documents?
I'm following (with the required modifications)  THIS
   code that calculates the cosine
similarity between 2 documents, but it doesn't look too efficient when it
comes to repeat everything between the user profile and every document
retreived by the query. 
This because the termvectors returned by the IndexSearcher function
getTermVector(...) only contain the terms present in the associated
document, forcing you to create manually the full vectors.
Isn't there the possibility to obtain full size vectors? (or are they way
too big?)



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Personalized search parameters

2018-01-06 Thread marco
Don't we need vectors of the same size to calculate the cosine similarity? 
Maybe I missed something, but following that example it looks like i have to
manually recreate the sparse vectors, because the term vector of a document
should (i may be wrong) contain only the terms that appear in that document.
Am I wrong?

Given that i assumed (and that example goes in that direction) that we have
to manually create the sparse vector by first collecting all the terms and
then calculating the tf-idf frequency for each term in each document.
That's what i did, and I obtained vectors of the same dimension for each
document, i was just wondering if there was a better optimized way to obtain
those sparse vectors.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SOLR + Nutch set up (UNCLASSIFIED)

2016-08-03 Thread Marco Scalone
Nutch also has adaptive strategy:

This class implements an adaptive re-fetch algorithm. This works as
> follows:
>
>- for pages that has changed since the last fetchTime, decrease their
>fetchInterval by a factor of DEC_FACTOR (default value is 0.2f).
>- for pages that haven't changed since the last fetchTime, increase
>their fetchInterval by a factor of INC_FACTOR (default value is 0.2f).
>If SYNC_DELTA property is true, then:
>   - calculate a delta = fetchTime - modifiedTime
>   - try to synchronize with the time of change, by shifting the next
>   fetchTime by a fraction of the difference between the last modification
>   time and the last fetch time. I.e. the next fetch time will be set to 
> fetchTime
>   + fetchInterval - delta * SYNC_DELTA_RATE
>   - if the adjusted fetch interval is bigger than the delta, then 
> fetchInterval
>   = delta.
>- the minimum value of fetchInterval may not be smaller than
>MIN_INTERVAL (default is 1 minute).
>- the maximum value of fetchInterval may not be bigger than
>MAX_INTERVAL (default is 365 days).
>
> NOTE: values of DEC_FACTOR and INC_FACTOR higher than 0.4f may destabilize
> the algorithm, so that the fetch interval either increases or decreases
> infinitely, with little relevance to the page changes. Please use
> main(String[])
> 
> method to test the values before applying them in a production system.
>

From:
https://nutch.apache.org/apidocs/apidocs-1.2/org/apache/nutch/crawl/AdaptiveFetchSchedule.html


2016-08-03 14:45 GMT-03:00 Walter Underwood :

> I’m pretty sure Nutch uses a batch crawler instead of the adaptive crawler
> in Ultraseek.
>
> I think we were the only people who built an adaptive crawler for
> enterprise use. I tried to get Ultraseek open-sourced. I made the argument
> to Mike Lynch. He looked at me like I had three heads and didn’t even
> answer me.
>
> Ultraseek also has great support for sites that need login. If you use
> that, you’ll need to find a way to do that with another crawler.
>
> wunder
> Walter Underwood
> Former Ultraseek Principal Engineer
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Aug 3, 2016, at 10:12 AM, Musshorn, Kris T CTR USARMY RDECOM ARL (US)
>  wrote:
> >
> > CLASSIFICATION: UNCLASSIFIED
> >
> > We are currently using ultraseek and looking to deprecate it in favor of
> solr/nutch.
> > Ultraseek runs all the time and auto detects when pages have changed and
> automatically reindexes them.
> > Is this possible with SOLR/nutch?
> >
> > Thanks,
> > Kris
> >
> > ~~
> > Kris T. Musshorn
> > FileMaker Developer - Contractor - Catapult Technology Inc.
> > US Army Research Lab
> > Aberdeen Proving Ground
> > Application Management & Development Branch
> > 410-278-7251
> > kris.t.musshorn@mail.mil
> > ~~
> >
> >
> >
> > CLASSIFICATION: UNCLASSIFIED
>
>


combined boolean operators

2017-06-28 Thread Marco Staub
Hi there,

I am a litte confused about combined boolean operators in the query parser. For 
example If I search for

myfield:a AND myfield:b OR myfield:c

This will be parsed internal to the query

+myfield:a +myfield:b myfield:c

But if I change the default operator to AND (q.op=AND) I got for the same 
origin query:

+myfield:a myfield:b myfield:c

Which is BTW in both cases not pure boolean algebra: A AND B OR C should be (A 
AND B) OR C

Did anybody know why the query is handled that way?

Version is Solr 6.5.0, the handler ist almost the default /select handler:


explicit
json
true



Queries:
/select?debugQuery=on&indent=on&q=myfield:a AND myfield:b OR myfield:c&wt=json
/select?debugQuery=on&indent=on&q.op=AND&q=myfield:a AND myfield:b OR 
myfield:c&wt=json

See parsedquery in the result. myfield should be replaceable by any fieldname 
in the index like id.

Best
Marco

Re: Solr scraping: Nutch and other alternatives.

2011-10-18 Thread Marco Martinez
Hi Luis,

Have you tried the copyField function with custom analyzers and tokenizers?

bye,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 Luis Cappa Banda 

> Hello everyone.
>
> I've been thinking about a way to retrieve information from a domain (for
> example, http://www.ign.com) to process and index. My idea is to use Solr
> as
> a searcher. I'm familiarized with Apache Nutch and I know that the latest
> version has a gateway to Solr to retrieve and index information with it. I
> tried it and it worked fine, but it's a little bit complex to develop
> plugins to process info and index it in a new field desired. Perhaps one of
> you have tried another (and better) alternative to data mine web
> information. Which is your recommendation? Can you give me any scraping
> suggestion?
>
> Thank you very much.
>
> Luis Cappa.
>


Re: Controlling the order of partial matches based on the position

2011-10-18 Thread Marco Martinez
Hi,

I would use a custom function query that uses termPositions to calculate the
order of the values in the field to accomplished your requirements.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 aronitin 

> Guys,
>
> It's been almost a week but there are no replies to the question that I
> posted.
>
> If its a small problem and already answered somewhere, please point me to
> that post. Otherwise please suggest any pointer to handle the requirement
> mentioned in the question,
>
> Nitin
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3429823.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Error Instantiating QParserPlugin

2011-10-20 Thread Marco Martinez
its seem that the problem is QParserPlugin2 class

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/20 

> hi,
> while to create customized query parser plugin for solr 3.2. I got the
> Instantiating error.As mentioned at various places I created two
> classes 1) MyQParserPlugin extends QParserPlugin2) MyQParser extends
> QParser
> org.apache.solr.common.SolrException: Error Instantiating QParserPlugin,
> MyQParserPlugin is not a org.apache.solr.search.QParserPlugin
>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:428)
>at
> org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:448)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1548)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1542)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1575)
>at org.apache.solr.core.SolrCore.initQParsers(SolrCore.java:1492)
>at org.apache.solr.core.SolrCore.<init>(SolrCore.java:558)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
>at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
>at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
>at
> org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
>at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
>at
> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
>at
> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
>at
> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
>at
> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
>at org.mortbay.jetty.Server.doStart(Server.java:224)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>at java.lang.reflect.Method.invoke(Unknown Source)
>at org.mortbay.start.Main.invokeMain(Main.java:194)
>at org.mortbay.start.Main.start(Main.java:534)
>at org.mortbay.start.Main.start(Main.java:441)
>at org.mortbay.start.Main.main(Main.java:119)
> Any idea about whats going on??
> Thanks Karan


mysolr python client

2011-11-30 Thread Marco Martinez
Hi all,

For anyone interested, recently I've been using a new Solr client for
Python. It's easy and pretty well documented. If you're interested its site
is: *http://mysolr.redtuna.org/*
*
*
bye!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: mysolr python client

2011-12-01 Thread Marco Martinez
Done!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/12/1 Marc SCHNEIDER 

> Hi Marco,
>
> Great! Maybe you can add it on the Solr wiki? (
> http://wiki.apache.org/solr/IntegratingSolr).
>
> Regards,
> Marc.
>
> On Thu, Dec 1, 2011 at 10:42 AM, Jens Grivolla  wrote:
>
> > On 11/30/2011 05:40 PM, Marco Martinez wrote:
> >
> >> For anyone interested, recently I've been using a new Solr client for
> >> Python. It's easy and pretty well documented. If you're interested its
> >> site
> >> is: http://mysolr.redtuna.org/
> >>
> >
> > Do you know what advantages it has over pysolr or solrpy? On the page it
> > only says "mysolr was born to be a fast and easy-to-use client for Apache
> > Solr’s API and because existing Python clients didn’t fulfill these
> > conditions."
> >
> > Thanks,
> > Jens
> >
> >
>


custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi,

I want to know if its posiible to get a higher score in a phrase query when
the matching is on the left side of the field. For example:


doc1=name:stores peter john
doc2=name:peter john stores
doc3=name:peter john something

if you do a search with name="peter john" the resultset i want to get is:

doc2
doc3
doc1

because the terms peter john are on the left side of the field and they get
a higher score.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi Otis,

Finally i construct my own function query that gives more score if the value
is at the start  of the field. But, its possible to tell solr to use
spanFirstQuery without coding. I think i have read that its no possible.

Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/18 Otis Gospodnetic 

> Marco,
>
> I don't think there is anything in Solr to do that (is there?), but you
> could do it with some coding if you combined the "regular query" with
> SpanFirstQuery with bigger boost:
>
>
> http://search-lucene.com/jd/lucene/org/apache/lucene/search/spans/SpanFirstQuery.html
>
> Oh, here are some examples and at the bottom you will see exactly what I
> suggested above:
>
>
> http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html||SpanFirstQuery<http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html%7C%7CSpanFirstQuery>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Marco Martinez 
> > To: solr-user@lucene.apache.org
> > Sent: Fri, June 18, 2010 4:34:45 AM
> > Subject: custom scoring phrase queries
> >
> > Hi,
>
> I want to know if its posiible to get a higher score in a phrase
> > query when
> the matching is on the left side of the field. For
> > example:
>
>
> doc1=name:stores peter john
> doc2=name:peter john
> > stores
> doc3=name:peter john something
>
> if you do a search with
> > name="peter john" the resultset i want to get
> > is:
>
> doc2
> doc3
> doc1
>
> because the terms peter john are on the
> > left side of the field and they get
> a higher score.
>
> Thanks in
> > advance,
>
>
> Marco Martínez Bautista
>
> > href="http://www.paradigmatecnologico.com"; target=_blank
> > >http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª
> > Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


Re: index pdf files

2010-08-12 Thread Marco Martinez
To help you we need the description of your fields in your schema.xml and
the query that you do when you search only a single word.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C] 

> I wrote a simple java program to import a pdf file. I can get a result when
> I do search *:* from admin page. I get nothing if I search a word. I wonder
> if I did something wrong or miss set something.
>
> Here is part of result I get when do *:* search:
> *
> - 
> - 
>  Hristovski D
>  
> - 
>  application/pdf
>  
> - 
>  microarray analysis, literature-based discovery, semantic
> predications, natural language processing
>  
> - 
>  Thu Aug 12 10:58:37 EDT 2010
>  
> - 
>  Combining Semantic Relations and DNA Microarray Data for Novel
> Hypotheses Generation Combining Semantic Relations and DNA Microarray Data
> for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej
> Kastrin,2...
> *
> Please help me out if anyone has experience with pdf files. I really
> appreciate it!
>
> Thanks so much,
>
>


Re: Search Results optimization

2010-08-13 Thread Marco Martinez
You can use a boost higher for stapler to accomplished your requirement.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/13 Hasnain 

>
> Hi All,
>
> My question is related to search results, I want to customize my query so
> that for query "stapler hammer", I should get results for all items
> containing word "stapler" first and then results containing hammer, right
> now results are mixing up, I want them sorted, i.e. all results of stapler
> on top and hammer on bottom not mixed, I havent changed any configuration
> files...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Search-Results-optimization-tp1129374p1129374.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr search speed very low

2010-08-25 Thread Marco Martinez
You should use the tokenizer solr.WhitespaceTokenizerFactory in your field
type to get your terms indexed, once you have indexed the data, you dont
need to use the * in your queries that is a heavy query to solr.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/25 Andrey Sapegin 

> Dear ladies and gentlemen.
>
> I'm newbie with Solr, I didn't find an aswer in wiki, so I'm writing here.
>
> I'm analysing Solr performance and have 1 problem. *Search time is about
> 7-10 seconds per query.*
>
> I have a *.csv 5Gb-database with about 15 fields and 1 key field (record
> number). I uploaded it to Solr without any problem using curl. This database
> contains information about books and I'm intrested in keyword search using
> one of the fields (not a key field). I mean that if I search, for example,
> for word "Hello", I expect response with sentences containing "Hello":
> "Hello all"
> "Hello World"
> "I say Hello to all"
> etc.
>
> I tested it from console using time command and curl:
>
> /usr/bin/time -o test_results/time_solr -a curl "
> http://localhost:8983/solr/select/?q=itemname:*$query*&version=2.2&start=0&rows=10&indent=on";
> -6 2>&1 >> test_results/response_solr
>
> So, my query is *itemname:*$query**. 'Itemname' - is the name of field.
> $query - is a bash variable containing only 1 word. All works fine.
> *But unfortunately, search time is about 7-10 seconds per query.* For
> example, Sphinx spent only about 0.3 second per query.
> If I use only $query, without stars (*), I receive answer pretty fast, but
> only exact matches.
> And I want to see any sentence containing my $query in the response. Thats
> why I'm using stars.
>
> NOW THE QUESTION.
> Is my query syntax correct (*field:*word**) for keyword search)? Why
> response time is so big? Can I reduce search time?
>
> Thank You in advance,
> Kind Regards,
>
> Andrey Sapegin,
> Software Developer,
>
> Unister GmbH
> Barfußgässchen 11 | 04109 Leipzig
>
> andrey.sape...@unister-gmbh.de <mailto:%20andreas.b...@unister-gmbh.de>
> www.unister.de <http://www.unister.de>
>
>


What is the maximum number of documents that can be indexed ?

2010-10-14 Thread Marco Ciaramella
Hi all,
I am working on a performance specification document on a Solr/Lucene-based
application; this document is intended for the final customer. My question
is: what is the maximum number of document I can index assuming 10 or
20kbytes for each document?

I could not find a precise answer to this question, and I tend to consider
that Solr index can be virtually limited only by the JVM, the Operating
System (limits to large file support), or by hardware constraints (mainly
RAM, etc. ... ).

Thanks
Marco


commit question

2012-05-16 Thread marco crivellaro
Hi all,
this might be a silly question but I've found different opinions on the
subject.

When a search is run after a commit is performed will the result include all
document(s) committed until last commit?

use case (sync):
1- add document
2- commit
3- search (faceted)

will faceted search on point 3 include the document added at point 1?

thank you,
Marco Crivellaro

--
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-question-tp3984044.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrJ 4, soft commit

2012-05-16 Thread marco crivellaro
Hi all,
I am evaluating Solr 4.0 fot its NRT capabilities.
How can you perform a soft commit with solrj 4.0?

HttpSolrServer.commit method doesn't have softCommit option which appears to
be an option available for the commit command:
http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-4-soft-commit-tp3984057.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different Results..

2010-12-22 Thread Marco Martinez
We need more information about the the analyzers and tokenizers of the
default field of your search

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 satya swaroop 

> Hi All,
> i am getting different results when i used with some escape keys..
> for example:::
> 1) when i use this request
>http://localhost:8080/solr/select?q=erlang!ericson
>   the result obtained is
>   
>
> 2) when the request is
> http://localhost:8080/solr/select?q=erlang/ericson
>the result is
>  
>
>
> My query here is, do solr consider both the queries differently and what do
> it consider for !,/ and all other escape characters.
>
>
> Regards,
> satya
>


Re: White space in facet values

2010-12-22 Thread Marco Martinez
try to copy the values (with copyfield) to a string field

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 Peter Karich 

>
>
> you should try fq=Product:"Electric Guitar"
>
>
> > How do I handle facet values that contain whitespace? Say I have a field
> "Product" that I want to facet on. A value for "Product" could be "Electric
> Guitar". How should I handle the white space in "Electric Guitar" during
> indexing? What about when I apply the constraint fq=Product:Electric Guitar?
>
> --
> http://jetwick.com open twitter search
>
>


function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Hi everyone,

My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:

q=car returns 3 docs

1-
name=car ford
marketValue=1
score=1.3

2-
name=car citroen
marketValue=2
score=1.3

3-
name=car mercedes
marketValue=0.5
score=1.3

but if want to sum the marketValue to the score, my returned list is the
next:

q=car+_val_:marketValue

1-
name=bus
marketValue=5
score=5

2-
name=car citroen
marketValue=2
score=3.3

3-
name=car ford
marketValue=1
score=2.3

4-
name=car mercedes
marketValue=0.5
score=1.8


Its possible to apply the function query only to the documents returned in
the first query?


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:

My query:
_query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned


Solr query parser
car AND _val_:marketValue -> 300 docs returned


Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Erik Hatcher 

> Try using AND (or set q.op):
>
>   q=car+AND+_val_:marketValue
>
> On Apr 12, 2011, at 07:11 , Marco Martinez wrote:
>
> > Hi everyone,
> >
> > My situation is the next, I need to sum the value of a field to the score
> to
> > the docs returned in the query, but not to all the docs, example:
> >
> > q=car returns 3 docs
> >
> > 1-
> > name=car ford
> > marketValue=1
> > score=1.3
> >
> > 2-
> > name=car citroen
> > marketValue=2
> > score=1.3
> >
> > 3-
> > name=car mercedes
> > marketValue=0.5
> > score=1.3
> >
> > but if want to sum the marketValue to the score, my returned list is the
> > next:
> >
> > q=car+_val_:marketValue
> >
> > 1-
> > name=bus
> > marketValue=5
> > score=5
> >
> > 2-
> > name=car citroen
> > marketValue=2
> > score=3.3
> >
> > 3-
> > name=car ford
> > marketValue=1
> > score=2.3
> >
> > 4-
> > name=car mercedes
> > marketValue=0.5
> > score=1.8
> >
> >
> > Its possible to apply the function query only to the documents returned
> in
> > the first query?
> >
> >
> > Thanks in advance,
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
>
>


Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
No, this query returns a few more documents than if a do it by lucene query
parser. I'm going to generate another query parser that send a simple term
query and see what is the output, when i have it, i will inform in the mail.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Yonik Seeley 

> On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
>  wrote:
> > Thanks but I tried this and I saw that this work in a standard scenario,
> but
> > in my query i use a my own query parser and it seems that they dont doing
> > the AND and returns all the docs in the index:
> >
> > My query:
> > _query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned
>
> This would seem to point to your generated query {!bm25}car
> matching all docs for some reason?
>
> -Yonik
> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> 25-26, San Francisco
>


Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
Its seems that is a problem of my own query, now i need to investigate if
there is something different between a normal query and my implementation of
the query, because if you use it alone, its works properly.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/13 Marco Martinez 

> No, this query returns a few more documents than if a do it by lucene query
> parser. I'm going to generate another query parser that send a simple term
> query and see what is the output, when i have it, i will inform in the mail.
>
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2011/4/12 Yonik Seeley 
>
>> On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
>>  wrote:
>> > Thanks but I tried this and I saw that this work in a standard scenario,
>> but
>> > in my query i use a my own query parser and it seems that they dont
>> doing
>> > the AND and returns all the docs in the index:
>> >
>> > My query:
>> > _query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned
>>
>> This would seem to point to your generated query {!bm25}car
>> matching all docs for some reason?
>>
>> -Yonik
>> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
>> 25-26, San Francisco
>>
>
>


function queries scope

2011-06-07 Thread Marco Martinez
Hi,

I need to use the function queries operations with the score of a given
query, but only in the docset that i get from the query and i dont know if
this is possible.

Example:

q=shops in madridreturns  1 docs  with a specific score for each doc

but now i need to do some stuff like

q=sum(product(2,query(shops in madrid),productValueField) but this will be
return all the docs in my index.


I know that i can do it via filter queries, ex, q=sum(product(2,query(shops
in madrid),productValueField)&fq=shops in madrid but this will do the query
two times and i dont want this because the performance is important to our
application.


Is there other approach to accomplished that=


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function queries scope

2011-06-07 Thread Marco Martinez
Thanks, but its not what i'm looking for, because the BoostQParserPlugin
multiplies the score of the query with the function queries defined in the b
param of the BoostQParserPlugin. and i can't use the edismax because we have
our own qparser. Its seems that i have to code another qparser.


Thanks Yonik anyway,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/6/7 Yonik Seeley 

> One way is to use the boost qparser:
>
> http://search-lucene.com/jd/solr/org/apache/solr/search/BoostQParserPlugin.html
> q={!boost b=productValueField}shops in madrid
>
> Or you can use the edismax parser which as a "boost" parameter that
> does the same thing:
> defType=edismax&q=shops in madrid&boost=productValueField
>
>
> -Yonik
> http://www.lucidimagination.com
>
>
> On Tue, Jun 7, 2011 at 6:53 AM, Marco Martinez
>  wrote:
> > Hi,
> >
> > I need to use the function queries operations with the score of a given
> > query, but only in the docset that i get from the query and i dont know
> if
> > this is possible.
> >
> > Example:
> >
> > q=shops in madridreturns  1 docs  with a specific score for each
> doc
> >
> > but now i need to do some stuff like
> >
> > q=sum(product(2,query(shops in madrid),productValueField) but this will
> be
> > return all the docs in my index.
> >
> >
> > I know that i can do it via filter queries, ex,
> q=sum(product(2,query(shops
> > in madrid),productValueField)&fq=shops in madrid but this will do the
> query
> > two times and i dont want this because the performance is important to
> our
> > application.
> >
> >
> > Is there other approach to accomplished that=
> >
> >
> > Thanks in advance,
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
>


term positions performance

2011-07-20 Thread Marco Martinez
Hi,

I am developing a new query term proximity and i am using the term positions
to get the positions of each term. I want to know if there is any clues to
increase the perfomance of using term positions, in index time o in query
time, all my fields that i am applying the term positions are indexed.

Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: term positions performance

2011-07-20 Thread Marco Martinez
Also, i develop this query via function query, i wonder if i do it via a
normal query will increase the perfomance..

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Marco Martinez 

> Hi,
>
> I am developing a new query term proximity and i am using the term
> positions to get the positions of each term. I want to know if there is any
> clues to increase the perfomance of using term positions, in index time o in
> query time, all my fields that i am applying the term positions are indexed.
>
> Thanks in advance,
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


Re: embeded solrj doesn't refresh index

2011-07-20 Thread Marco Martinez
You should send a commit to you embedded solr

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Jianbin Dai 

> Hi,
>
>
>
> I am using embedded solrj. After I add new doc to the index, I can see the
> changes through solr web, but not from embedded solrj. But after I restart
> the embedded solrj, I do see the changes. It works as if there was a cache.
> Anyone knows the problem? Thanks.
>
>
>
> Jianbin
>
>


Re: PositionIncrement gap and multi-valued fields.

2011-08-09 Thread Marco Martinez
Hi Luis,

As far as i know, the position increment gap only affects in some queries,
like phrase queries if you use the slop. The position incremente gap does
not affect  the similarity scoring formula of lucene :

score(q,d)   =
coord(q,d)<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_coord>
  ·  
queryNorm(q)<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_queryNorm>
  · ∑( tf(t in 
d)<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_tf>
  ·  
idf(t)<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_idf>
2  ·  
t.getBoost()<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_termBoost>
 ·  
norm(t,d)<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_norm>
 )t in q*Lucene Practical Scoring Function*
*
*
*
*
The two first arguments are related to normalizes the queries. In the
summation, the two first arguments are related to the frequency of the term,
in the document and in the index, the third one is the boost of the term in
the query, and the final one, encapsulates a few (indexing time) boost and
length factors, but the lengths factor are calculated with the number of
terms so the position increment gap doesnt make more tokens, so this factor
neither affect the score.

But if you use, for example a multivalue field, with a position incremente
gap of 100, if you do a query with a slop less than 100, you prevent to have
matches between two separated values of this field, ex:

q=test:"A B"~99

doc1

A
B

You dont get any matches for this doc, but if you do this query q=test:"A
B"~101 you will get the doc1 as a match.


Bye!


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/8/8 Luis Cappa Banda 

> Hello!
>
> I have a doubt about the behaviour of searching over field types that have
> positionIncrementGap defined. For example, supose that:
>
>
>   1. We have a field called "test" defined as multi-valued and white space
>   tokenized.
>   2. The index has an single document with a "test" value:
>
> 
> TEST1
> 
> 
> AAA BBB
> 
> 
> CCC DDD
> 
> 
> EEE FFF
> 
> 
> TEST2
> 
>
>
> I read that positionIncrementGap defines the virtual space between the last
> token of one field instance and the first token of the next instance
> (source:
>
> http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html
> ).
> When it says "last token of one field instance" means that is the last
> token
> of the first entry from the multi-valued content? In our example before it
> will be "TEST1".
>
> Anyway, I've been doing some tests modifying the positionIncrementGap value
> with high values and low values. Can anybody explain me with detail which
> implications has in Solr scoring algorythm an upper and a lower value? I
> would like to understand how this value affects matching results in fields
> and also calculating the final score (maybe more gap implies more spaces
> and
> a worst score when the value matches, etc.).
>
> Thank you for reading so far!
>


Re: Problems with estrange data appended to body field [SOLVED]

2012-07-09 Thread Marco Scalone
Problem solved. The problem was in the drupal side...  drupal core 
search interfered  with apachesolr module. and added extra information.


After deleting the core search tables and setting to 0 the numbres to 
index on cron .. reindex site  an the problem was solved.


thanks.
Marco

El 09/07/12 11:59, Marco Scalone escribió:

Hello, ,
I'm new in this list and have been using solr for many months and 
I'm trying to install and use wide along in the organization. But 
doing some test I realise a problem in the result snippet it generates.


When I make a search the result snippet shows a "stemmed" version of 
the title repeated times. made the search from apache admin query page 
and realise that the body was full of this strings. You will 
understand with this example:


--- 


Contralor de la Edificación

4110
Espacios Públicos Habitat y Edificaciones
SERVICIO
Otorgamiento de permisos de construcción, de locales industriales y 
comerciales. Recepción de denuncias de obras sin permiso. Coordinación 
con Bomberos sobre temas de seguridad edilicia.Visite nuestra página: 
www.montevideo.gub.uy/ciudadania/contralor-de-la-edificacion

Interna

(contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edificacion, contralor edif, contralor 
edificacion, contralor edif, contralor edif, contralor edif, contralor 
edif, contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edificacion, contralor edif, contralor edificacion, 
contralor edif, contralor edificacion, contralor edificacion, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif, contralor edif, contralor edif, contralor edif, 
contralor edif)


- 



As you can see the last part of the body has de steemed version of the 
title meny times, the same happens with the spell (because its a 
copy), the problems is that this is visible in the results and 
highlighted.


Need help because dont know why this is happening seems to be somthing 
automatic in solr, maybe some configuration.

Using Drupal 6 with standard modules, I'm attaching config files

Thanks a lot
Marco



--
Ing. Marco Scalone
División Tecnología de la Información
Intendencia de Montevideo
Tel.: 1950 int. 4426





Marco Scalone está ausente de la oficina.

2012-09-07 Thread Marco Scalone

Estaré ausente de la oficina desde el Vie 07/09/2012  y no volveré hasta el
Jue 20/09/2012 .

Responderé a su mensaje cuando regrese.



Re: Cannot get solr 1.3.0 to run properly with plesk 9.2.1 on CentOS

2009-08-18 Thread Marco Westermann

Hi,

I would guess, that the problem is, that you use a jre not a jdk. I mean 
I have read, that solr requires a JDK.


with best regards,

Marco

Aaron Aberg schrieb:

Constantijn,

First of all, I want you to know how much I appreciate you not giving
up on me. Second of all, your instructions were really great. I think
that I am getting closer to solving this issue. I am STILL get that
error but after a full tomcat reboot it picked up my solr.home
environment variable from my web.xml and its pointing to the new
location. (Good idea)

Here is the FULL log from start up of Tomcat. It might be excessive,
but I want to give you all of the information that I can:

Aug 17, 2009 11:16:08 PM org.apache.catalina.core.AprLifecycleListener
lifecycleEvent
INFO: The Apache Tomcat Native library which allows optimal
performance in production environments was not found on the
java.library.path:
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
Aug 17, 2009 11:16:09 PM org.apache.coyote.http11.Http11BaseProtocol init
INFO: Initializing Coyote HTTP/1.1 on http-9080
Aug 17, 2009 11:16:09 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 3382 ms
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
Aug 17, 2009 11:16:09 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: ContextListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: SessionListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: ContextListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: SessionListener: contextInitialized()
Aug 17, 2009 11:16:12 PM org.apache.catalina.core.ApplicationContext log
INFO: org.apache.webapp.balancer.BalancerFilter: init(): ruleChain:
[org.apache.webapp.balancer.RuleChain:
[org.apache.webapp.balancer.rules.URLStringMatchRule: Target string:
News / Redirect URL: http://www.cnn.com],
[org.apache.webapp.balancer.rules.RequestParameterRule: Target param
name: paramName / Target param value: paramValue / Redirect URL:
http://www.yahoo.com],
[org.apache.webapp.balancer.rules.AcceptEverythingRule: Redirect URL:
http://jakarta.apache.org]]
Aug 17, 2009 11:16:13 PM org.apache.coyote.http11.Http11BaseProtocol start
INFO: Starting Coyote HTTP/1.1 on http-8080
Aug 17, 2009 11:16:13 PM org.apache.jk.common.ChannelSocket init
INFO: JK: ajp13 listening on /0.0.0.0:8009
Aug 17, 2009 11:16:13 PM org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=0/57  config=null
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardService start
INFO: Starting service PSA
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.23
Aug 17, 2009 11:16:13 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
locateInstanceDir
INFO: Using JNDI solr.home: /usr/share/solr
Aug 17, 2009 11:16:15 PM
org.apache.solr.core.CoreContainer$Initializer initialize
INFO: looking for solr.xml: /usr/share/solr/solr.xml
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/usr/share/solr/'
Aug 17, 2009 11:16:15 PM org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Reusing parent classloader
Aug 17, 2009 11:16:15 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.ExceptionInInitializerError
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at 
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)

Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Marco Westermann

Hi Paul,

I would say, you should use the copyField tag in the schema. eg:



the text-field has to be difined as multivalued=true. When you now do an 
unqualified search, it will search every field, which is copied to the 
text-field.


with best regards,

Marco Westermann

Paul Tomblin schrieb:

I've got "text" and so if I
do an unqualified search it only finds in the field text.  If I want
to search title, I can do "title:foo", but what if I want to find if
the search term is in any field, or if it's in "text" or "title" or
"concept" or "keywords"?  I already tried "*:foo", but that throws an
exception:

Caused by: org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.client.solrj.SolrServerException:
org.apache.solr.common.SolrException: undefined field *
 [java] at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:161)


  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: Can I search for a term in any field or a list of fields?

2009-08-18 Thread Marco Westermann
exactly! for example you could create a field called "all". And you copy 
your fields to it, which should be searched, when all fields are searched.


then you have two possibilities: either you make this field the 
defaultSearchField for use of unqualified searches. or you qualify the 
field in the query all:foo and all fields are searched which have been 
copied to the all-field.


best
Marco

Paul Tomblin schrieb:

So if I want to make it so that the default search always searches
three specific fields, I can make another field multi-valued that they
are all copied into?

On Tue, Aug 18, 2009 at 10:46 AM, Marco Westermann wrote:
  

I would say, you should use the copyField tag in the schema. eg:



the text-field has to be difined as multivalued=true. When you now do an
unqualified search, it will search every field, which is copied to the
text-field.





  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



dynamic changes to schema

2009-08-18 Thread Marco Westermann

Hi there,

is there a possibility to change the solr-schema over php dynamically. 
The web-application I want to index at the moment has the feature to add 
fields to entitys and you can tell this fields that they are searchable. 
To realize this with solr the schema has to change when a searchable 
field is added or removed.


Any suggestions,

Thanks a lot,

Marco Westermann

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: dynamic changes to schema

2009-08-18 Thread Marco Westermann

hi,

thanks for the advise but the problem with dynamic fields is, that i 
cannot restrict how the user calls the field in the application. So 
there isn't a pattern I can use. But I thought about using mulitvalued 
fields for the dynamically added fields. Good Idea?


thanks,
Marco

Constantijn Visinescu schrieb:

use a dynamic field ?

On Tue, Aug 18, 2009 at 5:09 PM, Marco Westermann  wrote:

  

Hi there,

is there a possibility to change the solr-schema over php dynamically. The
web-application I want to index at the moment has the feature to add fields
to entitys and you can tell this fields that they are searchable. To realize
this with solr the schema has to change when a searchable field is added or
removed.

Any suggestions,

Thanks a lot,

Marco Westermann

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander






__ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 4346 
(20090818) __

E-Mail wurde geprüft mit ESET NOD32 Antivirus.

http://www.eset.com


  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Re: dynamic changes to schema

2009-08-19 Thread Marco Westermann

Hi, thanks for your answers, I think I have to go more in deatail.

we are talking about a shop-application which have products I want to 
search for. This products normally have the standard attributes like 
sku, a name, a price and so on. But the user can add attributes to the 
product. So for example if he sells books, he could add the author as 
attribute. Lets say he name this field my_author (but he is free to name 
it as he wants) and he tells this field over  the configuration, that it 
is searchable. So I need a field in solr for the author. Cause I cant 
restrict the user to prefix every field with something like my_ dynamic 
fields doesn't work, do they?


best,
Marco

Constantijn Visinescu schrieb:

huh? I think I lost you :)
You want to use a multivalued field to list what dynamic fields you have in
your document?

Also if you program your application correctly you should be able to
restrict your users from doing anything you please (or don't please in this
case).


On Tue, Aug 18, 2009 at 11:38 PM, Marco Westermann  wrote:

  

hi,

thanks for the advise but the problem with dynamic fields is, that i cannot
restrict how the user calls the field in the application. So there isn't a
pattern I can use. But I thought about using mulitvalued fields for the
dynamically added fields. Good Idea?

thanks,
Marco

Constantijn Visinescu schrieb:



use a dynamic field ?

On Tue, Aug 18, 2009 at 5:09 PM, Marco Westermann 
wrote:



  

Hi there,

is there a possibility to change the solr-schema over php dynamically.
The
web-application I want to index at the moment has the feature to add
fields
to entitys and you can tell this fields that they are searchable. To
realize
this with solr the schema has to change when a searchable field is added
or
removed.

Any suggestions,

Thanks a lot,

Marco Westermann

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander






__ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version
4346 (20090818) __

E-Mail wurde geprüft mit ESET NOD32 Antivirus.

http://www.eset.com




  

--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander






__ Hinweis von ESET NOD32 Antivirus, Signaturdatenbank-Version 4346 
(20090818) __

E-Mail wurde geprüft mit ESET NOD32 Antivirus.

http://www.eset.com


  



--
++ Business-Software aus einer Hand ++
++ Internet, Warenwirtschaft, Linux, Virtualisierung ++
http://www.intersales.de
http://www.eisxen.org
http://www.tarantella-partner.de
http://www.medisales.de
http://www.eisfair.net

interSales AG Internet Commerce
Subbelrather Str. 247
50825 Köln

Tel  02 21 - 27 90 50
Fax  02 21 - 27 90 517
Mail i...@intersales.de
Mail m...@intersales.de
Web  www.intersales.de

Handelsregister Köln HR B 30904
Ust.-Id.: DE199672015
Finanzamt Köln-Nord. UstID: nicht vergeben
Aufsichtsratsvorsitzender: Michael Morgenstern
Vorstand: Andrej Radonic, Peter Zander 



Lock on old index files

2009-11-26 Thread Branca Marco
Hi everybody,
I'm experiencing a problem with my Solr-based web application running on a Sun 
Solaris OS.

It seems that the application still holds file-descriptors to index files even 
if these last ones are removed. It can be observed mainly when the 
snapinstaller script is executed, but we can find a similar problem on boxes 
where this script is not used, but only import and optimization commands are 
executed.

In particular, if we look at all file descriptors open by the application, we 
see that, most of times, there are duplicate entries. For example, there are 
two different Inodes, pointed by two different file descriptors, but if we look 
more closely at them they are clearly pointing to the same file (size is 
exaclty the same). The strange thing is that only one of them is actually in 
the index folder, while the other one is no more there.

When the application is restarted, duplicate entries disappear and the open 
file descriptors correctly point to existing files in the index folder.

Have anyone encountered a similar behavior on his installation?

Thanks in advance,

Marco

--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


Faceting - grouping results

2009-04-28 Thread Branca Marco
Hi,
I have a question about faceting.
I'm trying to retrieve results grouped by a certain field. I know that I can 
specify the parameter "rows" in order to ask Solr to return only "rows" 
documents.
What I would like to do is to ask Solr to return a certain number of documents 
for each category found in the faceting info.
For example, calling the URL

 
http://[server-ip]:[server-port]/select?q=*:*&facet=on&facet.field=xxx[&SOMETHING_ELSE]

on a set of indexed documents where xxx can assume the following values:
 - A
 - B
 - C

I would like to know what to set in the Solr URL in order to obtain, for 
instance:
 - at most 10 docs with xxx=A
 - at most 10 docs with xxx=B
 - at most 10 docs with xxx=C

Thank you for your help,

Marco Branca
Consultant Sytel Reply S.r.l.
Via Ripamonti,  89 - 20139 Milano
Mobile: (+39) 348 2298186
e-mail: m.bra...@reply.it
Website: www.reply.eu

--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


Filtering query terms

2009-05-22 Thread Branca Marco
Hi,

I am experiencing problems using filters.

I'm using the following version of Solr:
  solr/nightly of 2009-04-12

The part of the schema.xml I'm using for setting filters is the following:


  





  
  





  


and the field I'm querying is a field called "all" declared as follows:



When I try testing the filter "solr.LowerCaseFilterFactory" I get different 
results calling the following urls:

 1. 
http://[server-ip]:[server-port]/solr/[core-name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
 2. 
http://[server-ip]:[server-port]/solr/[core-name]/select/?q=all%3APaPa&version=2.2&start=0&rows=10&indent=on

Besides, when trying to test the "solr.ISOLatin1AccentFilterFactory" I get 
different results calling the following urls:

 1. 
http://[server-ip]:[server-port]/solr/[core-name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
 2. 
http://[server-ip]:[server-port]/solr/[core-name]/select/?q=all%3Apapà&version=2.2&start=0&rows=10&indent=on

Is it the expected behavior or it is a (known) bug? I would like to apply some 
filter converting all searched words in the corresponding lowercase version 
without accents.

Thanks for your help,

Marco


--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


R: Filtering query terms

2009-05-22 Thread Branca Marco
Thank you very much for the instantaneous support.
I couldn't find the conflict for hours :(

When I have a response for the ISOLatin1AccentFilterFactory I will write it on 
the mailing-list.

Thanks again,

Marco

Da: Ensdorf Ken [ensd...@zoominfo.com]
Inviato: venerdì 22 maggio 2009 18.16
A: 'solr-user@lucene.apache.org'
Oggetto: RE: Filtering query terms

> When I try testing the filter "solr.LowerCaseFilterFactory" I get
> different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3APaPa&version=2.2&start=0&rows=10&indent=on

In this case, the WordDelimiterFilterFactory is kicking in on your second 
search, so "APaPa" is split into "APa" and "Pa".  You can double-check this by 
using the analysis tool in the admin UI - 
http://localhost:8983/solr/admin/analysis.jsp

>
> Besides, when trying to test the "solr.ISOLatin1AccentFilterFactory" I
> get different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapà&version=2.2&start=0&rows=10&indent=on

Not sure what it happening here, but again I would check it with the analysi 
tool

--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


R: Filtering query terms

2009-05-25 Thread Branca Marco
Hi,
I tested the new filters' configuration and it works fine.


  





  
  





  


The problem about ISOLatin1AccentFilterFactory was not due to Solr, but to a 
core-dependent configuration in a Solr multi-core environment. It was only 
necessary to set to 0 the property 'splitOnCaseChange' in 
solr.WordDelimiterFilterFactory.

Thanks for your support,

Marco


Marco Branca
Consultant Sytel Reply S.r.l.
Via Ripamonti,  89 - 20139 Milano
Mobile: (+39) 348 2298186
e-mail: m.bra...@reply.it
Website: www.reply.eu

Da: Ensdorf Ken [ensd...@zoominfo.com]
Inviato: venerdì 22 maggio 2009 18.16
A: 'solr-user@lucene.apache.org'
Oggetto: RE: Filtering query terms

> When I try testing the filter "solr.LowerCaseFilterFactory" I get
> different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3APaPa&version=2.2&start=0&rows=10&indent=on

In this case, the WordDelimiterFilterFactory is kicking in on your second 
search, so "APaPa" is split into "APa" and "Pa".  You can double-check this by 
using the analysis tool in the admin UI - 
http://localhost:8983/solr/admin/analysis.jsp

>
> Besides, when trying to test the "solr.ISOLatin1AccentFilterFactory" I
> get different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapà&version=2.2&start=0&rows=10&indent=on

Not sure what it happening here, but again I would check it with the analysi 
tool

--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


Re: Solr query parser doesn't invoke analyzer for simple term query?

2010-03-17 Thread Marco Martinez
Hello,

You can see what happen (which analyzer are used for this field and which is
the output of the analyzers) with this search using the analysis page of the
solr default web page. I assume you are using the same analyzers and
tokenizers in indexing and searching for this field in your schema.

Regards,


Marco Martínez Bautista



2010/3/17 Teruhiko Kurosaka 

> It seems that Solr's query parser doesn't pass a single term query
> to the Analyzer for the field. For example, if I give it
> 2001年 (year 2001 in Japanese), the searcher returns 0 hits
> but if I quote them with double-quotes, it returns hits.
> In this experiment, I configured schema.xml so that
> the field in question will use the morphological Analyzer
> my company makes that is capable of splitting 2001年
> into two tokens 2001 and 年.  I am guessing that this
> Analyzer is called ONLY IF the term is a phrase.
> Is my observation correct?
>
> If so, is there any configuration parameter that I can tweak
> to force any query for the text fields be processed by
> the Analyzer?
>
> One might ask why users won't put space between 2001 and 年.
> Well if they are clearly two separate words, people do that.
> But 年 works more like a suffix in this case, and in many
> Japanese speaker's mind, 2001年 seems like one token, so
> many people won't.  (Remember Japanese don't use spaces
> in normal writing.)  Forcing to use Analyzer would also
> be useful for compound word handling often desirable
> for languages like German.
>
> 
> Teruhiko "Kuro" Kurosaka
> RLP + Lucene & Solr = powerful search for global contents
>
>


Re: Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Marco Martinez
Hi Marcin,

This is because when you do the replication, all the caches are rebuild
cause the index has changed, so the searchs performance decrease. You can
change your architecture to a multicore one to reduce the impact of the
replication. Using two cores, one to do the replication, and other to
search, when the replication is done, do a swap of the cores so the caches
are updated all the time.

Regards


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/9 Marcin 

> Hi guys,
>
> I have noticed that Master/Slave replication process is slowing down slave
> read/search performance during replication being done.
>
>
> please help
> cheers
>


Re: Facet count problem

2010-04-19 Thread Marco Martinez
Hi Ranveer,

The error in the count of the facets its caused by the tokenized field that
you are using, if you want to do facets for the whole string, use a
fieldType that doesn't strip the the field in tokens like the string field.

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/19 Ranveer Kumar 

> Hi Erick,
>
> My schema configuration is following.
>
>
>  
>  
>
>
>
>
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>  
>  
>  
>  
>
>
>
>   
> ignoreCase="true" expand="true"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>  
>
>
>
> 
>
> 
>  
>
>
>
>
>
> On Mon, Apr 19, 2010 at 6:22 AM, Erick Erickson  >wrote:
>
> > Can we see the actual field definitions from your schema file.
> > Ahmet's question is vital and is best answered if you'll
> > copy/paste the relevant configuration entries But based
> > on what you *have* posted, I'd guess you're trying to
> > facet on tokenized fields, which is not recommended.
> >
> > You might take a look at:
> > http://wiki.apache.org/solr/UsingMailingLists, it'll help you
> > frame your questions in a manner that gets you your
> > answers as fast as possibld.
> >
> > Best
> > Erick
> >
> > On Sun, Apr 18, 2010 at 12:59 PM, Ranveer Kumar  > >wrote:
> >
> > > I am.using text for type, which is static. For example: type is a field
> > and
> > > I am using type for categorization. For news type I am using news and
> for
> > > blog using blog.. type is a text field.
> > >
> > > On Apr 17, 2010 8:38 PM, "Ahmet Arslan"  wrote:
> > >
> > > > I am facing problem to get facet result count. I must be > wrong
> > > somewhere. > I am getting proper ...
> > > Are you faceting on a tokenized field? What is the fieldType of your
> > field?
> > >
> >
>


Re: synonym filter problem for string or phrase

2010-04-29 Thread Marco Martinez
Hi Ranveer,

If you don't specify a field type in the q parameter, the search will be
done searching in your default search field defined in the solrconfig.xml,
its your default field a text_sync field?

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/29 Ranveer 

> Hi,
>
> I am trying to configure synonym filter.
> my requirement is:
> when user searching by phrase like "what is solr user?" then it should be
> replace with "solr user".
> something like : what is solr user? => solr user
>
> My schema for particular field is:
>
>  positionIncrementGap="100">
> 
> 
> 
>
> 
> 
> 
> 
> 
>  ignoreCase="true" expand="true" tokenizerFactory="KeywordTokenizerFactory"/>
> 
> 
>
> it seems working fine while trying by analysis.jsp but not by url
> http://localhost:8080/solr/core0/select?q="what is solr user?"
> or
> http://localhost:8080/solr/core0/select?q=what is solr user?
>
> Please guide me for achieve desire result.
>
>


Re: synonym filter problem for string or phrase

2010-05-03 Thread Marco Martinez
Hi Ranveer,

I don't see any stemming analyzer in your configuration of the field
'text_sync', also you have  at
query time and not at index time, maybe that is your problem.


Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/30 Jonty Rhods 

> On 4/29/10 8:50 PM, Marco Martinez wrote:
>
> Hi Ranveer,
>
> If you don't specify a field type in the q parameter, the search will be
> done searching in your default search field defined in the solrconfig.xml,
> its your default field a text_sync field?
>
> Regards,
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2010/4/29 Ranveer 
>
>
>
> Hi,
>
> I am trying to configure synonym filter.
> my requirement is:
> when user searching by phrase like "what is solr user?" then it should be
> replace with "solr user".
> something like : what is solr user? =>  solr user
>
> My schema for particular field is:
>
>  positionIncrementGap="100">
> 
> 
> 
>
> 
> 
> 
> 
> 
>  ignoreCase="true" expand="true"
> tokenizerFactory="KeywordTokenizerFactory"/>
>
> 
> 
>
> it seems working fine while trying by analysis.jsp but not by url
> http://localhost:8080/solr/core0/select?q="what is solr user?"
> or
> http://localhost:8080/solr/core0/select?q=what is solr user?
>
> Please guide me for achieve desire result.
>
>
>
>
>
>
> Hi Marco,
> thanks.
> yes my default search field is text_sync.
> I am getting result now but not as I expect.
> following is my synonym.txt
>
> what is bone cancer=>bone cancer
> what is bone cancer?=>bone cancer
> what is of bone cancer=>bone cancer
> what is symptom of bone cancer=>bone cancer
> what is symptoms of bone cancer=>bone cancer
>
> in above I am getting result of all synonym but not the last one "what is
> symptoms of bone cancer=>bone cancer".
> I think due to stemming I am not getting expected result. However when I am
> checking result from the analysis.jsp,
> its giving expected result. I am confused..
> Also I want to know best approach to configure synonym for my requirement.
>
> thanks
> with regards
>
> Hi,
>
> I am also facing same type of problem..
> I am Newbie please help.
>
> thanks
> Jonty
>


Re: multivalue fields logic required

2010-05-06 Thread Marco Martinez
Hi Jonty,

I think you have three possible solutions:


   1. Use the collapse component with your name field for not have any
   duplicates documents.
   2. Create a simple logic in your index with flags, like one flag to
   determine the first element of the same document (in your example you will
   have three differents documents and the fist one wiill have this flag=true).
   If the search only have name, you will have to set this flag to true, if
   not, the dept or the student will be defined and you will have one document
   returned.
   3. Do a post-processing of your data.

Maybe you will have more solutions but these are what i have thought right
now.

Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Jonty Rhods 

> thanks
>
> :General solution is to index 3 different SolrDocument in your example. id
> and name fields will repeat themselves. All fields will be single-valued.
>
> if I am indexing 3 different field then if user is searching by name + dept
> then it will return duplicate value.. is there any other best possible
> way..?
>
> thanks
> On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan  wrote:
>
> >
> > > recently I start to work on solr, So I am still very new to
> > > use solr. Sorry
> > > if I am logically wrong.
> > > I have two table, parent and referenced (child).
> > >
> > > for that I set multivalue field following is my schema
> > > details
> > >   > > stored="true" required="true"
> > > />
> > >
> > >
> > > > > indexed="true" stored="true"/>
> > >
> > > > > indexed="true" stored="true"
> > > multiValued="true"/>
> > > > > indexed="true" stored="true"
> > > multiValued="true"/>
> > >
> > > indexed data details:
> > >
> > > 
> > >
> > >   
> > > student1
> > > student2
> > > student3
> > >   
> > >
> > >   
> > > city1
> > > city2
> > > city3
> > >   
> > >  1
> > >
> > >  
> > >name of emp
> > >   
> > >
> > > 
> > >
> > > now my question is :
> > > When user is searching by city2 then I want to return
> > > employee2 and their id
> > > (for multi value field).
> > > something like:
> > >
> > > 
> > >
> > >   
> > >
> > > student2
> > >
> > >   
> > >
> > >   
> > >
> > > city2
> > >
> > >   
> > >  1
> > >
> > >  
> > >name of emp
> > >   
> > >
> > > 
> > >
> >
> > I had a similar need before. AFAIK you cannot do it with multivalued
> > fields. The indexing order is preserved in multivalued field. May be you
> can
> > post-process returned fields and capture correct position of matched city
> > field, and use this index to display correct dept value. But this is easy
> if
> > you are using string or integer type for city and dept.
> >
> > General solution is to index 3 different SolrDocument in your example. id
> > and name fields will repeat themselves. All fields will be single-valued.
> >
> >
> >
> >
> >
>


Re: hi to everyone

2010-05-06 Thread Marco Martinez
You should specify the core in your request, like
http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
webapp and 'core0' is the name of the core.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone 

> Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
> (this is the 4th day :D).
> I'm just a novice and i would like to make a question ...
>
> I'm using solr in multicore way but i don't understad how to add xml
> documents to a particular core ...
> Can someone help me ???
>
> Antonello
>


Re: hi to everyone

2010-05-06 Thread Marco Martinez
See this page
http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curland
the solr tutorial
http://lucene.apache.org/solr/tutorial.html (maybe you can use the
post.jar).

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone 

> Ok, you're right :D
>
> I exaplain my situation ...
>
> I have solr locally on my machine
>
> */home/antonello/solrtest*
>
> inside the folder solrtest I have:
>
> |_ build
> |_ build.xml
> |_ CHANGES.txt
> |_ client
> |_ common-build.xml
> |_ contrib
> |_ dist
> |_ docs
> |_ etc
> |_ lib
> |_ LICENSE.txt
> |_ logs
> |_ multicore
>|_ bandb
>|_ conf
>|_ schema.xml
>|_ solrconfig.xml
>|_ data
>|_ index
>|_ segments_1
>|_ segments.gen
>|_ solr.xml
> |_ NOTICE.txt
> |_ README.txt
> |_ src
> |_ start.jar
> |_ start_multicore.sh
> |_ webapps
>
>
> I have also xml files in anoter place and I would like to add these xml
> files to the bandb core.
> Is there a command to add an xml file to a particular core, imagining we
> can
> have an indefinite number of cores ?
>
>
>
>
>
> 2010/5/6 Marco Martinez 
>
> > You should specify the core in your request, like
> > http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
> > webapp and 'core0' is the name of the core.
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/6 Antonello Mangone 
> >
> > > Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
> > > (this is the 4th day :D).
> > > I'm just a novice and i would like to make a question ...
> > >
> > > I'm using solr in multicore way but i don't understad how to add xml
> > > documents to a particular core ...
> > > Can someone help me ???
> > >
> > > Antonello
> > >
> >
>


Re: JTeam Spatial Plugin

2010-05-12 Thread Marco Martinez
Hi,


You can use localsolr  (http://www.gissearch.com/localsolr) that supports
sharding if you need this feature.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/11 Jean-Sebastien Vachon 

> Hi,
>
> Thanks for your suggestion but I received more information about this issue
> from one of the JTeam's developer and he told me that
> my problem was caused by the plugin not supporting sharding at this time.
>
> In my case, I noticed that individual shards were computing the distance
> through the geo_distance field.
> However, the "master" Solr instance controlling the shards was kind of
> loosing this information from the lack of support for shards.
>
> For now there is no quick work around that I know of.
>
> Later,
>
> On 2010-05-11, at 2:54 PM, Michael wrote:
>
> > Try using "geo_distance" in the return fields.
> >
> > On Thu, Apr 29, 2010 at 9:26 AM, Jean-Sebastien Vachon
> >  wrote:
> >> Hi All,
> >>
> >> I am using JTeam's Spatial Plugin RC3 to perform spatial searches on my
> index and it works great. However, I can't seem to get it to return the
> computed distances.
> >>
> >> My query component is run before the geoDistanceComponent and the
> distanceField is set to "distance"
> >> Fields for lat/long are defined as well and the different tiers field
> are in the results. Increasing the radius cause the number of matches to
> increase so I guess that my setup is working...
> >>
> >> Here is sample query and its output (I removed some of the fields to
> keep it short):
> >>
> >>
> /select?passkey=sample&q={!spatial%20lat=40.27%20long=-76.29%20radius=22%20calc=arc}title:engineer&wt=json&indent=on&fl=*,distance
> >>
> >> 
> >>
> >> {
> >>  "responseHeader":{
> >>  "status":0,
> >>  "QTime":69,
> >>  "params":{
> >>"fl":"*,distance",
> >>"indent":"on",
> >>"q":"{!spatial lat=40.27 long=-76.29 radius=22
> calc=arc}title:engineer",
> >>"wt":"json"}},
> >>  "response":{"numFound":223,"start":0,"docs":[
> >>{
> >>
> >> "title":"Electrical Engineer",
> >>"long":-76.3054962158203,
> >> "lat":40.037899017334,
> >> "_tier_9":-3.004,
> >> "_tier_10":-6.0008,
> >> "_tier_11":-12.0016,
> >> "_tier_12":-24.0031,
> >> "_tier_13":-47.0061,
> >> "_tier_14":-93.00122,
> >> "_tier_15":-186.00243,
> >> "_tier_16":-372.00485},
> >> }}
> >>
> >> This output suggests to me that everything is in place. Anyone knows how
> to fetch the computed distance? I tried adding the field 'distance' to my
> list of fields but it didn't work
> >>
> >> Thanks
> >>
>
>


Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
Hi,

2º solution:

Not use multiValue fields, instead use two single fields, in your example
will be:

doc1:
dept: student1
city: city1
principalFlag:T
doc2:
dept: student2
city: city2
principalFlag:F

So, if you search without specify any city or dept, you should put
princiaplFlag:T for no get duplicate on your response. And if you specify a
city or a dept, there is no need to specify the principalFlag because you
will only get the result that match with your fields (you dont get
duplicates).

3º solution:

Do a postprocessing to eleminate the fields in your response that you dont
need, i mean, get only the city and the dept that should be in the query
response.

Hope this will help



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods 

> Hi Marco,
>
> I am trying to patch for collapse component support (till now no luck)..
> In mean time I would like to know the 2nd and 3rd option you mentioned
> (logic in solrj)..
>
> with regards
>
> On Thu, May 6, 2010 at 2:36 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi Jonty,
> >
> > I think you have three possible solutions:
> >
> >
> >   1. Use the collapse component with your name field for not have any
> >   duplicates documents.
> >   2. Create a simple logic in your index with flags, like one flag to
> >   determine the first element of the same document (in your example you
> > will
> >   have three differents documents and the fist one wiill have this
> > flag=true).
> >   If the search only have name, you will have to set this flag to true,
> if
> >   not, the dept or the student will be defined and you will have one
> > document
> >   returned.
> >   3. Do a post-processing of your data.
> >
> > Maybe you will have more solutions but these are what i have thought
> right
> > now.
> >
> > Regards,
> >
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/6 Jonty Rhods 
> >
> > > thanks
> > >
> > > :General solution is to index 3 different SolrDocument in your example.
> > id
> > > and name fields will repeat themselves. All fields will be
> single-valued.
> > >
> > > if I am indexing 3 different field then if user is searching by name +
> > dept
> > > then it will return duplicate value.. is there any other best possible
> > > way..?
> > >
> > > thanks
> > > On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan 
> wrote:
> > >
> > > >
> > > > > recently I start to work on solr, So I am still very new to
> > > > > use solr. Sorry
> > > > > if I am logically wrong.
> > > > > I have two table, parent and referenced (child).
> > > > >
> > > > > for that I set multivalue field following is my schema
> > > > > details
> > > > >   > > > > stored="true" required="true"
> > > > > />
> > > > >
> > > > >
> > > > > > > > > indexed="true" stored="true"/>
> > > > >
> > > > > > > > > indexed="true" stored="true"
> > > > > multiValued="true"/>
> > > > > > > > > indexed="true" stored="true"
> > > > > multiValued="true"/>
> > > > >
> > > > > indexed data details:
> > > > >
> > > > > 
> > > > >
> > > > >   
> > > > > student1
> > > > > student2
> > > > > student3
> > > > >   
> > > > >
> > > > >   
> > > > > city1
> > > > > city2
> > > > > city3
> > > > >   
> > > > >  1
> > > > >
> > > > >  
> > > > >name of emp
> > > > >   
> > > > >
> > > > > 
> > > > >
> > > > > now my question is :
> > > > > When user is searching by city2 then I want to return
> > > > > employee2 and their id
> > > > > (for multi value field).
> > > > > something like:
> > > > >
> > > > > 
> > > > >
> > > > >   
> > > > >
> > > > > student2
> > > > >
> > > > >   
> > > > >
> > > > >   
> > > > >
> > > > > city2
> > > > >
> > > > >   
> > > > >  1
> > > > >
> > > > >  
> > > > >name of emp
> > > > >   
> > > > >
> > > > > 
> > > > >
> > > >
> > > > I had a similar need before. AFAIK you cannot do it with multivalued
> > > > fields. The indexing order is preserved in multivalued field. May be
> > you
> > > can
> > > > post-process returned fields and capture correct position of matched
> > city
> > > > field, and use this index to display correct dept value. But this is
> > easy
> > > if
> > > > you are using string or integer type for city and dept.
> > > >
> > > > General solution is to index 3 different SolrDocument in your
> example.
> > id
> > > > and name fields will repeat themselves. All fields will be
> > single-valued.
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>


Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
You should do a preprocessing(multiply your document as many documents as
values you have in your multivalue field, with the principalFlag:T in your
first document) before you indexing the data with that logic

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods 

> hi Marco,
>
> Thanks for quick reply..
> I have another doubt: In 2nd solution: How to set flag for duplicate value.
> because I am not sure about the no fo duplicate rows (it could be random
> no..)
> so how can I set the flag..
> thank
>
> On Wed, May 12, 2010 at 12:59 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi,
> >
> > 2º solution:
> >
> > Not use multiValue fields, instead use two single fields, in your example
> > will be:
> >
> > doc1:
> > dept: student1
> > city: city1
> > principalFlag:T
> > doc2:
> > dept: student2
> > city: city2
> > principalFlag:F
> >
> > So, if you search without specify any city or dept, you should put
> > princiaplFlag:T for no get duplicate on your response. And if you specify
> a
> > city or a dept, there is no need to specify the principalFlag because you
> > will only get the result that match with your fields (you dont get
> > duplicates).
> >
> > 3º solution:
> >
> > Do a postprocessing to eleminate the fields in your response that you
> dont
> > need, i mean, get only the city and the dept that should be in the query
> > response.
> >
> > Hope this will help
> >
> >
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/12 Jonty Rhods 
> >
> > > Hi Marco,
> > >
> > > I am trying to patch for collapse component support (till now no
> luck)..
> > > In mean time I would like to know the 2nd and 3rd option you mentioned
> > > (logic in solrj)..
> > >
> > > with regards
> > >
> > > On Thu, May 6, 2010 at 2:36 PM, Marco Martinez <
> > > mmarti...@paradigmatecnologico.com> wrote:
> > >
> > > > Hi Jonty,
> > > >
> > > > I think you have three possible solutions:
> > > >
> > > >
> > > >   1. Use the collapse component with your name field for not have any
> > > >   duplicates documents.
> > > >   2. Create a simple logic in your index with flags, like one flag to
> > > >   determine the first element of the same document (in your example
> you
> > > > will
> > > >   have three differents documents and the fist one wiill have this
> > > > flag=true).
> > > >   If the search only have name, you will have to set this flag to
> true,
> > > if
> > > >   not, the dept or the student will be defined and you will have one
> > > > document
> > > >   returned.
> > > >   3. Do a post-processing of your data.
> > > >
> > > > Maybe you will have more solutions but these are what i have thought
> > > right
> > > > now.
> > > >
> > > > Regards,
> > > >
> > > >
> > > > Marco Martínez Bautista
> > > > http://www.paradigmatecnologico.com
> > > > Avenida de Europa, 26. Ática 5. 3ª Planta
> > > > 28224 Pozuelo de Alarcón
> > > > Tel.: 91 352 59 42
> > > >
> > > >
> > > > 2010/5/6 Jonty Rhods 
> > > >
> > > > > thanks
> > > > >
> > > > > :General solution is to index 3 different SolrDocument in your
> > example.
> > > > id
> > > > > and name fields will repeat themselves. All fields will be
> > > single-valued.
> > > > >
> > > > > if I am indexing 3 different field then if user is searching by
> name
> > +
> > > > dept
> > > > > then it will return duplicate value.. is there any other best
> > possible
> > > > > way..?
> > > > >
> > > > > thanks
> > > > > On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan 
> > > wrote:
> > > > >
> > > > > >
> > > > > > > recently I start to work on solr, So I am still very new to
> > > > > > > use solr. Sorry
> > >

Re: Question on pf (Phrase Fields)

2010-05-13 Thread Marco Martinez
I don't know if this solution accomplished your requirements but you can use
fq to do the query with only "foo" and q when you search by more terms.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/13 Blargy 

>
> Is there any way to configure this so it only takes after if you match more
> than one word?
>
> For example if I search for: "foo" it should have no effect on scoring, but
> if I search for "foo bar" then it should.
>
> Is this possible? Thanks
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Question-on-pf-Phrase-Fields-tp815095p815095.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


disable caches in real time

2010-05-14 Thread Marco Martinez
Hi,

I want to know if there is any approach to disable caches in a specific core
from a multicore server.

My situation is the next:

I have a multicore server where the core0 will be listen to the queries and
other core (core1) that will be replicated from a master server. Once the
replication has been done, i will swap the cores. My point is that i want to
disable the caches in the core that is in charge of the replication to save
memory in the machine.

Any suggestions will be appreciated.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: disable caches in real time

2010-05-17 Thread Marco Martinez
Any suggestions?

I have thought in have two configurations per server and reload each one
with the appropiated config file but i would prefer another solution if its
possible.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/14 Marco Martinez 

> Hi,
>
> I want to know if there is any approach to disable caches in a specific
> core from a multicore server.
>
> My situation is the next:
>
> I have a multicore server where the core0 will be listen to the queries and
> other core (core1) that will be replicated from a master server. Once the
> replication has been done, i will swap the cores. My point is that i want to
> disable the caches in the core that is in charge of the replication to save
> memory in the machine.
>
> Any suggestions will be appreciated.
>
> Thanks in advance,
>
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


Re: Targeting two fields with the same query or one field gathering contents from both ?

2010-05-17 Thread Marco Martinez
No, the equivalent for this will be:

- A: (the lazy fox) *OR* B: (the lazy fox)
- C: (the lazy fox)


Imagine the situation that you dont have in B 'the lazy fox', with the AND
you get 0 results although you have 'the lazy fox' in A and C

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/17 Xavier Schepler 

> Hey,
>
> let's say  I have :
>
> - a field named A with specific contents
>
> - a field named B with specific contents
>
> - a field named C witch contents only from A and B added with copyField.
>
> Are those queries equivalents in terms of performance :
>
> - A: (the lazy fox) AND B: (the lazy fox)
> - C: (the lazy fox)
>
> ??
>
> Thanks,
>
> Xavier
>
>
>
>


Re: Multifaceting on multivalued field

2010-05-18 Thread Marco Martinez
Hi,

This exception is fired when you don't have this field on your index, but
this comes because you have an error in your query syntax  !{ex=cars}cars,
should be {*!*ex=cars}cars , whith the exclamation inside the brackets.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Peter Karich 

> Hi all,
>
> I read about multifaceting [1] and tried it for myself. With
> multifaceting I would like to conserve the number of documents for the
> 'un-facetted case'. This works nice with normal fields, but I get an
> exception [2] if I apply this on a multivalued field.
> Is this a bug or logical :-) ? If the latter one is the case, would
> anybody help me to understand this?
>
> Regards,
> Peter.
>
> [1]
>
> http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html
>
> [2]
> org.apache.solr.common.SolrException: undefined field !{ex=cars}cars
>at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077)
>at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226)
>at
>
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283)
>at
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166)
>at
>
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
>at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
>
>


Re: disable caches in real time

2010-05-19 Thread Marco Martinez
Hi Chris,

Thank you for your answer.

I've always undestand that if you do a commit (replication does it), a new
searcher is open, and you lose performance (queries per second) while the
caches are regenerated. I think i don't explain correctly my situation
before, with my schema i want to avoid this loss of performance in an
enviroment with frequent updates.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Chris Hostetter 

> : I want to know if there is any approach to disable caches in a specific
> core
> : from a multicore server.
>
> only via hte config.
>
> : I have a multicore server where the core0 will be listen to the queries
> and
> : other core (core1) that will be replicated from a master server. Once the
> : replication has been done, i will swap the cores. My point is that i want
> to
> : disable the caches in the core that is in charge of the replication to
> save
> : memory in the machine.
>
> that seems bizarely complicated -- replication can work against a "live"
> core, no need to do the swap yourself, the replicationHandler takes care
> of this for your transparently (ie: you have one core, replicating from a
> master -- the old index will be searched by users, and have caches, and
> when the new version of the index is ready, the replication handler will
> swap the *index* in that core (but the core itself never changes) ... it
> can even autowarm the caches on the new index for you before the swap if
> you configure it that way.
>
> -Hoss
>
>


Re: Storing RandomSortField

2010-05-19 Thread Marco Martinez
Hi Alexandre,

I am not totally sure about this, but the random sort field its only used to
do a random sort on your searchs, and you will to pass differents values to
have differents sorts, so this only applies in the searchs, so no value is
indexed. You will find more information here:
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Alexandre Rocco 

> Hi guys,
>
> Is there any way to mak a RandomSortField be stored?
> I'm trying to do it for debugging purposes,
> My intention is to take a look at the values that are stored there to
> determine the sorting that is being applied to the results.
>
> I tried to make it a stored field as:
> 
>
> And also tried to create another text field, copying the result from the
> random field like this:
> 
> 
>
> Neither of the approaches worked.
> Is there any restriction on this kind of field that prevents it from being
> displayed in the results?
>
> Thanks,
> Alexandre
>


Re: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Marco Martinez
Maybe this will help you

http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/26 bbarani 

>
> Hi,
>
> Sorry if I am asking this question again in this forum..
>
> Is there any plugin which I can use to do a realtime indexing?
>
> I have a requirement where we have an application which sits on top of SQL
> server DB and updates happen on day to day basis. Users would like to see
> the changes made to the DB immediately in the search results. I am thinking
> of using JMS queue for achieving this, but before that I just want to check
> if anyone has implemented similar kind of requirement before?
>
> Any help / suggestions would be greatly appreciated.
>
> Thanks,
> bb
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Any-realtime-indexing-plugin-available-for-SOLR-tp845026p845026.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: solr.solr.home

2010-05-27 Thread Marco Martinez
Hi,

When you start the tomcat, you can specify the properties, it will be
something like this -Dsolr.solr.home=path/to/your/solr/home. For example, in
linux ./startup.sh -Dsolr.solr.home=path/to/your/solr/home



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/27 Antonello Mangone 

> But where I have to write this command ???
>
> System.setProperty("solr.solr.home",
> > "whateverpathyou'dliketosetonyourfilesystem");
> >
> > Claudio
> >
>


Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Hi Scott,

We need more information about your request, can you put the query that you
are doing to the servers.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang 

> Hi. All.
>   I am trying to use solr to search over 2 lucene indexes.  I am following
> the solr tutorial and test the distributed search example. It works.
>   Then I am using my own lucene indexes. Search in each solr instance works
> and return the expected result. But when I do distributed search using
> "shards". It only return the "numFound"=14. But the result contain nothing.
>Don't know why. Can Any one help? Thanks.
>


Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Try to put the rows parameter in your request, i guess that in your
solrconfig you have configured the default rows to 0 in your default request
handler.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang 

> Thanks for replying.
>
> Here is the part of my schema.xml:
> I only have 4 fields in my document.
>
> 
>
>required="true" />
>required="true"/>
>   
>   
>
>
>
>
>   
>   
>   
>   
>   
>   
>   
>   
>
>   
>   
>   
>   
>   
>   
>
>   
>
>   
>multiValued="true"/>
>
>   
>
>
>
>  
>
>  id
>
>
> I am running 2 instances as tutorial shows: one on 8983. Another one is on
> 7574.
> When I search on 8983:
> URL:
>
> http://localhost:8983/solr/select/?q=marship&version=2.2&start=0&rows=10&indent=on
> I got:
>
> 
> -
> 
> 89
> product
> 
> -
> 
> 90
> product
> 
> ..
>
>
> when I search on 7574:
> URL:
>
> http://localhost:7574/solr/select/?q=marship&version=2.2&start=0&rows=10&indent=on
> I got:
> 
> -
> 
> 89
> product
> 
> -
> 
> 90
> product
> 
> -
> 
> 91
> product
> 
> 
>
> As they are using 2 copies of same lucene indexes. the result is same.
> Then I use
> URL:
>
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=marship
> I got:
> 
> -
> 
> 0
> 31
> -
> 
> true
> marship
> localhost:8983/solr,localhost:7574/solr
> 
> 
> 
> 
>
> Note the numFound is 14.
> When I try URL:
>
> http://localhost:8983/solr/select?shards=localhost:8983/solr/&indent=true&q=marship
> The numFound="7" but still nothing returned.
>
> URL:
>
> http://localhost:8983/solr/select?shards=localhost:7574/solr/&indent=true&q=marship
> return numFound="7" too. And the result has nothing.
>
> Please help.
>
> Thanks.
> Regards.
> Scott
>
>
> On Mon, Jun 7, 2010 at 3:47 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi Scott,
> >
> > We need more information about your request, can you put the query that
> you
> > are doing to the servers.
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/6/7 Scott Zhang 
> >
> > > Hi. All.
> > >   I am trying to use solr to search over 2 lucene indexes.  I am
> > following
> > > the solr tutorial and test the distributed search example. It works.
> > >   Then I am using my own lucene indexes. Search in each solr instance
> > works
> > > and return the expected result. But when I do distributed search using
> > > "shards". It only return the "numFound"=14. But the result contain
> > nothing.
> > >Don't know why. Can Any one help? Thanks.
> > >
> >
>


Re: Distributed Search doesn't response the result set

2010-06-08 Thread Marco Martinez
Is there a way to let "ID" not be "indexed" in solr?


If i am not wrong, this is not possible if you want distributed searches,
because solr uses internaly the ids to retrieve the correct pagination in a
distributed search, i mean, when you do a distributed search (ie two
shards), two searches are fired in parallel and mixed them to get the
correct sort, after these steps, solr get the documents (doing a search by
id) from the corresponding shards to retrieve the others fields of the
documents you have define in the search.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/8 Scott Zhang 

> Hi. Markus.
>
> Thanks for replying.
>
> I figured out the reason this afternoon. Sorry for not following up on this
> list. I posted it onto dev list because I think it is a BUG.
>
>
>
> 
> I finally know why it doesn't return the result.
>
> When I created the index, I set "id" field as "Stored" but not "indexed"
> because I don't see the reason to index the id.
> Then in schema.xml, I found I have to set "ID" as "indexed" but actually it
> is not.
>
> Not sure how solr is implemented internally. But without set id as
> "indexed", the distributed search doesn't work. I tried rebuild a test
> index
> with set ID as Indexed. Then let solr use that index and distributed search
> works.
>
> Is there a way to let "ID" not be "indexed" in solr?
>
> ===
>
>
>
>
> On Tue, Jun 8, 2010 at 7:38 PM,  wrote:
>
> > did  you send a commit after the last doc posted to solr?
> >
> > > -Ursprüngliche Nachricht-
> > > Von: Scott Zhang [mailto:macromars...@gmail.com]
> > > Gesendet: Dienstag, 8. Juni 2010 08:30
> > > An: solr-user@lucene.apache.org
> > > Betreff: Re: Distributed Search doesn't response the result set
> > >
> > > Hi. All.
> > > I am still testing. I think I am approaching the truth.
> > > Now confirmed:
> > > the doc in my existing lucene indexes, when search with
> > > distributed search,
> > > none of them are returned. But the docs inserted from solr
> > > post.jar are
> > > returned successfully.
> > > Don't know why. looks the lucene docs has some difference
> > > from solr's
> > > lucene.
> > > And my situation is, I already have 72 indexes folders
> > > which occupy lots
> > > of disk and repost them to solr will take very long time, so
> > > I have to stick
> > > with my existing index. Is there a solution for this?
> > >
> > > Thanks.
> > > Regards.
> > >
> > > On Tue, Jun 8, 2010 at 2:02 PM, Scott Zhang
> > >  wrote:
> > >
> > > > Hi. All.
> > > >   I tried with the default solr example plus my own
> > > config/schema file. I
> > > > post test document into solr manually. Then test the
> > > distributed search and
> > > > it works. Then I switch to my existing l*ucene index, and
> > > it d*oesn't
> > > > work.  So I am wondering is that the reason, when solr use
> > > lucene index,
> > > > then it can't be distributed searched?
> > > >
> > > >Welcome anyone help.
> > > >
> > > > Thanks.
> > > > Regards.
> > > > Scott
> > > >
> > > >
> > > > On Mon, Jun 7, 2010 at 4:48 PM, Scott Zhang
> > > wrote:
> > > >
> > > >> Is there a possibility caused by I am using my own lucene indexes.
> > > >> Not the one created by solr itself?
> > > >>
> > > >>
> > > >> Regards
> > > >> Scott
> > > >>
> > > >>
> > > >> On Mon, Jun 7, 2010 at 4:24 PM, Scott Zhang
> > > wrote:
> > > >>
> > > >>> Hi.
> > > >>> I tried URL:
> > > >>>
> > > http://localhost:8983/solr/select?shards=localhost:8983/solr,l
> > > ocalhost:7574/solr&indent=true&q=marship&rows=10
> > > >>>  Got:
> > > >>> 
> > > >>> -
> > > >>> 
> > > >>> 0
> > > >>> 16
> > > >>&g

Re: solr usage reporting

2018-01-25 Thread Marco Reis
One way is to collect the log from your server and, then, use other tool to
generate your report.


On Thu, Jan 25, 2018 at 2:59 PM Becky Bonner  wrote:

> Hi all,
> We are in the process of replacing our Google Search Appliance with SOLR
> 7.1 and are needing one last piece of our requirements.  We provide a
> monthly report to our business that shows the top 1000 query terms
> requested during the date range as well as the query terms requested that
> contained no results.  Is there a way to log the requests and later query
> solr for these results? Or is there a plugin to add this functionality?
>
> Your help appreciated.
> Bcubed
>
>
> --
Marco Reis
Software Engineer
http://marcoreis.net
https://github.com/masreis
+55 61 9 81194620


Solr endpoint on the public internet

2020-10-08 Thread Marco Aurélio
Hi!

We're looking into the option of setting up search with Solr without an
intermediary application. This would mean our backend would index data into
Solr and we would have a public Solr endpoint on the internet that would
receive search requests directly.

Since I couldn't find an existing solution similar to ours, I would like to
know whether it's possible to secure Solr in a way that allows anyone only
read-access only to collections and how to achieve that. Specifically
because of this part of the documentation
<https://lucene.apache.org/solr/guide/8_5/securing-solr.html>:

*No Solr API, including the Admin UI, is designed to be exposed to
non-trusted parties. Tune your firewall so that only trusted computers and
people are allowed access. Because of this, the project will not regard
e.g., Admin UI XSS issues as security vulnerabilities. However, we still
ask you to report such issues in JIRA.*
Is there a way we can restrict read-only access to Solr collections so as
to allow users to make search requests directly to it or should we always
keep our Solr instances completely private?

Thanks in advance!

Best regards,
Marco Godinho


Searching a nested structure. Can not retrieve parents with all corresponding childs

2019-12-04 Thread Marco Ibscher
Hi there,

I have problems retrieving data in the nested structure in that it is indexed 
in solr 8.2:

I have a product database with products as the parent element and size/color 
combinations as the child elements. The data is imported with the data import 
handler:
























I can get all childs for a certain parent or all parents for a certain child, 
using the Block Join Query Parser (so the nested structure is working). But I 
cannot retrive parents with the corresponding childs.

I tried the following query:

q={!parent which="id:1"}&fl=*,[child]&rows=200
It returns the parent document but not the corresponding child documents. I 
dont't get any error message. I also checked the log file.

I also tried adding a childFilter or a parentFilter:

q={!parent which=doc_type:parent}&fl=id,[child parentFilter=doc_type:parent 
childFilter=doc_type:child]&rows=200

Using the parentFilter ends with the error message "Parent filter should not be 
sent when the schema is nested". The childFilter does not change the result 
(all parents, no childs).

Important schema fields:

  

  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  

Can anyone help? I also posted this problem on stackoverflow: 
https://stackoverflow.com/questions/59162038/searching-a-nested-structure-can-not-retirve-parents-with-all-corresponding-chi

Thank you.

Marco


Aw: Searching a nested structure. Can not retrieve parents with all corresponding childs

2019-12-09 Thread Marco Ibscher
Hi there,
 
on stackoverflow I got the advice to delete the _nest_path_ field. Without it I 
can use the parent filter without getting the "Parent filter should not be sent 
when the schema is nested" error mesage. For example:
 
q={!parent which=doc_type:parent}&fl=id,[child parentFilter=doc_type:parent 
childFilter=doc_type:child]&rows=200
 

I am still confused why it does not work with the _nest_path_ field and 
thankful for an advice, but right now I can work with this "solution".
 
Best regards
Marco

Gesendet: Mittwoch, 04. Dezember 2019 um 16:42 Uhr
Von: "Marco Ibscher" 
An: solr-user@lucene.apache.org
Betreff: Searching a nested structure. Can not retrieve parents with all 
corresponding childs
Hi there,

I have problems retrieving data in the nested structure in that it is indexed 
in solr 8.2:

I have a product database with products as the parent element and size/color 
combinations as the child elements. The data is imported with the data import 
handler:
























I can get all childs for a certain parent or all parents for a certain child, 
using the Block Join Query Parser (so the nested structure is working). But I 
cannot retrive parents with the corresponding childs.

I tried the following query:

q={!parent which="id:1"}&fl=*,[child]&rows=200
It returns the parent document but not the corresponding child documents. I 
dont't get any error message. I also checked the log file.

I also tried adding a childFilter or a parentFilter:

q={!parent which=doc_type:parent}&fl=id,[child parentFilter=doc_type:parent 
childFilter=doc_type:child]&rows=200

Using the parentFilter ends with the error message "Parent filter should not be 
sent when the schema is nested". The childFilter does not change the result 
(all parents, no childs).

Important schema fields:




















Can anyone help? I also posted this problem on stackoverflow: 
https://stackoverflow.com/questions/59162038/searching-a-nested-structure-can-not-retirve-parents-with-all-corresponding-chi

Thank you.

Marco


Re: regarding Extracting text from Images

2020-01-17 Thread Marco Reis
Are you intending to use the solution in production? If so, combining Tika
and Tesseract on the same server could not be a good choice.
Tika and Tesseract are heavy processing consumers, harming the main service
on the solution, in your case, Solr service.
I had the same situation here, and the combination Tika/Tesseract in the
production server does not scale, once I have many text documents and
images.
An alternative is to use a microservice to text preprocessing and another
one to OCR. You can take some ideas from https://github.com/tleyden/open-ocr
.
I have a separated Kubernetes cluster just for this, to extract and OCR
text from binary documents. Now, I can scale to a world-class solution.

Marco Reis
Software Engineer
http://marcoreis.net
+55 61 981194620



On Fri, 17 Jan 2020 at 07:17, Jörn Franke  wrote:

> Have you checked this?
>
> https://cwiki.apache.org/confluence/display/TIKA/TikaOCR
>
> > Am 17.01.2020 um 10:54 schrieb Retro :
> >
> > Hello, can you please advise me, how to configure Solr so that embedded
> Tika
> > is able to use Tesseract to do the  ocr of images? I have installed the
> > following software -
> > SOLR  - 7.4.0
> > Tesseract - 4.1.1-rc2-20-g01fb
> > TIKA   - TIKA 1.18
> > Tesseract is installed in to the following directory:
> > /usr/share/tesseract/4/tessdata/
> > echo $TESSDATA_PREFIX - > /usr/share/tesseract/4/tessdata/
> > tesseract -v
> > tesseract 4.1.1-rc2-20-g01fb
> > leptonica-1.76.0
> >
> > Command “tesseract test.jpg  test.txt”  produces accurate txt file with
> > OCRed content from test.jpg
> > Current setup allows us to index attachments such like structured text
> files
> > (txt, word, pdf, etc), but does not react in any way for attachments like
> > png, jpg. Nor it works if uploaded directly to SOLR using its web
> interface.
> >
> > Necessary modifications were made to the following files:
> > solrconfig.xml; TesseractOCRConfig.properties; parsecontent.xml;
> > PDFparser.properties.
> >
> > Would appreciate if someone helped me with this configuration.
> >
> >
> >
> > --
> > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Solr Windows service: Access is denied

2015-04-23 Thread Marco De Rossi
 Hi all,
 
my installation of Apache Solr on Windows Server 2008 R2 SP1 always stops with 
error "Access is denied" or "Incorrect Function" or "The data area passed to a 
system call is too small".
Do someone of you know how to fix this issue?
 
Thx
 
Marco
 
  

Re: HBase Datasource

2011-12-02 Thread Gian Marco Tagliani
Hi,
In my company we have the same need, import from Hbase into Solr
We just started a project here:

http://code.google.com/p/hbase-solr-dataimport/

We want to provide an easy way to import data from HBase, similar to
SqlEntityProcessor


Gian Marco



On Fri, Nov 11, 2011 at 4:37 AM, Fuad Efendi  wrote:

> I am using Lily for atomic index updates ( implemented very nice;
> transactionally; plus MapReduce; plus auto-denormaluzing)
>
> http://www.lilyproject.org
>
> It slows down "mean time" 7-10 times, but TPS still the same
>
>
>
> - Fuad
> http://www.tokenizer.ca
>
>
>
> Sent from my iPad
>
> On 2011-11-10, at 9:59 PM, Mark  wrote:
>
> > Has anyone had any success/experience with building a HBase datasource
> for DIH? Are there any solutions available on the web?
> >
> > Thanks.
>


docBoost with "fq" search

2012-03-07 Thread Gian Marco Tagliani

Hi All,
I'm seeing strange behavior with my Solr (version 3.4).

For searching I'm using the "q" and the "fq" params.
At index-time I'm adding a docBoost to each document.

When I perform a search with both "q" and "fq" params everything works.
For the search with "q=*:*" and something in the "fq", it seems to me 
that the docBoost in not taken into consideration.


Is that possible?

Thanks


Re: docBoost with "fq" search

2012-03-09 Thread Gian Marco Tagliani
Hi Ahmet,
thanks for the answer.

I'm really suprised because I always thought docBoost as a kind of sorting
tool.
And I used in that way, I'm giving big boost to the documents I want back
first in search.



Do you think there is a trick to force the usage of docBoost in my special
case?


Gian Marco


On Wed, Mar 7, 2012 at 2:51 PM, Ahmet Arslan  wrote:

>
>
> --- On Wed, 3/7/12, Gian Marco Tagliani  wrote:
>
> > From: Gian Marco Tagliani 
> > Subject: docBoost with "fq" search
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, March 7, 2012, 3:11 PM
> > Hi All,
> > I'm seeing strange behavior with my Solr (version 3.4).
> >
> > For searching I'm using the "q" and the "fq" params.
> > At index-time I'm adding a docBoost to each document.
> >
> > When I perform a search with both "q" and "fq" params
> > everything works.
> > For the search with "q=*:*" and something in the "fq", it
> > seems to me that the docBoost in not taken into
> > consideration.
> >
> > Is that possible?
>
> Yes possible.
>
> FilterQuery (fq) does not contribute to score. It is not used in score
> calculation.
>
> MatchAllDocsQuery (*:*) is a fast way to return all docs. Adding
> &fl=score&debugQuery=on will show that all docs will get constant score of
> 1.0.
>


DIH problem

2012-09-21 Thread Gian Marco Tagliani

Hi,
I'm updating my Solr from version 3.4 to version 3.6.1 and I'm facing a 
little problem with the DIH.


In the delta-import I'm using the /parentDeltaQuery/ feature of the DIH 
to update the parent entity.

I don't think this is working properly.

I realized that it's just executing the /parentDeltaQuery/ with the 
first record of the /deltaQuery /result.
Comparing the code with the previous versions I noticed that the 
rowIterator was never set to null.


To solve this I wrote a simple patch:

-
Index: 
solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java

===
--- 
solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java 
(revision 31454)
+++ 
solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java 
(working copy)

@@ -121,6 +121,7 @@
 if (rowIterator.hasNext())
   return rowIterator.next();
 query = null;
+rowIterator = null;
 return null;
   } catch (Exception e) {
 SolrException.log(log, "getNext() failed for query '" + query 
+ "'", e);

-


Do you think this is correct?

Thanks for your help

--
Gian Marco Tagliani





Re: DIH problem

2012-09-25 Thread Gian Marco Tagliani
Ok,
I'll try to verify if there is the same issue in the 4.0 and I'll open the
issue in Jira.

thanks

--
Gian Marco



On Sat, Sep 22, 2012 at 9:34 PM, Dyer, James
wrote:

> Gian,
>
> Even if you can't write a failing unit test (if you did it would be
> awesome), please open a JIRA issue on this and attach your patch.  Also,
> you may want to try 4.0 as opposed to 3.6 as some of the 3.6 issues with
> DIH are resolved in 4.0.
>
> https://issues.apache.org/jira/secure/Dashboard.jspa
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -Original Message-
> From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com]
> Sent: Friday, September 21, 2012 12:03 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH problem
>
> Gian,
>
> The only way to handle it is to provide a test case and attach to jira.
>
> Thanks
>
> On Fri, Sep 21, 2012 at 6:03 PM, Gian Marco Tagliani
> wrote:
>
> > Hi,
> > I'm updating my Solr from version 3.4 to version 3.6.1 and I'm facing a
> > little problem with the DIH.
> >
> > In the delta-import I'm using the /parentDeltaQuery/ feature of the DIH
> to
> > update the parent entity.
> > I don't think this is working properly.
> >
> > I realized that it's just executing the /parentDeltaQuery/ with the first
> > record of the /deltaQuery /result.
> > Comparing the code with the previous versions I noticed that the
> > rowIterator was never set to null.
> >
> > To solve this I wrote a simple patch:
> >
> > -
> > Index: solr/contrib/**dataimporthandler/src/java/**
> > org/apache/solr/handler/**dataimport/**EntityProcessorBase.java
> > ==**==**===
> > ---
> solr/contrib/**dataimporthandler/src/java/**org/apache/solr/handler/**
> > dataimport/**EntityProcessorBase.java (revision 31454)
> > +++
> solr/contrib/**dataimporthandler/src/java/**org/apache/solr/handler/**
> > dataimport/**EntityProcessorBase.java (working copy)
> > @@ -121,6 +121,7 @@
> >  if (rowIterator.hasNext())
> >return rowIterator.next();
> >  query = null;
> > +    rowIterator = null;
> >  return null;
> >} catch (Exception e) {
> >  SolrException.log(log, "getNext() failed for query '" + query +
> > "'", e);
> > -
> >
> >
> > Do you think this is correct?
> >
> > Thanks for your help
> >
> > --
> > Gian Marco Tagliani
> >
> >
> >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Tech Lead
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  
>
>


Re: DIH problem

2012-09-26 Thread Gian Marco Tagliani
Here the issue:

https://issues.apache.org/jira/browse/SOLR-3896




On Tue, Sep 25, 2012 at 1:41 PM, Gian Marco Tagliani
wrote:

> Ok,
> I'll try to verify if there is the same issue in the 4.0 and I'll open the
> issue in Jira.
>
> thanks
>
> --
> Gian Marco
>
>
>
>
> On Sat, Sep 22, 2012 at 9:34 PM, Dyer, James  > wrote:
>
>> Gian,
>>
>> Even if you can't write a failing unit test (if you did it would be
>> awesome), please open a JIRA issue on this and attach your patch.  Also,
>> you may want to try 4.0 as opposed to 3.6 as some of the 3.6 issues with
>> DIH are resolved in 4.0.
>>
>> https://issues.apache.org/jira/secure/Dashboard.jspa
>>
>> James Dyer
>> E-Commerce Systems
>> Ingram Content Group
>> (615) 213-4311
>>
>>
>> -Original Message-
>> From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com]
>> Sent: Friday, September 21, 2012 12:03 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: DIH problem
>>
>> Gian,
>>
>> The only way to handle it is to provide a test case and attach to jira.
>>
>> Thanks
>>
>> On Fri, Sep 21, 2012 at 6:03 PM, Gian Marco Tagliani
>> wrote:
>>
>> > Hi,
>> > I'm updating my Solr from version 3.4 to version 3.6.1 and I'm facing a
>> > little problem with the DIH.
>> >
>> > In the delta-import I'm using the /parentDeltaQuery/ feature of the DIH
>> to
>> > update the parent entity.
>> > I don't think this is working properly.
>> >
>> > I realized that it's just executing the /parentDeltaQuery/ with the
>> first
>> > record of the /deltaQuery /result.
>> > Comparing the code with the previous versions I noticed that the
>> > rowIterator was never set to null.
>> >
>> > To solve this I wrote a simple patch:
>> >
>> > -
>> > Index: solr/contrib/**dataimporthandler/src/java/**
>> > org/apache/solr/handler/**dataimport/**EntityProcessorBase.java
>> > ==**==**===
>> > ---
>> solr/contrib/**dataimporthandler/src/java/**org/apache/solr/handler/**
>> > dataimport/**EntityProcessorBase.java (revision 31454)
>> > +++
>> solr/contrib/**dataimporthandler/src/java/**org/apache/solr/handler/**
>> > dataimport/**EntityProcessorBase.java (working copy)
>> > @@ -121,6 +121,7 @@
>> >  if (rowIterator.hasNext())
>> >return rowIterator.next();
>> >  query = null;
>> > +rowIterator = null;
>> >  return null;
>> >} catch (Exception e) {
>> >  SolrException.log(log, "getNext() failed for query '" + query +
>> > "'", e);
>> > -
>> >
>> >
>> > Do you think this is correct?
>> >
>> > Thanks for your help
>> >
>> > --
>> > Gian Marco Tagliani
>> >
>> >
>> >
>> >
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Tech Lead
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  
>>
>>
>


My problem with T-shirts and nested documents

2019-05-24 Thread Gian Marco Tagliani
Hi all,
I'm facing a problem with Nested Documents.

To illustrate my problem I'll use the example with T-shirts in stock.
For every model of a T-shirt, we can have different colors and sizes, for
each combination we have the number of items in stock.

In Solr, for every model we have a document, for every combination of color
and size we have a nested child document.


model A
- color : red, size M, quantity 8
- color : blue, size L, quantity 4
- color : white, size M, quantity 1

model B
- color yellow, size S, quantity 7
- color yellow, size M, quantity 3

model C
- color red, size M, quantity 5
- color black, size L, quantity 6


I'm interested in size M only, and I want to know our stock ordered by
quantity.

model A, color red, quantity 8
model C, color red, quantity 5
model B, color yellow, quantity 3
model A, color white, quantity 1



My first idea was using the Json Nested Facet (
https://lucene.apache.org/solr/guide/json-facet-api.html#nested-facet-example
)
In that case I'm not able to sort by quantity nor discriminate between the
"color red" and "color white" lines for model A.

My second idea was to use the Analytics Component (
https://lucene.apache.org/solr/guide/analytics.html)
In this case I'm not able to get data from father and child document to
build a facet.

Has any of you encountered a similar problem? Do you have any idea on how
to address my case?


Thanks in advance
Gian Marco Tagliani


Re: My problem with T-shirts and nested documents

2019-05-24 Thread Gian Marco Tagliani
Hi Mikhail,
thanks for the quick response!

I'll try to do it with your suggestion

Many thanks :)

On Fri, May 24, 2019 at 3:12 PM Mikhail Khludnev  wrote:

> It's possible to search for stocks filtering by M, sorting by qty, and then
> parent docs need to be added to result list.
> Unfortunately, there is no [parent] result transformer. Thus it can be done
> with generic
>
> https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_
> .
>
>
> On Fri, May 24, 2019 at 3:16 PM Gian Marco Tagliani  >
> wrote:
>
> > Hi all,
> > I'm facing a problem with Nested Documents.
> >
> > To illustrate my problem I'll use the example with T-shirts in stock.
> > For every model of a T-shirt, we can have different colors and sizes, for
> > each combination we have the number of items in stock.
> >
> > In Solr, for every model we have a document, for every combination of
> color
> > and size we have a nested child document.
> >
> >
> > model A
> > - color : red, size M, quantity 8
> > - color : blue, size L, quantity 4
> > - color : white, size M, quantity 1
> >
> > model B
> > - color yellow, size S, quantity 7
> > - color yellow, size M, quantity 3
> >
> > model C
> > - color red, size M, quantity 5
> > - color black, size L, quantity 6
> >
> >
> > I'm interested in size M only, and I want to know our stock ordered by
> > quantity.
> >
> > model A, color red, quantity 8
> > model C, color red, quantity 5
> > model B, color yellow, quantity 3
> > model A, color white, quantity 1
> >
> >
> >
> > My first idea was using the Json Nested Facet (
> >
> >
> https://lucene.apache.org/solr/guide/json-facet-api.html#nested-facet-example
> > )
> > In that case I'm not able to sort by quantity nor discriminate between
> the
> > "color red" and "color white" lines for model A.
> >
> > My second idea was to use the Analytics Component (
> > https://lucene.apache.org/solr/guide/analytics.html)
> > In this case I'm not able to get data from father and child document to
> > build a facet.
> >
> > Has any of you encountered a similar problem? Do you have any idea on how
> > to address my case?
> >
> >
> > Thanks in advance
> > Gian Marco Tagliani
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: My problem with T-shirts and nested documents

2019-05-27 Thread Gian Marco Tagliani
Hi Walter,
It was just an example, I thought it was simpler to explain than my real
problem.

thanks,
GM

On Fri, May 24, 2019 at 4:47 PM Walter Underwood 
wrote:

> If you are really keeping inventory, use a relational database. Solr is a
> really poor choice for this kind of application.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On May 24, 2019, at 6:12 AM, Mikhail Khludnev  wrote:
> >
> > It's possible to search for stocks filtering by M, sorting by qty, and
> then
> > parent docs need to be added to result list.
> > Unfortunately, there is no [parent] result transformer. Thus it can be
> done
> > with generic
> >
> https://lucene.apache.org/solr/guide/6_6/transforming-result-documents.html#TransformingResultDocuments-_subquery_
> > .
> >
> >
> > On Fri, May 24, 2019 at 3:16 PM Gian Marco Tagliani <
> gm.tagli...@gmail.com>
> > wrote:
> >
> >> Hi all,
> >> I'm facing a problem with Nested Documents.
> >>
> >> To illustrate my problem I'll use the example with T-shirts in stock.
> >> For every model of a T-shirt, we can have different colors and sizes,
> for
> >> each combination we have the number of items in stock.
> >>
> >> In Solr, for every model we have a document, for every combination of
> color
> >> and size we have a nested child document.
> >>
> >>
> >> model A
> >>- color : red, size M, quantity 8
> >>- color : blue, size L, quantity 4
> >>- color : white, size M, quantity 1
> >>
> >> model B
> >>- color yellow, size S, quantity 7
> >>- color yellow, size M, quantity 3
> >>
> >> model C
> >>- color red, size M, quantity 5
> >>- color black, size L, quantity 6
> >>
> >>
> >> I'm interested in size M only, and I want to know our stock ordered by
> >> quantity.
> >>
> >> model A, color red, quantity 8
> >> model C, color red, quantity 5
> >> model B, color yellow, quantity 3
> >> model A, color white, quantity 1
> >>
> >>
> >>
> >> My first idea was using the Json Nested Facet (
> >>
> >>
> https://lucene.apache.org/solr/guide/json-facet-api.html#nested-facet-example
> >> )
> >> In that case I'm not able to sort by quantity nor discriminate between
> the
> >> "color red" and "color white" lines for model A.
> >>
> >> My second idea was to use the Analytics Component (
> >> https://lucene.apache.org/solr/guide/analytics.html)
> >> In this case I'm not able to get data from father and child document to
> >> build a facet.
> >>
> >> Has any of you encountered a similar problem? Do you have any idea on
> how
> >> to address my case?
> >>
> >>
> >> Thanks in advance
> >> Gian Marco Tagliani
> >>
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
>
>