Re: JSON from Term Vectors Component

2020-02-06 Thread Doug Turnbull
; > > > > > > The issue is json.nl produces non-standard json with duplicate keys. > > > Solr > > > > generates the following, which json lint fails given multiple keys > > > > > > > > { > > > > "positions": { > > >

Re: JSON from Term Vectors Component

2020-02-06 Thread Edward Ribeiro
> > > > > On Thu, Feb 6, 2020 at 11:36 AM Munendra S N > > > wrote: > > > > > >>> > > >>> Notice the lists, within lists, within lists. Where the keys are > > adjacent > > >>> items in the list. Is there a reason this i

Re: JSON from Term Vectors Component

2020-02-06 Thread Walter Underwood
>>> Notice the lists, within lists, within lists. Where the keys are >> adjacent >>>>> items in the list. Is there a reason this isn't a JSON dictionary? >>>>> >>>> I think this is because of NamedList. Have you tried using json.nl=map >&

Re: JSON from Term Vectors Component

2020-02-06 Thread Doug Turnbull
t;> items in the list. Is there a reason this isn't a JSON dictionary? > >>> > >> I think this is because of NamedList. Have you tried using json.nl=map > as > >> a > >> query parameter for this case? > >> > >> Regards, > >>

Re: JSON from Term Vectors Component

2020-02-06 Thread Walter Underwood
ON dictionary? >>> >> I think this is because of NamedList. Have you tried using json.nl=map as >> a >> query parameter for this case? >> >> Regards, >> Munendra S N >> >> >> >> On Thu, Feb 6, 2020 at 10:01 PM Doug Turnbull <

Re: JSON from Term Vectors Component

2020-02-06 Thread Doug Turnbull
; a > query parameter for this case? > > Regards, > Munendra S N > > > > On Thu, Feb 6, 2020 at 10:01 PM Doug Turnbull < > dturnb...@opensourceconnections.com> wrote: > > > Hi all, > > > > I was curious if anyone had any tips on parsing the JSO

Re: JSON from Term Vectors Component

2020-02-06 Thread Munendra S N
On Thu, Feb 6, 2020 at 10:01 PM Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > Hi all, > > I was curious if anyone had any tips on parsing the JSON response of the > term vectors component? Or anyway to force it to be more standard JSON? It > appears to be very

JSON from Term Vectors Component

2020-02-06 Thread Doug Turnbull
Hi all, I was curious if anyone had any tips on parsing the JSON response of the term vectors component? Or anyway to force it to be more standard JSON? It appears to be very heavily nested and idiosyncratic JSON, such as below. Notice the lists, within lists, within lists. Where the keys are

Re: Solr is very slow with term vectors

2019-08-16 Thread Walter Underwood
t;> wun...@wunderwood.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> tf.idf was invented because cosine similarity is too much >>>> computation. >>>>>>>>> tf.idf gives similar results mu

Re: Solr is very slow with term vectors

2019-08-16 Thread Jan Høydahl
tance. >>>>>>>> >>>>>>>> I would expect cosine similarity to be slow. I would also expect >>>>>>>> retrieving 1 million records to be slow. Doing both of those in one >>>>> minute >>>>>>

Re: Solr is very slow with term vectors

2019-08-16 Thread Jörn Franke
than cosine distance. >>>>>>> >>>>>>> I would expect cosine similarity to be slow. I would also expect >>>>>>> retrieving 1 million records to be slow. Doing both of those in one >>>> minute >>>>>>> is

Re: Solr is very slow with term vectors

2019-08-16 Thread Vignan Malyala
o be slow. I would also expect > >>>>> retrieving 1 million records to be slow. Doing both of those in one > >> minute > >>>>> is pretty good. > >>>>> > >>>>> As Kernighan and Paugher said in 1978, "Don’t diddle

Re: Solr is very slow with term vectors

2019-08-16 Thread Jörn Franke
gt;> retrieving 1 million records to be slow. Doing both of those in one >> minute >>>>> is pretty good. >>>>> >>>>> As Kernighan and Paugher said in 1978, "Don’t diddle code to make it >>>>> faster—find a better algorithm.” >>

Re: Solr is very slow with term vectors

2019-08-16 Thread Vignan Malyala
; > >>> https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style > >>> > >>> wunder > >>> Walter Underwood > >>> wun...@wunderwood.org > >>> http://observer.wunderwood.org/ (my blog) > >>> > >>>&

Re: Solr is very slow with term vectors

2019-08-15 Thread Jörn Franke
Walter Underwood >>> wun...@wunderwood.org >>> http://observer.wunderwood.org/ (my blog) >>> >>>> On Aug 11, 2019, at 10:40 AM, Doug Turnbull < >>> dturnb...@opensourceconnections.com> wrote: >>>> >>>> Hi Vignan, >&g

Re: Solr is very slow with term vectors

2019-08-15 Thread Vignan Malyala
t; dturnb...@opensourceconnections.com> wrote: >> > >> > Hi Vignan, >> > >> > We need to see more details / code of what your query parser plugin does >> > exactly with term vectors, we can't really help you without more >> details. >&g

Re: Solr is very slow with term vectors

2019-08-12 Thread Vignan Malyala
t; > > On Aug 11, 2019, at 10:40 AM, Doug Turnbull < > dturnb...@opensourceconnections.com> wrote: > > > > Hi Vignan, > > > > We need to see more details / code of what your query parser plugin does > > exactly with term vectors, we can't really help you without

Re: Solr is very slow with term vectors

2019-08-11 Thread Walter Underwood
AM, Doug Turnbull > wrote: > > Hi Vignan, > > We need to see more details / code of what your query parser plugin does > exactly with term vectors, we can't really help you without more details. > Is it open source? Can you share a minimal example that recreates the >

Re: Solr is very slow with term vectors

2019-08-11 Thread Doug Turnbull
Hi Vignan, We need to see more details / code of what your query parser plugin does exactly with term vectors, we can't really help you without more details. Is it open source? Can you share a minimal example that recreates the problem? On Sun, Aug 11, 2019 at 1:19 PM Vignan Malyala

Solr is very slow with term vectors

2019-08-11 Thread Vignan Malyala
Hi guys, I made my custom qparser plugin in Solr for scoring. The plugin only does cosine similarity of vectors for each record. I use term vectors here. Results are fine! BUT, Solr response is very slow with term vectors. It takes around 55 seconds for each request for 100 records. How do I

Solr is very slow with term vectors

2019-08-11 Thread Vignan Malyala
Hi I made by custom qparser plugin in Solr for scoring. The plugin only does cosine similarity of vectors. for each record. Results are fine! *BUT, Solr response is very slow. It takes around 55 seconds for each request.* *How do I make it faster to get my results in ms ?* *Please respond soon as

Re: Delete term vectors from existing index

2015-10-05 Thread Alessandro Benedetti
What make believe you there is a good way to remove the term vectors without re-indexing ? It does make sense that the simple optimise did not the job. It is what I would expect. I agree with you that term vectors are a separated data structure in the index, but I doubt there is a way to

Delete term vectors from existing index

2015-10-04 Thread Norgorn
I'm looking for a way to delete term vectors from existing index, schema is changed to 'termVectors="false" ' and optimization was performed after that, but index size remains the same (I'm totally sure, that optimization was successful). I've also tried to add s

problems retrieving term vectors using RealTimeGetHandler

2015-02-25 Thread Scott C. Cote
I’m working with term vectors via solr. Is there a way to configure the RealTimeGetHandler to return tv info? Here is my environment info: Scotts-MacBook-Air-2:solr_jetty scottccote$ java -version java version "1.8.0_31" Java(TM) SE Runtime Environment (build 1.8.0_31-b13) Java HotS

Re: more like this and term vectors

2015-02-23 Thread Jack Krupansky
It's never helpful when you merely say that it "did not work" - detail the symptom, please. Post both the query and the response. As well as the field and type definitions for the fields for which you expected term vectors - no term vectors are enabled by default. -- Jack Krupans

more like this and term vectors

2015-02-23 Thread Scott C. Cote
Is there a way to configure the more like this query handler and also receive the corresponding term vectors? (tf-idf) ? I tried by creating a “search component” for the term vectors and adding it to the mlt handler, but that did not work. Here is what I tried: filteredText

SolrCloud Term Vectors NullPointerException

2013-11-28 Thread Stanislav Sandalnikov
Hello everyone, I have problems with setting up SolrCloud to work with term vectors, failing with error: java.lang.NullPointerException at org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437) at

Re: Retrieving Term vectors

2013-03-20 Thread Sarita Nair
From: Koji Sekiguchi To: solr-user@lucene.apache.org Sent: Tuesday, March 19, 2013 7:02 PM Subject: Re: Retrieving Term vectors Hi Sarita, I've not dug into your code detail but my first impression is that you are missing store term positions? > FieldType fieldTy

Re: Retrieving Term vectors

2013-03-19 Thread Koji Sekiguchi
Hi Sarita, I've not dug into your code detail but my first impression is that you are missing store term positions? > FieldType fieldType = new FieldType();> IndexOptions indexOptions = IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS; > fieldType.setIndexOptions(indexOptions); > fieldTyp

Retrieving Term vectors

2013-03-19 Thread Sarita Nair
have stored term vectors. Any ideas on what is it that I am doing incorrectly, will be greatly appreciated. public class LuceneUtilTest { private final RAMDirectory ramDirectory = new RAMDirectory(); private static final String TERM_POSITION_PROVIDER = "term position provider"

Re: why search time increases without term vectors?

2013-01-29 Thread Artyom
View this message in context: http://lucene.472066.n3.nabble.com/why-search-time-increases-without-term-vectors-tp4035900p4037010.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: why search time increases without term vectors?

2013-01-29 Thread Upayavira
No, not at all. Presence or not of term vectors won't impact replication in that way. For SolrCloud, it is up to each node to create term vectors when it receives a document for indexing. Using 3.x style replication, the slave will pull all changed files making up changed segments on replic

Re: why search time increases without term vectors?

2013-01-28 Thread Artyom
message in context: http://lucene.472066.n3.nabble.com/why-search-time-increases-without-term-vectors-tp4035900p4036962.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Question about Term Vectors

2011-03-14 Thread Markus Jelsma
You need to reindex. On Monday 14 March 2011 14:04:00 Ahsan |qbal wrote: > Hi All > > Is there any way to drop term vectors from already built index file. > > Regards > Ahsan Iqbal -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Question about Term Vectors

2011-03-14 Thread Ahsan |qbal
Hi All Is there any way to drop term vectors from already built index file. Regards Ahsan Iqbal

Re: Highlighting with/without Term Vectors

2011-02-05 Thread Salman Akram
mostly that is not an issue (in some queries that's slow too but its size is the main issue). On average more than 90% of the query time is taken by 1st index file in searching (and total count as well). The confusion that I had was on the 1st index file which didn't have Term Vectors in

Re: Highlighting with/without Term Vectors

2011-02-04 Thread Otis Gospodnetic
al Message > From: Salman Akram > To: solr-user@lucene.apache.org > Sent: Fri, February 4, 2011 8:03:06 AM > Subject: Re: Highlighting with/without Term Vectors > > Basically Term Vectors are only on one main field i.e. Contents. Average > size of each document would be fe

Re: Highlighting with/without Term Vectors

2011-02-04 Thread Salman Akram
Basically Term Vectors are only on one main field i.e. Contents. Average size of each document would be few KB's but there are around 130 million documents so what do you suggest now? On Fri, Feb 4, 2011 at 5:24 PM, Otis Gospodnetic wrote: > Salman, > > It also depends on th

Re: Highlighting with/without Term Vectors

2011-02-04 Thread Otis Gospodnetic
/ - Original Message > From: Grant Ingersoll > To: solr-user@lucene.apache.org > Sent: Wed, January 26, 2011 10:44:09 AM > Subject: Re: Highlighting with/without Term Vectors > > > On Jan 24, 2011, at 2:42 PM, Salman Akram wrote: > > > Hi, > > > >

Re: Highlighting with/without Term Vectors

2011-01-26 Thread Grant Ingersoll
On Jan 24, 2011, at 2:42 PM, Salman Akram wrote: > Hi, > > Does anyone have any benchmarks how much highlighting speeds up with Term > Vectors (compared to without it)? e.g. if highlighting on 20 documents take > 1 sec with Term Vectors any idea how long it will take without th

Re: Highlighting with/without Term Vectors

2011-01-25 Thread Salman Akram
compressed so should be much smaller. > Total documents are more than 100 million. > > > On Tue, Jan 25, 2011 at 12:42 AM, Salman Akram < > salman.ak...@northbaysolutions.net> wrote: > >> Hi, >> >> Does anyone have any benchmarks how much highlighting

Re: Highlighting with/without Term Vectors

2011-01-24 Thread Salman Akram
Akram < salman.ak...@northbaysolutions.net> wrote: > Hi, > > Does anyone have any benchmarks how much highlighting speeds up with Term > Vectors (compared to without it)? e.g. if highlighting on 20 documents take > 1 sec with Term Vectors any idea how long it will take without them

Highlighting with/without Term Vectors

2011-01-24 Thread Salman Akram
Hi, Does anyone have any benchmarks how much highlighting speeds up with Term Vectors (compared to without it)? e.g. if highlighting on 20 documents take 1 sec with Term Vectors any idea how long it will take without them? I need to know since the index used for highlighting has a TVF file of

Highlighter problem when using WordDelimiterFilter and term vectors

2010-12-30 Thread Oliver Messner
t=on&hl=true some text WarmWarmWasserSpeicher here As you can see, the highlighter does not work like expected (at least for me). If the term vectors are not stored into the index, I get the expected result some text WarmWasserSpeicher here. I'm using Solr version 1.

Re: access term vectors in lucene

2010-06-16 Thread Grant Ingersoll
ctually i > making a project where i need tf idf values of all the terms in the > documents.. but i m unable to get any reference eg where it shows how to use > these term vectors to get the tf idf values of ALL the terms in my > documents... > > Plz help > > -Sarfaraz > >

access term vectors in lucene

2010-06-16 Thread sarfaraz masood
hello all, I wanna know that how can we access terms vectors in lucene.. actually i making a project where i need tf idf values of all the terms in the documents.. but i m unable to get any reference eg where it shows how to use these term vectors to get the tf idf values of ALL the terms in

Re: Bigram term vectors and weights possible with Solr?

2010-02-09 Thread Mike Hughes
Thank you Ahmet, this is exactly what I was looking for. Looks like the shingle filter can produce 3+-gram terms as well, that's great. I'm going to try this with both western and CJK language tokenizers and see how it turns out. On Tue, Feb 9, 2010 at 5:07 PM, Ahmet Arslan wrote: >> I've been l

Re: Bigram term vectors and weights possible with Solr?

2010-02-09 Thread Ahmet Arslan
> I've been looking at the Solr TermVectorComponent > (http://wiki.apache.org/solr/TermVectorComponent) and it > seems to have > something similar to this, but it looks to me like this is > a component > that is processed at query time (?) and is limited to > 1-gram terms. If you use it can give

Bigram term vectors and weights possible with Solr?

2010-02-09 Thread Mike Hughes
Hello, One of the commercial search platforms I work with has the concept of 'document vectors', which are 1-gram and 2-gram phrases and their associated tf/idf weights on a 0-1 scale, i.e. ["banana pie", 0.99] means banana pie is very relevant for this document. During the ingest/indexing proces

Re: using term vectors

2010-01-21 Thread Koji Sekiguchi
Tim, You should define the search component in solrconfig.xml, not schema.xml. Koji -- http://www.rondhuit.com/en/ Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote: Hi, I am trying termVectorComponents in SOLR. Per wiki I am trying to define component and handler. I define it so:

using term vectors

2010-01-21 Thread Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS]
Hi, I am trying termVectorComponents in SOLR. Per wiki I am trying to define component and handler. I define it so: true tvComponent ... But when I qu

Re: Dealing with term vectors

2009-09-15 Thread Grant Ingersoll
On Sep 15, 2009, at 5:31 AM, Licinio Fernández Maurelo wrote: Hi there, i want to recover the term vectors from indexes not calculating then but just only recovering instead. http://wiki.apache.org/solr/TermVectorComponent Some questions about this topic: 1. When i put the

Dealing with term vectors

2009-09-15 Thread Licinio Fernández Maurelo
Hi there, i want to recover the term vectors from indexes not calculating then but just only recovering instead. Some questions about this topic: 1. When i put the option ... what's happening behind? 1. Is Lucene storing the tv in the index? 2. Is Lucene storing addit

Re: term vectors

2009-05-28 Thread Erik Hatcher
Nutch - Original Message From: Walter Underwood To: solr-user@lucene.apache.org Sent: Wednesday, May 27, 2009 10:53:16 PM Subject: Re: term vectors If you really, really need to do XML-smart queries, go ahead and buy MarkLogic. I've worked with the principle folk there and they are re

Re: term vectors

2009-05-27 Thread Otis Gospodnetic
> Sent: Wednesday, May 27, 2009 10:53:16 PM > Subject: Re: term vectors > > If you really, really need to do XML-smart queries, go ahead and buy > MarkLogic. I've worked with the principle folk there and they are > really sharp. Their engine is awesome. XML search is hard, a

Re: term vectors

2009-05-27 Thread Walter Underwood
If you really, really need to do XML-smart queries, go ahead and buy MarkLogic. I've worked with the principle folk there and they are really sharp. Their engine is awesome. XML search is hard, and you can't take a regular search engine, even a really good one, and make it do full XML without tons

Re: term vectors

2009-05-27 Thread Matt Mitchell
I've been experimenting with the XML + Solr combo too. What I've found to be a good working solution is to: pick out the nodes you want as solr documents (every div1 or div2 etc.) index the text only (with lots of metadata fields) add a field for either the xpath to that node, or save the indivi

Re: term vectors

2009-05-27 Thread Erik Hatcher
On May 27, 2009, at 4:56 PM, Yosvanys Aponte wrote: i undestand what you say but the problem i have is user can make query like this: //tei.2//p"[quijote"] A couple of problems with this... for one, there's no query parser that'll interpret that syntax as you mean it in Solr. And also,

Re: term vectors

2009-05-27 Thread Yosvanys Aponte
are after is > flattening your structure so that it fits within Solr/Lucene's > document & field capabilities and then doing fielded search from > there. I'm not sure term vectors relates to what you're doing, but > we'll know more if you post some more det

Re: term vectors

2009-05-26 Thread Erik Hatcher
ld capabilities and then doing fielded search from there. I'm not sure term vectors relates to what you're doing, but we'll know more if you post some more details. Thanks, Erik On May 26, 2009, at 5:35 AM, Yosvanys Aponte wrote: Hello!! I´m working with solr, index

term vectors

2009-05-26 Thread Yosvanys Aponte
part of the query, that is the structure part , but the content is solr, i a need to search only in some fields not in the hole solr database. Could term vectors help me to do this, or there is other way to do it. thanks Aponte -- View this message in context: http://www.nabble.com/term-vectors

Re: MoreLikeThis and term vectors - documentation suggestion

2007-02-27 Thread Ken Krugler
ut from what I can tell the analyzer (either the default StandardAnalyzer or whatever gets set explicitly) will still get used in that case, if there's no term vector. : Also the performance really stunk when I didn't use stored term vectors. well .. i'd still rather be able to s

Re: MoreLikeThis and term vectors - documentation suggestion

2007-02-27 Thread Chris Hostetter
k fine. what other problems did you run into when you looked into this Ken? : Also the performance really stunk when I didn't use stored term vectors. well .. i'd still rather be able to say "using termVectors to make MLT faster" then: "if you don't use termVectors MLT doesn't work at all" -Hoss

Re: MoreLikeThis and term vectors - documentation suggestion

2007-02-26 Thread Ken Krugler
s a lot more than just what analyzer is used, given all of the filters that are also in play. Also the performance really stunk when I didn't use stored term vectors. It woudl be nice for as many features as possible to work without term vectors. I sometimes wonder whether schema.xml

Re: MoreLikeThis and term vectors - documentation suggestion

2007-02-26 Thread Mike Klaas
ira/browse/SOLR-69 Is it possible to modify MoreLikeThis to use the schema.xml-defined analyzer? That's the way the highlighting code currently works (it picks the index-time analyzer). It woudl be nice for as many features as possible to work without term vectors. I sometimes wonder whethe

Re: MoreLikeThis and term vectors - documentation suggestion

2007-02-26 Thread Bertrand Delacretaz
On 2/26/07, Ken Krugler <[EMAIL PROTECTED]> wrote: ...I was trying out the MoreLikeThis support, and getting some odd results... Thanks for the info, I have added a link to your message at https://issues.apache.org/jira/browse/SOLR-69 -Bertrand

MoreLikeThis and term vectors - documentation suggestion

2007-02-26 Thread Ken Krugler
Hi all, I was trying out the MoreLikeThis support, and getting some odd results. I realized that unless the fields being used for similarity calculation have a stored term vector, the MoreLikeThis code from Lucene will re-analyze the field using the StandardAnalyzer. Which, in my case, is qui