Ah, ok this is news to me and makes a lot more sense. If I can just run this back past you to make sure I understand. If I move my full_text to
If I move my fulltext document from my SQL database to "keyword_document" it will contain the original fulltext in the source, but the index will have the stopword filter, lowercase filter etc applied. Then by copying this to "truncated_document" the original source is being moved? *This is my definition for keyword_description, using the stopwords.txt* <fieldType name="keyword_description" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="30" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> *Then this to do the copying across. Is there somewhere specific to put this within the schema.xml?* <copyField source="keyword_description" dest="truncated_description" maxChars="3000"/> *Then do I need to have definitions for the "truncated description" in the same way that I did for "keyword_description"?* <fieldType name="truncated_description" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="30" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> Jack Krupansky-2 wrote > > You said "it has been copied from the keyword_document [field]", but the > reality is that Solr is not copying from the indexed value of the field, > but > from the source value for the field. The idea is that multiple fields can > be > based on the same source value even if they analyze and index the value in > different ways. > > -- Jack Krupansky > > -----Original Message----- > From: Spadez > Sent: Monday, September 17, 2012 12:29 PM > To: solr-user@.apache > Subject: Re: Taking a full text, then truncate and duplicate with > stopwords > > I'm really confused here. I have a document which is say 4000 words long. > I > want to get this put into two fields in Solr without having to save the > original document in its entirety within Solr. > > When I import my fulltext (4000 word) document to Solr I was going to put > it > straight into keyword_document which uses stopwords to remove words like > "and" "it" "this". Now I only have 3000 words for example. > > Then if I do copy command to move it into truncate_document then even > though > I can reduce it down to say 100 words, it is lacking words like "and" "it" > and "this" because it has been copied from the keyword_document. > > I want the following scenario: > > truncate_document to have 100 words including words like "and" "it" and > "this" > keyword_docment to have only stop words removed > And finally only have the fulltext document, full length and all stop > words, > exist in my SQL database. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html > Sent from the Solr - User mailing list archive at Nabble.com. > Jack Krupansky-2 wrote > > You said "it has been copied from the keyword_document [field]", but the > reality is that Solr is not copying from the indexed value of the field, > but > from the source value for the field. The idea is that multiple fields can > be > based on the same source value even if they analyze and index the value in > different ways. > > -- Jack Krupansky > > -----Original Message----- > From: Spadez > Sent: Monday, September 17, 2012 12:29 PM > To: solr-user@.apache > Subject: Re: Taking a full text, then truncate and duplicate with > stopwords > > I'm really confused here. I have a document which is say 4000 words long. > I > want to get this put into two fields in Solr without having to save the > original document in its entirety within Solr. > > When I import my fulltext (4000 word) document to Solr I was going to put > it > straight into keyword_document which uses stopwords to remove words like > "and" "it" "this". Now I only have 3000 words for example. > > Then if I do copy command to move it into truncate_document then even > though > I can reduce it down to say 100 words, it is lacking words like "and" "it" > and "this" because it has been copied from the keyword_document. > > I want the following scenario: > > truncate_document to have 100 words including words like "and" "it" and > "this" > keyword_docment to have only stop words removed > And finally only have the fulltext document, full length and all stop > words, > exist in my SQL database. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html > Sent from the Solr - User mailing list archive at Nabble.com. > Jack Krupansky-2 wrote > > You said "it has been copied from the keyword_document [field]", but the > reality is that Solr is not copying from the indexed value of the field, > but > from the source value for the field. The idea is that multiple fields can > be > based on the same source value even if they analyze and index the value in > different ways. > > -- Jack Krupansky > > -----Original Message----- > From: Spadez > Sent: Monday, September 17, 2012 12:29 PM > To: solr-user@.apache > Subject: Re: Taking a full text, then truncate and duplicate with > stopwords > > I'm really confused here. I have a document which is say 4000 words long. > I > want to get this put into two fields in Solr without having to save the > original document in its entirety within Solr. > > When I import my fulltext (4000 word) document to Solr I was going to put > it > straight into keyword_document which uses stopwords to remove words like > "and" "it" "this". Now I only have 3000 words for example. > > Then if I do copy command to move it into truncate_document then even > though > I can reduce it down to say 100 words, it is lacking words like "and" "it" > and "this" because it has been copied from the keyword_document. > > I want the following scenario: > > truncate_document to have 100 words including words like "and" "it" and > "this" > keyword_docment to have only stop words removed > And finally only have the fulltext document, full length and all stop > words, > exist in my SQL database. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html > Sent from the Solr - User mailing list archive at Nabble.com. > Jack Krupansky-2 wrote > > You said "it has been copied from the keyword_document [field]", but the > reality is that Solr is not copying from the indexed value of the field, > but > from the source value for the field. The idea is that multiple fields can > be > based on the same source value even if they analyze and index the value in > different ways. > > -- Jack Krupansky > > -----Original Message----- > From: Spadez > Sent: Monday, September 17, 2012 12:29 PM > To: solr-user@.apache > Subject: Re: Taking a full text, then truncate and duplicate with > stopwords > > I'm really confused here. I have a document which is say 4000 words long. > I > want to get this put into two fields in Solr without having to save the > original document in its entirety within Solr. > > When I import my fulltext (4000 word) document to Solr I was going to put > it > straight into keyword_document which uses stopwords to remove words like > "and" "it" "this". Now I only have 3000 words for example. > > Then if I do copy command to move it into truncate_document then even > though > I can reduce it down to say 100 words, it is lacking words like "and" "it" > and "this" because it has been copied from the keyword_document. > > I want the following scenario: > > truncate_document to have 100 words including words like "and" "it" and > "this" > keyword_docment to have only stop words removed > And finally only have the fulltext document, full length and all stop > words, > exist in my SQL database. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html > Sent from the Solr - User mailing list archive at Nabble.com. > Jack Krupansky-2 wrote > > You said "it has been copied from the keyword_document [field]", but the > reality is that Solr is not copying from the indexed value of the field, > but > from the source value for the field. The idea is that multiple fields can > be > based on the same source value even if they analyze and index the value in > different ways. > > -- Jack Krupansky > > -----Original Message----- > From: Spadez > Sent: Monday, September 17, 2012 12:29 PM > To: solr-user@.apache > Subject: Re: Taking a full text, then truncate and duplicate with > stopwords > > I'm really confused here. I have a document which is say 4000 words long. > I > want to get this put into two fields in Solr without having to save the > original document in its entirety within Solr. > > When I import my fulltext (4000 word) document to Solr I was going to put > it > straight into keyword_document which uses stopwords to remove words like > "and" "it" "this". Now I only have 3000 words for example. > > Then if I do copy command to move it into truncate_document then even > though > I can reduce it down to say 100 words, it is lacking words like "and" "it" > and "this" because it has been copied from the keyword_document. > > I want the following scenario: > > truncate_document to have 100 words including words like "and" "it" and > "this" > keyword_docment to have only stop words removed > And finally only have the fulltext document, full length and all stop > words, > exist in my SQL database. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008380.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- View this message in context: http://lucene.472066.n3.nabble.com/Taking-a-full-text-then-truncate-and-duplicate-with-stopwords-tp4008269p4008459.html Sent from the Solr - User mailing list archive at Nabble.com.