How to retrieve tokens?
Hi to everybody, My name is Thiago and I'm new with Apache Solr and NoSQL databases. At the moment, I'm working and using Solr for document indexing. My Question is: Is there any way to retrieve the tokens in place of the original data? For example: I have a field using the fieldtype text_general from the original schema.xml. If I insert a document with the following string in this field: "All you need is love", the tokens that I get are: all, you, need, love. When I search in this base, I want to get the tokens(all, you, need, love) in place of the indexed string. I searched for this in the web and in this forum too, but I saw some people saying to use TermVectorsComponent. Is there any way more easy to do it? As I saw, TermVectorsComponent is more difficult and use more memory. Thanks to everybody. Thiago -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-retrieve-tokens-tp3770007p3770007.html Sent from the Solr - User mailing list archive at Nabble.com.
How to facet data from a multivalued field?
Hello everybody, I've already searched about this topic in the forum, but I didn't find any case like this. I ask for apologizes if this topic have been already discussed. I'm having a problem in faceting a multivalued field. My field is called series, and it has names of TV series like the big bang theory, two and a half men ... In this field I can have a lot of TV series names. For example: Two and a Half Men How I Met Your Mother The Big Bang Theory What I want to do is: search and count how many documents related to each series. I'm doing it using facet search in this field. But it's returning each word separately. Like this: 91 91 21 45 45 21 45 45 91 21 45 And what I want is something like: 21 45 91 Is there any possible way to do it with facet search? I don't want the terms, I just want each string including the white spaces. Do I have to change my fieldtype to do this? Thanks to everybody. Thiago -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3897853.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to facet data from a multivalued field?
Thank you very much, Erik. I just changed the fieldtype to String and it worked as I expected. Now I can select the count of the series. Thanks again and thanks the others too. Thiago Erik Hatcher-4 wrote > > Thiago - > > You'll want your series field to be of type "string". If you also need > that field searchable by the words within them, you can copyField to a > separate "text" (or other analyzed) field type where you search on the > tokenized field but facet on the "string" one. > > Erik > -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3902621.html Sent from the Solr - User mailing list archive at Nabble.com.
Problems with Memory
I'm having problems with memory when I'm using Solr. I have an application that crawl the web for some documents. It does a lot of consecutively indexing. But after some days of crawling, I'm having problems with memory. My Java process is consuming a lot of memory and it doesn't seems OK. My computer is starting swap and my crawler is running very slow. My professor told me that it is using the cache. What can I do? Is there any option that I should choose to solve this problem? Thanks in advance Thiago -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-Memory-tp3980765.html Sent from the Solr - User mailing list archive at Nabble.com.
How to update one field without losing the others?
Hi people, I'm trying to update one field of my solr database, but I update all the others fields. For example, if I have a record with the following fields id, name, address and phone and I try to update just id and address, the name and the phone vanishes. Is there any way to keep those fields in a update command? I've already searched this and I found this http://lucene.472066.n3.nabble.com/Update-Index-Updating-Specific-Fields-td506165.html http://lucene.472066.n3.nabble.com/Update-Index-Updating-Specific-Fields-td506165.html and it tells that I can't do this without losing my fields, but it was posted in 2010. Is this functionality present in solr nowadays? Thanks to everybody, Thiago -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-update-one-field-without-losing-the-others-tp3989959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to update one field without losing the others?
I'm already downloading the document and updating it with all the changes. I thought it had an easier way to do it. Thanks for the information, Michael Della Bitta. Thiago de Sousa Silveira -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-update-one-field-without-losing-the-others-tp3989959p3989962.html Sent from the Solr - User mailing list archive at Nabble.com.
Returning a list of matching words
This may be obvious but I can't get my head straight. Is there a way to return a list of matching words that a record got matched against? For instance: record_a: ruby, solr, mysql, rails record_b: solr, java Then ?q=solr+OR+rails would return the matched words for the records record_a: solr, rails record_b: solr I'm not looking into using the highlight feature for that. Thanks, -- Thiago Jackiw
Solr on trunk throwing 404 errors
I've just downloaded the trunk version of Solr (great changes by the way, kudos!) and all I get after the server starts are 404 errors whenever I send requests. Any ideas why this could be happening? Thanks, -- Thiago Jackiw
Re: Solr on trunk throwing 404 errors
Grant, Yes, I'm just starting it out from the examples directory flat out of the trunk repository. This is the output when I run "java -jar start.jar" 2007-11-15 14:33:23.884::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2007-11-15 14:33:24.173::INFO: jetty-6.1.3 2007-11-15 14:33:24.263::INFO: Started SocketConnector @ 0.0.0.0:8983 There are no exceptions in the log, except 404's: 127.0.0.1 - - [15/11/2007:22:34:49 +] "GET /solr/admin/ HTTP/1.1" 404 1298 127.0.0.1 - - [15/11/2007:22:34:55 +] "GET / HTTP/1.1" 404 618 127.0.0.1 - - [15/11/2007:22:34:58 +] "GET /solr HTTP/1.1" 404 1291 127.0.0.1 - - [15/11/2007:22:34:05 +] "POST /solr/update HTTP/1.1" 404 1298 Thanks. -- Thiago Jackiw On Nov 15, 2007 1:12 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > Are there any exceptions in the logs? Are you trying the Jetty > example? Can you give us more info? > > -Grant > > > On Nov 15, 2007, at 3:37 PM, Thiago Jackiw wrote: > > > I've just downloaded the trunk version of Solr (great changes by the > > way, kudos!) and all I get after the server starts are 404 errors > > whenever I send requests. > > > > Any ideas why this could be happening? > > > > Thanks, > > > > -- > > Thiago Jackiw > >
Re: Solr on trunk throwing 404 errors
Ha! That did it. Thanks. Is that because I'm using the trunk and not a released version? -- Thiago Jackiw On Nov 15, 2007 2:49 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > Have you build the project ('$ ant example')? > > -Mike > > > On 15-Nov-07, at 2:41 PM, Thiago Jackiw wrote: > > > Grant, > > > > Yes, I'm just starting it out from the examples directory flat out of > > the trunk repository. > > > > This is the output when I run "java -jar start.jar" > > 2007-11-15 14:33:23.884::INFO: Logging to STDERR via > > org.mortbay.log.StdErrLog > > 2007-11-15 14:33:24.173::INFO: jetty-6.1.3 > > 2007-11-15 14:33:24.263::INFO: Started SocketConnector @ 0.0.0.0:8983 > > > > There are no exceptions in the log, except 404's: > > 127.0.0.1 - - [15/11/2007:22:34:49 +] "GET /solr/admin/ > > HTTP/1.1" 404 1298 > > 127.0.0.1 - - [15/11/2007:22:34:55 +] "GET / HTTP/1.1" 404 618 > > 127.0.0.1 - - [15/11/2007:22:34:58 +] "GET /solr HTTP/1.1" > > 404 1291 > > 127.0.0.1 - - [15/11/2007:22:34:05 +] "POST /solr/update > > HTTP/1.1" 404 1298 > > > > Thanks. > > > > -- > > Thiago Jackiw > > > > > > On Nov 15, 2007 1:12 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > >> Are there any exceptions in the logs? Are you trying the Jetty > >> example? Can you give us more info? > >> > >> -Grant > >> > >> > >> On Nov 15, 2007, at 3:37 PM, Thiago Jackiw wrote: > >> > >>> I've just downloaded the trunk version of Solr (great changes by the > >>> way, kudos!) and all I get after the server starts are 404 errors > >>> whenever I send requests. > >>> > >>> Any ideas why this could be happening? > >>> > >>> Thanks, > >>> > >>> -- > >>> Thiago Jackiw > >> > >> > >
Returning all rows from a query
Is there a way to retrieve all rows found without having to specify a value for it (?q=sales&rows=HUGE_NUMBER)? For instance, what I'd like to do would be something like "rows=*" or "rows=all" and that would return all the records found, without any limits. Thanks.
[acts_as_solr] Release v.0.8 is out
The new release v.0.8 of acts_as_solr is out and includes: NEW - New video tutorial NEW - Faceted search has been implemented and its possible to 'drill-down' on the facets NEW - New rake tasks you can use to start/stop the solr server in test, development and production environments: (thanks Matt Clark) rake solr:start|stop RAILS_ENV=test|development|production (defaults to development if none given) NEW - Changes to the plugin's test framework and it now supports Sqlite as well (thanks Matt Clark) FIX - Patch applied (thanks Micah) that allows one to have multiple solr instances in the same servlet FIX - Patch applied (thanks Micah) that allows indexing of STIs FIX - Patch applied (thanks Gordon) that allows the plugin to use a table's primary key different than 'id' FIX - Returning empty array instead of empty strings when no records are found FIX - Problem with unit tests failing due to order of the tests and speed of the commits == About == This plugin adds full text search capabilities and many other nifty features from Apache's Solr to any Rails model == Installation == On your Rails' root directory, just type script/plugin install http://opensvn.csie.org/acts_as_solr/trunk == Very Basic Usage == Just include the line below to any of your ActiveRecord models: acts_as_solr Or if you want, you can specify only the fields that should be indexed: acts_as_solr :fields => [:name, :author] Then to find instances of your model, just do: Model.find_by_solr(query) or Model.find_id_by_solr(query) Or if you want to specify the starting row and the number of rows per page: Model.find_by_solr(query, :start => 0, :rows => 10) Get it while it's hot => http://acts-as-solr.rubyforge.org -- Thiago Jackiw acts_as_solr => http://acts-as-solr.rubyforge.org Sitealizer => http://sitealizer.rubyforge.org
[ANN] acts_as_solr has a new home, please update
The acts_as_solr plugin has a new home, so please make sure you update your bookmarks: web: http://acts_as_solr.railsfreaks.com trac: http://trac.railsfreaks.com/projects/acts_as_solr svn: svn://svn.railsfreaks.com/projects/acts_as_solr api: http://api.railsfreaks.com/projects/acts_as_solr The current address (http://acts-as-solr.rubyforge.org) will be obsolete by release version 1.0 -- Thiago Jackiw
[ANN] acts_as_solr v.0.8.5 has been released
The acts_as_solr plugin v.0.8.5 has been released and this short release includes: FIX: There's no need to specify the :field_types anymore when doing a search in a model that specifies a field type for a field. The field types are automatically traced back when they're included #Indexing class Electronic < ActiveRecord::Base acts_as_solr :fields => [{:price => :range_float}] end #Searching Electronic.find_by_solr "ipod AND price:[* TO 59.99]" FIX: Better handling of nil values from indexed fields. Solr complained when indexing fields with field type and the field values being passed as nils. NEW: Adding Solr sort (order by) option to the search query (thanks Kevin Hunt) #This will return the records in ascending order based on the price Electronic.find_by_solr "ipod AND price:[* TO 59.99]", :order => 'price asc' FIX: Applying patch suggested for increasing the Solr commit speed (thanks Mourad Hammiche) FIX: Updated documentation web => http://acts_as_solr.railsfreaks.com svn => svn://svn.railsfreaks.com/projects/acts_as_solr/trunk *Note: the old address (http://acts-as-solr.rubyforge.org) and repository (http://opensvn.csie.org/acts_as_solr/trunk) aren't being updated anymore and will become obsolete by release version 1.0. Please use the addresses mentioned above. Have fun! -- Thiago Jackiw
Re: Delete entire index
Matt, I could be wrong, but I think you can send a "delete by query" syntax: *:* -- Thiago Jackiw acts_as_solr => http://acts-as-solr.railsfreaks.com On 6/13/07, Matt Mitchell <[EMAIL PROTECTED]> wrote: Hi, Is there a way to have Solr completely remove the current index? ? We're still in development and so our schema is wavering. Anytime we make a change and want to re-index we first have to: stop tomcat (or the solr webapp) manually remove the data/index restart tomcat (or the solr webapp) The removing of the data/index directory is where we have the most trouble, because of the file permissions. The data/index directory is owned by tomcat/tomcat so in order to remove it, we have to issue sudo rm which we'd like to avoid. Ideally if we could just tell Solr to delete all data without having to do anymore manual work, it'd be great! : ) Something else that would help is if we tell Tomcat/Solr which user/ group and/or permission to use on the data/index directory when it's created. Any thoughts on this? Matt
[ANN] acts_as_solr v.0.9 has been released
It's with great pleasure that I announce this great milestone for the acts_as_solr plugin. Thanks to all who contributed with ideas, patches, etc. = About = This plugin adds full text search capabilities and many other nifty features from Apache's Solr to any Rails model. = IMPORTANT: Before you Upgrade from v.0.8.5 = If you are currently using the embedded Solr in production environment, please make sure you backup the data directory before upgrading to version 0.9 because the directory where Solr lives now is under acts_as_solr/solr instead of acts_as_solr/test/solr. = Changes NEW: Added the option :scores when doing a search. If set to true this will return the score as a 'solr_score' attribute or each one of the instances found books = Book.find_by_solr 'ruby OR splinter', :scores => true books.records.first.solr_score => 1.21321397 books.records.last.solr_score => 0.12321548 NEW: Major change on the way the results returned are accessed. books = Book.find_by_solr 'ruby' # the above will return a SearchResults class with 4 methods: # docs|results|records: will return an array of records found # # books.records.is_a?(Array) # => true # # total|num_found|total_hits: will return the total number of records found # # books.total # => 2 # # facets: will return the facets when doing a faceted search # # max_score|highest_score: returns the highest score found # # books.max_score # => 1.3213213 NEW: Integrating acts_as_solr to use solr-ruby as the 'backend'. Integration based on the patch submitted by Erik Hatcher NEW: Re-factoring rebuild_solr_index to allow adds to be done in batch; and if a finder block is given, it will be called to retrieve the items to index. (thanks Daniel E.) NEW: Adding the option to specify the port Solr should start when using rake solr:start rake solr:start RAILS_ENV=your_env PORT=XX NEW: Adding deprecation warning for the :background configuration option. It will no longer be updated. NEW: Adding support for models that use a primary key other than integer class Posting < ActiveRecord::Base set_primary_key 'guid' #string #make sure you set the :primary_key_field => 'pk_s' if you wish to use a string field as the primary key acts_as_solr({},{:primary_key_field => 'pk_s'}) end FIX: Disabling of storing most fields. Storage isn't useful for acts_as_solr in any field other than the pk and id fields. It just takes up space and time. (thanks Daniel E.) FIX: Re-factoring code submitted by Daniel E. NEW: Adding an :auto_commit option that will only send the commit command to Solr if it is set to true class Author < ActiveRecord::Base acts_as_solr :auto_commit => false end FIX: Fixing bug on rake's test task FIX: Making acts_as_solr's Post class compatible with Solr 1.2 (thanks Si) NEW: Adding Solr 1.2 FIX: Removing Solr 1.1 NEW: Adding a conditional :if option to the acts_as_solr call. It behaves the same way ActiveRecord's :if argument option does. class Electronic < ActiveRecord::Base acts_as_solr :if => proc{|record| record.is_active?} end NEW: Adding fixtures to Solr index when using rake db:fixtures:load FIX: Fixing boost warning messages FIX: Fixing bug when adding a facet to a field that contains boost NEW: Deprecating find_with_facet and combining functionality with find_by_solr NEW: Adding the option to :exclude_fields when indexing a model class User < ActiveRecord::Base acts_as_solr :exclude_fields => [:password, :login, :credit_card_number] end FIX: Fixing branch bug on older ruby version NEW: Adding boost support for fields and documents being indexed: class Electronic < ActiveRecord::Base # You can add boosting on a per-field basis or on the entire document acts_as_solr :fields => [{:price => {:boost => 5.0}}], :boost => 5.0 end FIX: Fixed the acts_as_solr limitation to only accept test|development|production environments. ===== /Changes For more info: http://acts_as_solr.railsfreaks.com OR if your browser/isp can't render: http://acts-as-solr.railsfreaks.com -- Thiago Jackiw
Rejecting fields with null values
I'm not sure if this is possible or not, but, is there a way to do a search and reject fields that are empty or have null values like the pseudo code below? ?q=test+AND+(NOT+field_b:NULL) If this is not currently supported, does anyone think this is not a god idea to be implemented? Thanks, -- Thiago Jackiw acts_as_solr => http://acts-as-solr.railsfreaks.com
Re: Rejecting fields with null values
Hoss, As an inverted index, the Lucene index Solr uses doesn't know when documents have an "empty" value ... it stores the inverted mapping of value=>documents, so there is no way to query for field_b:NULL, let alone "NOT field_b:bull" I see what you mean. I guess searching for fields that require to have a value like the way you explained is a good way to go. Thanks! -- Thiago Jackiw acts_as_solr => http://acts-as-solr.railsfreaks.com On 6/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I'm not sure if this is possible or not, but, is there a way to do a : search and reject fields that are empty or have null values like the : pseudo code below? As an inverted index, the Lucene index Solr uses doesn't know when documents have an "empty" value ... it stores the inverted mapping of value=>documents, so there is no way to query for field_b:NULL, let alone "NOT field_b:bull" you can however query forthings like: field_b:[* TO *] which requres field_b to have some value (that seems to be the use case you are after) as a general rule, if you really want to be abel to support searches for rhings like "find all docs wher there is no value in field X" the easiest way to achieve something like that in Solr is to configure the field with a default value in the schema ... something that would never normally appear in your data (a placeholder for 'null' so to speak) and query on that. -Hoss
Same record belonging to multiple facets
Is there a way for a record to belong to multiple facets? If so, how would one go about implementing it? What I'd like to accomplish would be something like: record A: name="John Doe" category_facet="Cars" category_facet="Electronics" And when searching for "John Doe" his record would appear under both "Cars" and "Electronics" facet categories. Thanks. -- Thiago Jackiw
Re: Same record belonging to multiple facets
Is it that simple? Cool, I'll give it a try. -- Thiago Jackiw On 7/5/07, Martin Grotzke <[EMAIL PROTECTED]> wrote: On Thu, 2007-07-05 at 12:39 -0700, Thiago Jackiw wrote: > Is there a way for a record to belong to multiple facets? If so, how > would one go about implementing it? > > What I'd like to accomplish would be something like: > > record A: > name="John Doe" > category_facet="Cars" > category_facet="Electronics" Isn't this the multiValued="true" property in your field definition for category_facet? Cheers, Martin > > And when searching for "John Doe" his record would appear under both > "Cars" and "Electronics" facet categories. > > Thanks. > > -- > Thiago Jackiw >