How to retrieve tokens?

2012-02-23 Thread Thiago
Hi to everybody,

My name is Thiago and I'm new with Apache Solr and NoSQL databases. At the
moment, I'm working and using Solr for document indexing. My Question is: Is
there any way to retrieve the tokens in place of the original data?

For example:
I have a field using the fieldtype text_general from the original
schema.xml. If I insert a document with the following string in this field:
"All you need is love", the tokens that I get are: all, you, need, love.
When I search in this base, I want to get the tokens(all, you, need, love)
in place of the indexed string.

I searched for this in the web and in this forum too, but I saw some people
saying to use TermVectorsComponent. Is there any way more easy to do it? As
I saw, TermVectorsComponent is more difficult and use more memory.

Thanks to everybody.

Thiago


--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-retrieve-tokens-tp3770007p3770007.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to facet data from a multivalued field?

2012-04-09 Thread Thiago
Hello everybody,

I've already searched about this topic in the forum, but I didn't find any
case like this. I ask for apologizes if this topic have been already
discussed.

I'm having a problem in faceting a multivalued field. My field is called
series, and it has names of TV series like the big bang theory, two and a
half men ...

In this field I can have a lot of TV series names. For example:


   Two and a Half Men
   How I Met Your Mother
   The Big Bang Theory


What I want to do is: search and count how many documents related to each
series. I'm doing it using facet search in this field. But it's returning
each word separately. Like this:





   91
   91
   21
   45
   45
   21
   45
   45
   91
   21
   45






And what I want is something like:





   21
   45
   91






Is there any possible way to do it with facet search? I don't want the
terms, I just want each string including the white spaces. Do I have to
change my fieldtype to do this?

Thanks to everybody.

Thiago 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3897853.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to facet data from a multivalued field?

2012-04-11 Thread Thiago
Thank you very much, Erik. I just changed the fieldtype to String and it
worked as I expected. Now I can select the count of the series. Thanks again
and thanks the others too.

Thiago


Erik Hatcher-4 wrote
> 
> Thiago -
> 
> You'll want your series field to be of type "string".   If you also need
> that field searchable by the words within them, you can copyField to a
> separate "text" (or other analyzed) field type where you search on the
> tokenized field but facet on the "string" one.
> 
>   Erik
> 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-facet-data-from-a-multivalued-field-tp3897853p3902621.html
Sent from the Solr - User mailing list archive at Nabble.com.


Problems with Memory

2012-05-11 Thread Thiago
I'm having problems with memory when I'm using Solr. I have an application
that crawl the web for some documents. It does a lot of consecutively
indexing. But after some days of crawling, I'm having problems with memory.
My Java process is consuming a lot of memory and it doesn't seems OK. My
computer is starting swap and my crawler is running  very slow. My professor
told me that it is using the cache. What can I do? Is there any option that
I should choose to solve this problem?

Thanks in advance

Thiago

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-with-Memory-tp3980765.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to update one field without losing the others?

2012-06-16 Thread Thiago
Hi people,

I'm trying to update one field of my solr database, but I update all the
others fields. For example, if I have a record with the following fields id,
name, address and phone and I try to update just id and address, the name
and the phone vanishes. Is there any way to keep those fields in a update
command? I've already searched this and I found this 
http://lucene.472066.n3.nabble.com/Update-Index-Updating-Specific-Fields-td506165.html
http://lucene.472066.n3.nabble.com/Update-Index-Updating-Specific-Fields-td506165.html
 
and it tells that I can't do this without losing my fields, but it was
posted in 2010. Is this functionality present in solr nowadays?

Thanks to everybody,

Thiago



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-update-one-field-without-losing-the-others-tp3989959.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to update one field without losing the others?

2012-06-16 Thread Thiago
I'm already downloading the document and updating it with all the changes. I
thought it had an easier way to do it. 
Thanks for the information, Michael Della Bitta.

Thiago de Sousa Silveira

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-update-one-field-without-losing-the-others-tp3989959p3989962.html
Sent from the Solr - User mailing list archive at Nabble.com.


Returning a list of matching words

2007-08-09 Thread Thiago Jackiw
This may be obvious but I can't get my head straight. Is there a way
to return a list of matching words that a record got matched against?
For instance:

record_a: ruby, solr, mysql, rails
record_b: solr, java

Then ?q=solr+OR+rails would return the matched words for the records

record_a: solr, rails
record_b: solr

I'm not looking into using the highlight feature for that.

Thanks,

--
Thiago Jackiw


Solr on trunk throwing 404 errors

2007-11-15 Thread Thiago Jackiw
I've just downloaded the trunk version of Solr (great changes by the
way, kudos!) and all I get after the server starts are 404 errors
whenever I send requests.

Any ideas why this could be happening?

Thanks,

--
Thiago Jackiw


Re: Solr on trunk throwing 404 errors

2007-11-15 Thread Thiago Jackiw
Grant,

Yes, I'm just starting it out from the examples directory flat out of
the trunk repository.

This is the output when I run "java -jar start.jar"
2007-11-15 14:33:23.884::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
2007-11-15 14:33:24.173::INFO:  jetty-6.1.3
2007-11-15 14:33:24.263::INFO:  Started SocketConnector @ 0.0.0.0:8983

There are no exceptions in the log, except 404's:
127.0.0.1 -  -  [15/11/2007:22:34:49 +] "GET /solr/admin/
HTTP/1.1" 404 1298
127.0.0.1 -  -  [15/11/2007:22:34:55 +] "GET / HTTP/1.1" 404 618
127.0.0.1 -  -  [15/11/2007:22:34:58 +] "GET /solr HTTP/1.1" 404 1291
127.0.0.1 -  -  [15/11/2007:22:34:05 +] "POST /solr/update
HTTP/1.1" 404 1298

Thanks.

--
Thiago Jackiw


On Nov 15, 2007 1:12 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> Are there any exceptions in the logs?  Are you trying the Jetty
> example?  Can you give us more info?
>
> -Grant
>
>
> On Nov 15, 2007, at 3:37 PM, Thiago Jackiw wrote:
>
> > I've just downloaded the trunk version of Solr (great changes by the
> > way, kudos!) and all I get after the server starts are 404 errors
> > whenever I send requests.
> >
> > Any ideas why this could be happening?
> >
> > Thanks,
> >
> > --
> > Thiago Jackiw
>
>


Re: Solr on trunk throwing 404 errors

2007-11-15 Thread Thiago Jackiw
Ha! That did it. Thanks. Is that because I'm using the trunk and not a
released version?

--
Thiago Jackiw


On Nov 15, 2007 2:49 PM, Mike Klaas <[EMAIL PROTECTED]> wrote:
> Have you build the project ('$ ant example')?
>
> -Mike
>
>
> On 15-Nov-07, at 2:41 PM, Thiago Jackiw wrote:
>
> > Grant,
> >
> > Yes, I'm just starting it out from the examples directory flat out of
> > the trunk repository.
> >
> > This is the output when I run "java -jar start.jar"
> > 2007-11-15 14:33:23.884::INFO:  Logging to STDERR via
> > org.mortbay.log.StdErrLog
> > 2007-11-15 14:33:24.173::INFO:  jetty-6.1.3
> > 2007-11-15 14:33:24.263::INFO:  Started SocketConnector @ 0.0.0.0:8983
> >
> > There are no exceptions in the log, except 404's:
> > 127.0.0.1 -  -  [15/11/2007:22:34:49 +] "GET /solr/admin/
> > HTTP/1.1" 404 1298
> > 127.0.0.1 -  -  [15/11/2007:22:34:55 +] "GET / HTTP/1.1" 404 618
> > 127.0.0.1 -  -  [15/11/2007:22:34:58 +] "GET /solr HTTP/1.1"
> > 404 1291
> > 127.0.0.1 -  -  [15/11/2007:22:34:05 +] "POST /solr/update
> > HTTP/1.1" 404 1298
> >
> > Thanks.
> >
> > --
> > Thiago Jackiw
> >
> >
> > On Nov 15, 2007 1:12 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> >> Are there any exceptions in the logs?  Are you trying the Jetty
> >> example?  Can you give us more info?
> >>
> >> -Grant
> >>
> >>
> >> On Nov 15, 2007, at 3:37 PM, Thiago Jackiw wrote:
> >>
> >>> I've just downloaded the trunk version of Solr (great changes by the
> >>> way, kudos!) and all I get after the server starts are 404 errors
> >>> whenever I send requests.
> >>>
> >>> Any ideas why this could be happening?
> >>>
> >>> Thanks,
> >>>
> >>> --
> >>> Thiago Jackiw
> >>
> >>
>
>


Returning all rows from a query

2007-05-08 Thread Thiago Jackiw
Is there a way to retrieve all rows found without having to specify a value for 
it (?q=sales&rows=HUGE_NUMBER)? For instance, what I'd like to do would be 
something like "rows=*" or "rows=all" and that would return all the records 
found, without any limits.

Thanks.


[acts_as_solr] Release v.0.8 is out

2007-05-11 Thread Thiago Jackiw


The new release v.0.8 of acts_as_solr is out and includes:

NEW - New video tutorial
NEW - Faceted search has been implemented and its possible to 'drill-down' on 
the facets
NEW - New rake tasks you can use to start/stop the solr server in test, 
development and production environments: (thanks Matt Clark)
rake solr:start|stop RAILS_ENV=test|development|production (defaults to 
development if none given)

NEW - Changes to the plugin's test framework and it now supports Sqlite as well 
(thanks Matt Clark)
FIX - Patch applied (thanks Micah) that allows one to have multiple solr 
instances in the same servlet
FIX - Patch applied (thanks Micah) that allows indexing of STIs
FIX - Patch applied (thanks Gordon) that allows the plugin to use a table's 
primary key different than 'id'
FIX - Returning empty array instead of empty strings when no records are found
FIX - Problem with unit tests failing due to order of the tests and speed of 
the commits

== About ==
This plugin adds full text search capabilities and many other nifty features 
from Apache's Solr to any Rails model

== Installation ==
On your Rails' root directory, just type
 script/plugin install http://opensvn.csie.org/acts_as_solr/trunk

== Very Basic Usage ==
Just include the line below to any of your ActiveRecord models:
 acts_as_solr

Or if you want, you can specify only the fields that should be indexed:
 acts_as_solr :fields => [:name, :author]

Then to find instances of your model, just do:
 Model.find_by_solr(query) or Model.find_id_by_solr(query)

Or if you want to specify the starting row and the number of rows per page:
 Model.find_by_solr(query, :start => 0, :rows => 10)


Get it while it's hot => http://acts-as-solr.rubyforge.org

--
Thiago Jackiw
acts_as_solr => http://acts-as-solr.rubyforge.org
Sitealizer => http://sitealizer.rubyforge.org



[ANN] acts_as_solr has a new home, please update

2007-05-13 Thread Thiago Jackiw

The acts_as_solr plugin has a new home, so please make sure you update
your bookmarks:

web:
http://acts_as_solr.railsfreaks.com

trac:
http://trac.railsfreaks.com/projects/acts_as_solr

svn:
svn://svn.railsfreaks.com/projects/acts_as_solr

api:
http://api.railsfreaks.com/projects/acts_as_solr

The current address (http://acts-as-solr.rubyforge.org) will be
obsolete by release version 1.0

--
Thiago Jackiw


[ANN] acts_as_solr v.0.8.5 has been released

2007-05-16 Thread Thiago Jackiw

The acts_as_solr plugin v.0.8.5 has been released and this short
release includes:

FIX: There's no need to specify the :field_types anymore when doing a
search in a model that specifies a field type for a field. The field
types are automatically traced back when they're included

#Indexing
class Electronic < ActiveRecord::Base
  acts_as_solr :fields => [{:price => :range_float}]
end

#Searching
Electronic.find_by_solr "ipod AND price:[* TO 59.99]"

FIX: Better handling of nil values from indexed fields. Solr
complained when indexing fields with field type and the field values
being passed as nils.

NEW: Adding Solr sort (order by) option to the search query (thanks Kevin Hunt)

 #This will return the records in ascending order based on the price
 Electronic.find_by_solr "ipod AND price:[* TO 59.99]", :order => 'price asc'

FIX: Applying patch suggested for increasing the Solr commit speed
(thanks Mourad Hammiche)
FIX: Updated documentation

web => http://acts_as_solr.railsfreaks.com
svn  => svn://svn.railsfreaks.com/projects/acts_as_solr/trunk

*Note: the old address (http://acts-as-solr.rubyforge.org)  and
repository (http://opensvn.csie.org/acts_as_solr/trunk) aren't being
updated anymore and will become obsolete by release version 1.0.
Please use the addresses mentioned above.

Have fun!

--
Thiago Jackiw


Re: Delete entire index

2007-06-13 Thread Thiago Jackiw

Matt,

I could be wrong, but I think you can send a "delete by query" syntax:
*:*

--
Thiago Jackiw
acts_as_solr => http://acts-as-solr.railsfreaks.com


On 6/13/07, Matt Mitchell <[EMAIL PROTECTED]> wrote:

Hi,
Is there a way to have Solr completely remove the current index?
 ?

We're still in development and so our schema is wavering. Anytime we
make a change and want to re-index we first have to:

stop tomcat (or the solr webapp)
manually remove the data/index
restart tomcat (or the solr webapp)

The removing of the data/index directory is where we have the most
trouble, because of the file permissions. The data/index directory is
owned by tomcat/tomcat so in order to remove it, we have to issue
sudo rm which we'd like to avoid.

Ideally if we could just tell Solr to delete all data without having
to do anymore manual work, it'd be great! : )

Something else that would help is if we tell Tomcat/Solr which user/
group and/or permission to use on the data/index directory when it's
created.

Any thoughts on this?

Matt



[ANN] acts_as_solr v.0.9 has been released

2007-06-18 Thread Thiago Jackiw

It's with great pleasure that I announce this great milestone for the
acts_as_solr plugin. Thanks to all who contributed with ideas,
patches, etc.

= About =
This plugin adds full text search capabilities and many other nifty
features from Apache's Solr to any Rails model.

= IMPORTANT: Before you Upgrade from v.0.8.5 =
If you are currently using the embedded Solr in production
environment, please make sure you backup the data directory before
upgrading to version 0.9 because the directory where Solr lives now is
under acts_as_solr/solr instead of acts_as_solr/test/solr.

= Changes 
NEW: Added the option :scores when doing a search. If set to true this
will return the score as a 'solr_score' attribute or each one of the
instances found
 books = Book.find_by_solr 'ruby OR splinter', :scores => true
 books.records.first.solr_score
 => 1.21321397
 books.records.last.solr_score
 => 0.12321548

NEW: Major change on the way the results returned are accessed.
 books = Book.find_by_solr 'ruby'
 # the above will return a SearchResults class with 4 methods:
 # docs|results|records: will return an array of records found
 #
 #   books.records.is_a?(Array)
 #   => true
 #
 # total|num_found|total_hits: will return the total number of records found
 #
 #   books.total
 #   => 2
 #
 # facets: will return the facets when doing a faceted search
 #
 # max_score|highest_score: returns the highest score found
 #
 #   books.max_score
 #   => 1.3213213

NEW: Integrating acts_as_solr to use solr-ruby as the 'backend'.
Integration based on the patch submitted by Erik Hatcher

NEW: Re-factoring rebuild_solr_index to allow adds to be done in
batch; and if a finder block is given, it will be called to retrieve
the items to index. (thanks Daniel E.)

NEW: Adding the option to specify the port Solr should start when
using rake solr:start
 rake solr:start RAILS_ENV=your_env PORT=XX

NEW: Adding deprecation warning for the :background configuration
option. It will no longer be updated.

NEW: Adding support for models that use a primary key other than integer
 class Posting < ActiveRecord::Base
   set_primary_key 'guid' #string
   #make sure you set the :primary_key_field => 'pk_s' if you wish to
use a string field as the primary key
   acts_as_solr({},{:primary_key_field => 'pk_s'})
 end

FIX: Disabling of storing most fields. Storage isn't useful for
acts_as_solr in any field other than the pk and id fields. It just
takes up space and time. (thanks Daniel E.)

FIX: Re-factoring code submitted by Daniel E.

NEW: Adding an :auto_commit option that will only send the commit
command to Solr if it is set to true
 class Author < ActiveRecord::Base
acts_as_solr :auto_commit => false
 end

FIX: Fixing bug on rake's test task

FIX: Making acts_as_solr's Post class compatible with Solr 1.2 (thanks Si)

NEW: Adding Solr 1.2

FIX: Removing Solr 1.1

NEW: Adding a conditional :if option to the acts_as_solr call. It
behaves the same way ActiveRecord's :if argument option does.
 class Electronic < ActiveRecord::Base
   acts_as_solr :if => proc{|record| record.is_active?}
 end

NEW: Adding fixtures to Solr index when using rake db:fixtures:load

FIX: Fixing boost warning messages

FIX: Fixing bug when adding a facet to a field that contains boost

NEW: Deprecating find_with_facet and combining functionality with find_by_solr

NEW: Adding the option to :exclude_fields when indexing a model
 class User < ActiveRecord::Base
   acts_as_solr :exclude_fields => [:password, :login, :credit_card_number]
 end

FIX: Fixing branch bug on older ruby version

NEW: Adding boost support for fields and documents being indexed:
 class Electronic < ActiveRecord::Base
   # You can add boosting on a per-field basis or on the entire document
   acts_as_solr :fields => [{:price => {:boost => 5.0}}], :boost => 5.0
 end

FIX: Fixed the acts_as_solr limitation to only accept
test|development|production environments.
===== /Changes 

For more info:
http://acts_as_solr.railsfreaks.com
OR if your browser/isp can't render:
http://acts-as-solr.railsfreaks.com

--
Thiago Jackiw


Rejecting fields with null values

2007-06-20 Thread Thiago Jackiw

I'm not sure if this is possible or not, but, is there a way to do a
search and reject fields that are empty or have null values like the
pseudo code below?

?q=test+AND+(NOT+field_b:NULL)

If this is not currently supported, does anyone think this is not a
god idea to be implemented?

Thanks,

--
Thiago Jackiw
acts_as_solr => http://acts-as-solr.railsfreaks.com


Re: Rejecting fields with null values

2007-06-20 Thread Thiago Jackiw

Hoss,


As an inverted index, the Lucene index Solr uses doesn't know when
documents have an "empty" value ... it stores the inverted mapping of
value=>documents, so there is no way to query for field_b:NULL, let alone
"NOT field_b:bull"


I see what you mean.  I guess searching for fields that require to
have a value like the way you explained is a good way to go.

Thanks!

--
Thiago Jackiw
acts_as_solr => http://acts-as-solr.railsfreaks.com


On 6/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:

: I'm not sure if this is possible or not, but, is there a way to do a
: search and reject fields that are empty or have null values like the
: pseudo code below?

As an inverted index, the Lucene index Solr uses doesn't know when
documents have an "empty" value ... it stores the inverted mapping of
value=>documents, so there is no way to query for field_b:NULL, let alone
"NOT field_b:bull"

you can however query forthings like:  field_b:[* TO *] which requres
field_b to have some value (that seems to be the use case you are after)

as a general rule, if you really want to be abel to support searches for
rhings like "find all docs wher there is no value in field X" the easiest
way to achieve something like that in Solr is to configure the field with
a default value in the schema ... something that would never normally
appear in your data (a placeholder for 'null' so to speak) and query on
that.


-Hoss




Same record belonging to multiple facets

2007-07-05 Thread Thiago Jackiw

Is there a way for a record to belong to multiple facets? If so, how
would one go about implementing it?

What I'd like to accomplish would be something like:

record A:
name="John Doe"
category_facet="Cars"
category_facet="Electronics"

And when searching for "John Doe" his record would appear under both
"Cars" and "Electronics" facet categories.

Thanks.

--
Thiago Jackiw


Re: Same record belonging to multiple facets

2007-07-05 Thread Thiago Jackiw

Is it that simple? Cool, I'll give it a try.

--
Thiago Jackiw


On 7/5/07, Martin Grotzke <[EMAIL PROTECTED]> wrote:

On Thu, 2007-07-05 at 12:39 -0700, Thiago Jackiw wrote:
> Is there a way for a record to belong to multiple facets? If so, how
> would one go about implementing it?
>
> What I'd like to accomplish would be something like:
>
> record A:
> name="John Doe"
> category_facet="Cars"
> category_facet="Electronics"
Isn't this the multiValued="true" property in your field definition for
category_facet?

Cheers,
Martin

>
> And when searching for "John Doe" his record would appear under both
> "Cars" and "Electronics" facet categories.
>
> Thanks.
>
> --
> Thiago Jackiw
>