document categorization using solr?

2010-03-25 Thread Joel Nylund
Hi, Does solr have something built in, or recommended add-on that does document categorization? ( I found a thread about a year ago, but not exact same topic) For example, here is a commercial categorization product that will take a website and categorize it http://grapeshot.co.uk/onlin

Re: weird sorting behavior

2009-12-31 Thread Joel Nylund
schema.xml examples have additional information that you really should scan at least HTH Erick On Thu, Dec 31, 2009 at 8:53 AM, Joel Nylund wrote: Hi, After some further investigation, it turns out that null fields were sorting first, so if the title was null it was coming up first. Thi

Re: weird sorting behavior

2009-12-31 Thread Joel Nylund
last? thanks Joel On Dec 30, 2009, at 3:11 PM, Joel Nylund wrote: Hi, so this is only available in 1.5? I tried in 1.4 and got : org.apache.solr.common.SolrException: Error loading class 'solr.CollationKeyFilterFactory' Is there a way to do this in 1.4? The link Shalin sent is a

Re: weird sorting behavior

2009-12-30 Thread Joel Nylund
alin Shekhar Mangar wrote: On Thu, Dec 24, 2009 at 11:51 PM, Joel Nylund wrote: update, I tried changing to datatype string, and it sorts the numerics better, but the other sorts are not as good. Is there a way to control sorting for special chars, for example, I want blanks to sor

Re: weird sorting behavior

2009-12-24 Thread Joel Nylund
dont work string - sorts nicely for numbers and letters, but special chars like blanks show up first in the list thanks Joel On Dec 24, 2009, at 11:20 AM, Joel Nylund wrote: I have a field: stored="true" required="false"/> sortMissingLast

weird sorting behavior

2009-12-24 Thread Joel Nylund
I have a field: required="false"/> sortMissingLast="true" omitNorms="true"> When I sort it using titles that are alphanumeric it works great, but if the titles start with numbers, it almost seems

suggestions for DIH batchSize

2009-12-22 Thread Joel Nylund
Hi, it looks like from looking at the code the default is 500, is the recommended setting for this? Has anyone notice any significant performance/memory tradeoffs by making this much bigger? thanks Joel

Re: Request Assistance with DIH

2009-12-14 Thread Joel Nylund
Development console, it does not appear that the connection to Oracle is being made. So if someone could offer some configuration/connection setup directions I would very much appreciate it. Thanks Robbin -Original Message- From: Joel Nylund [mailto:jnyl...@yahoo.com] Sent: Friday

Re: Auto update with deltaimport

2009-12-12 Thread Joel Nylund
windows or unix? unix - make a shell script and call it from cron windows - make a .bat or .cmd file and call it from scheduler within the shell scripts/bat files use wget or curl to call the right import: wget -q -O /dev/null http://localhost:8983/solr/dataimport?command=delta-import Joe

Re: Request Assistance with DIH

2009-12-11 Thread Joel Nylund
add ?command=full-import to your url http://localhost:8983/solr/dataimport?command=full-import thanks Joel On Dec 11, 2009, at 7:45 PM, Robbin wrote: I've been trying to use the DIH with oracle and would love it if someone could give me some pointers. I put the ojdbc14.jar in both the Tom

Re: # in query

2009-12-08 Thread Joel Nylund
es should look like... *Assuming* that the ### are getting indexed and *assuming* your tokenizer tokenized on, whitespace, and *assuming* that by text_rev you are talking about ReversedWildcardFilterFactory, I wouldn't expect a search to match if it wasn't exactly: s'

Re: # in query

2009-12-08 Thread Joel Nylund
therwise the browser considers it as a separator between the URL for the server (on the left) and the fragment identifier (on the right) which is not sent the server. You might want to read about "URL-encoding", escaping with backslash is a shell-thing, not a thing for URLs! pau

Re: # in query

2009-12-07 Thread Joel Nylund
as a separator between the URL for the server (on the left) and the fragment identifier (on the right) which is not sent the server. You might want to read about "URL-encoding", escaping with backslash is a shell-thing, not a thing for URLs! paul Le 07-déc.-09 à 21:16, Joel Nylund

# in query

2009-12-07 Thread Joel Nylund
Hi, How can I put a # sign in a query, do I need to escape it? For example I want to query books with title that contain # No work so far: http://localhost:8983/solr/select?q=textTitle:"#"; http://localhost:8983/solr/select?q=textTitle:# http://localhost:8983/solr/select?q=textTitle:"\#"; Gett

how to get list of unique terms for a field

2009-12-04 Thread Joel Nylund
Hi, lets say I have a field called countryName, is there a way to get a list of all the countries for this field? Trying to figure out a nice way to keep my categories and the solr results in sync, would be nice to get these from solr instead of the database. thanks Joel

deleteById without solrj?

2009-12-03 Thread Joel Nylund
Is there a url based approach to delete a document? thanks Joel

Re: how to do partial word searches?

2009-12-03 Thread Joel Nylund
your substring matches will be faster. On Tue, Nov 24, 2009 at 7:51 PM, Joel Nylund wrote: Hi, I saw some older postings on this, but didnt see a resolution. I have a field called title, I would like to be able to find partial word matches within the title. For example: http://loca

debugging javascript DIH

2009-12-03 Thread Joel Nylund
is there a way to print to std out or anything from my javascript DIH transformer? thanks Joel

Re: weird behavior between 2 enviorments

2009-12-03 Thread Joel Nylund
t 11:00 AM, Joel Nylund wrote: same client, here are the debug results, something interesting is going on, I dont understand solr/lucene well enough to understand, see below not working env (linux) - 0 2 - true countryName:"Bosnia and Herzegovina" - countryName:&qu

Re: weird behavior between 2 enviorments

2009-12-03 Thread Joel Nylund
.0 − 27.0 − 0.0 − 0.0 − 0.0 − 0.0 − 0.0 − 27.0 On Dec 3, 2009, at 10:20 AM, Yonik Seeley wrote: Are you querying both systems from the same browser / client? Try adding debugQuery=true and see of the query parses the same for both (could be the browser/client doing extra escap

weird behavior between 2 enviorments

2009-12-03 Thread Joel Nylund
I have 2 environments one works great for this query: my osx environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns 2 results my linux environment: http://localhost:8983/solr/select?q=countryName:%22Bosnia%20and%20Herzegovina%22 - returns

Re: NOT combined with OR is not getting exected results

2009-12-02 Thread Joel Nylund
thanks that worked! and yes I have some with no categoryType thanks Joel On Dec 2, 2009, at 2:24 PM, AHMET ARSLAN wrote: Hi, thanks, but still get 530 results for this new query your proposed. May be you have some documents that has empty categoryType field. Can you try this: q = ((*:* -cat

Re: NOT combined with OR is not getting exected results

2009-12-02 Thread Joel Nylund
Hi, thanks, but still get 530 results for this new query your proposed. thanks Joel On Dec 2, 2009, at 12:00 PM, AHMET ARSLAN wrote: http://localhost:8983/solr/select?q=%28NOT%20categoryType:%22MEDIATYPE%22%29 :gives 292289 results http://localhost:8983/solr/select?q=fmMediaType:%22text%

NOT combined with OR is not getting exected results

2009-12-02 Thread Joel Nylund
http://localhost:8983/solr/select?q=%28NOT%20categoryType:%22MEDIATYPE%22%29 :gives 292289 results http://localhost:8983/solr/select?q=fmMediaType:%22text%22 :gives 530 results http://localhost:8983/solr/select?q=%28NOT%20categoryType:%22MEDIATYPE%22%29%20OR%20fmMediaType:%22text%22

getting value from parent query in subquery transformer

2009-12-02 Thread Joel Nylund
Hi, I have an entity that has a entity within it that executes a query for each row and calls a transformer. Is there a way to pass a value from the parent query into the transformer? For example, I have an entity called document, and it it has an ID and sometimes it has a category. I hav

Re: getting total index size & last update date/time from query

2009-12-01 Thread Joel Nylund
ata/index 2009-11-19T16:44:45Z See http://wiki.apache.org/solr/LukeRequestHandler Peter -Original Message----- From: Joel Nylund [mailto:jnyl...@yahoo.com] Sent: Thursday, November 19, 2009 8:31 AM To: solr-user@lucene.apache.org Subject: getting total index size & last update d

Re: solr/jetty not working for anything other than localhost

2009-11-25 Thread Joel Nylund
yes says: 2009-11-25 18:08:59.967::INFO: Started SocketConnector @ 0.0.0.0:8983 running on osx thanks Joel On Nov 25, 2009, at 6:00 PM, simon wrote: On Wed, Nov 25, 2009 at 5:27 PM, Joel Nylund wrote: I see: tcp46 0 0 *.8983 *.* LISTEN

Re: solr/jetty not working for anything other than localhost

2009-11-25 Thread Joel Nylund
interfaces netstat -an |grep 8983 You should see tcp0 0 0.0.0.0:8983 0.0.0.0:* LISTEN -Simon On Wed, Nov 25, 2009 at 3:55 PM, Joel Nylund wrote: Hi, if I try to use any other hostname jetty doesnt work, gives a blank page, if I telnet too the server/port it

solr/jetty not working for anything other than localhost

2009-11-25 Thread Joel Nylund
Hi, if I try to use any other hostname jetty doesnt work, gives a blank page, if I telnet too the server/port it just disconnects. I tried editing the scripts.conf to change the hostname, that didnt seem to help. For example I tried editing my etc/hosts file and added: 127.0.0.1 solriscool

Re: how to do partial word searches?

2009-11-25 Thread Joel Nylund
nd trailing wildcard query"Best Erick On Tue, Nov 24, 2009 at 7:51 PM, Joel Nylund wrote: Hi, I saw some older postings on this, but didnt see a resolution. I have a field called title, I would like to be able to find partial word matches within the title. For example: http://loc

Re: configure solr

2009-11-24 Thread Joel Nylund
for #1, under example, is there a webapps folder, does it contain solr.war ? are there any errors in your startup log for jetty, does it say anything about setting up solr, and solr home etc. Joel On Nov 24, 2009, at 4:55 PM, Jill Han wrote: Hi, I just downloaded solr -1.4.0 to my compute

how to do partial word searches?

2009-11-24 Thread Joel Nylund
Hi, I saw some older postings on this, but didnt see a resolution. I have a field called title, I would like to be able to find partial word matches within the title. For example: http://localhost:8983/solr/select?q=textTitle:%22*sulli*%22 I would expect it to find: the daily dish | by andr

Re: help with dataimport delta query

2009-11-24 Thread Joel Nylund
a.job_jobs_id} > I guess it should be ${dataimporter.delta.id} > > On Tue, Nov 24, 2009 at 1:19 AM, Joel Nylund > wrote: > > Hi, I have solr all working nicely, except im trying > to get deltas to work > > on my data import handler > > > > Here is a simplifi

Re: help with dataimport delta query

2009-11-23 Thread Joel Nylund
Nov 23, 2009, at 2:49 PM, Joel Nylund wrote: Hi, I have solr all working nicely, except im trying to get deltas to work on my data import handler Here is a simplification of my data import config, I have a table called "Book" which has categories, im doing subquries for the cat

help with dataimport delta query

2009-11-23 Thread Joel Nylund
Hi, I have solr all working nicely, except im trying to get deltas to work on my data import handler Here is a simplification of my data import config, I have a table called "Book" which has categories, im doing subquries for the category info and calling a javascript helper. This all works

getting total index size & last update date/time from query

2009-11-19 Thread Joel Nylund
Hi, Looking for total number of documents in my index and the last updated date/time of the index. Is there a way to get this through the standard query q=? if not, what is the best way to get this info from solr. thanks Joel

indexing on differnt server

2009-11-11 Thread Joel Nylund
is it possible to index on one server and copy the files over? thanks Joel

Re: deployment questions

2009-11-11 Thread Joel Nylund
better off running solr as a server on its own and using network security? thanks Joel On Nov 9, 2009, at 5:04 PM, Joel Nylund wrote: Hi, I have a java app that is deployed in jboss/tomcat container. I would like to add my solr index to it. I have read about this and it seems fairly

deployment questions

2009-11-09 Thread Joel Nylund
Hi, I have a java app that is deployed in jboss/tomcat container. I would like to add my solr index to it. I have read about this and it seems fairly straight forward, but im curious the best way to secure it. I require my users to login to my app to use it, so I want the search functions

Re: solr query help alpha numeric and not

2009-11-05 Thread Joel Nylund
Avlesh, thanks those worked, for somre reason I never got your mail, found it in one of the list archives though. thanks again Joel On Nov 5, 2009, at 9:08 PM, Avlesh Singh wrote: Didn't the queries in my reply work? Cheers Avlesh On Fri, Nov 6, 2009 at 4:16 AM, Joel Nylund wrote

Re: solr query help alpha numeric and not

2009-11-05 Thread Joel Nylund
, The ID is sent back as a string (instead of as an integer) in your example. Could this be the cause? - Jonathan On Nov 4, 2009, at 9:08 AM, Joel Nylund wrote: Hi, I have a field called firstLetterTitle, this field has 1 char, it can be anything, I need help with a few queries on this char

Re: how to use ajax-solr - example?

2009-11-04 Thread Joel Nylund
and format the data. I figured this is something I can throw together in a few hours, but I also figured someone would have already done the work. thanks Joel On Nov 4, 2009, at 2:02 PM, Israel Ekpo wrote: On Wed, Nov 4, 2009 at 10:48 AM, Joel Nylund wrote: Hi, I looked at the

Re: exact match lookup

2009-11-04 Thread Joel Nylund
that worked, thanks! had to negate the score. thanks Joel On Nov 4, 2009, at 1:57 PM, Jérôme Etévé wrote: If feedClass acts as an identifier, better use string :) use sort=title asc,score desc (not sort:) J. 2009/11/4 Joel Nylund : thank worked for me, changed to: http://localhost:8983

Re: exact match lookup

2009-11-04 Thread Joel Nylund
ser) feedClass:Social defaultField:News . Well that's the idea. It should then work using the type string. Cheers! J. 2009/11/4 Joel Nylund : Hi, I have a field that I want to do exact match lookups using. (when I say exact match, im looking for equivalent to a sql query where with no l

exact match lookup

2009-11-04 Thread Joel Nylund
Hi, I have a field that I want to do exact match lookups using. (when I say exact match, im looking for equivalent to a sql query where with no like clause so where feedClass = "Social News") For example the field is called feedClass and im doing: http://localhost:8983/solr/select?q=feedClas

how to use ajax-solr - example?

2009-11-04 Thread Joel Nylund
Hi, I looked at the documentation and I have no idea how to get started? Can someone point me to or show me an example of how to send a query to a solr server and paginate through the results using ajax- solr. I would glady write a blog tutorial on how to do this if someone can get me star

solr query help alpha numeric and not

2009-11-04 Thread Joel Nylund
Hi, I have a field called firstLetterTitle, this field has 1 char, it can be anything, I need help with a few queries on this char: 1.) I want all NON ALPHA and NON numbers, so any char that is not A-Z or 0-9 I tried: http://localhost:8983/solr/select?q=NOT%20firstLetterTitle:0%20TO%209%20

Re: best way to model 1-N

2009-10-30 Thread Joel Nylund
Im using apache-solr-1.3.0 I got it to work using javascript function instead. thanks Joel On Oct 30, 2009, at 12:44 PM, Chantal Ackermann wrote: This looks all right to me, but I might be missing something. Which version/build of SOLR are you using? Chantal Joel Nylund schrieb: Thanks

Re: best way to model 1-N

2009-10-30 Thread Joel Nylund
n DIH's data-config file - . 2. If you "add" documents to Solr yourself multiple values for the field can be specified as an array or list of values in the SolrInputDocument. A multivalued field provides the same faceting and searching capabilites like regular fields. There is no spe

Re: best way to model 1-N

2009-10-30 Thread Joel Nylund
uments to Solr yourself multiple values for the field can be specified as an array or list of values in the SolrInputDocument. A multivalued field provides the same faceting and searching capabilites like regular fields. There is no special syntax. Cheers Avlesh On Fri, Oct 30, 2009 at

best way to model 1-N

2009-10-29 Thread Joel Nylund
Hi, I have one index so far which contains feeds. I have been able to de- normalize several tables and map this data onto the feed entity. There is one tricky problem that I need help on. Feeds have 1 - many categories. So Lets say we have Category1, Category2 and Category3 Feed 1 - is in

multiple sql queries for one index?

2009-10-29 Thread Joel Nylund
Hi, Its been hurting my brain all day to try to build 1 query for my index (joins upon joins upon joins). Is there a way I can do multiple queries to populate the same index? I have one main table that I can join everything back via ID, it should be theoretically possible If this can be

data import with transformer

2009-10-29 Thread Joel Nylund
Hi, I have been reading the solr book and wiki, but I cant find any similar examples to what Im looking for. I have a database field called category, this field needs some text manipulation before it goes in the index here is the java code for what im trying to do: // categories look like

Re: weird problem with letters S and T

2009-10-29 Thread Joel Nylund
t 1:23 AM, Norberto Meijome wrote: On Wed, 28 Oct 2009 19:20:37 -0400 Joel Nylund wrote: Well I tried removing those 2 letters from stopwords, didnt seem to help, I also tried changing the field type to "text_ws", didnt seem to work. Any other ideas? Hi Joel, if your stop word

Re: weird problem with letters S and T

2009-10-28 Thread Joel Nylund
r are only storing one character per field. There are other text field types that do not have the stop word filter, so give your first letter field that field type. In this way stopword filter analyser is only disabled for searches on the first letter field. Cheers, Martijn 2009/10/28 Joel Nylund : T

Re: weird problem with letters S and T

2009-10-28 Thread Joel Nylund
similar issue the other day; in my case the solution turned out to be that the letters were stopwords. Don't know if this is your answer, but worth checking. Bern -Original Message- From: Joel Nylund [mailto:jnyl...@yahoo.com] Sent: Thursday, 29 October 2009 9:17 AM To: solr

weird problem with letters S and T

2009-10-28 Thread Joel Nylund
(I am super new to solr, sorry if this is an easy one) Hi, I want to support an A-Z type view of my data. I have a DataImportHandler that uses sql (my query is complex, but the part that matters is: SELECT f.id, f.title, LEFT(f.title,1) as firstLetterTitle FROM Foo f I can create this index