Custom Field Type

2009-03-04 Thread Fouad Mardini
Hello,

I have a multivalued field in my schema of type text_ws, values are of the
form #int #int
I need to be able to query on the first and sort on the second, this does
not seem to be enabled out of the box
I looked around for a while and it seems the recommended approach is to
create a custom field type and implement this logic in the getSortField
method
But since the field is multivalued, I need to sort by the value i am
searching for, so i need access to the current query being executed.
Also, i can't seem to figure out the correct -classpath to give to javac for
it to find the packages needed to create the class file (my java is a bit
rusty to say the least)

Thanks,
Fouad


Re: Custom Field Type

2009-03-05 Thread Fouad Mardini
Hello Yonik,

Thanks for your help, but I am not really sure I follow.
It is possible to use the PatternTokenizerFactory with pattern = (\d+)  and
group = 0 to tokenize the input correctly
But I don't see how to use the copyField to achieve sorting


I read the documentation and this does not seem to be possible

Are there any performance implications on using dynamic fields?
Could you please elaborate on your idea

Thanks again
/Fouad


On Wed, Mar 4, 2009 at 8:12 PM, Yonik Seeley wrote:

> On Wed, Mar 4, 2009 at 12:24 PM, Fouad Mardini 
> wrote:
> > I have a multivalued field in my schema of type text_ws, values are of
> the
> > form #int #int
> > I need to be able to query on the first and sort on the second, this does
> > not seem to be enabled out of the box
>
> Can you put the two numbers in separate fields for this purpose?
> If you can't do it from the indexer, a schema with copyField in
> conjunction with PatternTokenizerFactory could do it.
>
> -Yonik
> http://www.lucidimagination.com
>


Dismax request handler and highlighting

2009-06-07 Thread Fouad Mardini
Hello,

I am having problems with the dismax request handler and highlighting.
The following query works as intended

http://localhost:8983/solr/select?indent=on&q=myquery&start=0&rows=10&fl=id%2Cscore&qt=standard&wt=standard&hl=true&hl.fl=myfield

whereas

http://localhost:8983/solr/select?indent=on&q.alt=myquery&start=0&rows=10&fl=id%2Cscore&qt=dismax&wt=standard&hl=true&hl.fl=myfield

I am using dismax since i need boost functions.
Furthermore, using the q parameter with dismax doesn't seem to work with me,
debug gives the following output

myquery
+() ()

is there a setting somewhere that i need to set?

I am building SOLR right out of svn.

Thanks,
Fouad


Indexing large documents

2007-08-20 Thread Fouad Mardini
Hello,

I am using solr to index text extracted from word documents, and it is
working really well.
Recently i started noticing that some documents are not indexed, that is i
know that the word foobar is in a document, but when i search for foobar the
id of that document is not returned.
I suspect that this has to do with the size of the document, and that
documents with a lot of text are not being indexed.
Please advise.

thanks,
fmardini


Re: Indexing large documents

2007-08-20 Thread Fouad Mardini
Well, I am using the java textmining library to extract text from documents,
then i do a post to solr
I do not have an error log, i only have *.request.log files in the logs
directory

Thanks

On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote:
>
> Fouad,
>
> I would check the error log or console for any possible errors first.
> They may not show up, it really depends on how you are processing the
> word document (custom solr, feeding the text to it, etc).  We are
> using a custom version of solr with PDF, DOC, XLS, etc text extraction
> and I have successfully indexed 40mb documents.  I did have indexing
> problems with a large document or two and simply increasing the heap
> size fixed the problem.
>
> - Pete
>
> On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > I am using solr to index text extracted from word documents, and it is
> > working really well.
> > Recently i started noticing that some documents are not indexed, that is
> i
> > know that the word foobar is in a document, but when i search for foobar
> the
> > id of that document is not returned.
> > I suspect that this has to do with the size of the document, and that
> > documents with a lot of text are not being indexed.
> > Please advise.
> >
> > thanks,
> > fmardini
> >
>


Re: Indexing large documents

2007-08-20 Thread Fouad Mardini
thanks, i reindexed the documents and now it works, there was an issue with
text extraction it seems.
I also changed the maxFieldLength and it must have helped

thanks

On 8/20/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
>
> You will probably need to increase the value of maxFieldLength in your
> solrconfig.xml.  The default value is 1 which might explain why your
> documents are not being completely indexed.
>
> Piete
>
>
> On 20/08/07, Peter Manis <[EMAIL PROTECTED]> wrote:
> >
> > The that should show some errors if something goes wrong, if not the
> > console usually will.  The errors will look like a java stacktrace
> > output.  Did increasing the heap do anything for you?  Changing mine
> > to 256mb max worked fine for all of our files.
> >
> > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > > Well, I am using the java textmining library to extract text from
> > documents,
> > > then i do a post to solr
> > > I do not have an error log, i only have *.request.log files in the
> logs
> > > directory
> > >
> > > Thanks
> > >
> > > On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote:
> > > >
> > > > Fouad,
> > > >
> > > > I would check the error log or console for any possible errors
> first.
> > > > They may not show up, it really depends on how you are processing
> the
> > > > word document (custom solr, feeding the text to it, etc).  We are
> > > > using a custom version of solr with PDF, DOC, XLS, etc text
> extraction
> > > > and I have successfully indexed 40mb documents.  I did have indexing
> > > > problems with a large document or two and simply increasing the heap
> > > > size fixed the problem.
> > > >
> > > > - Pete
> > > >
> > > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > > > > Hello,
> > > > >
> > > > > I am using solr to index text extracted from word documents, and
> it
> > is
> > > > > working really well.
> > > > > Recently i started noticing that some documents are not indexed,
> > that is
> > > > i
> > > > > know that the word foobar is in a document, but when i search for
> > foobar
> > > > the
> > > > > id of that document is not returned.
> > > > > I suspect that this has to do with the size of the document, and
> > that
> > > > > documents with a lot of text are not being indexed.
> > > > > Please advise.
> > > > >
> > > > > thanks,
> > > > > fmardini
> > > > >
> > > >
> > >
> >
>