indexing java byte code in classes / jars

2015-05-08 Thread Mark
I looking to use Solr search over the byte code in Classes and Jars. Does anyone know or have experience of Analyzers, Tokenizers, and Token Filters for such a task? Regards Mark

Re: indexing java byte code in classes / jars

2015-05-08 Thread Mark
ging.LogRecord); > > public java.lang.String _format(java.util.logging.LogRecord); > > public java.lang.String getHead(java.util.logging.Handler); > > public java.lang.String getTail(java.util.logging.Handler); > > public java.lang.String formatMessage(java.util.logging.LogRec

Re: indexing java byte code in classes / jars

2015-05-08 Thread Mark
https://searchcode.com/ looks really interesting, however I want to crunch as much searchable aspects out of jars sititng on a classpath or under a project structure... Really early days so I'm open to any suggestions On 8 May 2015 at 22:09, Mark wrote: > To answer why bytecode -

Re: indexing java byte code in classes / jars

2015-05-08 Thread Mark
Erik, Thanks for the pretty much OOTB approach. I think I'm going to just try a range of approaches, and see how far I get. The "IDE does this suggestion" would be worth looking into as well. On 8 May 2015 at 22:14, Mark wrote: > > https://searchcode.com/ > &g

Re: indexing java byte code in classes / jars

2015-05-09 Thread Mark
Hi Alexandre, Solr & ASM is the extact poblem I'm looking to hack about with so I'm keen to consider any code no matter how ugly or broken Regards Mark On 9 May 2015 at 10:21, Alexandre Rafalovitch wrote: > If you only have classes/jars, use ASM. I have done this before,

Configuring number or shards

2013-11-05 Thread Mark
Can you configure the number of shards per collection or is this a system wide setting affecting all collections/indexes? Thanks

Sharding and replicas (Solr Cloud)

2013-11-07 Thread Mark
If I create my collection via the ZkCLI (https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities) how do I configure the number of shards and replicas? Thanks

SimplePostTool with extracted Outlook messages

2015-01-26 Thread Mark
I'm looking to index some outlook extracted messages *.msg I notice by default msg isn't one of the defaults so I tried the following: java -classpath dist/solr-core-4.10.3.jar -Dtype=application/vnd.ms-outlook org.apache.solr.util.SimplePostTool C:/temp/samplemsg/*.msg That didn't work However

Re: SimplePostTool with extracted Outlook messages

2015-01-26 Thread Mark
-F "myfile=@6252671B765A1748992DF1A6403BDF81A4A22C00.msg" Regards Mark On 26 January 2015 at 21:47, Alexandre Rafalovitch wrote: > Seems like apple to oranges comparison here. > > I would try giving an explicit end point (.../extract), a single > message, and a literal id for the SimplePostTool and

Re: SimplePostTool with extracted Outlook messages

2015-01-26 Thread Mark
think I may just extend SimplePostToo or look to use Solr Cell perhaps? On 26 January 2015 at 22:14, Alexandre Rafalovitch wrote: > Well, you are NOT posting to the same URL. > > > On 26 January 2015 at 17:00, Mark wrote: > > http://localhost:8983/solr/update > > > > --

Re: SimplePostTool with extracted Outlook messages

2015-01-27 Thread Mark
rse a folder means that is requires an ID strategy - which I believe is lacking. Reagrds Mark On 27 January 2015 at 10:57, Erik Hatcher wrote: > Try adding -Dauto=true and take away setting url. The type probably isn't > needed then either. > > With the new Solr 5 bin/

Re: SimplePostTool with extracted Outlook messages

2015-01-27 Thread Mark
ested. Thanks for eveyones suggestions. Regards Mark On 27 January 2015 at 18:01, Alexandre Rafalovitch wrote: > Your IDs seem to be the file names, which you are probably also getting > from your parsing the file. Can't you just set (or copyField) that as an ID > on the Solr side

Re: SimplePostTool with extracted Outlook messages

2015-01-27 Thread Mark
imeMap.put("msg", "application/vnd.ms-outlook"); Regards Mark On 27 January 2015 at 18:39, Mark wrote: > Hi Alex, > > On an individual file basis that would work, since you could set the ID on > an individual basis. > > However recuring a folder it doesn&#

extract and add fields on the fly

2015-01-28 Thread Mark
Is it possible to use curl to upload a document (for extract & indexing) and specify some fields on the fly? sort of: 1) index this document 2) by the way here are some important facets whilst your at it Regards Mark

Re: extract and add fields on the fly

2015-01-28 Thread Mark
ushed > to solr. Create the SID from the existing doc, add any additional fields, > then add to solr. > > On Wed, Jan 28, 2015 at 11:56 AM, Mark wrote: > > > Is it possible to use curl to upload a document (for extract & indexing) > > and specify some fields on the fly?

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Second thoughts SID is purely i/p as its name suggests :) I think a better approach would be 1) curl to upload/extract passing docID 2) curl to update additional fields for that docID On 28 January 2015 at 17:30, Mark wrote: > > "Create the SID from the existing doc" implies

Re: extract and add fields on the fly

2015-01-28 Thread Mark
I'm looking to 1) upload a binary document using curl 2) add some additional facets Specifically my question is can this be achieved in 1 curl operation or does it need 2? On 28 January 2015 at 17:43, Mark wrote: > > Second thoughts SID is purely i/p as its name suggests :) &g

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Use case is use curl to upload/extract/index document passing in additional facets not present in the document e.g. literal.source="old system" In this way some fields come from the uploaded extracted content and some fields as specified in the curl URL Hope that's clearer? Reg

Re: extract and add fields on the fly

2015-01-28 Thread Mark
field 'stuff'","code":400}} ..getting closer.. On 28 January 2015 at 18:03, Mark wrote: > > Use case is > > use curl to upload/extract/index document passing in additional facets not > present in the document e.g. literal.source="old system" > &

Re: extract and add fields on the fly

2015-01-28 Thread Mark
ture=div&fmap.div=foo_txt&boost.foo_txt=3&literal.blah_s=Bah"; -F "tutorial=@"help.pdf and therefore I learned that you can't update a field that isn't in the original which is what I was trying to do before. Regards Mark On 28 January 2015 at 18:38, Alexandr

Duplicate documents based on attribute

2013-07-25 Thread Mark
How would I go about doing something like this. Not sure if this is something that can be accomplished on the index side or its something that should be done in our application. Say we are an online store for shoes and we are selling Product A in red, blue and green. Is there a way when we sea

Re: Duplicate documents based on attribute

2013-07-25 Thread Mark
e on it outside of > solr. > > > On Thu, Jul 25, 2013 at 10:12 PM, Mark wrote: > >> How would I go about doing something like this. Not sure if this is >> something that can be accomplished on the index side or its something that >> should be done in our application.

Alternative searches

2013-07-31 Thread Mark
Can someone explain how one would go about providing alternative searches for a query… similar to Amazon. For example say I search for "Red Dump Truck" - 0 results for "Red Dump Truck" - 500 results for " Red Truck" - 350 results for "Dump Truck" Does this require multiple searches? Thanks

Percolate feature?

2013-08-02 Thread Mark
We have a set number of known terms we want to match against. In Index: "term one" "term two" "term three" I know how to match all terms of a user query against the index but we would like to know how/if we can match a user's query against all the terms in the index? Search Queries: "my search

Re: Problems matching delimited field

2013-08-05 Thread Mark
That was it… thanks On Aug 2, 2013, at 3:27 PM, Shawn Heisey wrote: > On 8/2/2013 4:16 PM, Robert Zotter wrote: >> The problem is the query get's expanded to "1 Foo" not ( "1" OR "Foo") >> >> 1Foo >> 1Foo >> +DisjunctionMaxQuery((name_textsv:"1 foo")) () >> +(name_textsv:"1 foo") () >> >> DisM

Re: Percolate feature?

2013-08-05 Thread Mark
;t match against indexed documents. > > Solr does support Lucene's "min should match" feature so that you can > specify, say, four query terms and return if at least two match. This is the > "mm" parameter. > > See: > http://wiki.apache.org/solr/ExtendedDisMax#mm

Re: Percolate feature?

2013-08-05 Thread Mark
y" wrote: > Fine, then write the query that way: +foo +bar baz > > But it still doesn't sound as if any of this relates to prospective > search/percolate. > > -- Jack Krupansky > > -Original Message- From: Mark > Sent: Monday, August 05, 2013 2:1

Re: Percolate feature?

2013-08-08 Thread Mark
Ok forget the mention of percolate. We have a large list of known keywords we would like to match against. Product keyword: "Sony" Product keyword: "Samsung Galaxy" We would like to be able to detect given a product title whether or not it matches any known keywords. For a keyword to be mat

Re: Percolate feature?

2013-08-09 Thread Mark
idworks.com > > On Fri, Aug 9, 2013 at 8:19 AM, Erick Erickson > wrote: >> This _looks_ like simple phrase matching (no slop) and highlighting... >> >> But whenever I think the answer is really simple, it usually means >> that I'm missing something..

Re: Percolate feature?

2013-08-09 Thread Mark
I'll look into this. Thanks for the concrete example as I don't even know which classes to start to look at to implement such a feature. On Aug 9, 2013, at 9:49 AM, Roman Chyla wrote: > On Fri, Aug 9, 2013 at 11:29 AM, Mark wrote: > >>> *All* of the terms in the fie

Re: Percolate feature?

2013-08-10 Thread Mark
> So to reiteratve your examples from before, but change the "labels" a > bit and add some more converse examples (and ignore the "highlighting" > aspect for a moment... > > doc1 = "Sony" > doc2 = "Samsung Galaxy" > doc3 = "Sony Playstation" > > queryA = "Sony Experia" ... matches only do

Re: Percolate feature?

2013-08-10 Thread Mark
e since you literally > do mean "if I index this document, will it match any of these queries" (but > doesn't score a hit on your direct check for whether it is a clean keyword > match.) > > In your previous examples you only gave clean product titles, not examples o

Re: Percolate feature?

2013-08-13 Thread Mark
Any ideas? On Aug 10, 2013, at 6:28 PM, Mark wrote: > Our schema is pretty basic.. nothing fancy going on here > > > > > > protected="protected.txt"/> > generateNumberParts="1" catenateWords="0" c

App server?

2013-10-02 Thread Mark
Is Jetty sufficient for running Solr or should I go with something a little more enterprise like tomcat? Any others?

SolrJ best pratices

2013-10-07 Thread Mark
Are there any links describing best practices for interacting with SolrJ? I've checked the wiki and it seems woefully incomplete: (http://wiki.apache.org/solr/Solrj) Some specific questions: - When working with HttpSolrServer should we keep around instances for ever or should we create a single

Bootstrapping / Full Importing using Solr Cloud

2013-10-08 Thread Mark
We are in the process of upgrading our Solr cluster to the latest and greatest Solr Cloud. I have some questions regarding full indexing though. We're currently running a long job (~30 hours) using DIH to do a full index on over 10M products. This process consumes a lot of memory and while updat

Re: SolrJ best pratices

2013-10-09 Thread Mark
Thanks for the clarification. In Solr Cloud just use 1 connection. In non-cloud environments you will need one per core. On Oct 8, 2013, at 5:58 PM, Shawn Heisey wrote: > On 10/7/2013 3:08 PM, Mark wrote: >> Some specific questions: >> - When working with HttpSolrServer

Setting SolrCloudServer collection

2013-10-11 Thread Mark
If using one static SolrCloudServer how can I add a bean to a certain collection. Do I need to update setDefaultCollection() each time? I doubt that thread safe? Thanks

Re: DIH importing

2011-08-29 Thread Mark
Thanks Ill give that a try On 8/26/11 9:54 AM, simon wrote: It sounds as though you are optimizing the index after the delta import. If you don't do that, then only new segments will be replicated and syncing will be much faster. On Fri, Aug 26, 2011 at 12:08 PM, Mark wrote: W

Searching multiple fields

2011-09-26 Thread Mark
I have a use case where I would like to search across two fields but I do not want to weight a document that has a match in both fields higher than a document that has a match in only 1 field. For example. Document 1 - Field A: "Foo Bar" - Field B: "Foo Baz" Document 2 - Field A: "Foo Blar

Re: Searching multiple fields

2011-09-27 Thread Mark
I thought that a similarity class will only affect the scoring of a single field.. not across multiple fields? Can anyone else chime in with some input? Thanks. On 9/26/11 9:02 PM, Otis Gospodnetic wrote: Hi Mark, Eh, I don't have Lucene/Solr source code handy, but I *think* for that

HBase Datasource

2011-11-10 Thread Mark
Has anyone had any success/experience with building a HBase datasource for DIH? Are there any solutions available on the web? Thanks.

CachedSqlEntityProcessor

2011-11-15 Thread Mark
I am trying to use the CachedSqlEntityProcessor with Solr 1.4.2 however I am not seeing any performance gains. I've read some other posts that reference cacheKey and cacheLookup however I don't see any reference to them in the wiki http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityPr

Re: CachedSqlEntityProcessor

2011-11-15 Thread Mark
FYI my sub-entity looks like the following On 11/15/11 10:42 AM, Mark wrote: I am trying to use the CachedSqlEntityProcessor with Solr 1.4.2 however I am not seeing any performance gains. I've read some other posts that reference cacheKey and cacheLookup however I don't see any re

Multithreaded DIH bug

2011-12-01 Thread Mark
I'm trying to use multiple threads with DIH but I keep receiving the following error.. "Operation not allowed after ResultSet closed" Is there anyway I can fix this? Dec 1, 2011 4:38:47 PM org.apache.solr.common.SolrException log SEVERE: Full Import failed:java.lang.RuntimeException: Error in

Re: Multithreaded DIH bug

2011-12-02 Thread Mark
ser/201110.mbox/browser I plan (but only plan, sorry) to address it at 4.0 where SOLR-2382 refactoring has been applied recently. Regards On Fri, Dec 2, 2011 at 4:57 AM, Mark wrote: I'm trying to use multiple threads with DIH but I keep receiving the following error.. "Operation not allow

Question on DIH delta imports

2011-12-05 Thread Mark
*pk*: The primary key for the entity. It is*optional*and only needed when using delta-imports. It has no relation to the uniqueKey defined in schema.xml but they both can be the same. When using in a nested entity is the PK the primary key column of the join table or the key used for joining?

Re: Question on DIH delta imports

2011-12-06 Thread Mark
Anyone? On 12/5/11 11:04 AM, Mark wrote: *pk*: The primary key for the entity. It is*optional*and only needed when using delta-imports. It has no relation to the uniqueKey defined in schema.xml but they both can be the same. When using in a nested entity is the PK the primary key column of

Design questions/Schema Help

2010-07-26 Thread Mark
We are thinking about using Cassandra to store our search logs. Can someone point me in the right direction/lend some guidance on design? I am new to Cassandra and I am having trouble wrapping my head around some of these new concepts. My brain keeps wanting to go back to a RDBMS design. We wi

Re: Design questions/Schema Help

2010-07-26 Thread Mark
On 7/26/10 4:43 PM, Mark wrote: We are thinking about using Cassandra to store our search logs. Can someone point me in the right direction/lend some guidance on design? I am new to Cassandra and I am having trouble wrapping my head around some of these new concepts. My brain keeps wanting to

Solr crawls during replication

2010-07-26 Thread Mark
We have an index around 25-30G w/ 1 master and 5 slaves. We perform replication every 30 mins. During replication the disk I/O obviously shoots up on the slaves to the point where all requests routed to that slave take a really long time... sometimes to the point of timing out. Is there any lo

DIH and Cassandra

2010-08-04 Thread Mark
Is it possible to use DIH with Cassandra either out of the box or with something more custom? Thanks

Throttling replication

2010-09-02 Thread Mark
Is there any way or forthcoming patch that would allow configuration of how much network bandwith (and ultimately disk I/O) a slave is allowed during replication? We have the current problem of while replicating our disk I/O goes through the roof. I would much rather have the replication take

Re: Solr crawls during replication

2010-09-02 Thread Mark
On 8/6/10 5:03 PM, Chris Hostetter wrote: : We have an index around 25-30G w/ 1 master and 5 slaves. We perform : replication every 30 mins. During replication the disk I/O obviously shoots up : on the slaves to the point where all requests routed to that slave take a : really long time... somet

Re: Throttling replication

2010-09-02 Thread Mark
On 9/2/10 8:27 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: There is no way to currently throttle replication. It consumes the whole bandwidth available. It is a nice to have feature On Thu, Sep 2, 2010 at 8:11 PM, Mark wrote: Is there any way or forthcoming patch that would allow configuration of

Re: Throttling replication

2010-09-02 Thread Mark
bandwidth would be nice. -brandon On 9/2/10 7:41 AM, Mark wrote: Is there any way or forthcoming patch that would allow configuration of how much network bandwith (and ultimately disk I/O) a slave is allowed during replication? We have the current problem of while replicating our disk I/O goes through

Re: Solr crawls during replication

2010-09-03 Thread Mark
. From: Shawn Heisey [s...@elyograg.org] Sent: Friday, September 03, 2010 1:46 PM To: solr-user@lucene.apache.org Subject: Re: Solr crawls during replication On 9/2/2010 9:31 AM, Mark wrote: Thanks for the suggestions. Our slaves have 12G with 10G dedicated to the JVM.. too much

Solr Cloud Architecture and DIH

2012-12-19 Thread Mark
We're currently running Solr 3.5 and our indexing process works as follows: We have a master that has a cron job to run a delta import via DIH every 5 minutes. The delta-import takes around 75 minutes to full complete, most of that is due to optimization after each delta and then the slaves s

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark
x%2B5%29%2F38%29*42%2B105%2C+x%3D50..175 Regards, Kent Fitch On Tue, Feb 14, 2012 at 12:29 PM, Mark <mailto:static.void@gmail.com>> wrote: I need some help with one of my boost functions. I would like the function to look something like the following mockup below. Starts

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark
construct using sums of basic sigmoidal functions. The logistic and probit functions are commonly used for this. Sent from my iPhone On Feb 14, 2012, at 10:05, Mark wrote: Thanks I'll have a look at this. I should have mentioned that the actual values on the graph aren't important ra

Re: Need help with graphing function (MATH)

2012-02-14 Thread Mark
Or better yet an example in solr would be best :) Thanks! On 2/14/12 11:05 AM, Mark wrote: Would you mind throwing out an example of these types of functions. Looking at Wikipedia (http://en.wikipedia.org/wiki/Probit) its seems like the Probit function is very similar to what I want. Thanks

Question on replication

2010-11-22 Thread Mark
After I perform a delta-import on my master the slave replicates the whole index which can be quite time consuming. Is there any way for the slave to replicate only partials that have changed? Do I need to change some setting on master not to commit/optimize to get this to work? Thanks

Solr DataImportHandler (DIH) and Cassandra

2010-11-29 Thread Mark
Is there anyway to use DIH to import from Cassandra? Thanks

Re: Solr DataImportHandler (DIH) and Cassandra

2010-11-29 Thread Mark
es or indexes to know what has changed. There is also the Lucandra project, not exactly what your after but may be of interest anyway https://github.com/tjake/Lucandra Hope that helps. Aaron On 30 Nov, 2010,at 05:04 AM, Mark wrote: Is there anyway to use DIH to import from Cassandra? Thanks

Limit number of characters returned

2010-12-02 Thread Mark
Is there way to limit the number of characters returned from a stored field? For example: Say I have a document (~2K words) and I search for a word that's somewhere in the middle. I would like the document to match the search query but the stored field should only return the first 200 characte

Re: Limit number of characters returned

2010-12-03 Thread Mark
Correct me if I am wrong but I would like to return highlighted excerpts from the document so I would still need to index and store the whole document right (ie.. highlighting only works on stored fields)? On 12/3/10 3:51 AM, Ahmet Arslan wrote: --- On Fri, 12/3/10, Mark wrote: From: Mark

Negative fl param

2010-12-03 Thread Mark
When returning results is there a way I can say to return all fields except a certain one? So say I have stored fields foo, bar and baz but I only want to return foo and bar. Is it possible to do this without specifically listing out the fields I do want?

Re: Limit number of characters returned

2010-12-03 Thread Mark
ld be your own response writer, but unless and until you index gets cumbersome, I'd avoid that. Plus, storing the copied contents only shouldn't impact search much, since this doesn't add any terms... Best Erick On Fri, Dec 3, 2010 at 10:32 AM, Mark wrote: Correct me if I am

Re: Negative fl param

2010-12-03 Thread Mark
Ok simple enough. I just created a SearchComponent that removes values from the fl param. On 12/3/10 9:32 AM, Ahmet Arslan wrote: When returning results is there a way I can say to return all fields except a certain one? So say I have stored fields foo, bar and baz but I only want to return fo

Highlighting parameters

2010-12-03 Thread Mark
Is there a way I can specify separate configuration for 2 different fields. For field 1 I wan to display only 100 chars, Field 2 200 chars

Solr Newbie - need a point in the right direction

2010-12-06 Thread Mark
e solution and this has confused me. Basically, if you guys could point me in the right direction for resources (even as much as saying, you need X, it's over there) that would be a huge help. Cheers Mark

Re: Solr Newbie - need a point in the right direction

2010-12-07 Thread Mark
Thanks to everyone who responded, no wonder I was getting confused, I was completely focusing on the wrong half of the equation. I had a cursory look through some of the Nutch documentation available and it is looking promising. Thanks everyone. Mark On Tue, Dec 7, 2010 at 10:19 PM, webdev1977

Warming searchers/Caching

2010-12-07 Thread Mark
Is there any plugin or easy way to auto-warm/cache a new searcher with a bunch of searches read from a file? I know this can be accomplished using the EventListeners (newSearcher, firstSearcher) but I rather not add 100+ queries to my solrconfig.xml. If there is no hook/listener available, is

Re: Warming searchers/Caching

2010-12-07 Thread Mark
y, but Xinclude looks like what you're after, see: http://wiki.apache.org/solr/SolrConfigXml#XInclude Best Erick On Tue, Dec 7, 2010 at 6:33 PM, Mark wrote: Is there any plugin or easy way to auto-warm/cache a new searcher with a bunch of searches read from a file? I know this can be ac

Re: Warming searchers/Caching

2010-12-08 Thread Mark
plete statement of your setup is in order, since we seem to be talking past each other. Best Erick On Tue, Dec 7, 2010 at 10:24 PM, Mark wrote: Maybe I should explain my problem a little more in detail. The problem we are experiencing is after a delta-import we notice a extremely high load time o

Re: Warming searchers/Caching

2010-12-08 Thread Mark
otherwise block on CPU with lots of new indexes being warmed at once. Solr is not very good at providing 'real time indexing' for this reason, although I believe there are some features in post-1.4 trunk meant to support 'near real time search' better. _

Re: Warming searchers/Caching

2010-12-08 Thread Mark
our call. Best Erick On Wed, Dec 8, 2010 at 12:25 PM, Mark wrote: We only replicate twice an hour so we are far from real-time indexing. Our application never writes to master rather we just pick up all changes using updated_at timestamps when delta-importing using DIH. We don't have any wa

Re: Warming searchers/Caching

2010-12-09 Thread Mark
Our machines have around 8gb of ram and our index is 25gb. What are some good values for those cache settings. Looks like we have the defaults in place... size="16384" initialSize="4096" autowarmCount="1024" You are correct, I am just removing the health-check file and our loadbalancer preve

Very high load after replicating

2010-12-12 Thread Mark
After replicating an index of around 20g my slaves experience very high load (50+!!) Is there anything I can do to alleviate this problem? Would solr cloud be of any help? thanks

Re: Very high load after replicating

2010-12-13 Thread Mark
Markus, My configuration is as follows... ... false 2 ... false 64 10 false true No cache warming queries and our machines have 8g of memory in them with about 5120m of ram dedicated to so Solr. When our index is around 10-11g in size everything runs smoothly. At around 20g+ it just fall

Re: Very high load

2010-12-13 Thread Mark
Changing the subject. Its not related to after replication. It only appeared after indexing an extra field which increased our index size from 12g to 20g+ On 12/13/10 7:57 AM, Mark wrote: Markus, My configuration is as follows... ... false 2 ... false 64 10 false true No cache warming

Need some guidance on solr-config settings

2010-12-14 Thread Mark
Can anyone offer some advice on what some good settings would be for an index or around 6 million documents totaling around 20-25gb? It seems like when our index gets to this size our CPU load spikes tremendously. What would be some appropriate settings for ramBufferSize and mergeFactor? We cu

Re: Need some guidance on solr-config settings

2010-12-14 Thread Mark
Excellent reply. You mentioned: "I've been experimenting with FastLRUCache versus LRUCache, because I read that below a certain hitratio, the latter is better." Do you happen to remember what that threshold is? Thanks On 12/14/10 7:59 AM, Shawn Heisey wrote: On 12/14/201

DIH and UTF-8

2010-12-27 Thread Mark
Seems like I am missing some configuration when trying to use DIH to import documents with chinese characters. All the documents save crazy nonsense like "这是测试" instead of actual chinese characters. I think its at the JDBC level because if I hardcode one of the fields within data-confi

Re: DIH and UTF-8

2010-12-27 Thread Mark
05 PM, Mark wrote: Seems like I am missing some configuration when trying to use DIH to import documents with chinese characters. All the documents save crazy nonsense like "这是测试" instead of actual chinese characters. I think its at the JDBC level because if I hardcode on

Re: DIH and UTF-8

2010-12-27 Thread Mark
Glen http://zzzoot.blogspot.com/ On Mon, Dec 27, 2010 at 5:15 PM, Mark wrote: Solr: 1.4.1 JDBC driver: Connector/J 5.1.14 Looks like its the JDBC driver because It doesn't even work with a simple java program. I know this is a little off subject now, but do you have any clues? Thanks again

Re: DIH and UTF-8

2010-12-27 Thread Mark
Just like the user of that thread... i have my database, table, columns and system variables all set but it still doesnt work as expected. Server version: 5.0.67 Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> SHOW VARIABLES LIKE 'collation%'; +

Re: DIH and UTF-8

2010-12-28 Thread Mark
besides your browser? Yes, I am running out of ideas! :-) -Glen On Mon, Dec 27, 2010 at 7:22 PM, Mark wrote: Just like the user of that thread... i have my database, table, columns and system variables all set but it still doesnt work as expected. Server version: 5.0.67 Source distribution

Dynamic column names using DIH

2010-12-28 Thread Mark
Is there a way to create dynamic column names using the values returned from the query? For example:

Re: DIH and UTF-8

2010-12-29 Thread Mark
. However when using the mysql client all the characters would show up as all mangled or as ''. This was resolved by running the following query "set names utf8;". On 12/28/10 10:17 PM, Glen Newton wrote: Hi Mark, Could you offer a more technical explanation of the Ra

Query multiple cores

2010-12-29 Thread Mark
Is it possible to query across multiple cores and combine the results? If not available out-of-the-box could this be accomplished using some sort of custom request handler? Thanks for any suggestions.

Re: Query multiple cores

2010-12-29 Thread Mark
On Dec 29, 2010, at 3:24 PM, Mark wrote: Is it possible to query across multiple cores and combine the results? If not available out-of-the-box could this be accomplished using some sort of custom request handler? Thanks for any suggestions.

Question on long delta import

2010-12-30 Thread Mark
When using DIH my delta imports appear to finish quickly.. ie it says "Indexing completed. Added/Updated: 95491 documents. Deleted 11148 documents." in a relatively short amount of time (~30mins). However the importMessage says "A command is still running..." for a really long time (~60mins).

DIH MySQLNonTransientConnectionException

2011-01-01 Thread Mark
I have recently been receiving the following errors during my DIH importing. Has anyone ran into this issue before? Know how to resolve it? Thanks! Jan 1, 2011 4:51:06 PM org.apache.solr.handler.dataimport.JdbcDataSource closeConnection SEVERE: Ignoring Error when closing connection com.mysql

DIH keeps felling during full-import

2011-02-07 Thread Mark
I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1 large import? Thanks. Feb 7, 2011 5:52:33 AM org.apa

Re: DIH keeps failing during full-import

2011-02-07 Thread Mark
Typo in subject On 2/7/11 7:59 AM, Mark wrote: I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1

Re: DIH keeps felling during full-import

2011-02-07 Thread Mark
Mon, Feb 7, 2011 at 9:29 PM, Mark wrote: I'm receiving the following exception when trying to perform a full-import (~30 hours). Any idea on ways I could fix this? Is there an easy way to use DIH to break apart a full-import into multiple pieces? IE 3 mini-imports instead of 1 large i

DIH threads

2011-02-18 Thread Mark
Has anyone applied the DIH threads patch on 1.4.1 (https://issues.apache.org/jira/browse/SOLR-1352)? Does anyone know if this works and/or does it improve performance? Thanks

Removing duplicates

2011-02-18 Thread Mark
I know that I can use the SignatureUpdateProcessorFactory to remove duplicates but I would like the duplicates in the index but remove them conditionally at query time. Is there any easy way I could accomplish this?

Field Collapsing on 1.4.1

2011-02-19 Thread Mark
Is there a seamless field collapsing patch for 1.4.1? I see it has been merged into trunk but I tried downloading it to give it a whirl but it appears that many things have changed and our application would need some considerable work to get it up an running. Thanks

  1   2   3   4   5   6   7   8   9   10   >