POSTing the individual parameters (literal.id, literal.mycategory, literal.mycategory) as name value pairs to 1.4's /update/extract does work. I just realized the POST's content type hadn't been set to 'application/x-www-form-urlencoded'. Set it to that and it accepts all the parameters.
-dKt ________________________________ From: David Thompson <guinyardli...@yahoo.com> To: solr-user@lucene.apache.org Sent: Fri, July 9, 2010 12:17:59 PM Subject: PDF remote streaming extract with lots of multiValues How would I go about setting a large number of literal values in a call to index a remote PDF? I'm currently calling: http://host/solr/update/extract?literal.id=abc&literal.mycategory=blah&stream.url=http://otherhost/some/file.pdf And that works great, except now I'm coming across usecases where I need send in hundreds, up to thousands, of different values for 'mycategory'. So with mycategory defined as a multiValued string, I can call: http://host/solr/update/extract?literal.id=abc&literal.mycategory=blah&literal.mycategory=foo&literal.mycategory=bar&stream.url=http://otherhost/some/file.pdf and that works as expected. But when I try to embed thousands of literal.mycategory parameters in the call, eventually my container says 'look, I've been forgiving about letting you GET URLs far longer than 1500 characters, but this is ridiculous' and barfs on it. I've tried POSTing a <add><doc>...</doc></add> command, but it only pays attention to parameters in the URL query string, ignoring everything in the document. I've seen some other threads that seem related, but now I'm just confused. What's the best way to tackle this? -dKt