Entity textbox is empty

2015-12-26 Thread adib
 Hi, 
I just started using solr 5.2 and i was able to install, add a document and
doing dome query.
It's been a pleasing journey until i stuck trying to import data from MySQL
database. I follow  this tutorial
  , and there are no
error in logs, but the strange thing is the "entity" combo box in
data-import GUI is empty. 

I wonder if there's any steps that i miss ? 

Here i also attached the screenshot of the admin GUI. Thankyou

 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Entity-textbox-is-empty-tp4247387.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Entity textbox is empty

2015-12-26 Thread Ahmet Arslan
hi Adib,

Please remove the following line from the dataconfig.xml file:
# define data source

Ahmet



On Saturday, December 26, 2015 11:23 AM, adib  wrote:
 Hi, 
I just started using solr 5.2 and i was able to install, add a document and
doing dome query.
It's been a pleasing journey until i stuck trying to import data from MySQL
database. I follow  this tutorial
  , and there are no
error in logs, but the strange thing is the "entity" combo box in
data-import GUI is empty. 

I wonder if there's any steps that i miss ? 

Here i also attached the screenshot of the admin GUI. Thankyou





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Entity-textbox-is-empty-tp4247387.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Entity textbox is empty

2015-12-26 Thread adib
Wow, It's working 

Thankyou




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Entity-textbox-is-empty-tp4247387p4247393.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: AJAX access to Solr Server

2015-12-26 Thread GW
Yes, your proxy seems to work.

The only thing that bothers me is anyone can query your Solr installation.

The world is not a nice place and I can't tell you how many DOS attacks
I've fended off in the last 30 years.

If I thought you were an a-hole I could set up a few machines and query
your server to a standstill.

About ten years ago I was working on a contract. The competitor that lost
the bid did a email DOS attack on me after they took out a whole bunch car
adds (hot deals) in the local paper.  My email was f###'d and my phone was
ringing off the hook.

Cheers,

GW


On 25 December 2015 at 21:55, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Yeah I prefer a whitelist of locked down query request handlers via a
> proxy that are reasonably well protected. I would never expose update to
> the web or allow any updating over a public interface.
>
> If you want an example, you can checkout
>
>
> http://solr.quepid.com/solr/statedecoded/select?q=*:*&qt=update&stream.body=
> *:*&commit=true
>
> http://solr.quepid.com/solr/statedecoded/update?stream.body=
> *:*&commit=true
>
> But still get search results back:
> http://solr.quepid.com/solr/statedecoded/select?q=*:*
>
> Click all those all day long. And do let me know if you find holes... I'm
> sure there's room for improvement
>
> Cheers,
> -Doug
>
> On Friday, December 25, 2015, GW  wrote:
>
> > If you are using Linux a simple one liner in IP tables
> >
> > iptables -I INPUT \! --src www.yourwebserver.com -m tcp -p tcp --dport
> > 8983 -j DROP
> >
> >
> > If windows, you can do something similar
> >
> > otherwise it is very easy for anyone to delete all your documents with
> >
> > http://yoursolrserver.com:8983/solr/your-core/update?stream.body=
> > *:*&commit=true
> >
> >
> >
> >
> > On 25 December 2015 at 20:42, Doug Turnbull <
> > dturnb...@opensourceconnections.com > wrote:
> >
> > > Hi Shawn
> > >
> > > Maybe I should have qualified the parameters of scenarios this make me
> > > comfortable just proxying Solr directly w/o an API
> > >
> > > These situations include:
> > >
> > > 1. I've got no qualms about giving the whole world access to every
> > document
> > > in the index. There's nothing protected about anything.
> > > 2. The content can be easily rebuilt , it's not very large. (I can
> easily
> > > push a button and make a new one)
> > >
> > > Sure you can denial of service Solr, and I might lose my search index.
> > But
> > > you can denial of service anything. This includes just about anything
> you
> > > put in front of Solr. Moreover, the added complexity of a
> > > Drupal/Wordpress/your API might only add to the security problems with
> > > their own security issues. I'd rather keep it simple and have fewer
> > moving
> > > parts.
> > >
> > > Cases where I would want an API in front of Solr (these are just the
> > > security ones):
> > > - I want to protect the content (ie based on some notion of a "user" or
> > > other permissions)
> > > - Rebuilding the content would be very hard and time consuming
> > >
> > > I would also say to expose Solr directly to everyone you probably
> should
> > > know about Solr's bugaboos:
> > > - the lovely qt parameter and the request dispatcher (the nginx proxy
> > below
> > > disallows qt)
> > > - deep paging (prevented by the nginx proxy)
> > > - how to lock down a request handler fairly robustly, how to use
> > invariants
> > > - mitigating intentionally malicious queries (such as the lovely
> "sleep"
> > > function query).
> > >
> > > I'm also curious to hear what the websolr people do, or anyone else
> that
> > > hosts Solr for the JavaScript app development crowd.
> > >
> > > Cheers
> > > -Doug
> > >
> > >
> > > On Friday, December 25, 2015, Shawn Heisey  > > wrote:
> > >
> > > > On 12/25/2015 12:17 PM, Eric Dain wrote:
> > > > > Does allowing javascript direct access to SolrCloud raise security
> > > > concern?
> > > > > should I build a REST service in between?
> > > > >
> > > > > I need to provide async search capability to web pages. the pages
> > will
> > > be
> > > > > public with no authentication.
> > > >
> > > > End users should never have access to Solr.  Access to Solr from the
> > > > end-user machine is required if you want to accept Solr responses
> > > directly.
> > > >
> > > > In one of the other replies that you received, Doug has given you an
> > > > nginx config for proxying access to Solr -- indirect access.  This
> can
> > > > protect against *changes* to the index, and it has protection against
> > > > high start/rows values, but there are many other ways that an
> attacker
> > > > can construct denial of service queries, which this proxy config will
> > > > not prevent.
> > > >
> > > > I think that indirect access (through a proxy) should not be allowed
> > > > either, unless you can trust all the people that will have access.
> > > >
> > > > If Solr is open to a sufficiently wide audience (especially the
> > > > Internet), someone will find a way to abuse the se

Re: AJAX access to Solr Server

2015-12-26 Thread Doug Turnbull
True though you could also query an API in front of Solr to a stand still
pretty easily.  DoSing is a pretty easy thing to do to anything that needs
to be open to the public.

The biggest issue with the proxy approach is an attacker with Solr
knowledge that doesn't need to DoS, just send a handful of really slow
queries to Solr. This is something that can be mitigated, but the more
skilled the attacker the more interesting the slow queries get.

I should note this is a problem to mitigate with any system that handles
user queries by passing then directly to edismax. Solr sort of encourages
you to talk straight to edismax, and most systems don't prepare or escape
the query. Instead they want to support the full range of query operations.
An attacker can still put nasty function queries in the query box enough
times to make a Solr server crawl.

Doug

On Saturday, December 26, 2015, GW  wrote:

> Yes, your proxy seems to work.
>
> The only thing that bothers me is anyone can query your Solr installation.
>
> The world is not a nice place and I can't tell you how many DOS attacks
> I've fended off in the last 30 years.
>
> If I thought you were an a-hole I could set up a few machines and query
> your server to a standstill.
>
> About ten years ago I was working on a contract. The competitor that lost
> the bid did a email DOS attack on me after they took out a whole bunch car
> adds (hot deals) in the local paper.  My email was f###'d and my phone was
> ringing off the hook.
>
> Cheers,
>
> GW
>
>
> On 25 December 2015 at 21:55, Doug Turnbull <
> dturnb...@opensourceconnections.com > wrote:
>
> > Yeah I prefer a whitelist of locked down query request handlers via a
> > proxy that are reasonably well protected. I would never expose update to
> > the web or allow any updating over a public interface.
> >
> > If you want an example, you can checkout
> >
> >
> >
> http://solr.quepid.com/solr/statedecoded/select?q=*:*&qt=update&stream.body=
> > *:*&commit=true
> >
> > http://solr.quepid.com/solr/statedecoded/update?stream.body=
> > *:*&commit=true
> >
> > But still get search results back:
> > http://solr.quepid.com/solr/statedecoded/select?q=*:*
> >
> > Click all those all day long. And do let me know if you find holes... I'm
> > sure there's room for improvement
> >
> > Cheers,
> > -Doug
> >
> > On Friday, December 25, 2015, GW >
> wrote:
> >
> > > If you are using Linux a simple one liner in IP tables
> > >
> > > iptables -I INPUT \! --src www.yourwebserver.com -m tcp -p tcp --dport
> > > 8983 -j DROP
> > >
> > >
> > > If windows, you can do something similar
> > >
> > > otherwise it is very easy for anyone to delete all your documents with
> > >
> > > http://yoursolrserver.com:8983/solr/your-core/update?stream.body=
> > > *:*&commit=true
> > >
> > >
> > >
> > >
> > > On 25 December 2015 at 20:42, Doug Turnbull <
> > > dturnb...@opensourceconnections.com  >
> wrote:
> > >
> > > > Hi Shawn
> > > >
> > > > Maybe I should have qualified the parameters of scenarios this make
> me
> > > > comfortable just proxying Solr directly w/o an API
> > > >
> > > > These situations include:
> > > >
> > > > 1. I've got no qualms about giving the whole world access to every
> > > document
> > > > in the index. There's nothing protected about anything.
> > > > 2. The content can be easily rebuilt , it's not very large. (I can
> > easily
> > > > push a button and make a new one)
> > > >
> > > > Sure you can denial of service Solr, and I might lose my search
> index.
> > > But
> > > > you can denial of service anything. This includes just about anything
> > you
> > > > put in front of Solr. Moreover, the added complexity of a
> > > > Drupal/Wordpress/your API might only add to the security problems
> with
> > > > their own security issues. I'd rather keep it simple and have fewer
> > > moving
> > > > parts.
> > > >
> > > > Cases where I would want an API in front of Solr (these are just the
> > > > security ones):
> > > > - I want to protect the content (ie based on some notion of a "user"
> or
> > > > other permissions)
> > > > - Rebuilding the content would be very hard and time consuming
> > > >
> > > > I would also say to expose Solr directly to everyone you probably
> > should
> > > > know about Solr's bugaboos:
> > > > - the lovely qt parameter and the request dispatcher (the nginx proxy
> > > below
> > > > disallows qt)
> > > > - deep paging (prevented by the nginx proxy)
> > > > - how to lock down a request handler fairly robustly, how to use
> > > invariants
> > > > - mitigating intentionally malicious queries (such as the lovely
> > "sleep"
> > > > function query).
> > > >
> > > > I'm also curious to hear what the websolr people do, or anyone else
> > that
> > > > hosts Solr for the JavaScript app development crowd.
> > > >
> > > > Cheers
> > > > -Doug
> > > >
> > > >
> > > > On Friday, December 25, 2015, Shawn Heisey  
> > > > wrote:
> > > >
> > > > > On 12/25/2015 12:17 PM, Eric Dain wrote:
> > > > > > Does allowing jav

Re: AJAX access to Solr Server

2015-12-26 Thread GW
What are you using for a client?

I generally use a REST client written in PHP or Perl and then prevent cross
scripting so only the client can do the work.

My Solr cluster is running behind OpenVPN on 172.16.0.0/24

I use a jquery in the following to get an infinite scroll

http://www.frogshopping.com

cross scripting work not in place yet.

On 26 December 2015 at 09:59, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> True though you could also query an API in front of Solr to a stand still
> pretty easily.  DoSing is a pretty easy thing to do to anything that needs
> to be open to the public.
>
> The biggest issue with the proxy approach is an attacker with Solr
> knowledge that doesn't need to DoS, just send a handful of really slow
> queries to Solr. This is something that can be mitigated, but the more
> skilled the attacker the more interesting the slow queries get.
>
> I should note this is a problem to mitigate with any system that handles
> user queries by passing then directly to edismax. Solr sort of encourages
> you to talk straight to edismax, and most systems don't prepare or escape
> the query. Instead they want to support the full range of query operations.
> An attacker can still put nasty function queries in the query box enough
> times to make a Solr server crawl.
>
> Doug
>
> On Saturday, December 26, 2015, GW  wrote:
>
> > Yes, your proxy seems to work.
> >
> > The only thing that bothers me is anyone can query your Solr
> installation.
> >
> > The world is not a nice place and I can't tell you how many DOS attacks
> > I've fended off in the last 30 years.
> >
> > If I thought you were an a-hole I could set up a few machines and query
> > your server to a standstill.
> >
> > About ten years ago I was working on a contract. The competitor that lost
> > the bid did a email DOS attack on me after they took out a whole bunch
> car
> > adds (hot deals) in the local paper.  My email was f###'d and my phone
> was
> > ringing off the hook.
> >
> > Cheers,
> >
> > GW
> >
> >
> > On 25 December 2015 at 21:55, Doug Turnbull <
> > dturnb...@opensourceconnections.com > wrote:
> >
> > > Yeah I prefer a whitelist of locked down query request handlers via a
> > > proxy that are reasonably well protected. I would never expose update
> to
> > > the web or allow any updating over a public interface.
> > >
> > > If you want an example, you can checkout
> > >
> > >
> > >
> >
> http://solr.quepid.com/solr/statedecoded/select?q=*:*&qt=update&stream.body=
> > > *:*&commit=true
> > >
> > > http://solr.quepid.com/solr/statedecoded/update?stream.body=
> > > *:*&commit=true
> > >
> > > But still get search results back:
> > > http://solr.quepid.com/solr/statedecoded/select?q=*:*
> > >
> > > Click all those all day long. And do let me know if you find holes...
> I'm
> > > sure there's room for improvement
> > >
> > > Cheers,
> > > -Doug
> > >
> > > On Friday, December 25, 2015, GW  >
> > wrote:
> > >
> > > > If you are using Linux a simple one liner in IP tables
> > > >
> > > > iptables -I INPUT \! --src www.yourwebserver.com -m tcp -p tcp
> --dport
> > > > 8983 -j DROP
> > > >
> > > >
> > > > If windows, you can do something similar
> > > >
> > > > otherwise it is very easy for anyone to delete all your documents
> with
> > > >
> > > > http://yoursolrserver.com:8983/solr/your-core/update?stream.body=
> > > > *:*&commit=true
> > > >
> > > >
> > > >
> > > >
> > > > On 25 December 2015 at 20:42, Doug Turnbull <
> > > > dturnb...@opensourceconnections.com  >
> > wrote:
> > > >
> > > > > Hi Shawn
> > > > >
> > > > > Maybe I should have qualified the parameters of scenarios this make
> > me
> > > > > comfortable just proxying Solr directly w/o an API
> > > > >
> > > > > These situations include:
> > > > >
> > > > > 1. I've got no qualms about giving the whole world access to every
> > > > document
> > > > > in the index. There's nothing protected about anything.
> > > > > 2. The content can be easily rebuilt , it's not very large. (I can
> > > easily
> > > > > push a button and make a new one)
> > > > >
> > > > > Sure you can denial of service Solr, and I might lose my search
> > index.
> > > > But
> > > > > you can denial of service anything. This includes just about
> anything
> > > you
> > > > > put in front of Solr. Moreover, the added complexity of a
> > > > > Drupal/Wordpress/your API might only add to the security problems
> > with
> > > > > their own security issues. I'd rather keep it simple and have fewer
> > > > moving
> > > > > parts.
> > > > >
> > > > > Cases where I would want an API in front of Solr (these are just
> the
> > > > > security ones):
> > > > > - I want to protect the content (ie based on some notion of a
> "user"
> > or
> > > > > other permissions)
> > > > > - Rebuilding the content would be very hard and time consuming
> > > > >
> > > > > I would also say to expose Solr directly to everyone you probably
> > > should
> > > > > know about Solr's bugaboos:
> > > > > - the lovely

Solr5.X document loss in splitting shards

2015-12-26 Thread Luca Quarello
Hi,
I have a SOLR 5.3.1 CLOUD with two nodes and 8 shards per node.

Each shard is about* 35 million documents (**35025882**) and 16GB sized.*


   - I launch the SPLIT command on a shard (shard 13) in the ASYNC way:

curl "
http://x-perf-jvm5:8983/solr/admin/collections?action=SPLITSHARD&collection=sepa&shard=shard13&async=1006
"



   - After many time I obtain:


curl "
http://x-perf-jvm5:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1006
"


06020505sepa_shard13_1_replica1EMPTY_BUFFERhttp://192.168.72.55:8983/solr/sepa_shard13_replica1/";>048100completedTaskId:
1006264805140687740 webapp=null path=/admin/cores
params={shard=shard13_0&collection.configName=flsFragments&name=sepa_shard13_0_replica1&action=CREATE&collection=sepa&wt=javabin&qt=/admin/cores&async=1006264805140687740&version=2}
status=0 QTime=2 00completedTaskId: 1006264808287598167 webapp=null path=/admin/cores
params={shard=shard13_1&collection.configName=flsFragments&name=sepa_shard13_1_replica1&action=CREATE&collection=sepa&wt=javabin&qt=/admin/cores&async=1006264808287598167&version=2}
status=0 QTime=0 00completedTaskId: 1006264810307413066 webapp=null path=/admin/cores
params={coreNodeName=core_node18&state=active&nodeName=192.168.72.55:8983_solr&action=PREPRECOVERY&checkLive=true&core=sepa_shard13_1_replica1&wt=javabin&qt=/admin/cores&onlyIfLeader=true&async=1006264810307413066&version=2}
status=0 QTime=0 00completedTaskId: 1006264810317508052 webapp=null path=/admin/cores
params={targetCore=sepa_shard13_0_replica1&targetCore=sepa_shard13_1_replica1&action=SPLIT&core=sepa_shard13_replica1&wt=javabin&qt=/admin/cores&async=1006264810317508052&version=2}
status=0 QTime=0 00completedTaskId: 1006266054432757899 webapp=null path=/admin/cores
params={name=sepa_shard13_1_replica1&action=REQUESTAPPLYUPDATES&wt=javabin&qt=/admin/cores&async=1006266054432757899&version=2}
status=0 QTime=5 completedfound 1006 in completed
tasks




   - I launch the commit command:

curl http://x-perf-jvm5:8983/solr/sepa/update --data-binary '' -H
'Content-type:application/xml'



0162




The new created shards have:
*13430316 documents (5.6 GB) and 13425924 documents (5.59 GB**)*.

What is the problem? Where I am wrong?

Thanks,
Luca