Re: News clustering
Hi Stanislaw Osinski, On Mon, Dec 3, 2012 at 6:13 PM, Stanislaw Osinski wrote: > One of our clients uses Solr's search results clustering for grouping news. > Instead of the default Carrot2 algorithm that ships with Solr they use a > commercial one, but Carrot2 should give you decent clusters too. Here's an > example clustering result: > > http://imagebin.org/238001 > > Staszek > > -- > Stanislaw Osinski > http://carrotsearch.com > > On Fri, Nov 30, 2012 at 4:44 PM, Jorge Luis Betancourt Gonzalez < > jlbetanco...@uci.cu> wrote: > > > Hi all: > > > > I'm thinking on using nutch combined with solr to index some news sites > in > > an intranet. And I was wondering how effective could be using the > > clustering component to cluster the search results? Any success history > on > > using solr clustering component for news clustering? Any existing > solution > > for clustering/classification on index time? > > > > Greetings! > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS > > INFORMATICAS... > > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > > > http://www.uci.cu > > http://www.facebook.com/universidad.uci > > http://www.flickr.com/photos/universidad_uci > > >
Re: News clustering
Hi Stanislaw Osinski, Was the picture generated using Lingo 3G algorihtms? I saw some sub-clusters inside it. Nice pic :) I am interested to learn it. How long is the Lingo 3G trial period? Is there any way to programmatically measure the performance of Carrot2 clustering algorithm? thanx cheers Hanjoyo On Mon, Dec 3, 2012 at 6:13 PM, Stanislaw Osinski wrote: > One of our clients uses Solr's search results clustering for grouping news. > Instead of the default Carrot2 algorithm that ships with Solr they use a > commercial one, but Carrot2 should give you decent clusters too. Here's an > example clustering result: > > http://imagebin.org/238001 > > Staszek > > -- > Stanislaw Osinski > http://carrotsearch.com > > On Fri, Nov 30, 2012 at 4:44 PM, Jorge Luis Betancourt Gonzalez < > jlbetanco...@uci.cu> wrote: > > > Hi all: > > > > I'm thinking on using nutch combined with solr to index some news sites > in > > an intranet. And I was wondering how effective could be using the > > clustering component to cluster the search results? Any success history > on > > using solr clustering component for news clustering? Any existing > solution > > for clustering/classification on index time? > > > > Greetings! > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS > > INFORMATICAS... > > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > > > http://www.uci.cu > > http://www.facebook.com/universidad.uci > > http://www.flickr.com/photos/universidad_uci > > >
Re: How to change Solr UI
Hi Romita, In my opinion, if you are new to Solr, you can start learning from Solritas. Solritas uses Apache Velocity, a templating language, CSS and JQuery to manage it looks and behavior. Besides that you can write a custom SearchComponent inside the /browse SearchHandler to add more functionality to your search application. Kind regards, Hanjoyo On Mon, Dec 3, 2012 at 4:35 PM, Romita Saha wrote: > Hi, > > I want to change the Solr UI. As far as i understand, Solritas is just for > prototyping, where I can change the UI according to a predefined template > (Velocity) and cannot add on any additional functionality to that page. > How can I change the Solr UI otherwise. Any guidance would be appreciated. > > Thanks and regards, > Romita >
Re: News clustering
Hi Stanislaw, I mean measuring the similarity between the document in each cluster. Also, difference between document on one cluster with another cluster. I saw the sample code ClusteringQualityBencmark.java However, I do not know how to make use of it for assessing my Solr Clustering performance. Kind regards, Hanjoyo On Mon, Dec 3, 2012 at 8:11 PM, Stanislaw Osinski wrote: > > Was the picture generated using Lingo 3G algorihtms? > > I saw some sub-clusters inside it. > > Nice pic :) > > > > That is correct. > > > I am interested to learn it. > > How long is the Lingo 3G trial period? > > > > I'll send you the details in a private e-mail in a second. > > > > > Is there any way to programmatically measure the performance of Carrot2 > > clustering algorithm? > > > > I'm not sure what you mean by performance. Measuring clustering time is > pretty straightforward, measuring the quality of clusters is not, a lot > depends on your specific data and application. > > Staszek >
Re: News clustering
Hi Stanislaw, I see. Thank you for the reference. Kind regards, Hanjoyo On Tue, Dec 4, 2012 at 12:37 AM, Stanislaw Osinski wrote: > > I mean measuring the similarity between the document in each cluster. > > Also, difference between document on one cluster with another cluster. > > > > I saw the sample code ClusteringQualityBencmark.java > > However, I do not know how to make use of it for assessing my Solr > > Clustering performance. > > > > You'd need to write your own code for this, here are the most common > clustering quality measures you mentioned: > > > http://en.wikipedia.org/wiki/Cluster_analysis#Evaluation_of_clustering_results > > These are meant for the general case (numeric attributes), to apply them to > texts, you'd need to use the vector representation of the documents. > > One a more general note, synthetic measures test only the document-cluster > assignments, but none take the quality of labels into account (this is > really hard to measure objectively). > > Staszek >
Re: How to change Solr UI
> > > Note that Velocity _can_ be used for user-facing code, but be very sure you > secure your Solr. If you allow direct access, a user can easily enter > something like http:// > /update?commit=true&stream.body=*:*. > And all your documents will be gone. > > Hi Erickson, Thank you for the input. I'll notice and filter out this url. * http:// /update?commit=true&stream.body=*:* Kind regards, Hanjoyo
Re: solr war -> osgi
> Has anyone had any experience repackaging the solr war for osgi? And while > I'm at it, has anyone done this in geronimo 3.0? > > Hi Marcos, Start glassfish web server. Put solr war file inside the autodeploy folder. Finally, you need to find the solr home folder location. Different operating system will have different solr home location for glassfish. You need to find it yourself in the glassfish log file. It is a bit difficult. good luck Kind regards, Hanjoyo
Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse
Hello list, On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo wrote: > Hello all, > > I used older Solr 3.6.1 version. > I created a new web project (called SolrRedo) on Netbeans 7.1.1 running on > Glassfish Web Server > Then I moved sources from the solr.war sample code (that resided inside > apache-solr-3.6.1.zip) > into SolrRedo' Netbeans 7.1.1 project. > > I also do some settings (ex: put the solr.home folder into a proper > place), deploy and run the project > I successfully run it on the browser (including > http://localhost:8080/SolrRedo/admin/). > However, I got error HTTP Status 500 when trying to browse > http://localhost:8080/SolrRedo/browse/ > How should I fix this problem? > > Kind regards > Hanjoyo > > > Here is the details: > HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: > lazy loading error at > org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763) > at > org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778) > at > org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) > at > org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) > at > org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) > at > org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) > at > com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317) > at > com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195) > at > com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) > at com.sun.grizzly.http.ProcessorTask.doProcess > (ProcessorTask.java:746) at > com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at > com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) > at > com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) > at > com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) > at > com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) > at > com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) > at > com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) > at > com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) > at com.sun.grizzly.ContextTask.run > (ContextTask.java:71) at > com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) > at > com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) > at java.lang.Thread.run(Thread.java:722) > Caused by: org.apache.solr.common.SolrException: Error loading class > 'solr.VelocityResponseWriter' at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394) > at > org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at > org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487) > at org.apache.solr.core.SolrCore.access$300 > (SolrCore.java:72) at > org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758) > > ... 28 more Caused by: java.lang.ClassNotFoundException: > solr.VelocityResponseWriter at java.net.URLClassLoader$1.run > (URLClassLoader.java:366) at > java.net.URLClassLoader$1.run(URLClassLoader.java:355) at > java.security.AccessController.doPrivileged(Native Method) at > java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at > java.net.FactoryURLClassLoader.loadClass > (URLClassLoader.java:789) at > java.lang.ClassLoader.loadClass(ClassLoader.java:356) at > java.lang.Class.forName0(Native > Method) at java.lang.Class.forName(Class.java:264) at > org.apache.solr.core.SolrResourceLoader.findClass > (SolrResourceLoader.java:378) ... 32 more
Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse
Hello list, I finally solved the problem. I miss the configuration of solr jar file in the solrconfig.xml file. thank you. Kind regards, Hanjoyo On Tue, Oct 2, 2012 at 5:57 PM, Iwan Hanjoyo wrote: > Hello list, > > > > On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo wrote: > >> Hello all, >> >> I used older Solr 3.6.1 version. >> I created a new web project (called SolrRedo) on Netbeans 7.1.1 running >> on Glassfish Web Server >> Then I moved sources from the solr.war sample code (that resided inside >> apache-solr-3.6.1.zip) >> into SolrRedo' Netbeans 7.1.1 project. >> >> I also do some settings (ex: put the solr.home folder into a proper >> place), deploy and run the project >> I successfully run it on the browser (including >> http://localhost:8080/SolrRedo/admin/). >> However, I got error HTTP Status 500 when trying to browse >> http://localhost:8080/SolrRedo/browse/ >> How should I fix this problem? >> >> Kind regards >> Hanjoyo >> >> >> Here is the details: >> HTTP Status 500 - lazy loading error >> org.apache.solr.common.SolrException: lazy loading error at >> org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763) >> at >> org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778) >> at >> org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273) >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256) >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217) >> at >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279) >> at >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) >> at >> org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) >> at >> org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) >> at >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) >> at >> org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) >> at >> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) >> at >> com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317) >> at >> com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195) >> at >> com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) >> at com.sun.grizzly.http.ProcessorTask.doProcess >> (ProcessorTask.java:746) at >> com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at >> com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) >> at >> com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) >> at >> com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) >> at >> com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) >> at >> com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) >> at >> com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) >> at >> com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) >> at com.sun.grizzly.ContextTask.run >> (ContextTask.java:71) at >> com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) >> at >> com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) >> at java.lang.Thread.run(Thread.java:722) >> Caused by: org.apache.solr.common.SolrException: Error loading class >> 'solr.VelocityResponseWriter' at >> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394) >> at >> org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at >> org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487) >> at org.apache.solr.core.SolrCore.access$300 >> (SolrCore.java:72) at >> org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758) >> >> ... 28 more Caused by: java.lang.ClassNotFoundException: >> solr.VelocityResponseWriter at java.net.URLClassLoader$1.run >> (URLClassLoader.java:366) at >> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at >> java.security.AccessController.doPrivileged(Native Method) at >> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at >> java.net.FactoryURLClassLoader.loadClass >> (URLClassLoader.java:789) at >> java.lang.ClassLoader.loadClass(ClassLoader.java:356) at >> java.lang.Class.forName0(Native >> Method) at java.lang.Class.forName(Class.java:264) at >> org.apache.solr.core.SolrResourceLoader.findClass >> (SolrResourceLoader.java:378) ... 32 more > > >
Re: successfully move to glassfish but got error accessing Velocity sample code
Hello list, I finally solved the problem. I miss the configuration of solr jar files in the solrconfig.xml file. thank you. Kind regards, Hanjoyo On Mon, Oct 1, 2012 at 8:58 PM, Iwan Hanjoyo wrote: > Hello all, > > First, after extracting apache-solr-3.6.1.zip file, I can run and access > http://localhost:8080/browse (the solritas velocity example) from jetty. > I also successfully move the solr.war to Glassfish and get it running. > > However, I got an error when accessing http://localhost:8080/browse > (the solritas velocity example) from glassfish. > What configuration is missing here? > > I have copied the solr.home folder as is from the apache-solr-3.6.1.zip > Thanx before. > > Kind regards, > > > Hanjoyo >
Re: solr1.4 code Example
you can download the code directly from here http://www.solrenterprisesearchserver.com/ http://solrenterprisesearchserver.s3-website-us-east-1.amazonaws.com/downloads/5883-solr-enterprise-search2.zip regards, Hanjoyo
find a way to solr netbeans
Hi list, Any one know the how-to integration solr with netbeans? The reasons I want to have solr in netbeans: + to avoid the long classpath configuration in the environment variables + avoid complicated steps (especially when starting and restarting the glassfish server), + help with debugging the app. *=* *It simply integrate all the processes.* So far, it is ok. I have netbeans run the app in the browser, view the admin pages, but got error when submit the search button. Here is the error message: HTTP Status 400 - Missing solr core name in path I found the glassfish' log file reporting: [#|2012-10-11T23:19:51.468+0700|INFO|glassfish3.1.2|org.apache.solr.handler.component.HttpShardHandlerFactory|..Setting urlScheme to: http://|#] This happened since I put the solr/home folder into the netbeans project and hardcode solr/home path in solr.xml file. This is what I have done to fix "Setting urlScheme to: http://"; : add in the solrconfig.xml file this configuration 1000 5000 http://127.0.0.1:8080/SolrRedo Results: The glassfish log file indicate this progress [#|2012-10-11T23:19:51.796+0700|INFO|glassfish3.1.2|org.apache.solr.handler.component.HttpShardHandlerFactory|_ThreadID=68;_ThreadName=Thread-2;|Setting urlScheme to: http://127.0.0.1:8080/SolrRedo|#] However, the problem still happened (HTTP Status 400 - Missing solr core name in path). Can anyone help? Many thanks before. Kind regards, Hanjoyo
Re: Apache Nutch 1.5.1 + Apache Solr 4.0
Hi Steiner, I found a video tutorial on Nutch 1.4 + Solr 3.4.0 (on Windows). It do solve my error. Hope it do for yours too. Here is the link: Running Nutch and Solr on Windows Tutorial: Part 1 http://www.youtube.com/watch?v=baxhI6Wkov8 Running Nutch and Solr on Windows Tutorial: Part 2 http://www.youtube.com/watch?v=Qs-18hRRpNU Running Nutch and Solr on Windows Tutorial: Part 3 http://www.youtube.com/watch?v=GtbDHiYrlNE Published on Mar 15, 2012 by Dutedute2 Kind regards, Hanjoyo On Thu, Nov 8, 2012 at 4:52 PM, Antony Steiner wrote: > Hello my name is Antony and I'm new to apache nutch and solr. > > I want to crawl my website and therefore I downloaded nutch to do this. > This works fine. But no I would like to integrate nutch with solr. Im > running this on my unix system. > Im trying to follow this tutorial: > http://wiki.apache.org/nutch/NutchTutorial > But it wont for me. Running Solr without nutch is no problem. I can post > documents to solr with post.jar. But what I want to do is post my nutch > crawl to solr. > Now if I copy the schema.xml from nutch to > apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr > (java -jar start.jar), I get compiling errors but Solr will start. (Is this > the correct directory to copy my schema?) > > Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema > INFO: Schema name=nutch > Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create > SEVERE: Unable to create core: collection1 > org.apache.solr.common.SolrException: Schema Parsing Failed: multiple > points > at > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571) > at org.apache.solr.schema.IndexSchema.(IndexSchema.java:113) > ... > > Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log > SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed: > multiple points > at > org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571) > at org.apache.solr.schema.IndexSchema.(IndexSchema.java:113) > at > org.apache.solr.core.CoreContainer.create(CoreContainer.java:846) > ... > > Now if I don't copy the schema and push my nutch crawl to solr I get > following error: > > SolrIndexer: starting at 2012-11-08 10:49:02 > Indexing 5 documents > java.io.IOException: Job failed! > SolrDeleteDuplicates: starting at 2012-11-08 10:49:47 > SolrDeleteDuplicates: Solr url: http://photon:8983/solr/ > > And this is taken from the logging: > org.apache.solr.common.SolrException: ERROR: [doc= > http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host' > > What should I do or what am I missing? > > I hope you can help me > Best Regards > Antony >