Re: [Virtuoso-users] Virtuoso-users Digest, Vol 108, Issue 10

Haag, Jason Tue, 06 Oct 2015 06:38:37 -0700

Thank you Kingsley. Where do I obtain a newer version of the Sponger
with the HTML (an variants) Cartridge?
-------------------------------------------------------
+1.850.266.7100(office)
+1.850.471.1300 (mobile)
jhaag75 (skype)
http://jasonhaag.com (Web)
http://twitter.com/mobilejson (Twitter)
http://linkedin.com/in/jasonhaag (LinkedIn)




On Tue, Oct 6, 2015 at 8:21 AM,
<virtuoso-users-requ...@lists.sourceforge.net> wrote:
> Send Virtuoso-users mailing list submissions to
>         virtuoso-users@lists.sourceforge.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.sourceforge.net/lists/listinfo/virtuoso-users
> or, via email, send a message with subject or body 'help' to
>         virtuoso-users-requ...@lists.sourceforge.net
>
> You can reach the person managing the list at
>         virtuoso-users-ow...@lists.sourceforge.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Virtuoso-users digest..."
>
>
> Today's Topics:
>
>    1. Re: Virtuoso-users Digest, Vol 108, Issue 4 (Haag, Jason)
>    2. Re: Virtuoso-users Digest, Vol 108, Issue 4 (Kingsley Idehen)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 5 Oct 2015 22:58:46 -0500
> From: "Haag, Jason" <jhaa...@gmail.com>
> Subject: Re: [Virtuoso-users] Virtuoso-users Digest, Vol 108, Issue 4
> To: virtuoso-users <virtuoso-users@lists.sourceforge.net>,
>         kide...@openlinksw.com
> Message-ID:
>         <cahjqjnj39f8kfwr9q8k0jjgt7x+nrkjibmpczn_tbe+0aqb...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Kingsley,
>
> In response to your advice I had a few questions. I recently performed a
> clean install of VOS. I'm running Version: 07.20.3214, Build: Oct 6 2015 on
> Debian + Ubuntu. I checked RDFa option under cartridges. I didn't see the
> double check for HTML (and variants) option. Where do I configure the URI
> burner options?
>
> Here is a screen capture of my import settings for the crawler job:
> https://docs.google.com/document/d/1Y0Z9b5vBftbgniwmVTp10WT0gblXQKC93ivvvnggPYE/edit?usp=sharing
>
> When I execute a SPARQL query it returns duplicate data:
> http://52.23.175.123:8890/sparql?default-graph-uri=&query=PREFIX+xapi%3A+%3Chttp%3A%2F%2Fpurl.org%2Fxapi%2Fontology%23%3E%0D%0A%0D%0ASELECT+DISTINCT+*%0D%0A%0D%0AWHERE+%7B%0D%0A%0D%0A+++%3FVerb+a+xapi%3AVerb+.%0D%0A%0D%0A%0D%0A%7D%0D%0A&should-sponge=&format=text%2Fhtml&timeout=0&debug=on
>
>
> Are these URIs with an IP address from the sponger? Did I duplicate the
> import data by selecting too many options? Thank you for the support and
> advice. It would be helpful if there were more information about these
> settings/ hatch options.
>
> Kind Regards,
>
> J Haag
>
> SPARQL example: http://52.23.175.123:8890/sparql
>
> PREFIX xapi: <http://purl.org/xapi/ontology#>
>
> SELECT DISTINCT *
>
> WHERE {
>
>    ?Verb a xapi:Verb .
>
>
> }
>
>
>
>
> Your advice was to do the following:
>
> [1] Uncheck "WebDAV" checkbox
>
> [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger
> Cartridge won't be invoked (this includes the ability to read RDFa)
> [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML
> Cartridge .
>
> Also double check the "HTML (and variants)" Cartridge options. You need
> to set: rdfa=yes, in options. Here is a dump of the options used by
> URIBurner:
>
> fallback-mode=no
> *rdfa=yes*
> reify_html5md=1
> reify_rdfa=0
> reify_jsonld=1
> reify_all_grddl=0
> passthrough_mode=yes
> loose=yes
> reify_html=0
> reify_html_misc=0
> reify_turtle=yes
>
>
> As for what's the best solution for your goal? This is the best solution
> since you can schedule your content crawling.  You result should
> ultimately match:
>
> http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html
> -- Using /about sponger service.
>> Message: 1
>> Date: Fri, 2 Oct 2015 12:43:18 -0400
>> From: Kingsley Idehen <kide...@openlinksw.com>
>> Subject: Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso
>> To: virtuoso-users@lists.sourceforge.net
>> Message-ID: <560eb426.4060...@openlinksw.com>
>> Content-Type: text/plain; charset="windows-1252"
>>
>> On 9/29/15 10:57 AM, Haag, Jason wrote:
>>> Following up on my original inquiry: I currently have several RDF
>>> datasets available on my server. Each data set has an RDF dump
>>> available as RDF/XML, JSON-LD, and Turtle. These dumps are generated
>>> automatically without virtuoso from an HTML page marked up using RDFa.
>>>
>>> What is the best option for automating the import of this data on a
>>> regular basis into the virtuoso DB? I would like to automatically
>>> import RDFa data ideally, but or even rdf/xml or turtle files would be
>>> fine too. I tried this with the attached settings, but the data
>>> doesn't appear in the database. What do I need to enable or change in
>>> my settings in order to automatically import RDF data? See attached
>>> screen captures. Thanks for any tips or advice!
>>
>> Do the following:
>>
>> [1] Uncheck "WebDAV" checkbox
>> [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger
>> Cartridge won't be invoked (this includes the ability to read RDFa)
>> [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML
>> Cartridge .
>>
>> Also double check the "HTML (and variants)" Cartridge options. You need
>> to set: rdfa=yes, in options. Here is a dump of the options used by
>> URIBurner:
>>
>> fallback-mode=no
>> *rdfa=yes*
>> reify_html5md=1
>> reify_rdfa=0
>> reify_jsonld=1
>> reify_all_grddl=0
>> passthrough_mode=yes
>> loose=yes
>> reify_html=0
>> reify_html_misc=0
>> reify_turtle=yes
>>
>>
>> As for what's the best solution for your goal? This is the best solution
>> since you can schedule your content crawling.  You result should
>> ultimately match:
>>
>>
> http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html
>> -- Using /about sponger service.
>>
>>
>> --
>> Regards,
>>
>> Kingsley Idehen
>> Founder & CEO
>> OpenLink Software
>> Company Web: http://www.openlinksw.com
>> Personal Weblog 1: http://kidehen.blogspot.com
>> Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
>> Twitter Profile: https://twitter.com/kidehen
>> Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>> Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 2
> Date: Tue, 6 Oct 2015 09:21:38 -0400
> From: Kingsley Idehen <kide...@openlinksw.com>
> Subject: Re: [Virtuoso-users] Virtuoso-users Digest, Vol 108, Issue 4
> To: virtuoso-users@lists.sourceforge.net
> Message-ID: <5613cae2.3080...@openlinksw.com>
> Content-Type: text/plain; charset="windows-1252"
>
> On 10/5/15 11:58 PM, Haag, Jason wrote:
>> Hi Kingsley,
>>
>> In response to your advice I had a few questions. I recently performed
>> a clean install of VOS. I'm running Version: 07.20.3214, Build: Oct 6
>> 2015 on Debian + Ubuntu. I checked RDFa option under cartridges.
>
> Was this the HTML Extractor Cartridge?
>> I didn't see the double check for HTML (and variants) option.
>
> If you are configuring the HTML Extractor Cartridge you would see that
> option.
>
>> Where do I configure the URI burner options?
>
> You are configuring the Virtuoso Sponger. URIBurner is just a public
> facing instance of the Sponger offered as a transformation service.
>
>>
>> Here is a screen capture of my import settings for the crawler job:
>> https://docs.google.com/document/d/1Y0Z9b5vBftbgniwmVTp10WT0gblXQKC93ivvvnggPYE/edit?usp=sharing
>
> That shows your crawler jobs being configured to use 3 sponger
> cartridges. I can also see that you are using an older version of the
> Sponger which doesn't include the HTML (an variants) Cartridge. That
> cartridge actually replaces all the RDFa variants presented in the old
> interface.
>>
>> When I execute a SPARQL query it returns duplicate
>> data: 
>> http://52.23.175.123:8890/sparql?default-graph-uri=&query=PREFIX+xapi%3A+%3Chttp%3A%2F%2Fpurl.org%2Fxapi%2Fontology%23%3E%0D%0A%0D%0ASELECT+DISTINCT+*%0D%0A%0D%0AWHERE+%7B%0D%0A%0D%0A+++%3FVerb+a+xapi%3AVerb+.%0D%0A%0D%0A%0D%0A%7D%0D%0A&should-sponge=&format=text%2Fhtml&timeout=0&debug=on
>
> Yes, because you have the same data across several internal document
> identifiers (Named Graphs). See:
> http://52.23.175.123:8890/sparql?default-graph-uri=&qtxt=PREFIX+xapi%3A+%3Chttp%3A%2F%2Fpurl.org%2Fxapi%2Fontology%23%3E%0D%0A%0D%0ASELECT+DISTINCT+%3Fg%0D%0A%0D%0AWHERE+%7B+GRAPH+%3Fg+%7B%0D%0A%0D%0A+++%3FVerb+a+xapi%3AVerb+.+%7D%0D%0A%0D%0A%0D%0A%7D%0D%0A&should-sponge=&format=text%2Fhtml&timeout=0&debug=on
>
>
>>
>>
>> Are these URIs with an IP address from the sponger?
>
> Yes, they are proxy Linked Data URIs i.e., URIs made by the sponger that
> deliver 5-Star Linked Data principles adherence.
>
>> Did I duplicate the import data by selecting too many options?
>
> You certainly have many named graphs being created that contain the same
> data.
>
> Kingsley
>
>> Thank you for the support and advice. It would be helpful if there
>> were more information about these settings/ hatch options.
>>
>> Kind Regards,
>>
>> J Haag
>>
>> SPARQL example: http://52.23.175.123:8890/sparql
>>
>> PREFIX xapi: <http://purl.org/xapi/ontology#>
>>
>> SELECT DISTINCT *
>>
>> WHERE {
>>
>>    ?Verb a xapi:Verb .
>>
>>
>> }
>>
>>
>>
>>
>> Your advice was to do the following:
>>
>> [1] Uncheck "WebDAV" checkbox
>>
>> [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger
>> Cartridge won't be invoked (this includes the ability to read RDFa)
>> [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML
>> Cartridge .
>>
>> Also double check the "HTML (and variants)" Cartridge options. You need
>> to set: rdfa=yes, in options. Here is a dump of the options used by
>> URIBurner:
>>
>> fallback-mode=no
>> *rdfa=yes*
>> reify_html5md=1
>> reify_rdfa=0
>> reify_jsonld=1
>> reify_all_grddl=0
>> passthrough_mode=yes
>> loose=yes
>> reify_html=0
>> reify_html_misc=0
>> reify_turtle=yes
>>
>>
>> As for what's the best solution for your goal? This is the best solution
>> since you can schedule your content crawling.  You result should
>> ultimately match:
>>
>> http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html
>> -- Using /about sponger service.
>> > Message: 1
>> > Date: Fri, 2 Oct 2015 12:43:18 -0400
>> > From: Kingsley Idehen <kide...@openlinksw.com
>> <mailto:kide...@openlinksw.com>>
>> > Subject: Re: [Virtuoso-users] Automating RDF data imports in VIrtuoso
>> > To: virtuoso-users@lists.sourceforge.net
>> <mailto:virtuoso-users@lists.sourceforge.net>
>> > Message-ID: <560eb426.4060...@openlinksw.com
>> <mailto:560eb426.4060...@openlinksw.com>>
>> > Content-Type: text/plain; charset="windows-1252"
>> >
>> > On 9/29/15 10:57 AM, Haag, Jason wrote:
>> >> Following up on my original inquiry: I currently have several RDF
>> >> datasets available on my server. Each data set has an RDF dump
>> >> available as RDF/XML, JSON-LD, and Turtle. These dumps are generated
>> >> automatically without virtuoso from an HTML page marked up using RDFa.
>> >>
>> >> What is the best option for automating the import of this data on a
>> >> regular basis into the virtuoso DB? I would like to automatically
>> >> import RDFa data ideally, but or even rdf/xml or turtle files would be
>> >> fine too. I tried this with the attached settings, but the data
>> >> doesn't appear in the database. What do I need to enable or change in
>> >> my settings in order to automatically import RDF data? See attached
>> >> screen captures. Thanks for any tips or advice!
>> >
>> > Do the following:
>> >
>> > [1] Uncheck "WebDAV" checkbox
>> > [2] Check "Sponger" checkbox -- otherwise "HTML (and variants)" Sponger
>> > Cartridge won't be invoked (this includes the ability to read RDFa)
>> > [3] Check "Show Sponger Extractor Cartridges" -- and then check the HTML
>> > Cartridge .
>> >
>> > Also double check the "HTML (and variants)" Cartridge options. You need
>> > to set: rdfa=yes, in options. Here is a dump of the options used by
>> > URIBurner:
>> >
>> > fallback-mode=no
>> > *rdfa=yes*
>> > reify_html5md=1
>> > reify_rdfa=0
>> > reify_jsonld=1
>> > reify_all_grddl=0
>> > passthrough_mode=yes
>> > loose=yes
>> > reify_html=0
>> > reify_html_misc=0
>> > reify_turtle=yes
>> >
>> >
>> > As for what's the best solution for your goal? This is the best solution
>> > since you can schedule your content crawling.  You result should
>> > ultimately match:
>> >
>> >
>> http://linkeddata.uriburner.com/about/html/http/xapi.vocab.pub/datasets/adl/verbs/index.html
>> > -- Using /about sponger service.
>> >
>> >
>> > --
>> > Regards,
>> >
>> > Kingsley Idehen
>> > Founder & CEO
>> > OpenLink Software
>> > Company Web: http://www.openlinksw.com
>> > Personal Weblog 1: http://kidehen.blogspot.com
>> > Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
>> <http://www.openlinksw.com/blog/%7Ekidehen>
>> > Twitter Profile: https://twitter.com/kidehen
>> > Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
>> > LinkedIn Profile: http://www.linkedin.com/in/kidehen
>> > Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
>> >
>>
>>
>> ------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Virtuoso-users mailing list
>> Virtuoso-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog 1: http://kidehen.blogspot.com
> Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
> Twitter Profile: https://twitter.com/kidehen
> Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
> Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: smime.p7s
> Type: application/pkcs7-signature
> Size: 2407 bytes
> Desc: S/MIME Cryptographic Signature
>
> ------------------------------
>
> ------------------------------------------------------------------------------
>
>
> ------------------------------
>
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>
> End of Virtuoso-users Digest, Vol 108, Issue 10
> ***********************************************

------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] Virtuoso-users Digest, Vol 108, Issue 10

Reply via email to