Hello, Please remove me from this list. It is not possible to do this at: http://developer.marklogic.com/mailman/listinfo/general
Thanks in advance, Rob Pauly 2016-06-20 16:52 GMT+02:00 <[email protected]>: > Hi All, > > > > We did a test round of test for 15000 xmls which has xi:include element > (Sample given below). The require large xml (hierarchy xml) is getting > generated in just *PT23.040552S*. We used node-expand API to generate the > xml. Whereas our old recursive approach is taking more than 30 minute to > perform the same operation. Can you please provide any thoughts ? Any other > things we should be consider ? > > > > *import module namespace xinc = "http://marklogic.com/xinclude > <http://marklogic.com/xinclude>" at "/MarkLogic/xinclude/xinclude.xqy";* > > *xinc:node-expand(fn:doc("/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8") )* > > > > Where "/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8" is the root xml URI in > the hierarchy. > > > > *1- Root object which contains relationships* > > <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="myPackage" type="string"> > > <value>somevalue</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > * <include > href="/data/c525e14d-59d5-4ada-b47d-3d62b69633c8" > xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude > <http://www.w3.org/2001/XInclude>"/>* > > * <include > href="/data/12970f40-053d-4f22-8e39-073ca3a17454" > xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude > <http://www.w3.org/2001/XInclude>"/>* > > .... > > </relationships> > > </object> > > > > *2- Child object which contains further relationships (It is one of the > child which is inside the relationships)* > > > > <object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="pixelXDimension" type="int"> > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > <include > href="/data/xyzzqqka-59d5-4ada-b47d-125shydtt2bs" > xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/> > > .... > > </relationships> > > </object> > > > > > > *3- Further Child object which contains other relationships * > > > > <object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs"> > > <properties> > > <property name="pixelXDimension" type="int"> > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > <include > href="/data/abcgdt13-59d5-125a-b47d-425shydtt2bs" > xpointer="xpath(/*:object)" xmlns="http://www.w3.org/2001/XInclude"/> > > .... > > </relationships> > > </object> > > > > And so on. And final xml which we want : > > > > <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="myPackage" type="string"> > > <value>somevalue</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > <object name="myImage" > id="c525e14d-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property > name="pixelXDimension" type="int"> > > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > <object > name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs"> > > > <properties> > > > <property name="pixelXDimension" type="int"> > > > <value>645</value> > > > </property> > > ..... > > .... > > > </properties> > > > > > <relationships> > > .... > > > </relationships> > > </object> > > .... > > </relationships> > > </object> > > .... > > </relationships> > > </object> > > > > Regards, > > Abhinav > > > > *From:* Mishra, Abhinav Kumar (Cognizant) > *Sent:* Thursday, June 16, 2016 12:55 PM > *To:* MarkLogic Developer Discussion > *Cc:* Singh, Vikas (Cognizant) > *Subject:* RE: [MarkLogic Dev General] performance issue for creating > large xml > > > > Hi Geert, > > We are creating an xml which looks like a hierarchy. And once the > hierarchy is prepared from small chunks we are using an XSLT to transform > the hierarchy into another format. The small chunks contains metadata for > different-2 files. > > Currently we are having more than *30000* small chunks and we have to > create a large xml (hierarchy xml) out of these chunks in memory. The > generated large xml (hierarchy xml) will be more than *30MB* in size. And > this process is taking more than 45 minutes to complete. So we are looking > for a design change. Vikas pointed out to use *xi:include*. So we thought > of having a discussion here. > > > > Let me try to explain what we are doing. > > > > *1- Root object which contains relationships* > > <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="myPackage" type="string"> > > <value>somevalue</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > > <value>c525e14d-59d5-4ada-b47d-3d62b69633c8</value> > > > <value>12970f40-053d-4f22-8e39-073ca3a17454</value> > > .... > > </relationships> > > </object> > > > > *2- Child object which contains further relationships (It is one of the > child which is inside the relationships)* > > > > <object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="pixelXDimension" type="int"> > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > > <value>xyzzqqka-59d5-4ada-b47d-125shydtt2bs</value> > > .... > > </relationships> > > </object> > > > > > > *3- Further Child object which contains other relationships * > > > > <object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs"> > > <properties> > > <property name="pixelXDimension" type="int"> > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <relationships> > > > <value>abcgdt13-59d5-125a-b47d-425shydtt2bs</value> > > .... > > </relationships> > > </object> > > > > and so on. and at the end we are creating a large xml which will look like: > > > > <object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property name="myPackage" type="string"> > > <value>somevalue</value> > > </property> > > ..... > > .... > > </properties> > > > > <contains> > > <object name="myImage" > id="c525e14d-59d5-4ada-b47d-3d62b69633c8"> > > <properties> > > <property > name="pixelXDimension" type="int"> > > > <value>645</value> > > </property> > > ..... > > .... > > </properties> > > > > <contains> > > <object > name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs"> > > > <properties> > > > <property name="pixelXDimension" type="int"> > > > <value>645</value> > > > </property> > > ..... > > .... > > > </properties> > > > > > <contains> > > .... > > > </contains> > > </object> > > .... > > </contains> > > </object> > > .... > > </contains> > > </object> > > > > Now we are using XSLT to transform into another format which we need as a > business requirement. > > > > > > Regards > > Abhinav > > > > *From:* [email protected] [ > mailto:[email protected] > <[email protected]>] *On Behalf Of *Geert Josten > *Sent:* Thursday, June 16, 2016 10:29 AM > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] performance issue for creating > large xml > > > > Hi Vikas, > > > > XInclude processing requires building the large xml in memory too, > regardless where it will be going. So whether this will work well enough > for your case depends on how large `large` is.. > > > > Kind regards, > > Geert > > > > *From: *<[email protected]> on behalf of " > [email protected]" <[email protected]> > *Reply-To: *MarkLogic Developer Discussion < > [email protected]> > *Date: *Thursday, June 16, 2016 at 4:24 PM > *To: *"[email protected]" <[email protected]> > *Subject: *Re: [MarkLogic Dev General] performance issue for creating > large xml > > > > Thanks Geert for quick reply > > > > As per current process also we are creating large xml by adding all > related fragment, but not committing this large xml into database , so we > are planning to create xml as below. > > > > <object name="Test" > > > <!--Some metadata properties --> > > <relationships> > > <relationship type="reference"> > > <value>49d7116c24d541aea73328b761cdd89f</value> > > <xi:include href="/49d7116c24d541aea73328b761cdd89f.xml" > xpointer="49d7116c24d541aea73328b761cdd89f" /> > > </relationship> > > </object> > > > > As per above xml we are planning to add one more value as <xi:include> > which will be same as value element but contains exact xpath. So when we > want expanded form based on the xinclude it will automatically expanded. > Will this approach improve our performance. This xi:include will be the > different content with same structure. > > > > Regards, > > Vikas Singh > > *From:* [email protected] [ > mailto:[email protected] > <[email protected]>] *On Behalf Of *Geert Josten > *Sent:* Thursday, June 16, 2016 7:29 PM > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] performance issue for creatign > large xml > > > > Hi Vikas, > > > > Keep in mind you will be buffering all related fragments in memory while > building this large XML. It might work out, but it won’t scale well. To > allow keeping memory usage small, and streaming through the results, you > are better off returning all xml chunks without wrapping them in a single > large document or element node. > > > > Not very elegant, but this would probably work: > > > > "<wrapper>", > > <p>hello world</p>, > > <p>hello world</p>, > > "</wrapper>" > > > > You can replace the p elements with anything that produces results in a > streaming manner.. > > > > Cheers, > > Geert > > > > *From: *<[email protected]> on behalf of " > [email protected]" <[email protected]> > *Reply-To: *MarkLogic Developer Discussion < > [email protected]> > *Date: *Thursday, June 16, 2016 at 3:47 PM > *To: *"[email protected]" <[email protected]> > *Subject: *[MarkLogic Dev General] performance issue for creatign large > xml > > > > Hi All, > > > > As per current design in our project we are creating large xml by adding > all small xml chunks for a final outcome .For achieving this we are using > cts:search and this search will work recursively . > > > > *Example*: We have one xml which contains metadata and all references of > it .Now when we will create final result , we will be getting all > references and metadata of all references and creating one large xml. Child > references also contains other references and so on. > > > > This process is taking around one hour for creating the final result. > > > > Can we change our design and use * XInclude* in all the parent document > so when we want final output. It will be automatically expanded for all > child so no need to search in database . > > Will this improve our performance for generation of final outcome. > > > > Regards, > > Vikas Singh > > > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. Where permitted by > applicable law, this e-mail and other e-mail communications sent to and > from Cognizant e-mail addresses may be monitored. > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
