Hi All,

We did a test round of test for 15000 xmls which has xi:include element (Sample 
given below). The require large xml (hierarchy xml) is getting generated in 
just PT23.040552S. We used node-expand API to generate the xml. Whereas our old 
recursive approach is taking more than 30 minute to perform the same operation. 
Can you please provide any thoughts ? Any other things we should be consider ?

import module namespace xinc = "http://marklogic.com/xinclude"; at 
"/MarkLogic/xinclude/xinclude.xqy";
xinc:node-expand(fn:doc("/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8") )

Where "/data/d14d44ec-59d5-4ada-b47d-3d62b69633c8" is the root xml URI in the 
hierarchy.

1- Root object which contains relationships
<object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="myPackage" type="string">
                                             <value>somevalue</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                                                            <include 
href="/data/c525e14d-59d5-4ada-b47d-3d62b69633c8" xpointer="xpath(/*:object)" 
xmlns="http://www.w3.org/2001/XInclude"/>
                                                            <include 
href="/data/12970f40-053d-4f22-8e39-073ca3a17454" xpointer="xpath(/*:object)" 
xmlns="http://www.w3.org/2001/XInclude"/>
    ....
               </relationships>
</object>

2- Child object which contains further relationships (It is one of the child 
which is inside the relationships)

<object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="pixelXDimension" type="int">
                                             <value>645</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                                                            <include 
href="/data/xyzzqqka-59d5-4ada-b47d-125shydtt2bs" xpointer="xpath(/*:object)" 
xmlns="http://www.w3.org/2001/XInclude"/>
    ....
               </relationships>
</object>


3- Further Child object which contains other relationships

<object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
               <properties>
                              <property name="pixelXDimension" type="int">
                                             <value>645</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                                                            <include 
href="/data/abcgdt13-59d5-125a-b47d-425shydtt2bs" xpointer="xpath(/*:object)" 
xmlns="http://www.w3.org/2001/XInclude"/>
    ....
               </relationships>
</object>

And so on. And final xml which we want :

<object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="myPackage" type="string">
                                             <value>somevalue</value>
                              </property>
.....
....
               </properties>

               <relationships>
                              <object name="myImage" 
id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
                                             <properties>
                                                            <property 
name="pixelXDimension" type="int">
                                                                           
<value>645</value>
                                                            </property>
.....
....
                                             </properties>

                                             <relationships>
                                                            <object 
name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
                                                                           
<properties>
                                                                                
          <property name="pixelXDimension" type="int">
                                                                                
                         <value>645</value>
                                                                                
          </property>
.....
....
                                                                           
</properties>

                                                                           
<relationships>
....
                                                                           
</relationships>
                                                            </object>
....
                                             </relationships>
                              </object>
....
               </relationships>
</object>

Regards,
Abhinav

From: Mishra, Abhinav Kumar (Cognizant)
Sent: Thursday, June 16, 2016 12:55 PM
To: MarkLogic Developer Discussion
Cc: Singh, Vikas (Cognizant)
Subject: RE: [MarkLogic Dev General] performance issue for creating large xml

Hi Geert,
We are creating an xml which looks like a hierarchy. And once the hierarchy is 
prepared from small chunks we are using an XSLT to transform the hierarchy into 
another format. The small chunks contains metadata for different-2 files.
Currently we are having more than 30000 small chunks and we have to create a 
large xml (hierarchy xml) out of these chunks in memory. The generated large 
xml (hierarchy xml) will be more than 30MB in size. And this process is taking 
more than 45 minutes to complete. So we are looking for a design change. Vikas 
pointed out to use xi:include. So we thought of having a discussion here.

Let me try to explain what we are doing.

1- Root object which contains relationships
<object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="myPackage" type="string">
                                             <value>somevalue</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                              
<value>c525e14d-59d5-4ada-b47d-3d62b69633c8</value>
                              
<value>12970f40-053d-4f22-8e39-073ca3a17454</value>
    ....
               </relationships>
</object>

2- Child object which contains further relationships (It is one of the child 
which is inside the relationships)

<object name="myImage" id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="pixelXDimension" type="int">
                                             <value>645</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                              
<value>xyzzqqka-59d5-4ada-b47d-125shydtt2bs</value>
    ....
               </relationships>
</object>


3- Further Child object which contains other relationships

<object name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
               <properties>
                              <property name="pixelXDimension" type="int">
                                             <value>645</value>
                              </property>
               .....
               ....
               </properties>

               <relationships>
                              
<value>abcgdt13-59d5-125a-b47d-425shydtt2bs</value>
    ....
               </relationships>
</object>

and so on. and at the end we are creating a large xml which will look like:

<object name="package" id="d14d44ec-59d5-4ada-b47d-3d62b69633c8">
               <properties>
                              <property name="myPackage" type="string">
                                             <value>somevalue</value>
                              </property>
               .....
               ....
               </properties>

               <contains>
                              <object name="myImage" 
id="c525e14d-59d5-4ada-b47d-3d62b69633c8">
                                             <properties>
                                                            <property 
name="pixelXDimension" type="int">
                                                                           
<value>645</value>
                                                            </property>
               .....
               ....
                                             </properties>

                                             <contains>
                                                            <object 
name="thumbnail" id="xyzzqqka-59d5-4ada-b47d-125shydtt2bs">
                                                                           
<properties>
                                                                                
          <property name="pixelXDimension" type="int">
                                                                                
                         <value>645</value>
                                                                                
          </property>
               .....
               ....
                                                                           
</properties>

                                                                           
<contains>
    ....
                                                                           
</contains>
                                                            </object>
    ....
                                             </contains>
                              </object>
    ....
               </contains>
</object>

Now we are using XSLT to transform into another format which we need as a 
business requirement.


Regards
Abhinav

From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Geert Josten
Sent: Thursday, June 16, 2016 10:29 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] performance issue for creating large xml

Hi Vikas,

XInclude processing requires building the large xml in memory too, regardless 
where it will be going. So whether this will work well enough for your case 
depends on how large `large` is..

Kind regards,
Geert

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Thursday, June 16, 2016 at 4:24 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General] performance issue for creating large xml

Thanks Geert for quick reply

As per current process also  we are creating large xml  by adding all related 
fragment,  but not committing this large xml into database ,  so we are 
planning to create xml  as below.

 <object name="Test" >
 <!--Some metadata properties -->
 <relationships>
  <relationship type="reference">
     <value>49d7116c24d541aea73328b761cdd89f</value>
         <xi:include href="/49d7116c24d541aea73328b761cdd89f.xml" 
xpointer="49d7116c24d541aea73328b761cdd89f" />
    </relationship>
</object>

As per above xml we are planning to add one more value as <xi:include>  which 
will be same  as value element but contains exact xpath. So when we want 
expanded form based on the xinclude  it will automatically expanded. Will this 
approach improve our performance. This  xi:include will be the different 
content with same structure.

Regards,
Vikas Singh
From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Geert Josten
Sent: Thursday, June 16, 2016 7:29 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] performance issue for creatign large xml

Hi Vikas,

Keep in mind you will be buffering all related fragments in memory while 
building this large XML. It might work out, but it won't scale well. To allow 
keeping memory usage small, and streaming through the results, you are better 
off returning all xml chunks without wrapping them in a single large document 
or element node.

Not very elegant, but this would probably work:

"<wrapper>",
<p>hello world</p>,
<p>hello world</p>,
"</wrapper>"

You can replace the p elements with anything that produces results in a 
streaming manner..

Cheers,
Geert

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Thursday, June 16, 2016 at 3:47 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] performance issue for creatign large xml

Hi All,

As per current design in our project we are creating large xml by adding all 
small xml chunks for a final outcome .For achieving this we are using 
cts:search and this search will work recursively .

Example:  We have one xml which contains metadata and all references of it .Now 
when we will create final result , we  will be getting all references and 
metadata of all references and creating one large xml. Child references also 
contains other references and so on.

This process is taking around one hour for creating the final result.

Can we change our design and use XInclude in all the parent document so when we 
want final output. It will be automatically expanded for all child so no need 
to search in database .
Will this improve our performance for generation of final outcome.

Regards,
Vikas Singh

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to