I think basically you don't want to hold the whole file in memory, there is no reason to, try the code i provided and without outputting the line just out put a counter e.g. 1 2 3 4 5 6 7 See if it barfs at the same line number
A On 12 January 2012 19:09, Aaron J. White <[email protected]> wrote: > midstring and split taken from cflib > > http://www.cflib.org/udf/MidString > http://www.cflib.org/udf/split > > On Jan 12, 1:03 pm, "Aaron J. White" <[email protected]> wrote: > > Not really. > > > > <cfset locals.startOfTitle = "<example_node>" /> > > <cfset locals.endOfTitle = "</example_node>" /> > > > > <cfloop index="locals.line" file="#locals.absFilePath#"> > > <cfif locals.line DOES NOT CONTAIN locals.endOfTitle> > > <!--- add line to titleitem ---> > > <cfset locals.titleItem &= locals.line /> > > <cfset application.import.lineCount += 1 /> > > <cfif application.import.stop> > > <cfabort /> > > </cfif> > > <cfelse> > > <cfset locals.titleItem &= locals.line /> > > <cfset application.import.lineCount += 1 /> > > <!--- we hit the end of a title. first get exta > chars from back. > > we'll need those later---> > > <cfset locals.tempArr = > application.utility.split(locals.titleItem, > > locals.endOfTitle) /> > > <cfset locals.tempItem = > locals.tempArr[arraylen(locals.tempArr)] & > > "" /> > > <!--- now get everything id middle of nodes ---> > > <cfset locals.titleItem = locals.startOfTitle & > > application.utility.midstring(locals.titleItem, locals.startOfTitle, > > locals.endOfTitle) & locals.endOfTitle/> > > <!--- convert title item to xml object---> > > <cfset locals.titleXml = > xmlparse(locals.titleItem) /> > > <!--- we have our node. prepare titleItem text > for next iteration > > ---> > > <cfset locals.titleItem = locals.tempItem/> > > <cfif application.import.stop > > > <cfabort /> > > <cfelse> > > <!--- process the title xml and add > required info to the database > > ---> > > <cfset processTitleItem(locals.titleXml) > /> > > </cfif> > > </cfif> > > </cfloop> > > > > On Jan 12, 12:43 pm, Alex Skinner <[email protected]> wrote: > > > > > > > > > > > > > > > > > Seeing some code would be good how are you doing the read > > > > > I google and found something like this > > > > > <cfscript> > > > // Define the file to read, use forward slashes only > > > FileName="C:/Example/ReadMe.txt"; > > > // Initilize Java File IO > > > FileIOClass=createObject("java","java.io.FileReader"); > > > FileIO=FileIOClass.init(FileName); > > > LineIOClass=createObject("java","java.io.BufferedReader" ); > > > LineIO=LineIOClass.init(FileIO); > > > </cfscript> > > > > > <CFSET EOF=0> > > > <CFLOOP condition="NOT EOF"> > > > <!--- Read in next line ---> > > > <CFSET CurrLine=LineIO.readLine()> > > > <!--- If CurrLine is not defined, we have reached the end of file > ---> > > > <CFIF IsDefined("CurrLine") EQ "NO"> > > > <CFSET EOF=1> > > > <CFBREAK> > > > </CFIF> > > > <CFOUTPUT>#CurrLine#<br></CFOUTPUT><CFFLUSH> > > > </CFLOOP> > > > > > Is your solution similar ? > > > > > A > > > > > On 12 January 2012 17:57, Aaron J. White <[email protected]> wrote: > > > > > > Hey all, > > > > > > I am receiving an OutOfMemory error while running a script that is > > > > trying to loop over a 1.2gb+ xml file (~ 12 million lines). I'm not > > > > really sure if what I am doing is just horrible and there is a better > > > > way or if it is a memory issue in openbd. > > > > > > I have assigned tomcat 2gb max memory. While I'm running the script I > > > > can see the memory usage slowly creep up in task manager. With 4gb of > > > > ram on the vps I get to about 7 million lines before tomcat gives up. > > > > When I had 3gb of ram on the server and 1gb applied to Tomcat I could > > > > only get to about 4 million lines. > > > > > > Here's the logic behind what I am doing. > > > > > > I am interested in one particular node in the large file so I loop > > > > over the file line by line. As I loop if the line does not contain > the > > > > end of the node I'm looking for then I <cfset locals.exampleNode &= > > > > locals.line /> > > > > Once I hit a line that contains the end of the node ( </ > > > > example_node> ). I do a few operations to clean up any extra text > from > > > > the front and back of the node string and then convert it to xml with > > > > xmlparse. > > > > > > Once I have the node as xml I push it to another function that does > > > > serveral things. > > > > ** uses xpath to grab particular information from the node. Seven > > > > xpath searches are done on each node unless I decide to skip the node > > > > after the first two xpath searches. > > > > ** Depending on the content I either add the information to my > > > > database, update the information, or skip it. I have about 5 tables > > > > that are getting modified from the script. A few of the unimportant > > > > queries use background="yes". > > > > The whole script runs in a cfthread so it doesn't time out. > > > > > > Can anyone give any insight. Also, I could post some code example, > but > > > > my script is about 600 lines long. > > > > > > -- > > > > online documentation:http://openbd.org/manual/ > > > > google+ hints/tips:https://plus.google.com/115990347459711259462 > > > > http://groups.google.com/group/openbd?hl=en > > > > > > Join us @http://www.OpenCFsummit.org/Dallas, Feb 2012 > > > > > -- > > > Alex Skinner > > > Managing Director > > > Pixl8 Interactive > > > > > Tel: +448452600726 > > > Email: [email protected] > > > Web: pixl8.co.uk > > -- > online documentation: http://openbd.org/manual/ > google+ hints/tips: https://plus.google.com/115990347459711259462 > http://groups.google.com/group/openbd?hl=en > > Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012 > -- Alex Skinner Managing Director Pixl8 Interactive Tel: +448452600726 Email: [email protected] Web: pixl8.co.uk -- online documentation: http://openbd.org/manual/ google+ hints/tips: https://plus.google.com/115990347459711259462 http://groups.google.com/group/openbd?hl=en Join us @ http://www.OpenCFsummit.org/ Dallas, Feb 2012
