On 30/11/11 00:46, David Houlden wrote:
On Wednesday 30 November 2011 00:28:10 Allan wrote:
On 29/11/11 23:52, Allan wrote:
On 29/11/11 20:05, Jack wrote:
On 2011.11.29 10:35, Allan wrote:
On 29/11/11 15:03, David Houlden wrote:
I downloaded a csv bank statement from my bank today and ran it
through the new CSV importer. I found one problem which is maybe a
little unusual and is down to the way my bank formatted the file. I
have attached an editted version of the file which contains the first
record showing the headers and the data record which caused a problem.

The 5th field on the record in question is the Transaction
Description and it contains a comma. That part of the field is
however contained within double quote characters. The unusual feature
is that only part of the field is contained in double quotes. The
back end (which doesn't contain a comma) is not. Importing this file
confuses the CSV importer and results in the whole of the record from
the Transaction Description onwards being put into the Transaction
Description field even though there are other fields delimited by
comma.

I appreciate that this may be unusual data but I would expect that
each record should be broken into fields using the field delimiter
but ignoring any field delimiters which are contained within double
quotes (the text delimiter). Indeed, this file imports as I would
expect into LibreOffice Calc.

If required I will raise a bug report for this.

I'll have a look at that. I do expect to deal with quotes, but the UI
has undergone a significant upheaval and it may be there's something
I've missed.

I doubt the quotes are the problem; it's likely the fact that there is
something after the close quote before the next comma. What library are
you using to parse the file, or are you doing it manually? It's probably
expecting the next comma immediately after the closing quote, and either
getting confused or generating an error at that point.

I had to write a routine a while ago to deal with the converse problem,
to detect where a "quoted" string has been erroneously split, because of
a comma, or, in a value, a 'thousand separator' being mistaken for a
field delimiter.

I've now done an addition to deal with this issue and it works.... But,
as I have to do both checks, the first is conflicting with the new one.

So, a bit more to do yet. I don't really want to go from scratch, but
may have to.

Allan

I just needed to make a slight change to the original routine, and now
both conditions are being caught.

@Dave

I don't know if you'd like to try a patch,before I commit, as you have
the original culprit file?

Sure. Send it and I'll try it out.

Dave.

I think this should work for you, Dave.

It's reassuring to know others are trying the plugin, and not just me!

Allan

>From 07684d2ffee02d95b71acb56b4cd04d1fa606d86 Mon Sep 17 00:00:00 2001
From: Allan Anderson <agande...@gmail.com>
Date: Wed, 30 Nov 2011 11:02:04 +0000
Subject: [PATCH] Deal with case where a 'field seperator' is within quotes and the quotes don't include the whole of the field.

---
 kmymoney/plugins/csvimport/csvutil.cpp |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kmymoney/plugins/csvimport/csvutil.cpp b/kmymoney/plugins/csvimport/csvutil.cpp
index c6146ba..1c5829b 100644
--- a/kmymoney/plugins/csvimport/csvutil.cpp
+++ b/kmymoney/plugins/csvimport/csvutil.cpp
@@ -60,20 +60,21 @@ QStringList Parse::parseLine(const QString& data)
   listIn = m_inBuffer.split(m_fieldDelimiterCharacter);// firstly, split on m_fieldDelimiterCharacter
 
   QStringList::const_iterator constIterator;
-
-  for(constIterator = listIn.constBegin(); constIterator < listIn.constEnd();
-      ++constIterator) {
+  
+  for(constIterator = listIn.constBegin(); constIterator < listIn.constEnd(); ++constIterator) {
     txt = (*constIterator);
+    
     // detect where a "quoted" string has been erroneously split, because of a comma,
-    // or in a value, a 'thousand separator' being mistaken for a field delimitor.
+    // or in a value, a 'thousand separator' being mistaken for a field delimiter.
+    //Also, where a 'field seperator' is within quotes and the quotes don't include the whole of the field.
 
-    while((txt.startsWith(m_textDelimiterCharacter)) && (!txt.endsWith(m_textDelimiterCharacter)))  {
+    while((txt.startsWith(m_textDelimiterCharacter)) && (!txt.mid(1,-1).contains(m_textDelimiterCharacter)))  {
       if(++constIterator < listIn.constEnd())  {
         txt1 = (*constIterator);//                       second part of the split string
         txt += m_fieldDelimiterCharacter + txt1;//       rejoin the string
       } else break;
     }
-    listOut += txt.remove(m_textDelimiterCharacter);
+    listOut += txt;
   }
   return listOut;
 }
-- 
1.7.4.1

_______________________________________________
KMyMoney-devel mailing list
KMyMoney-devel@kde.org
https://mail.kde.org/mailman/listinfo/kmymoney-devel

Reply via email to