On 30/11/11 12:29, David Houlden wrote:
On Wednesday 30 November 2011 11:08:35 Allan wrote:
On 30/11/11 00:46, David Houlden wrote:
On Wednesday 30 November 2011 00:28:10 Allan wrote:
On 29/11/11 23:52, Allan wrote:
On 29/11/11 20:05, Jack wrote:
On 2011.11.29 10:35, Allan wrote:
On 29/11/11 15:03, David Houlden wrote:
I downloaded a csv bank statement from my bank today and ran it
through the new CSV importer. I found one problem which is maybe a
little unusual and is down to the way my bank formatted the file. I
have attached an editted version of the file which contains the
first record showing the headers and the data record which caused a
problem.

The 5th field on the record in question is the Transaction
Description and it contains a comma. That part of the field is
however contained within double quote characters. The unusual
feature is that only part of the field is contained in double
quotes. The back end (which doesn't contain a comma) is not.
Importing this file confuses the CSV importer and results in the
whole of the record from the Transaction Description onwards being
put into the Transaction Description field even though there are
other fields delimited by comma.

I appreciate that this may be unusual data but I would expect that
each record should be broken into fields using the field delimiter
but ignoring any field delimiters which are contained within double
quotes (the text delimiter). Indeed, this file imports as I would
expect into LibreOffice Calc.

If required I will raise a bug report for this.

I'll have a look at that. I do expect to deal with quotes, but the UI
has undergone a significant upheaval and it may be there's something
I've missed.

I doubt the quotes are the problem; it's likely the fact that there is
something after the close quote before the next comma. What library
are you using to parse the file, or are you doing it manually? It's
probably expecting the next comma immediately after the closing
quote, and either getting confused or generating an error at that
point.

I had to write a routine a while ago to deal with the converse problem,
to detect where a "quoted" string has been erroneously split, because
of a comma, or, in a value, a 'thousand separator' being mistaken for
a field delimiter.

I've now done an addition to deal with this issue and it works.... But,
as I have to do both checks, the first is conflicting with the new one.

So, a bit more to do yet. I don't really want to go from scratch, but
may have to.

Allan

I just needed to make a slight change to the original routine, and now
both conditions are being caught.

@Dave

I don't know if you'd like to try a patch,before I commit, as you have
the original culprit file?

Sure. Send it and I'll try it out.

Dave.

I think this should work for you, Dave.

It's reassuring to know others are trying the plugin, and not just me!

Allan

Thanks Allan. The patch seems to have fixed the problem I was seeing but has
introduced another problem. With the patch, quotes around a field or part of a
field are not removed from the imported data.

Dave

Hi Dave

<Big grin>

I deliberately left the quotes because they were in the middle of that field and I thought you might attach some significance to them!

It shouldn't be a problem to clear them, and all other quotes as previously. I'll send another patch later.

Allan

_______________________________________________
KMyMoney-devel mailing list
KMyMoney-devel@kde.org
https://mail.kde.org/mailman/listinfo/kmymoney-devel

Reply via email to