Re: [Rd] sprintf, check number of parameters

2021-02-25 Thread Tomas Kalibera

On 2/22/21 11:34 AM, Duncan Murdoch wrote:
This is ugly, but I think it's legal, and it doesn't trigger a 
warning:  output unused parameters as zero-length strings:


 msnx(T0, mask = '%1$.1f (SD=%2$.1f)%3$.0s%4$.0s')

Perhaps an example using %.0s could be included to show how to skip a 
value.


Thanks, I've added a sentence to that effect.

Best
Tomas



Duncan Murdoch

On 22/02/2021 5:06 a.m., Tomas Kalibera wrote:

Dear Matthias,

On 2/6/21 2:11 PM, Matthias Gondan wrote:

Dear developers,

This is a follow-up from an earlier mail about warnings of unused 
arguments in sprintf:


1. This should obviously raise an error (and it does):
sprintf('%i %i', 1)
Fehler in sprintf("%i %i", 1) : zu wenig Argumente [= too few 
arguments]


2. This should, in my opinion, raise a warning about an unused 
argument (and I think it does in now R-devel):

sprintf('%i', 1, 2)

yes, it does.
3. From the conversation below, it seems that this also raises a 
warning (in R-devel):

sprintf('%1$i', 1, 2)

yes, it does as well
I think that one should be suppressed. When I reported this a few 
months ago, I didn’t really have a use case for (3), but now I think 
I have found something. Suppose I have a function that calculates 
some descriptive statistics, mean, sd, available cases, missings, 
something like the one below:


msnx = function(x, mask='%1$.1f (SD=%2$.1f, n=%3$i, NA=%4$i)')
{
    m = mean(x, na.rm=TRUE)
    s = sd(x, na.rm=TRUE)
    n = sum(!is.na(x))
    na = sum(is.na(x))
        sprintf(mask, m, s, n, na)
}

The mask is meant to help formatting it a bit.

msnx(T0)
[1] "30.7 (SD=4.7, n=104, NA=0)"

Now I want a „less detailed“ summary, so I invoke the function with 
something like


msnx(T0, mask='%1$.1f (SD=%2$.1f)')
[1] "30.7 (SD=4.7)"

In my opinion, in the last example, sprintf should not raise the 
warning in (2) if all arguments in the mask are „dollared“. I am 
still a bit unsure since the example uses a function that calculate 
things that aren’t being used (n and na), and this could be 
considered bad programming style. But there might be other use 
cases, and it is, nevertheless, a deliberate choice to skip 
arguments 3$ and 4$.


Thanks for the example. I am sympathetic with your concerns about the
programming style in it: the caller needs to know exactly how "mask"
will be used, that it would be in a call to sprintf() and what would be
the indices of the arguments.

The warning has been introduced a while ago and there has not been any
report yet that it would break existing good style code (particularly
CRAN packages have been tested extensively), which indicates that
currently the R code base does not rely on unused $- arguments.

It is hence I think wise to keep the warning to prevent R code base from
relying on that in the future, because gcc/clang already warn on unused
$-arguments. Not only that gcc developers must have been thinking hard
about the same thing before us getting to this conclusion: $- arguments
are a POSIX extension and gcc/clang are the key compilers for POSIX
systems, so it is safer to abide by their rules. In principle POSIX may
mandate that $- arguments are used explicitly in the future (now it is
rather vague, it seems unused are fine only when last), and even if not,
deviations from gcc/clang could cause confusion for applications and
developers using both C/C++ and R.

Best
Tomas





Best wishes,

Matthias



Dear Matthias,

thanks for the suggestion, R-devel now warns on unused arguments by
format (both numbered and un-numbered). It seems that the new 
warning is

useful, often it finds cases when arguments were accidentally passed to
sprintf but had been meant for a different function.

R allows combining both numbered and un-numbered references in a single
format, even though it may be better to avoid and POSIX does not allow
that.

Best
Tomas

On 9/20/20 1:03 PM, Matthias Gondan wrote:

Dear R developers,

I am wondering if this should raise an error or a warning.


sprintf('%.f, %.f', 1, 2, 3)

[1] "1, 2"

I am aware that R has „numbered“ sprintf arguments 
(sprintf('%1$.f', …), and in that case, omissing of specific 
arguments may be intended. But in the usual syntax, omission of an 
argument is probably a mistake.


Thank you for your consideration.

Best wishes,

Matthias

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.csv, worrying behaviour?

2021-02-25 Thread TAYLOR, Benjamin (BLACKPOOL TEACHING HOSPITALS NHS FOUNDATION TRUST) via R-devel
Dear all

I've been using R for around 16 years now and I've only just become aware of a 
behaviour of read.csv that I find worrying which is why I'm contacting this 
list. A simplified example of the behaviour is as follows

I created a "test.csv" file containing the following lines:

a,b,c,d,e,f,g
1,2,3,4

And then read it into R using:

> d = read.csv("test.csv")
> d
  a b c d  e  f  g
1 1 2 3 4 NA NA NA

I was surprised that this did not issue a warning. I can understand why the 
following csv would not issue a warning:

a,b,c,d,e,f,g
1,2,3,4,,,

But the missing commas in the first example? Thoughts from others would be 
welcome.

Kind regards

Ben


~~

Benjamin M. Taylor, MSci, MSc, PhD
Lead Data Scientist
Blackpool Teaching Hospitals NHS Foundation Trust
Home 15
Whinney Heys Road
Blackpool
FY3 8NR

Scholar: https://scholar.google.co.uk/citations?user=6Hf0CJkJ&hl=en
Github: https://github.com/bentaylor1
Gitlab: https://gitlab.com/ben_taylor
ORCID: http://orcid.org/-0001-8667-4089





This message may contain confidential information. If yo...{{dropped:19}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.csv, worrying behaviour?

2021-02-25 Thread Kevin R. Coombes
I believe this is documented behavior. The 'read.csv' function is a 
front-end to 'read.table' with different default values. IN this 
particular case, read.csv sets fill = TRUE, which means that it is 
supposed to fill incomplete lines with NA's. It also sets header=TRUE, 
which is presumably what it is using to determine the expected length of 
a line-row.

  -- Kevin

On 2/25/2021 4:11 AM, TAYLOR, Benjamin (BLACKPOOL TEACHING HOSPITALS NHS 
FOUNDATION TRUST) via R-devel wrote:

Dear all

I've been using R for around 16 years now and I've only just become aware of a 
behaviour of read.csv that I find worrying which is why I'm contacting this 
list. A simplified example of the behaviour is as follows

I created a "test.csv" file containing the following lines:

a,b,c,d,e,f,g
1,2,3,4

And then read it into R using:


d = read.csv("test.csv")
d

   a b c d  e  f  g
1 1 2 3 4 NA NA NA

I was surprised that this did not issue a warning. I can understand why the 
following csv would not issue a warning:

a,b,c,d,e,f,g
1,2,3,4,,,

But the missing commas in the first example? Thoughts from others would be 
welcome.

Kind regards

Ben


~~

Benjamin M. Taylor, MSci, MSc, PhD
Lead Data Scientist
Blackpool Teaching Hospitals NHS Foundation Trust
Home 15
Whinney Heys Road
Blackpool
FY3 8NR

Scholar: https://scholar.google.co.uk/citations?user=6Hf0CJkJ&hl=en
Github: https://github.com/bentaylor1
Gitlab: https://gitlab.com/ben_taylor
ORCID: http://orcid.org/-0001-8667-4089





This message may contain confidential information. If ...{{dropped:6}}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel