On Mon, 24 Oct 2011, David Winsemius wrote:
You could also have saved the subsetted data, applied `factor` to the
subsetted column and then used `xtabs`.
temp <- subset(chemdata, , )
temp$param <- factor(temp$param)
(Now only levels that exist are in the temp version.)
David,
Thank you. I
On Oct 24, 2011, at 12:10 PM, Rich Shepard wrote:
On Mon, 24 Oct 2011, David Winsemius wrote:
The appearance of levels with all zeroes is probably because I
didn't include drop.unused.levels = FALSE in the xtabs specification.
OK. Adding 'drop.unused.levels' does make a huge difference.
On Mon, 24 Oct 2011, David Winsemius wrote:
The appearance of levels with all zeroes is probably because I didn't include
drop.unused.levels = FALSE in the xtabs specification.
OK. Adding 'drop.unused.levels' does make a huge difference.
Thanks,
Rich
__
On Oct 24, 2011, at 11:34 AM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
The first thing I would try would be
with(subset(chemdata, param %in% c('TDS', 'Cond', 'Mg', 'SO4',
'Cl', 'Na', and 'Ca') , 1:4) ,
xtabs(quant ~ site + sampdate + param) )
David,
Need to rem
On Fri, 21 Oct 2011, David Winsemius wrote:
The first thing I would try would be
with(subset(chemdata, param %in% c('TDS', 'Cond', 'Mg', 'SO4', 'Cl', 'Na',
and 'Ca') , 1:4) ,
xtabs(quant ~ site + sampdate + param) )
David,
Need to remove the 'and' from the above.
The results include
On Oct 21, 2011, at 8:14 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
The only variable in that dataframe with what appears to be a
continuous
value (which is how I would expect "total dissolved solids" to be
measured) is "quant" Are you saying that the value of quant
On Fri, 21 Oct 2011, David Winsemius wrote:
The only variable in that dataframe with what appears to be a continuous
value (which is how I would expect "total dissolved solids" to be
measured) is "quant" Are you saying that the value of quant is measuring
something with different units depending
On Oct 21, 2011, at 6:17 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
What problem are you trying to solve?
What I need now is to compare TDS (total dissolved solids) with
specific
conductivity and the ions that are normally comprise TDS. Before
running any
regr
On Fri, 21 Oct 2011, David Winsemius wrote:
What problem are you trying to solve?
What I need now is to compare TDS (total dissolved solids) with specific
conductivity and the ions that are normally comprise TDS. Before running any
regression models I need to look at these data from three p
On Oct 21, 2011, at 4:38 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
How are we to determine which lines contain information about the
"relationships" of param=="TDS" with whatever cases or variable has
values
of "Cond" and "SO4"? Are you really trying to compare two
On Fri, 21 Oct 2011, David Winsemius wrote:
How are we to determine which lines contain information about the
"relationships" of param=="TDS" with whatever cases or variable has values
of "Cond" and "SO4"? Are you really trying to compare two disjoint groups
on some statistic like the means and
On Oct 21, 2011, at 3:02 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
First you need to clarify whether "TDS" is the name of a column or a
possible value in a column named "param". This whole painful
multi-question process would be greatly accelerated if you offered
str
On Fri, 21 Oct 2011, David Winsemius wrote:
First you need to clarify whether "TDS" is the name of a column or a
possible value in a column named "param". This whole painful
multi-question process would be greatly accelerated if you offered
str(chemdata).
Yes, I did on a different thread, bu
On Oct 21, 2011, at 2:09 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, David Winsemius wrote:
The last part ("in the same column") does not make sense, since I was
interpreting the term "parameter" to mean a value in a particular
column. Assuming these are R NA's then logical indexing:
wit
On Fri, 21 Oct 2011, David Winsemius wrote:
The last part ("in the same column") does not make sense, since I was
interpreting the term "parameter" to mean a value in a particular column.
Assuming these are R NA's then logical indexing:
with( chemdata, chemdata[!is.na(param1) & !is.na(param2)
On Fri, 21 Oct 2011, David Winsemius wrote:
The last part ("in the same column") does not make sense, since I was
interpreting the term "parameter" to mean a value in a particular column.
David,
That's what I meant: two values from the 'param' column.
Assuming these are R NA's then logica
On Oct 21, 2011, at 1:04 PM, Rich Shepard wrote:
On Fri, 21 Oct 2011, Weidong Gu wrote:
No easy way out with missing data problems, all imputations are
based on
some strong and untestable assumptions.
Thanks for the insights.
Let me rephrase my question in a way that should work: is th
On Fri, 21 Oct 2011, B77S wrote:
I know in my experience "Cond" (conductivity??) doesn't vary much within a
stream except for during high flow events, and I would imagine the same is
true for TDS.
This is generally true, but not in the streams with which we're working.
TDS values, for exampl
On Fri, 21 Oct 2011, Weidong Gu wrote:
No easy way out with missing data problems, all imputations are based on
some strong and untestable assumptions.
Thanks for the insights.
Let me rephrase my question in a way that should work: is there a way to
subset my comprehensive data frame ('ch
I know in my experience "Cond" (conductivity??) doesn't vary much within a
stream except for during high flow events, and I would imagine the same is
true for TDS. If these are all low flow values, you could possibly
determine a mean/median value to use for the missing data points. Obviously
this
Sounds like you are dealing with missing data problem. At default, lm
or glm would only keep observations with complete records (complete
case analysis). This can be problematic if you have many missing
variables and missing values occur not completely at random (i.e.,
missing values are dependent
Because of regulatory requirement changes over several decades and weather
conditions preventing site access the variables in my data set have
different lengths. I'd like guidance on how to perform linear regressions
and other models with these variables.
For example, there are 2206 rows for
22 matches
Mail list logo