On Jun 14, 2010, at 12:32 PM, Assa Yeroslaviz wrote:
I thought unique delete the whole line.
I don't really need the row names, but I thought of it as a way of
getting
the unique items.
Is there a way of deleting whole lines completely according to their
identifiers?
What I really need are unique values on the first column.
Assa
On Mon, Jun 14, 2010 at 18:04, jim holtman <jholt...@gmail.com> wrote:
Your process does remove all the duplicate entries based on the
content of the two columns. After you do this, there are still
duplicate entries in the first column that you are trying to use as
rownames and therefore the error. Why to you want to use non-unique
entries as rownames? Do you really need the row names, or should you
only be keeping unique values for the first column?
On Mon, Jun 14, 2010 at 8:54 AM, Assa Yeroslaviz <fry...@gmail.com>
wrote:
Hello everybody,
I have a a matrix of 2 columns and over 27k rows.
some of the rows are double , so I tried to remove them with the
command
unique():
Workbook5 <- read.delim(file = "Workbook5.txt")
dim(Workbook5)
[1] 27748 2
Workbook5 <- unique(Workbook5)
Jim already showed you one way in another thread and it is probably
more intuitive than this way, but just so you know...
Workbook5 <- Workbook5[ unique(Workbook5[ ,1] ) , ]
... should have worked. Logical indexing on first column with return
of both columns of qualifying rows.
--
David.
dim(Workbook5)
[1] 20101 2
it removed a lot of line, but unfortunately not all of them. I
wanted to
add
the row names to the matrix and got this error message:
rownames(Workbook5) <- Workbook5[,1]
Error in `row.names<-.data.frame`(`*tmp*`, value = c(1L, 2L, 3L,
4L, 5L,
:
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘A_51_P102339’,
‘A_51_P102518’, ‘A_51_P103435’, ‘A_51_P103465’,
‘A_51_P103594’, ‘A_51_P104409’, ‘A_51_P104718’,
‘A_51_P105869’, ‘A_51_P106428’, ‘A_51_P106799’,
‘A_51_P107176’, ‘A_51_P107959’, ‘A_51_P108767’,
‘A_51_P109258’, ‘A_51_P109708’, ‘A_51_P110341’,
‘A_51_P111757’, ‘A_51_P112427’, ‘A_51_P112662’,
‘A_51_P113672’, ‘A_51_P115018’, ‘A_51_P116496’,
‘A_51_P116636’, ‘A_51_P117666’, ‘A_51_P118132’,
‘A_51_P118168’, ‘A_51_P118400’, ‘A_51_P118506’,
‘A_51_P119315’, ‘A_51_P120093’, ‘A_51_P120305’,
‘A_51_P120738’, ‘A_51_P120785’, ‘A_51_P121134’,
‘A_51_P121359’, ‘A_51_P121412’, ‘A_51_P121652’,
‘A_51_P121724’, ‘A_51_P121829’, ‘A_51_P122141’,
‘A_51_P122964’, ‘A_51_P123422’, ‘A_51_P123895’,
‘A_51_P124008’, ‘A_51_P124719’, ‘A_51_P125648’,
‚ÄòA_51_P125679‚Äô, ‚ÄòA_51_P125779‚ [... truncated]
Is there a better way to discard the duplicataions in the text file
(Excel
file is the origin).
R.version
_
platform x86_64-apple-darwin9.8.0
arch x86_64
os darwin9.8.0
system x86_64, darwin9.8.0
status Patched
major 2
minor 11.1
year 2010
month 06
day 03
svn rev 52201
language R
version.string R version 2.11.1 Patched (2010-06-03 r52201)
THX
Assa
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.