On Sun, Jul 20, 2014 at 6:33 AM, Ubernerdy <uberne...@gmail.com> wrote: > Hello guys! > > I have been messing around with R for a while now, but this situation has > me a bit stumped. I was unable to solve it by reading documentation. > > So I have this table (currently in Excel - could export it as csv) with > values in 3 columns. Let's call them value P (for position), value B > (beginning) and E (end). P represents the position of a mutation in the > genome, B and E are the beginnings and ends of a nearby gene that either > contains the mutation or not. > > I am trying to compute the distance between the mutation and the gene. > > If the mutation is contained in the gene, that is value P is greater than B > and lesser than E, the result is 0. > > If the mutation is "left" of the gene, the distance is negative and is > equal to P-B. > > If the mutation is "right" of the gene, the distance is positive and is > equal to P-E. > > How would i achieve this in R? > > Regards and thanks, S. >
Jim has given you the answer in R. So why am I "shooting off my mouth?". Well, good question! <grin/> But I thought it might be a bit helpful to mention a couple of things. First, you can do this same calculation in Excel by using the =IF(logical-expression,true-expression,false-expression) in another column on the spreadsheet. Just in case you wanted to do it in Excel in addition to, or instead of, R. And, like the ifelse() function in R, you can use a "nested" IF() within either, or both, of the value expressions of the =IF. Something like: =IF(P<B,P-B,IF(P>E,P-E,0)) where "P", "B", and "E" are replaced by the proper cell names. Of course, the plus in R is that you can create the distance values "all at once" because R can work with vectors. Whereas in Excel, you'll need to put the =IF expression in every cell in the appropriate column. I mention it only for completeness, just in case you need the data in the spreadsheet also. Secondly, you mentioned exporting to a csv. I would _guess_ that you are using Windows, not a Mac nor Linux. If so, then built into Windows is an ODBC driver for reading (and writing) from (to) an Excel spreadsheet. Most users don't know much about ODBC, but it is there. You might be interested: http://www.cyberfella.co.uk/2012/06/11/windows7-odbc/ http://msdn.microsoft.com/en-us/library/windows/desktop/dn259722%28v=vs.85%29.aspx I'm not where I can work my way through it because I'm at home on a Linux machine. But if you need some step-by-step, I'll be glad to _try_ on Monday when I'm back in the office. I just set up an ODBC for accessing MS-SQL and the process is similar with Excel. Or you might search YouTube, which is where I found out about this, albeit not using R. On http://Youtube.com try searching on "set up odbc connections excel" But I didn't really see too many which were relevant. Most were how to set up ODBC to allow Excel to read from some other database. Anyway, once this is done, you could then read the Excel data using the RODBC driver. Something like: library(RODBC); ch <- odbcConnect("Excel"); # The name of the ODBC connection you made above data <- sqlFetch(ch,"SheetName"); # read all the data in the Excel workbook in the tab named "SheetName" # please forgive any errors - not where I can test today Another possibility is to use an "R native" access via either openxlsx or XLConnect. I use openxlsx myself, not because it is "better", but because XLConnect uses Apache POI which means using Java which means it is not as fast. Well, at least in my assumptions XLConnect must be slower due to Java. I have not done any performance or reliability testing. I have done some functionality testing and both seem to work on my test Excel spreadsheet. In addition, these packages can also create or replace an Excel spreadsheet by rewriting the ???.xlsx file. Note this is not an "update in place" like with a real database. Honestly, for me, these latter are easier to understand. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.