Re: Numpy outlier removal
"Steven D'Aprano" wrote in message news:[email protected]... > On Sun, 06 Jan 2013 19:44:08 +, Joseph L. Casale wrote: > >> I have a dataset that consists of a dict with text descriptions and >> values that are integers. If required, I collect the values into a list >> and create a numpy array running it through a simple routine: >> >> data[abs(data - mean(data)) < m * std(data)] >> >> where m is the number of std deviations to include. > > I'm not sure that this approach is statistically robust. No, let me be > even more assertive: I'm sure that this approach is NOT statistically > robust, and may be scientifically dubious. > > The above assumes your data is normally distributed. How sure are you > that this is actually the case? > > For normally distributed data: > > Since both the mean and std calculations as effected by the presence of > outliers, your test for what counts as an outlier will miss outliers for > data from a normal distribution. For small N (sample size), it may be > mathematically impossible for any data point to be greater than m*SD from > the mean. For example, with N=5, no data point can be more than 1.789*SD > from the mean. So for N=5, m=1 may throw away good data, and m=2 will > fail to find any outliers no matter how outrageous they are. > > For large N, you will expect to find significant numbers of data points > more than m*SD from the mean. With N=10, and m=3, you will expect to > throw away 270 perfectly good data points simply because they are out on > the tails of the distribution. > > Worse, if the data is not in fact from a normal distribution, all bets > are off. You may be keeping obvious outliers; or more often, your test > will be throwing away perfectly good data that it misidentifies as > outliers. > > In other words: this approach for detecting outliers is nothing more than > a very rough, and very bad, heuristic, and should be avoided. > > Identifying outliers is fraught with problems even for experts. For > example, the ozone hole over the Antarctic was ignored for many years > because the software being used to analyse it misidentified the data as > outliers. > > The best general advice I have seen is: > > Never automatically remove outliers except for values that are physically > impossible (e.g. "baby's weight is 95kg", "test score of 31 out of 20"), > unless you have good, solid, physical reasons for justifying removal of > outliers. Other than that, manually remove outliers with care, or not at > all, and if you do so, always report your results twice, once with all > the data, and once with supposed outliers removed. > > You can read up more about outlier detection, and the difficulties > thereof, here: > > http://www.medcalc.org/manual/outliers.php > > https://secure.graphpad.com/guides/prism/6/statistics/index.htm > > http://www.webapps.cee.vt.edu/ewr/environmental/teach/smprimer/outlier/outlier.html > > http://stats.stackexchange.com/questions/38001/detecting-outliers-using-standard-deviations > > > > -- > Steven If you suspect that the data may not be normal you might look at exploratory data analysis, see Tukey. It's descriptive rather than analytic, treats outliers respectfully, uses median rather than mean, and is very visual. Wherever I analyzed data both gaussian and with EDA, EDA always won. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: serial module
Sounds like you may be using this on a Windows machine. the code is functional, it is best practice to close the port first before openiing it. If due to an error, usually not syntax, the port will stay stuck open until the program is closed and reopened. I have used the Python serial port (serial.py?) with good results. Paul Simon "Ron Eggler" wrote in message news:[email protected]... > Hoi, > > I'm trying to connect to a serial port and always get the error > "serial.serialutil.SerialException: Port is already open." whcih is > untrue. > I have no serial port open yet, my code looks like this: > #!/usr/bin/python > import time > import serial > > # configure the serial connections (the parameters differs on the device > # you are connecting to) > ser = serial.Serial( >port='/dev/ttyUSB0', >baudrate=19200, >parity=serial.PARITY_ODD, >stopbits=serial.STOPBITS_TWO, >bytesize=serial.SEVENBITS > ) > > ser.open() > > Why do I get this error? > > Thank you, > Ron > > --- Posted via news://freenews.netfront.net/ - Complaints to > [email protected] --- -- http://mail.python.org/mailman/listinfo/python-list
Re: How to round trip python and sqlite dates
"Mark Lawrence" wrote in message news:[email protected]... > All the references regarding the subject that I can find, e.g. > http://stackoverflow.com/questions/1829872/read-datetime-back-from-sqlite-as-a-datetime-in-python, > > talk about creating a table in memory using the timestamp type from the > Python layer. I can't see how to use that for a file on disk, so after a > bit of RTFM I came up with this. > > import sqlite3 > from datetime import datetime, date > > def datetime2date(datetimestr): > return datetime.strptime(datetimestr, '%Y-%m-%d') > > sqlite3.register_converter('DATETIME', datetime2date) > > db = sqlite3.connect(r'C:\Users\Mark\Cash\Data\test.sqlite', > detect_types=sqlite3.PARSE_DECLTYPES) > c = db.cursor() > c.execute('delete from temp') > row = 'DWP ESA', date(2013,11,18), 'Every two weeks', 143.4, date.max > c.execute('insert into temp values (?,?,?,?,?)', row) > c.execute('select * from temp') > row = c.fetchone() > nextdate = row[1] > print(nextdate, type(nextdate)) > > Run it and > > Traceback (most recent call last): > File "C:\Users\Mark\MyPython\mytest.py", line 13, in > c.execute('select * from temp') > File "C:\Users\Mark\MyPython\mytest.py", line 7, in datetime2date > return datetime.strptime(datetimestr, '%Y-%m-%d') > TypeError: must be str, not bytes > > However if I comment out the register_converter line this output is > printed > > 2013-11-18 > > Further digging in the sqlite3 file dbapi2.py I found references to > convert_date and convert_timestamp, but putting print statements in them > and they didn't appear to be called. > > So how do I achieve the round trip that I'd like, or do I simply cut my > loses and use strptime on the string that I can see returned? > > Note that I won't be checking replies, if any, for several hours as it's > now 02:15 GMT and I'm heading back to bed. > > -- > Python is the second best programming language in the world. > But the best has yet to be invented. Christian Tismer > > Mark Lawrence > Just a quicky, but I believe you don't have to register the datetime or timestamp converter as it is already implicit in the python to sql adaptation.This should handle the round trip conversion for you. I use some similar code but it's late here now. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Retrieving possible list for use in a subsequent INSERT
gt;
> is enum or set column types what needed here as proper columns to store
> 'download' list?
>
> Code:
>
> create table visitors
> (
> counterID integer(5) not null,
> host varchar(50) not null,
> refs varchar(25) not null,
> city varchar(20) not null,
> userOS varchar(10) not null,
> browser varchar(10) not null,
> hits integer(5) not null default 1,
> visits datetime not null,
> downloads set('None Yet'),
>
> foreign key (counterID) references counters(ID),
> unique index (visits)
> )ENGINE = MYISAM;
>
>
> Is the SET column type the way to do it?
> i tried it but the error i'm receiving is:
>
>
> pymysql.err.InternalError: (1241, 'Operand should contain 1 column(s)')
>
> Please help pick the necessary column type that will be able to store a a
> list of values.
> --
> https://mail.python.org/mailman/listinfo/python-list
If you have a list of values of the same type, but different values,
you need a new table with a foreign key to the table it relates to.
This is a relational database question. You can read more here:
http://en.wikipedia.org/wiki/Database_normalization#Normal_forms
--
Joel Goldstick
http://joelgoldstick.com
He doesn't a many to many table, although that would put the schema
into a classic normal form. Yes, there will be duplicated data. Sometimes
de-normalizing a schema may make things simpler and easier to use for
someone not used to database work. I would also use a many to many table
being familiar with normal forms but it is not a neccessity.
Paul Simon
--
https://mail.python.org/mailman/listinfo/python-list
Re: Retrieving possible list for use in a subsequent INSERT
"Nick the Gr33k" wrote in message news:[email protected]... > 1/11/2013 7:07 ??, ?/? Paul Simon ??: > >> If you have a list of values of the same type, but different values, >> you need a new table with a foreign key to the table it relates to. >> This is a relational database question. You can read more here: >> >> http://en.wikipedia.org/wiki/Database_normalization#Normal_forms > > I already answered to that in my previous post, this answer was Joel's > there was no need to retype it since i i saw it and responded to it. Perhaps you misunderstood my response to Joel's comment. You don't a many to many table as he said above in your quote. That's required for a normal form but isn't necessary. Denormalize the many to many form, have duplicated data in your only table if that works for you. Storage is cheap and its easier to create sql stsatements, too. Paul Simon -- https://mail.python.org/mailman/listinfo/python-list
Re: Data acquisition
"spintronic" wrote in message news:[email protected]... > Dear friends, > > I have a trouble with understanding the following. I have a very short > script (shown below) which works fine if I "run" step by step (or line > by line) in Python shell (type the first line/command -> press Enter, > etc.). I can get all numbers (actually, there are no numbers but a > long string, but this is not a problem) I need from a device: > > '0.3345098119,0.01069121274,0.02111624694,0.03833379529,0.02462816409,0.0774275008,0.06554297421,0.07366750919,0.08122602002,0.004018369318,0.03508462415,0.04829900696,0.06383554085, > > ...' > > However, when I start very the same list of commands as a script, it > gives me the following, which is certainly wrong: > > [0.0, 0.0, 0.0, 0.0, 0.0,...] > > Any ideas? Why there is a difference when I run the script or do it > command by command? > > === > from visa import * > > mw = instrument("GPIB0::20::INSTR", timeout = None) > > mw.write("*RST") > mw.write("CALC1:DATA? FDATA") > > a=mw.read() > > print a > === > (That is really all!) > > > PS In this case I use Python Enthought for Windows, but I am not an > expert in Windows (I work usually in Linux but now I need to run this > data acquisition under Windows). I'm almost certain that there is a turnaround timing issue that is causing the problem. These are common problems in data aquisition systems. The simplest solution is to loop and wait for end of line from the sending end and if necessary put in a time delay. After receiving the data, check the received data for correct format, correct first and last characters, and if possible, check sum. I've worked through this problem with rs-485 data collection systems where there is no hand shaking and would not be surprised to expect the same even with rs-232. Paul Simon -- http://mail.python.org/mailman/listinfo/python-list
Re: Making ETL from Access 97 to Access 2003
"rusi" wrote in message news:ff550c58-58b0-4bf2-bf12-08986ab2b...@ka6g2000pbb.googlegroups.com... On Apr 15, 5:27 pm, Steeve wrote: > Hi, > > I need to take data from 5 differents (but similar) database in MS Access > 97 and merge them into one MS Access 2003 database. Not sure what this had to do with python. However You could write out the five as csvs and then read in those csvs. This is assuming that access 2003 cannot read in access 97. [Seems a bit surprising though] > > Is some packages exist to do this task? Dunno Have you seen http://allenbrowne.com/ser-48.html ? If there are indices and especially linked primary and foreign keys its much more complicated than that. One has to delve into Access container structures etc. As far as I know it has to be done from Access. Paul Simon -- http://mail.python.org/mailman/listinfo/python-list
Re: Making ETL from Access 97 to Access 2003
"rusi" wrote in message news:92551c63-1347-4f1a-9dca-d1bbd5e4d...@ys5g2000pbc.googlegroups.com... Its hard to distinguish what you are saying from what I said because you've lost the quotes. On Apr 15, 9:01 pm, "Paul Simon" wrote: > "rusi" wrote in message > > news:ff550c58-58b0-4bf2-bf12-08986ab2b...@ka6g2000pbb.googlegroups.com... > On Apr 15, 5:27 pm, Steeve wrote: > > > Hi, > > > I need to take data from 5 differents (but similar) database in MS > > Access > > 97 and merge them into one MS Access 2003 database. > > Not sure what this had to do with python. > However > You could write out the five as csvs and then read in those csvs. > This is assuming that access 2003 cannot read in access 97. [Seems a > bit surprising though] > > > > > Is some packages exist to do this task? > > Dunno > Have you seenhttp://allenbrowne.com/ser-48.html? > > If there are indices and especially linked primary and foreign keys its > much more complicated than that. One has to delve into Access container > structures etc. As far as I know it has to be done from Access. I assume you are saying this for my csv suggestion? Yes of course. I gave this as the last resort if direct import and other such attempts dont work out. >>>Could you please append your comments instead of splitting them? >>>Let me try to be clearer. If one only wants to merge tables, csv will >>>work fine, exporting them from Access. >>>Reconstucting keys and relationships can be done with some difficulty >>>using Access' container model. See the Developer's Handbook by Getz, >>>Litwin and Gilbert. >>>Paul Simon -- http://mail.python.org/mailman/listinfo/python-list
Re: Graphical library - charts
I suggest you look at matplotlib. It's a bit of a learning curve but will do whatever you need. I have a similar requirement and found that gnuplot did not work for me. The plots are impressive. Paul Simon wrote in message news:[email protected]... > Hello, > > I have thousends of files with logs from monitoring system. Each file > has some important data (numbers). I'd like to create charts using those > numbers. Could you please suggest library which will allow creating > such charts ? The preferred chart is line chart. > > Besides is there any library which allow me to zoom in/out of such chart ? > Sometimes I need to create chart using long-term data (a few months) but > then observe a minutes - it would be good to not create another short-term > chart but just zoom-in. > > Those files are on one unix server and the charts will be displayed on > another unix server so the X-Window protocol is going to be used. > > Any suggestions ? > > Best regards > przemol > -- http://mail.python.org/mailman/listinfo/python-list
problem installing python 2.6.2 from tarball
I have just finished going through the usual ./configure, make, etc. and now have a load of stuff in my home directory that I think doesn't belong there. Apparently I did the installation from my home directory (linux) and have a directory in my home directory "Python2.6.2" with subdirectories like "build", "Demo", "Doc", "Include", "Lib", etc. There are many files under /usr/bin/local/ which appear to be duplicates. This looks like a mess to me and would like some help to sort this out. Paul Simon -- http://mail.python.org/mailman/listinfo/python-list
tkinter problem
I have the "tkinter" problem and need some assistance to straighten it out. >From the web page "http://wiki.python.org/moin/TkInter"; I tested as in "step 1" and cannot import "_tkinter." I do not have that file on my computer, but do have tkinter.py in /usr/local/lib/python2.6/lib-tk. as well as the directories /usr/lib/tk8.5 and /usr/lib/tcl8.5. This python stuff is great, but the documentation frequently feels like it is just a bit out of my grasp. I realize that all of this is free but I understand the instructions on the web page to repair only to the point of confusion. I'm not an expert. How do I modify my python configuration? Is there a file that needs to be edited? Which setup.py file do I use? Make? or python setup.py build and python setup.py install? Thanks. I appreciate your help. Paul Simon -- http://mail.python.org/mailman/listinfo/python-list
Re: tkinter problem
"Chris Rebert" wrote in message news:[email protected]... On Wed, Jul 8, 2009 at 4:18 PM, Paul Simon wrote: > I have the "tkinter" problem and need some assistance to straighten it > out. > >From the web page "http://wiki.python.org/moin/TkInter"; I tested as in > >"step > 1" and cannot import "_tkinter." I do not have that file on my computer, > but > do have tkinter.py in /usr/local/lib/python2.6/lib-tk. as well as the > directories /usr/lib/tk8.5 and /usr/lib/tcl8.5. > This python stuff is great, but the documentation frequently > feels like it is just a bit out of my grasp. I realize that all of this is > free but I understand the instructions on the web page to repair only to > the > point of confusion. I'm not an expert. How do I modify my python > configuration? Is there a file that needs to be edited? Which setup.py > file > do I use? Make? or python setup.py build and python setup.py install? > Thanks. I appreciate your help. - How did you install Python? - What Linux distro are you using? Cheers, Chris -- http://blog.rebertia.com I"m using Mandriva 2008.1. I have to tell you honestly that I'm not sure exactly how I installed Python. Originally I had installed 2.5 from RPM but 2.6 was not available for my distro (2008.1) in RPM. I downloaded something from python.org and installed. Not sure if it was tarball or zip file. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: tkinter problem
"Peter Otten" <[email protected]> wrote in message news:[email protected]... > Paul Simon wrote: > >> "Chris Rebert" wrote in message >> news:[email protected]... >> On Wed, Jul 8, 2009 at 4:18 PM, Paul Simon wrote: >>> I have the "tkinter" problem and need some assistance to straighten it >>> out. >>> >From the web page "http://wiki.python.org/moin/TkInter"; I tested as in >>> >"step >>> 1" and cannot import "_tkinter." I do not have that file on my computer, >>> but >>> do have tkinter.py in /usr/local/lib/python2.6/lib-tk. as well as the >>> directories /usr/lib/tk8.5 and /usr/lib/tcl8.5. >>> This python stuff is great, but the documentation frequently >>> feels like it is just a bit out of my grasp. I realize that all of this >>> is free but I understand the instructions on the web page to repair only >>> to the >>> point of confusion. I'm not an expert. How do I modify my python >>> configuration? Is there a file that needs to be edited? Which setup.py >>> file >>> do I use? Make? or python setup.py build and python setup.py install? >>> Thanks. I appreciate your help. >> >> - How did you install Python? >> - What Linux distro are you using? >> >> Cheers, >> Chris >> http://blog.rebertia.com >> I"m using Mandriva 2008.1. I have to tell you honestly that I'm not sure >> exactly how I installed Python. Originally I had installed 2.5 from RPM >> but 2.6 was not available for my distro (2008.1) in RPM. I downloaded >> something from python.org and installed. Not sure if it was tarball or >> zip file. > > Zip or tar doesn't matter, you are installing "from source". > > Python has to find the necessary include files for tcl/tk. These are in > separate packages that you have to install before you invoke Python's > configure script. > > I don't know what they are called on your system -- look for tk-dev.rpm, > tcl-dev.rpm or similar. > > You may run into the same problem with other modules like readline. > > Peter > Thank you Peter. I understand what you are saying but don't know how to do it. Although I installed from source, I followed a "cookbook" recipe. Could you tell me what files to execute, where they might be, and file arguments? I'm just ignorant, not stupid. ;-). Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: tkinter problem
"David Smith" wrote in message news:[email protected]... > Paul Simon wrote: >> "Peter Otten" <[email protected]> wrote in message >> news:[email protected]... >>> Paul Simon wrote: >>> >>>> "Chris Rebert" wrote in message >>>> news:[email protected]... >>>> On Wed, Jul 8, 2009 at 4:18 PM, Paul Simon wrote: >>>>> I have the "tkinter" problem and need some assistance to straighten it >>>>> out. >>>>> >From the web page "http://wiki.python.org/moin/TkInter"; I tested as >>>>> >in >>>>>> "step >>>>> 1" and cannot import "_tkinter." I do not have that file on my >>>>> computer, >>>>> but >>>>> do have tkinter.py in /usr/local/lib/python2.6/lib-tk. as well as the >>>>> directories /usr/lib/tk8.5 and /usr/lib/tcl8.5. >>>>> This python stuff is great, but the documentation frequently >>>>> feels like it is just a bit out of my grasp. I realize that all of >>>>> this >>>>> is free but I understand the instructions on the web page to repair >>>>> only >>>>> to the >>>>> point of confusion. I'm not an expert. How do I modify my python >>>>> configuration? Is there a file that needs to be edited? Which setup.py >>>>> file >>>>> do I use? Make? or python setup.py build and python setup.py install? >>>>> Thanks. I appreciate your help. >>>> - How did you install Python? >>>> - What Linux distro are you using? >>>> >>>> Cheers, >>>> Chris >>>> http://blog.rebertia.com >>>> I"m using Mandriva 2008.1. I have to tell you honestly that I'm not >>>> sure >>>> exactly how I installed Python. Originally I had installed 2.5 from >>>> RPM >>>> but 2.6 was not available for my distro (2008.1) in RPM. I downloaded >>>> something from python.org and installed. Not sure if it was tarball or >>>> zip file. >>> Zip or tar doesn't matter, you are installing "from source". >>> >>> Python has to find the necessary include files for tcl/tk. These are in >>> separate packages that you have to install before you invoke Python's >>> configure script. >>> >>> I don't know what they are called on your system -- look for tk-dev.rpm, >>> tcl-dev.rpm or similar. >>> >>> You may run into the same problem with other modules like readline. >>> >>> Peter >>> >> >> Thank you Peter. I understand what you are saying but don't know how to >> do >> it. Although I installed from source, I followed a "cookbook" recipe. >> Could you tell me what files to execute, where they might be, and file >> arguments? I'm just ignorant, not stupid. ;-). >> >> Paul >> >> > > Just install the tkinter package from the Mandriva Linux Control > Center's Software Management system. I just did it, doing a search for > tkinter brought it right up. All done. > > --David Thanks to all for your patient help. I have made some progress, but still no success. I installed Active Tcl-8.5.7 and corrected the PATH accordingly. However I still get a "missing" message on building Python. "Failed to find the necessary bits (!) to build these modules: _tkinter (among others) To find the necessary bits, look in setup.py in detect_modules() for teh module's name." Not sure what bits are, euphemism? but am about to wipe the disk and reinstall linux, etc. Paul -- http://mail.python.org/mailman/listinfo/python-list
