os.system and subprocess odd behavior
Example of the issue for arguments sake:
Platform Ubuntu server 12.04LTS, python 2.7
Say file1.txt has "hello world" in it.
subprocess.Popen("cat < file1 > file2", shell = True)
subprocess.call("cat < file1 > file2", shell = True)
os.system("cat < file1 > file2")
I'm finding that file2 IS created, but with 0bytes in it, this happens when I
try any sort of cmd to the system of the nature where I'm putting the output
into a file.
I've made sure it isn't a permission issue. The command runs fine from the cmd
line and python is being run with super user privileges. Strait from the
terminal I get a hello world copy as file2... as expected.
I would like python to simply exec the cmd and move on I don't want to read
and write the stdout ect into python and write it to a file. Any thoughts as
to why this creates file2, but no data appears? Is there a better way to do
this?
Thank you!
--
http://mail.python.org/mailman/listinfo/python-list
Re: os.system and subprocess odd behavior
Thanks! I am using .txt extensions. Sorry for being a little vague. -- http://mail.python.org/mailman/listinfo/python-list
Re: os.system and subprocess odd behavior
Thanks for verifying this for me Steven. I'm glad you are seeing it work. It's really the strangest thing. The issue seems to be with the " > outfile.txt" portion of the command. The actual command is running a query on a verticalDB and dumping the result. The EXACT command run from the command line runs just fine. Even if I use the simple cat command to and out file as just a simple test case... The file is created with zero bytes (see below)... but its as if python moves on or gets an 0 exit code after the first part of the cmd is executed and no data is written. -rw-r--r-- 1 root root0 Dec 14 15:33 QUAD_12142012203251.TXT Any thoughts as to why on my end this may happen? Thanks again! -- http://mail.python.org/mailman/listinfo/python-list
Re: os.system and subprocess odd behavior
Oscar, seems you may be correct. I need to run this program as a superuser. However, after some more tests with simple commands... I seem to be working correctly from any permission level in python Except for the output write command from the database to a file. Which runs fine if I paste it into the cmd line. Also, subprocess.call_check() returns clean. However, nothing is written to the output file when called from python. so this cmd runs great from the cmd line (sudo or no) however the output file in this case is owned by the sysadmin either way... not root? /usr/local/Calpont/mysql/bin/mysql --defaults-file=/usr/local/Calpont/mysql/my.cnf -u root myDB < /home/myusr/jobs/APP_JOBS/JOB_XXX.SQL > /home/myusr/jobs/APP_JOBS/JOB_XXX.TXT When run from sudo python (other files are also created and owned by root correctly) however no output is written from the db command zero byte file only (owned by root)... returns to python with no errors. I'm sorta at a loss. I'd rather still avoid having python connect to the db directly or reading the data from stdout, is a waste or mem and time for what I need. Thanks for any more thoughts. > > Because of the root permissions on the file? What happens if you write > > to a file that doesn't need privileged access? > > > > Instead of running the "exact command", run the cat commands you > > posted (that Steven has confirmed as working) and run them somewhere > > in your user directory without root permissions. > > > > Also you may want to use subprocess.check_call as this raises a Python > > error if the command returns an error code. > > > > > > Oscar -- http://mail.python.org/mailman/listinfo/python-list
Re: os.system and subprocess odd behavior
Oscar I can confirm this behavior from terminal.
AND this works as well, simulating exactly what I'm doing permissions wise, and
calling sudo python test.py below
f1 = open('TESTDIR/file1.txt', 'w')
f1.write('some test here\n')
f1.close()
cmd1 = 'cat < TESTDIR/file1.txt > TESTDIR/file2.txt'
P = Popen(cmd1, shell=True)
P.wait()
cmd2 = 'cat < TESTDIR/file1.txt | sudo tee TESTDIR/file3.txt'
P = Popen(cmd2, shell=True)
P.wait()
-rw-r--r-- 1 root root 15 Dec 18 12:57 file1.txt
-rw-r--r-- 1 root root 15 Dec 18 12:57 file2.txt
-rw-r--r-- 1 root root 15 Dec 18 12:57 file3.txt
HOWEVER...
when using this command from before no dice
/usr/local/Calpont/mysql/bin/mysql
--defaults-file=/usr/local/Calpont/mysql/my.cnf -u root myDB <
/home/myusr/jobs/APP_JOBS/JOB_XXX.SQL > /home/myusr/jobs/APP_JOBS/JOB_XXX.TXT
OR
/usr/local/Calpont/mysql/bin/mysql
--defaults-file=/usr/local/Calpont/mysql/my.cnf -u root myDB <
/home/myusr/jobs/APP_JOBS/JOB_XXX.SQL | sudo tee
/home/myusr/jobs/APP_JOBS/JOB_XXX.TXT
So it's basically as if python gets a response instantly (perhaps from the
query) and closes the process, since we've verified its not permissions related.
Perhaps someone can try a mysql cmd line such as above within python? And see
if you can verify this behavior. I believe the query returning with no errors
is shutting the sub shell/process?
I've tried this with all options p.wait() ect as well as parsing the command
and running shell false.
Again the exact command run perfect when pasted and run from the shell. I'll
try running it a few other ways with some diff db options.
> Follow through the bash session below
>
>
>
> $ cd /usr
>
> $ ls
>
> bin games include lib local sbin share src
>
> $ touch file
>
> touch: cannot touch `file': Permission denied
>
> $ sudo touch file
>
> [sudo] password for oscar:
>
> $ ls
>
> bin file games include lib local sbin share src
>
> $ cat < file > file2
>
> bash: file2: Permission denied
>
> $ sudo cat < file > file2
>
> bash: file2: Permission denied
>
> $ sudo cat < file > file2
>
> bash: file2: Permission denied
>
> $ sudo cat < file | tee file2
>
> tee: file2: Permission denied
>
> $ sudo cat < file | sudo tee file2
>
> $ ls
>
> bin file file2 games include lib local sbin share src
>
>
>
> The problem is that when you do
>
>
>
> $ sudo cmd > file2
>
>
>
> it is sort of like doing
>
>
>
> $ sudo cmd | this_bash_session > file2
>
>
>
> so the permissions used to write to file2 are the same as the bash
>
> session rather than the command cmd which has root permissions. By
>
> piping my output into "sudo tee file2" I can get file2 to be written
>
> by a process that has root permissions.
>
>
>
> I suspect you have the same problem although it all complicated by the
>
> fact that everything is a subprocess of Python. Is it possibly the
>
> case that the main Python process does not have root permissions but
>
> you are using it to run a command with sudo that then does have root
>
> permissions?
>
>
>
> Does piping through something like "sudo tee" help?
>
>
>
>
>
> Oscar
--
http://mail.python.org/mailman/listinfo/python-list
Re: os.system and subprocess odd behavior
Solved the issue, by injecting the query into the cmd line. Shell script worked fine as if I was cutting and pasting to the prompt. Seems to still be something with the subprocess receiving and exit code before or when the query finishes, just when I ask to to read from the .SQL file. example called from in python: mysql < file.txt > out.txt < doesn't work (query is run 0Byte output) mysql -e "my query" > out.txt <- does work However this isn't standard mysql as it's infinidb. Maybe this is an esoteric issue. Thanks for the help Oscar. Frustrating since it seems illogical seems if the cmd runs in the shell it should have the exact same behavior from a subprocess shell=True cmd string call. If I find anything else I'll update this. -- http://mail.python.org/mailman/listinfo/python-list
pytables - best practices / mem leaks
I have an H5 file with one group (off the root) and two large main
tables and I'm attempting to aggragate my data into 50+ new groups (off
the root) with two tables per sub group.
sys info:
PyTables version: 1.3.2
HDF5 version: 1.6.5
numarray version: 1.5.0
Zlib version: 1.2.3
BZIP2 version: 1.0.3 (15-Feb-2005)
Python version:2.4.2 (#1, Jul 13 2006, 20:16:08)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)]
Platform: darwin-Power Macintosh (v10.4.7)
Byte-ordering: big
Ran all pytables tests included with package and recieved an OK.
Using the following code I get one of three errors:
1. Illegal Instruction
2. Malloc(): trying to call free() twice
3. Bus Error
I believe all three stem from the same issue, involving a malloc()
memory problem in the pytable c libraries. I also believe this may be
due to how I'm attempting to write my sorting script.
The script executes fine and all goes well until I'm sorting about
group 20 to 30 and I throw one of the three above errors depending on
how/when I'm flush() close() the file. When I open the file after the
error using h5ls all tables are in perfact order up to the crash and if
I continue from the point every thing runs fine until python throws the
same error again after another 10 sorts or so. The somewhat random
crashing is what leads me to believe I have a memory leak or my method
of doing this is incorrect.
Is there a better way to aggragate data using pytables/python? Is there
a better way to be doing this? This seems strait forward enough.
Thanks,
Conor
#function to agg state data from main neg/pos tables into neg/pos state
tables
import string
import tables
def aggstate(state, h5file):
print state
class PosRecords(tables.IsDescription):
sic = tables.IntCol(0, 1, 4, 0, None, 0)
numsic = tables.IntCol(0, 1, 4, 0, None, 0)
empsiz = tables.StringCol(1, '?', 1, None, 0)
salvol = tables.StringCol(1, '?', 1, None, 0)
popcod = tables.StringCol(1, '?', 1, None, 0)
state = tables.StringCol(2, '?', 1, None, 0)
zip = tables.IntCol(0, 1, 4, 0, None, 1)
class NegRecords(tables.IsDescription):
sic = tables.IntCol(0, 1, 4, 0, None, 0)
numsic = tables.IntCol(0, 1, 4, 0, None, 0)
empsiz = tables.StringCol(1, '?', 1, None, 0)
salvol = tables.StringCol(1, '?', 1, None, 0)
popcod = tables.StringCol(1, '?', 1, None, 0)
state = tables.StringCol(2, '?', 1, None, 0)
zip = tables.IntCol(0, 1, 4, 0, None, 1)
group1 = h5file.createGroup("/", state+"_raw_records", state+" raw
records")
table1 = h5file.createTable(group1, "pos_records", PosRecords, state+"
raw pos record table")
table2 = h5file.createTable(group1, "neg_records", NegRecords, state+"
raw neg record table")
table = h5file.root.raw_records.pos_records
point = table1.row
for x in table.iterrows():
if x['state'] == state:
point['sic'] = x['sic']
point['numsic'] = x['numsic']
point['empsiz'] = x['empsiz']
point['salvol'] = x['salvol']
point['popcod'] = x['popcod']
point['state'] = x['state']
point['zip'] = x['zip']
point.append()
h5file.flush()
table = h5file.root.raw_records.neg_records
point = table2.row
for x in table.iterrows():
if x['state'] == state:
point['sic'] = x['sic']
point['numsic'] = x['numsic']
point['empsiz'] = x['empsiz']
point['salvol'] = x['salvol']
point['popcod'] = x['popcod']
point['state'] = x['state']
point['zip'] = x['zip']
point.append()
h5file.flush()
states =
['AL','AK','AZ','AR','CA','CO','CT','DC','DE','FL','GA','HI','ID','IL','IN','IA','KS','KY','LA','ME','MD','MA','MI','MN','MS','MO','MT','NE','NV','NH','NJ','NM','NY','NC','ND','OH','OK','OR','PA','RI','SC','SD','TN','TX','UT','VT','VA','WA','WV','WI','WY']
h5file = tables.openFile("200309_data.h5", mode = 'a')
for i in xrange(len(states)):
aggstate(states[i], h5file)
h5file.close()
--
http://mail.python.org/mailman/listinfo/python-list
run a string as code?
How can you make python interpret a string (of py code) as code. For example if you want a py program to modify itself as it runs. I know this is an advantage of interpreted languages, how is this done in python. Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: run a string as code?
[EMAIL PROTECTED] wrote:
> py_genetic wrote:
> > How can you make python interpret a string (of py code) as code. For
> > example if you want a py program to modify itself as it runs. I know
> > this is an advantage of interpreted languages, how is this done in
> > python. Thanks.
>
> This might do it...
>
> >>> print eval.__doc__
> eval(source[, globals[, locals]]) -> value
>
> Evaluate the source in the context of globals and locals.
> The source may be a string representing a Python expression
> or a code object as returned by compile().
> The globals must be a dictionary and locals can be any mappping,
> defaulting to the current globals and locals.
> If only globals is given, locals defaults to it.
For example each time this line is interpreted I would like to use the
new value of the state var which is a global var. How can I force
state to be identified and used in this string.
r_table = h5file.root.state_raw_records.neg_records
r_table = eval("h5file.root.state_raw_records.neg_records") ??
r_table = h5file.root.eval("state")_raw_records.neg_records ?? eval is
not a part of root
dont think either of these is very logical? Any ideas? Possibly the
parser mod?
--
http://mail.python.org/mailman/listinfo/python-list
Re: run a string as code?
py_genetic wrote:
> [EMAIL PROTECTED] wrote:
> > py_genetic wrote:
> > > How can you make python interpret a string (of py code) as code. For
> > > example if you want a py program to modify itself as it runs. I know
> > > this is an advantage of interpreted languages, how is this done in
> > > python. Thanks.
> >
> > This might do it...
> >
> > >>> print eval.__doc__
> > eval(source[, globals[, locals]]) -> value
> >
> > Evaluate the source in the context of globals and locals.
> > The source may be a string representing a Python expression
> > or a code object as returned by compile().
> > The globals must be a dictionary and locals can be any mappping,
> > defaulting to the current globals and locals.
> > If only globals is given, locals defaults to it.
>
> For example each time this line is interpreted I would like to use the
> new value of the state var which is a global var. How can I force
> state to be identified and used in this string.
>
> r_table = h5file.root.state_raw_records.neg_records
>
> r_table = eval("h5file.root.state_raw_records.neg_records") ??
> r_table = h5file.root.eval("state")_raw_records.neg_records ?? eval is
> not a part of root
>
> dont think either of these is very logical? Any ideas? Possibly the
> parser mod?
Got it!
tmp = "h5file.root."+state+"_raw_records.pos_records"
r_table = eval(tmp)
works great thanks for the help!
--
http://mail.python.org/mailman/listinfo/python-list
Re: run a string as code?
Gary Herron wrote:
> py_genetic wrote:
> > py_genetic wrote:
> >
> >> [EMAIL PROTECTED] wrote:
> >>
> >>> py_genetic wrote:
> >>>
> >>>> How can you make python interpret a string (of py code) as code. For
> >>>> example if you want a py program to modify itself as it runs. I know
> >>>> this is an advantage of interpreted languages, how is this done in
> >>>> python. Thanks.
> >>>>
> >>> This might do it...
> >>>
> >>>
> >>>>>> print eval.__doc__
> >>>>>>
> >>> eval(source[, globals[, locals]]) -> value
> >>>
> >>> Evaluate the source in the context of globals and locals.
> >>> The source may be a string representing a Python expression
> >>> or a code object as returned by compile().
> >>> The globals must be a dictionary and locals can be any mappping,
> >>> defaulting to the current globals and locals.
> >>> If only globals is given, locals defaults to it.
> >>>
> >> For example each time this line is interpreted I would like to use the
> >> new value of the state var which is a global var. How can I force
> >> state to be identified and used in this string.
> >>
> >> r_table = h5file.root.state_raw_records.neg_records
> >>
> >> r_table = eval("h5file.root.state_raw_records.neg_records") ??
> >> r_table = h5file.root.eval("state")_raw_records.neg_records ?? eval is
> >> not a part of root
> >>
> >> dont think either of these is very logical? Any ideas? Possibly the
> >> parser mod?
> >>
> >
> > Got it!
> >
> > tmp = "h5file.root."+state+"_raw_records.pos_records"
> > r_table = eval(tmp)
> >
> > works great thanks for the help!
> >
> Yes, it works, but this is not a good place to use eval. Now that we see
> how you want to use it, we can find a *much* better way to do it.
>
> If you want to lookup an attribute of an object, but the attribute name
> is a string in a variable, then use getattr to do the lookup.
>
> If in interpret your code correctly:
>
> attrname = state + "_raw_records"
> obj = getattr(h5file.root, attrname)
> r_table = obj.pos_records
>
> These, of course, could be combined into a single (but not necessarily
> clearer) line.
>
> Gary Herron
So it is eval() is more appropriate when evalution blocks of string
code, and getattr() is more efficient for dealing with objects such as
h5file object above? Thanks.
--
http://mail.python.org/mailman/listinfo/python-list
Re: pytables - best practices / mem leaks
py_genetic wrote:
> I have an H5 file with one group (off the root) and two large main
> tables and I'm attempting to aggragate my data into 50+ new groups (off
> the root) with two tables per sub group.
>
> sys info:
> PyTables version: 1.3.2
> HDF5 version: 1.6.5
> numarray version: 1.5.0
> Zlib version: 1.2.3
> BZIP2 version: 1.0.3 (15-Feb-2005)
> Python version:2.4.2 (#1, Jul 13 2006, 20:16:08)
> [GCC 4.0.1 (Apple Computer, Inc. build 5250)]
> Platform: darwin-Power Macintosh (v10.4.7)
> Byte-ordering: big
>
> Ran all pytables tests included with package and recieved an OK.
>
>
> Using the following code I get one of three errors:
>
> 1. Illegal Instruction
>
> 2. Malloc(): trying to call free() twice
>
> 3. Bus Error
>
> I believe all three stem from the same issue, involving a malloc()
> memory problem in the pytable c libraries. I also believe this may be
> due to how I'm attempting to write my sorting script.
>
> The script executes fine and all goes well until I'm sorting about
> group 20 to 30 and I throw one of the three above errors depending on
> how/when I'm flush() close() the file. When I open the file after the
> error using h5ls all tables are in perfact order up to the crash and if
> I continue from the point every thing runs fine until python throws the
> same error again after another 10 sorts or so. The somewhat random
> crashing is what leads me to believe I have a memory leak or my method
> of doing this is incorrect.
>
> Is there a better way to aggragate data using pytables/python? Is there
> a better way to be doing this? This seems strait forward enough.
>
> Thanks,
> Conor
>
> #function to agg state data from main neg/pos tables into neg/pos state
> tables
>
> import string
> import tables
>
>
> def aggstate(state, h5file):
>
> print state
>
> class PosRecords(tables.IsDescription):
> sic = tables.IntCol(0, 1, 4, 0, None, 0)
> numsic = tables.IntCol(0, 1, 4, 0, None, 0)
> empsiz = tables.StringCol(1, '?', 1, None, 0)
> salvol = tables.StringCol(1, '?', 1, None, 0)
> popcod = tables.StringCol(1, '?', 1, None, 0)
> state = tables.StringCol(2, '?', 1, None, 0)
> zip = tables.IntCol(0, 1, 4, 0, None, 1)
>
> class NegRecords(tables.IsDescription):
> sic = tables.IntCol(0, 1, 4, 0, None, 0)
> numsic = tables.IntCol(0, 1, 4, 0, None, 0)
> empsiz = tables.StringCol(1, '?', 1, None, 0)
> salvol = tables.StringCol(1, '?', 1, None, 0)
> popcod = tables.StringCol(1, '?', 1, None, 0)
> state = tables.StringCol(2, '?', 1, None, 0)
> zip = tables.IntCol(0, 1, 4, 0, None, 1)
>
>
>
> group1 = h5file.createGroup("/", state+"_raw_records", state+" raw
> records")
>
> table1 = h5file.createTable(group1, "pos_records", PosRecords, state+"
> raw pos record table")
> table2 = h5file.createTable(group1, "neg_records", NegRecords, state+"
> raw neg record table")
>
> table = h5file.root.raw_records.pos_records
> point = table1.row
> for x in table.iterrows():
> if x['state'] == state:
> point['sic'] = x['sic']
> point['numsic'] = x['numsic']
> point['empsiz'] = x['empsiz']
> point['salvol'] = x['salvol']
> point['popcod'] = x['popcod']
> point['state'] = x['state']
> point['zip'] = x['zip']
>
> point.append()
>
> h5file.flush()
>
> table = h5file.root.raw_records.neg_records
> point = table2.row
> for x in table.iterrows():
> if x['state'] == state:
> point['sic'] = x['sic']
> point['numsic'] = x['numsic']
> point['empsiz'] = x['empsiz']
> point['salvol'] = x['salvol']
> point['popcod'] = x['popcod']
> point[&
Re: Create a new class on the fly
Alex, thanks for the advise: > > class PosRecords(tables.IsDescription): > > > class A(object): > > self.__init__(self, args): > > This makes 0 sense; maybe you should learn elementary Python syntax well > _before_ trying advanced stuff, no? I accidently left that erroneous snippet in, however if your offering a class in smart ass let me know where to sign up. -- http://mail.python.org/mailman/listinfo/python-list
Efficient way of generating original alphabetic strings like unix file "split"
Hi, I'm looking to generate x alphabetic strings in a list size x. This is exactly the same output that the unix command "split" generates as default file name output when splitting large files. Example: produce x original, but not random strings from english alphabet, all lowercase. The length of each string and possible combinations is dependent on x. You don't want any repeats. [aaa, aab, aac, aad, aax, .. bbc, bbd, bcd] I'm assumming there is a slick, pythonic way of doing this, besides writing out a beast of a looping function. I've looked around on activestate cookbook, but have come up empty handed. Any suggestions? Thanks, Conor -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient way of generating original alphabetic strings like unix file "split"
> > You didn't try hard enough. :) > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465 > > -- > HTH, > Rob Thanks Rob, "permutation" was the keyword I shcould have used! -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient way of generating original alphabetic strings like unix file "split"
On Jun 14, 3:02 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > On Jun 14, 4:39 pm, py_genetic <[EMAIL PROTECTED]> wrote: > > > > You didn't try hard enough. :) > > > >http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465 > > > > -- > > > HTH, > > > Rob > > > Thanks Rob, "permutation" was the keyword I shcould have used! > > See my other post to see if that is indeed what you mean. Thanks, mensanator I see what you are saying, I appreciate you clarification. I modified the unique version to fit my needs, sometimes you just want the first x unique combinations and of the right "width" (A or AA or AAA...) string, so I reworked it a bit to be more efficient. Isn't this a case of base^n-1 for # unique combinations, using the alphabet: 26^strlen - 1 or to figure out strlen from #of combinations needed: ln(26 * #ofcobinations needed)/ ln(26) obviously a float but a pritty good idea of strlen needed when rounded? -- http://mail.python.org/mailman/listinfo/python-list
converting strings to most their efficient types '1' --> 1, 'A' ---> 'A', '1.2'---> 1.2
Hello, I'm importing large text files of data using csv. I would like to add some more auto sensing abilities. I'm considing sampling the data file and doing some fuzzy logic scoring on the attributes (colls in a data base/ csv file, eg. height weight income etc.) to determine the most efficient 'type' to convert the attribute coll into for further processing and efficient storage... Example row from sampled file data: [ ['8','2.33', 'A', 'BB', 'hello there' '100,000,000,000'], [next row...] ] Aside from a missing attribute designator, we can assume that the same type of data continues through a coll. For example, a string, int8, int16, float etc. 1. What is the most efficient way in python to test weather a string can be converted into a given numeric type, or left alone if its really a string like 'A' or 'hello'? Speed is key? Any thoughts? 2. Is there anything out there already which deals with this issue? Thanks, Conor -- http://mail.python.org/mailman/listinfo/python-list
Re: converting strings to most their efficient types '1' --> 1, 'A' ---> 'A', '1.2'---> 1.2
This is excellect advise, thank you gentelman. Paddy: We can't really, in this arena make assumtions about the data source. I fully agree with your point, but if we had the luxury of really knowing the source we wouldn't be having this conversation. Files we can deal with could be consumer data files, log files, financial files... all from different users BCP-ed out or cvs excell etc. However, I agree that we can make one basic assumtion, for each coll there is a correct and furthermore optimal format. In many cases we may have a supplied "data dictionary" with the data in which case you are right and we can override much of this process, except we still need to find the optimal format like int8 vs int16. James: Using a baysian method were my inital thoughts as well. The key to this method, I feel is getting a solid random sample of the entire file without having to load the whole beast into memory. What are your thoughts on other techniques? For example training a neural net and feeding it a sample, this might be nice and very fast since after training (we would have to create a good global training set) we could just do a quick transform on a coll sample and ave the probabilities of the output units (one output unit for each type). The question here would encoding, any ideas? A bin rep of the vars? Furthermore, niave bayes decision trees etc? John: > The approach that I've adopted is to test the values in a column for all > types, and choose the non-text type that has the highest success rate > (provided the rate is greater than some threshold e.g. 90%, otherwise > it's text). > For large files, taking a 1/N sample can save a lot of time with little > chance of misdiagnosis. I like your approach, this could be simple. Intially, I was thinking a loop that did exactly this, just test the sample colls for "hits" and take the best. Thanks for the sample code. George: Thank you for offering to share your transform function. I'm very interested. -- http://mail.python.org/mailman/listinfo/python-list
Create a new class on the fly
Is this possible or is there a better way. I need to create a new class during runtime to be used inside a function. The class definition and body are dependant on unknows vars at time of exec, thus my reasoning here. class PosRecords(tables.IsDescription): class A(object): self.__init__(self, args): def mkClass(self, args): eval( "class B(object): ...") #definition on B is dependant on dynamic values in string ..do stuff with class thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: segmentation fault in scipy?
>No! matrix objects use matrix multiplication for *. You seem to need >elementwise >multiplication. No! when you mult a vector with itself transposed, the diagonal of the resulting matrix is the squares of each error (albeit you do a lot of extra calc), then sum the squares, ie trace(). Its a nifty trick, if you don't have too much data 25000x25000 matrix in mem and youre using matricies ie. batch learning. The actual equation includes multiply by 1/2*(sum of the squares), but mean squared error can be more telling about error and cross entropy is even better, becuase it tells you how well youre predicting the posterior probabilies... -- http://mail.python.org/mailman/listinfo/python-list
Re: segmentation fault in scipy?
True! it is rediculous/insane as I mentioned and noted and agreed with you (in all your responses) and was my problem, however, not wrong (same result), as I was just simply noting (not to be right), although, yes, insane. Thanks again. -- http://mail.python.org/mailman/listinfo/python-list
Re: segmentation fault in scipy?
>Now I'm even more confused. What kind of array is "error" here? First you tell >me it's a (25000, 80) array and now you are telling me it is a (25000,) array. >Once you've defined what "error" is, then please tell me what the quantity is >that you want to calculate. I think I told you several different wrong things, >previously, based on wrong assumptions. > It's just in my original post I was trying to get across maximum size of the arrays I'm using. sorry for the confusion, I didn't state actual size of my output vectors. I discovered the probelm when your first stated: >If error.shape == (25000, 80), then dot(error, transpose(error)) will be >returning an array of shape (25000, 25000) Which was exactly related to the excessive calculation I was running and set off the red flags and made it very clear. Later I was somewhat confused and believed that you we were talking about two different things regarding SSE when you said: SSE = sum(multiply(error, error), axis=None) And didn't realize that multipy() was an efficient element mult method for matricies, thinking it was like matrixmultiply() and gave you back the old trace sse method but with the casting (not meaning to contradict you). However all is now very clear, and I agree with element wise mult or even squaring the output error, and have no real reason why I was using trace, except I had a faint memory of using it in a class for small experiments in matlab (i guess the idea was to keep everything in linear algebra) and I spit it up for some reason when I was doing a quicky error function. -- http://mail.python.org/mailman/listinfo/python-list
