import issues python3.4
Dear all, I have a software code that is written in python and has a structure like this. package/ run_my_gui.py#gui sub_package/ main.py sub_sub_package/ __init__.py A.py B.py C.py All the imports in main.py are absolute and look like below: (1) from package.sub_package.sub_sub_package import A The issue is I have an external script that tests main.py. I do not want to manually edit all import statements in my code to look like below: (2) from sub_sub_package import A Is there anyway I can run my external script without changing the absolute path from (1) to (2) in my code. I use subprocess.call to run main.py within my external script and it gets few arguments to run. Thanks a lot in advance for your replies, -- https://mail.python.org/mailman/listinfo/python-list
Re: PySide window does not resize to fit screen
Hi Chris and others, I have re-designed my user interface to include layouts and now it fits the screen perfectly. Thanks a lot for all your help on this thread, Best -- https://mail.python.org/mailman/listinfo/python-list
how to compare and check if two binary(h5) files numerically have the same contents
Dear all, I am trying to compare the contents of two binary files. I use python 3.6 filecomp comparing same name files inside two directories. results_dummy=filecmp.cmpfiles(dir1, dir2, common, shallow=True) The above line works for *.bin file I have in both directories, but it does not work with h5 files. When comparing two hdf5 files that contain exactly the same groups/datasets and numerical data, filecmp.cmpfiles finds them as mismatch. My hdf files are not binary equal but contain the same exact data. Is there anyway to compare the contents of two hdf5 files from within Python script and without using h5diff? Thanks in Advance, -- https://mail.python.org/mailman/listinfo/python-list
pyinstaller not finding external fortran executable
Dear all, My code consists of a graphical user interface written in PySide. This GUI will then call a FORTRAN executable. I run the FORTRAN executable using the following lines in my script: curPath = os.path.dirname(os.path.realpath(__file__)) execPath = os.path.join(curPath, "myFORTRAN.out") returnValue = subprocess.call([execPath, self.runConfig] The script works fine and finds executable. The problem is when I use pyinstaller. I use the following command to create standalone executable that bundles my script, the external FORTRAN executable and all necessary python modules/libraries in to one standalone executable. pyinstaller --hidden-import=h5py.defs --hidden-import=h5py.utils --hidden-import=h5py.h5ac --hidden-import=h5pyl_proxy main_script.py --onefile I get the following error running this standalone executable: File "subprocess.py", line 534, in call File "subprocess.py", line 856, in __init__ File "subprocess.py", line 1464, in _execute_child FileNotFoundError: [Errno 2] No such file or directory: /myFORTRAN.out' If I manually copy myFORTRAN.out in to the same directory as the standalone executable created by pyinstaller, then it works. Does anybody know how to fix this? Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Reading Fortran Ascii output using python
Hi all, I am trying to read an ascii file written in Fortran90 using python. I am reading this file by opening the input file and then reading using: inputfile.readline() On each line of the ascii file I have a few numbers like this: line 1: 1 line 2: 1000.834739 2000.38473 3000.349798 line 3: 1000 2000 5000.69394 99934.374638 54646.9784 The problem is when I have more than 3 numbers on the same line such as line 3, python seems to read this using two reads. This makes the above example will be read like this: line 1: 1 line 2: 1000.834739 2000.38473 3000.349798 line 3: 1000 2000 5000.69394 line 4: 99934.374638 54646.9784 How can I fix this for each fortran line to be read correctly using python? Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: Reading Fortran Ascii output using python
On Monday, October 31, 2016 at 6:30:12 PM UTC+1, Irmen de Jong wrote:
> On 31-10-2016 18:20, Heli wrote:
> > Hi all,
> >
> > I am trying to read an ascii file written in Fortran90 using python. I am
> > reading this file by opening the input file and then reading using:
> >
> > inputfile.readline()
> >
> > On each line of the ascii file I have a few numbers like this:
> >
> > line 1: 1
> > line 2: 1000.834739 2000.38473 3000.349798
> > line 3: 1000 2000 5000.69394 99934.374638 54646.9784
> >
> > The problem is when I have more than 3 numbers on the same line such as
> > line 3, python seems to read this using two reads. This makes the above
> > example will be read like this:
> >
> > line 1: 1
> > line 2: 1000.834739 2000.38473 3000.349798
> > line 3: 1000 2000 5000.69394
> > line 4: 99934.374638 54646.9784
> >
> > How can I fix this for each fortran line to be read correctly using python?
> >
> > Thanks in Advance for your help,
> >
> >
>
> You don't show any code so it's hard to say what is going on.
> My guess is that your file contains spurious newlines and/or CRLF
> combinations.
>
> Try opening the file in universal newline mode and see what happens?
>
> with open("fortranfile.txt", "rU") as f:
> for line in f:
> print("LINE:", line)
>
>
> Irmen
Thanks Irmen,
I tried with "rU" but that did not make a difference. The problem is a line
that with one single write statement in my fortran code :
write(UNIT=9,FMT="(99g20.8)") value
seems to be read in two python inputfile.readline().
Any ideas how I should be fixing this?
Thanks,
--
https://mail.python.org/mailman/listinfo/python-list
Re: Reading Fortran Ascii output using python
On Monday, October 31, 2016 at 8:03:53 PM UTC+1, MRAB wrote:
> On 2016-10-31 17:46, Heli wrote:
> > On Monday, October 31, 2016 at 6:30:12 PM UTC+1, Irmen de Jong wrote:
> >> On 31-10-2016 18:20, Heli wrote:
> >> > Hi all,
> >> >
> >> > I am trying to read an ascii file written in Fortran90 using python. I
> >> > am reading this file by opening the input file and then reading using:
> >> >
> >> > inputfile.readline()
> >> >
> >> > On each line of the ascii file I have a few numbers like this:
> >> >
> >> > line 1: 1
> >> > line 2: 1000.834739 2000.38473 3000.349798
> >> > line 3: 1000 2000 5000.69394 99934.374638 54646.9784
> >> >
> >> > The problem is when I have more than 3 numbers on the same line such as
> >> > line 3, python seems to read this using two reads. This makes the above
> >> > example will be read like this:
> >> >
> >> > line 1: 1
> >> > line 2: 1000.834739 2000.38473 3000.349798
> >> > line 3: 1000 2000 5000.69394
> >> > line 4: 99934.374638 54646.9784
> >> >
> >> > How can I fix this for each fortran line to be read correctly using
> >> > python?
> >> >
> >> > Thanks in Advance for your help,
> >> >
> >> >
> >>
> >> You don't show any code so it's hard to say what is going on.
> >> My guess is that your file contains spurious newlines and/or CRLF
> >> combinations.
> >>
> >> Try opening the file in universal newline mode and see what happens?
> >>
> >> with open("fortranfile.txt", "rU") as f:
> >> for line in f:
> >> print("LINE:", line)
> >>
> >>
> >> Irmen
> >
> > Thanks Irmen,
> >
> > I tried with "rU" but that did not make a difference. The problem is a line
> > that with one single write statement in my fortran code :
> >
> > write(UNIT=9,FMT="(99g20.8)") value
> >
> > seems to be read in two python inputfile.readline().
> >
> > Any ideas how I should be fixing this?
> >
> > Thanks,
> >
> What is actually in the file?
>
> Try opening it in binary mode and print using the ascii function:
>
> with open("fortranfile.txt", "rb") as f:
> contents = f.read()
>
> print("CONTENTS:", ascii(contents))
Thanks guys,
I solved the problem on the Fortran side. Some lines contained new lines
characters that I fixed by setting format descriptors in Fortran.
Thanks for your help,
--
https://mail.python.org/mailman/listinfo/python-list
data interpolation
Hi, I have a question about data interpolation using python. I have a big ascii file containg data in the following format and around 200M points. id, xcoordinate, ycoordinate, zcoordinate then I have a second file containing data in the following format, ( 2M values) id, xcoordinate, ycoordinate, zcoordinate, value1, value2, value3,..., valueN I would need to get values for x,y,z coordinates of file 1 from values of file2. I don´t know whether my data in file1 and 2 is from structured or unstructured grid source. I was wondering which interpolation module either from scipy or scikit-learn you recommend me to use? I would also appreciate if you could recommend me some sample example/reference. Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
best way to read a huge ascii file.
Hi, I have a huge ascii file(40G) and I have around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f1[:,1] y=f1[:,2] z=f1[:,3] id=f1[:,0] I will need the x,y,z and id arrays later for interpolations. The problem is reading the file takes around 80 min while the interpolation only takes 15 mins. I was wondering if there is a more optimized way to read the file that would reduce the time to read the input file? I have the same problem when writing the output using np.savetxt. Thank you in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
Hi all, Let me update my question, I have an ascii file(7G) which has around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f[:,1] y=f[:,2] z=f[:,3] id=f[:,0] I will need the x,y,z and id arrays later for interpolations. The problem is reading the file takes around 80 min while the interpolation only takes 15 mins. I tried to get the memory increment used by each line of the script using python memory_profiler module. The following line which reads the entire 7.4 GB file increments the memory usage by 3206.898 MiB (3.36 GB). First question is Why it does not increment the memory usage by 7.4 GB? f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) The following 4 lines do not increment the memory at all. x=f[:,1] y=f[:,2] z=f[:,3] id=f[:,0] Finally I still would appreciate if you could recommend me what is the most optimized way to read/write to files in python? are numpy np.loadtxt and np.savetxt the best? Thanks in Advance, -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
Hi all, Writing my ASCII file once to either of pickle or npy or hdf data types and then working afterwards on the result binary file reduced the read time from 80(min) to 2 seconds. Thanks everyone for your help. -- https://mail.python.org/mailman/listinfo/python-list
Interpolation gives negative values
Dear all, I have an ASCII file (f1) with 100M lines with columns x,y,z ( new data points) and then I have a second ASCII file (f2) with 2M lines with columns x,y,z and VALUE (v10).(known data points). I need to interpolate f1 over f2, that is I need to have values for v10 for all coordinated in f1. I am using the following script to interpolate. from scipy.interpolate import NearestNDInterpolator #new data points f1=np.loadtxt(os.path.join(dir,coord_source),delimiter=None,skiprows=0) x_coord=f1[:,1] y_coord=f1[:,2] z_coord=f1[:,3] # known data points f2=np.loadtxt(os.path.join(dir,val_source),delimiter=None,skiprows=1) x_val=f2[:,1] y_val=f2[:,2] z_val=f2[:,3] v10=f2[:,10] # Value to be interpolated # my interpolation function myf_v10=NearestNDInterpolator((x_val, y_val,z_val),v10) interpolated_v10=myf_v10(x_coord,y_coord,z_coord) I have the following questions. 1. Considering f1 is 50 times bigger than f2. can I still use the above script? 2. The variable v10 that I need to interpolate should always be >= 0 (positive). Using the above script I am getting negative values for v10. How can I fix this? I would really appreciate your help, Thanks in Advance, -- https://mail.python.org/mailman/listinfo/python-list
Re: numpy arrays
Thanks for your replies. I have a question in regard with my previous question. I have a file that contains x,y,z and a value for that coordinate on each line. Here I am giving an example of the file using a numpy array called f. f=np.array([[1,1,1,1], [1,1,2,2], [1,1,3,3], [1,2,1,4], [1,2,2,5], [1,2,3,6], [1,3,1,7], [1,3,2,8], [1,3,3,9], [2,1,1,10], [2,1,2,11], [2,1,3,12], [2,2,1,13], [2,2,2,14], [2,2,3,15], [2,3,1,16], [2,3,2,17], [2,3,3,18], [3,1,1,19], [3,1,2,20], [3,1,3,21], [3,2,1,22], [3,2,2,23], [3,2,3,24], [3,3,1,25], [3,3,2,26], [3,3,3,27], ]) then after tranposing f, I get the x,y and z coordinates: f_tranpose=f.T x=np.sort(np.unique(f_tranpose[0])) y=np.sort(np.unique(f_tranpose[1])) z=np.sort(np.unique(f_tranpose[2])) Then I will create a 3D array to put the values inside. The only way I see to do this is the following: arr_size=x.size val2=np.empty([3, 3,3]) for sub_arr in f: idx = (np.abs(x-sub_arr[0])).argmin() idy = (np.abs(y-sub_arr[1])).argmin() idz = (np.abs(z-sub_arr[2])).argmin() val2[idx,idy,idz]=sub_arr[3] I know that in the example above I could simple reshape f_tranpose[3] to a three by three by three array, but in my real example the coordinates are not in order and the only way I see to do this is by looping over the whole file which takes a lot of time. I would appreciate any workarounds to make this quicker. Thanks, -- https://mail.python.org/mailman/listinfo/python-list
Re: numpy arrays
Thanks a lot Oscar, The lexsort you suggested was the way to go. import h5py import numpy as np f=np.loadtxt(inputFile,delimiter=None) xcoord=np.sort(np.unique(f[:,0])) ycoord=np.sort(np.unique(f[:,1])) zcoord=np.sort(np.unique(f[:,2])) x=f[:,0] y=f[:,1] z=f[:,2] val=f[:,3] ind = np.lexsort((val,z,y,x)) # Sort by x, then by y, then by z, then by val sortedVal=np.array([(val[i]) for i in ind]).reshape((xcoord.size,ycoord.size,zcoord.size)) -- https://mail.python.org/mailman/listinfo/python-list
Re: numpy arrays
Thanks Oscar, In my case this did the trick. sortedVal=np.array(val[ind]).reshape((xcoord.size,ycoord.size,zcoord.size)) -- https://mail.python.org/mailman/listinfo/python-list
Re: numpy arrays
As you said, this did the trick. sortedVal=np.array(val[ind]).reshape((xcoord.size,ycoord.size,zcoord.size)) Only val[ind] instead of val[ind,:] as val is 1D. Thanks Oscar, -- https://mail.python.org/mailman/listinfo/python-list
installing scipy
Hi all, I have a python34 installed on a windows-64bit machine. I am using Eclipse pydev editor. I need to used griddata from scipy.interpolate. I have installed scipy using by downloading the followng wheel file: scipy-0.17.0-cp34-none-win_amd64 and isntalling using pip install. The install is successful, but It seems like scipy is not installed correctly and I get the following error when trying to import and use scipy modules: C:\Python34\Scripts>python -c "import scipy.interpolate" Traceback (most recent call last): File "", line 1, in File "C:\Python34\lib\site-packages\scipy\interpolate\__init__.py", line 158, in from .interpolate import * File "C:\Python34\lib\site-packages\scipy\interpolate\interpolate.py", line 11, in import scipy.linalg File "C:\Python34\lib\site-packages\scipy\linalg\__init__.py", line 174, in from .misc import * File "C:\Python34\lib\site-packages\scipy\linalg\misc.py", line 5, in from .blas import get_blas_funcs File "C:\Python34\lib\site-packages\scipy\linalg\blas.py", line 155, in from scipy.linalg import _fblas ImportError: DLL load failed How can I fix this problem? Thanks, -- https://mail.python.org/mailman/listinfo/python-list
Re: installing scipy
Yes, the python I have installed is 64bit. Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32 and the scipy wheel I am trying to install from is : scipy-0.17.0-cp34-none-win_amd64.whl At Sayth: Thanks for recommending Anaconda. I already am familiar with it, but I was wondering why the scipy installed on my normal python gives the following error : mportError: DLL load failed Thanks for your comments, -- https://mail.python.org/mailman/listinfo/python-list
reshape and keep x,y,z ordering
Hello, I have a question regarding reshaping numpy array. I either have a 1D array that I need to reshape to a 3D array or a 3D array to reshape to a 1d numpy array. In both of these cases it is assumed that data follows x,y,z ordering. and I use the following to reshape the numpy array. new_1d_array=np.reshape(3d.transpose(),(3d_nx*3d_ny*3d_nz,)) new_3d_array=np.reshape(1d,((3d_x,3d_y,3d_z)).transpose()) My question is if there is anyway that reshape would keep x,y,z ordering that would not require transpose? and if there is a better more efficient way to do this? Thanks alot, -- https://mail.python.org/mailman/listinfo/python-list
Re: reshape and keep x,y,z ordering
Thanks Michael,
This did the trick. array.flatten('F') works exactly as I need.
Thanks a lot,
--
https://mail.python.org/mailman/listinfo/python-list
fastest way to read a text file in to a numpy array
Hi, I need to read a file in to a 2d numpy array containing many number of lines. I was wondering what is the fastest way to do this? Is even reading the file in to numpy array the best method or there are better approaches? Thanks for your suggestions, -- https://mail.python.org/mailman/listinfo/python-list
Re: fastest way to read a text file in to a numpy array
Dear all, After a few tests, I think I will need to correct a bit my question. I will give an example here. I have file 1 with 250 lines: X1,Y1,Z1 X2,Y2,Z2 Then I have file 2 with 3M lines: X1,Y1,Z1,value11,value12, value13, X2,Y2,Z2,value21,value22, value23,... I will need to interpolate values for the coordinates on file 1 from file 2. (using nearest) I am using the scipy.griddata for this. scipy.interpolate.griddata(points, values, xi, method='linear', fill_value=nan, rescale=False) When slicing the code, reading files in to numpy is not the culprit, but the griddata is. time to read file2= 2 min time to interpolate= 48 min I need to repeat the griddata above to get interpolation for each of the column of values. I was wondering if there are any ways to improve the time spent in interpolation. Thank you very much in advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Getting number of neighbours for a 3d numpy arrays
Hi, I have a 3d numpy array containing true/false values for each i,j,k. The size of the array is a*b*c. for each cell with indices i,j,k; I will need to check all its neighbours and calculate the number of neighbour cells with true values. A cell with index i,j,k has the following neighbours : n1 with indices [i-1,j,k] if i>0 ; cell with i=0 does not have any n1 neighbour. (left neighbour) n2 with inidices [i+1,j,k] if i0 n4 with inidces [i,j+1,k] if j0 n6 with indices [i,j,k+1] if k0 : n1=myarray[i-1,j,k] if n1== True: n_neigh+=1 if i<248: n2=grains_3d[i+1,j,k] if n2== True: n_neigh_grains+=1 if j>0: n3=myarray[i,j-1,k] if n3== True: n_neigh+=1 if j<1247: n4=myarray[i,j+1,k] if n4== True: n_neigh+=1 if k>0 : n5=myarray[i,j,k-1] if n5== True: n_neigh+=1 if k<169: n6=myarray[i,j,k+1] if n6== True: n_neigh+=1 Is there anyway I can get an array containg all n1 neighbour, a second array containing all n2 neighbours and so on and then add n1 to n6 arrays element-wise to get all neighbours that meet certain conditions for a cell. Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
reshape with xyz ordering
Hi, I sort a file with 4 columns (x,y,z, somevalue) and I sort it using numpy.lexsort. ind=np.lexsort((val,z,y,x)) myval=val[ind] myval is a 1d numpy array sorted by x,then y, then z and finally val. how can I reshape correctly myval so that I get a 3d numpy array maintaining the xyz ordering of the data? my val looks like the following: x,y,z, val 0,0,0,val1 0,0,1,val2 0,0,2,val3 ... Thanks a lot for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: reshape with xyz ordering
Thanks for your replies. Let me explain my problem a little bit more. I have the following data which i read from a file using numpy.loadtxt and then i sort it using np.lexsort: x=f[:,0] # XColumn y=f[:,1] # YColumn z=f[:,2] # ZColumn val=f[:,3] # Val Column xcoord=np.sort(np.unique(f[:,0])) # XCoordinates ycoord=np.sort(np.unique(f[:,1])) # YCoordinates zcoord=np.sort(np.unique(f[:,2])) # ZCoordinates ind = np.lexsort((val,z,y,x)) val_sorted=np.array(val[ind]) I know that the val column has data sorted first by x, then by y, then by z which means that column x changes slowest and column z changes fastest. x,y,z, val 0,0,0,val1 0,0,1,val2 0,0,2,val3 0,0,zn,valn ... xn,yn,zn,valfin I want to reshape val_sorted in to a 3d numpy array of (nx,ny,nz). which of the following is the correct way and why? #1 val_sorted_reshaped=val_sorted.reshape((xcoord.size,ycoord.size,zcoord.size)) #2 #val_sorted_reshaped=val_sorted.reshape((xcoord.size,ycoord.size,zcoord.size)).transpose() #3 #val_sorted_reshaped=val_sorted.reshape((zcoord.size,ycoord.size,xcoord.size)) #4 #val_sorted_reshaped=val_sorted.reshape((zcoord.size,ycoord.size,xcoord.size)).transpose() Thanks, -- https://mail.python.org/mailman/listinfo/python-list
3D numpy array subset
Dear all, I am reading a dataset from a HDF5 file using h5py. my datasets are 3D. Then I will need to check if another 3d numpy array is a subset of this 3D array i am reading from the file. In general, is there any way to check if two 3d numpy arrays have intersections and if so, get the indices of the intersection area. By intersection, I exactly mean the intersection definition used in set theory. Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
np.searchSorted over 2D array
Dear all, I need to check whether two 2d numpy arrays have intersections and if so I will need to have the cell indices of the intersection. By intersection, I exactly mean the intersection definition used in set theory. I will give an example of what I need to do: a=[[0,1,2],[3,4,5],[6,7,8]] b=[[0,1,2],[3,4,5]] I would like to check whether b is subset of a and then get the indices in a where b matches. cellindices=[[True,True,True],[True,True,True],[False,False,False]] What is the best way to do this in numpy? Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: np.searchSorted over 2D array
Thanks Peter, I will try to explain what I really need. I have a 3D numpy array of 100*100*100 (1M elements). Then I have another numpy array of for example 10*2*10 (200 elements). I want to know if in the bigger dataset of 100*100*100, there is anywhere, where the second numpy array of 200 elements with shape 10*2*10 appears. If it does, then I want the indices of the bigger dataset where this happens. I hope I´m making myself more clear :) Thanks for your comments, On Wednesday, December 9, 2015 at 5:37:07 PM UTC+1, Peter Otten wrote: > Heli wrote: > > [Please don't open a new thread for the same problem] > > > I need to check whether two 2d numpy arrays have intersections and if so I > > will need to have the cell indices of the intersection. > > > > By intersection, I exactly mean the intersection definition used in set > > theory. > > > > I will give an example of what I need to do: > > > > a=[[0,1,2],[3,4,5],[6,7,8]] > > b=[[0,1,2],[3,4,5]] > > > > I would like to check whether b is subset of a and then get the indices in > > a where b matches. > > > > cellindices=[[True,True,True],[True,True,True],[False,False,False]] > > > > What is the best way to do this in numpy? > > Providing an example is an improvement over your previous post, but to me > it's still not clear what you want. > > >>> functools.reduce(lambda x, y: x | y, (a == i for i in b.flatten())) > array([[ True, True, True], >[ True, True, True], >[False, False, False]], dtype=bool) > > produces the desired result for the example input, but how do you want to > handle repeating numbers as in > > >>> a = numpy.array([[0,1,2],[3,4,5],[3, 2, 1]]) > >>> functools.reduce(lambda x, y: x | y, (a == i for i in b.flatten())) > array([[ True, True, True], >[ True, True, True], >[ True, True, True]], dtype=bool) > > ? > > Try to be clear about your requirement, describe it in english and provide a > bit of context. You might even write a solution that doesn't use numpy and > ask for help in translating it. > > At the very least we need more/better examples. -- https://mail.python.org/mailman/listinfo/python-list
looping and searching in numpy array
Dear all, I need to loop over a numpy array and then do the following search. The following is taking almost 60(s) for an array (npArray1 and npArray2 in the example below) with around 300K values. for id in np.nditer(npArray1): newId=(np.where(npArray2==id))[0][0] Is there anyway I can make the above faster? I need to run the script above on much bigger arrays (50M). Please note that my two numpy arrays in the lines above, npArray1 and npArray2 are not necessarily the same size, but they are both 1d. Thanks a lot for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: looping and searching in numpy array
On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote:
> Heli wrote:
>
> > Dear all,
> >
> > I need to loop over a numpy array and then do the following search. The
> > following is taking almost 60(s) for an array (npArray1 and npArray2 in
> > the example below) with around 300K values.
> >
> >
> > for id in np.nditer(npArray1):
> >
> >newId=(np.where(npArray2==id))[0][0]
> >
> >
> > Is there anyway I can make the above faster? I need to run the script
> > above on much bigger arrays (50M). Please note that my two numpy arrays in
> > the lines above, npArray1 and npArray2 are not necessarily the same size,
> > but they are both 1d.
>
> You mean you are looking for the index of the first occurence in npArray2
> for every value of npArray1?
>
> I don't know how to do this in numpy (I'm not an expert), but even basic
> Python might be acceptable:
>
> lookup = {}
> for i, v in enumerate(npArray2):
> if v not in lookup:
> lookup[v] = i
>
> for v in npArray1:
> print(lookup.get(v, ""))
>
> That way you iterate once (in Python) instead of 2*len(npArray1) times (in
> C) over npArray2.
Dear Peter,
Thanks for your reply. This really helped. It reduces the script time from
61(s) to 2(s).
I am still very interested in knowing the correct numpy way to do this, but
till then your fix works great.
Thanks a lot,
--
https://mail.python.org/mailman/listinfo/python-list
Re: looping and searching in numpy array
On Thursday, March 10, 2016 at 5:49:07 PM UTC+1, Heli wrote:
> On Thursday, March 10, 2016 at 2:02:57 PM UTC+1, Peter Otten wrote:
> > Heli wrote:
> >
> > > Dear all,
> > >
> > > I need to loop over a numpy array and then do the following search. The
> > > following is taking almost 60(s) for an array (npArray1 and npArray2 in
> > > the example below) with around 300K values.
> > >
> > >
> > > for id in np.nditer(npArray1):
> > >
> > >newId=(np.where(npArray2==id))[0][0]
> > >
> > >
> > > Is there anyway I can make the above faster? I need to run the script
> > > above on much bigger arrays (50M). Please note that my two numpy arrays in
> > > the lines above, npArray1 and npArray2 are not necessarily the same size,
> > > but they are both 1d.
> >
> > You mean you are looking for the index of the first occurence in npArray2
> > for every value of npArray1?
> >
> > I don't know how to do this in numpy (I'm not an expert), but even basic
> > Python might be acceptable:
> >
> > lookup = {}
> > for i, v in enumerate(npArray2):
> > if v not in lookup:
> > lookup[v] = i
> >
> > for v in npArray1:
> > print(lookup.get(v, ""))
> >
> > That way you iterate once (in Python) instead of 2*len(npArray1) times (in
> > C) over npArray2.
>
> Dear Peter,
>
> Thanks for your reply. This really helped. It reduces the script time from
> 61(s) to 2(s).
>
> I am still very interested in knowing the correct numpy way to do this, but
> till then your fix works great.
>
> Thanks a lot,
And yes, I am looking for the index of the first occurence in npArray2
for every value of npArray1.
--
https://mail.python.org/mailman/listinfo/python-list
numpy arrays
Hi, I have a 2D numpy array like this: [[1,2,3,4], [1,2,3,4], [1,2,3,4] [1,2,3,4]] Is there any fast way to convert this array to [[1,1,1,1], [2,2,2,2] [3,3,3,3] [4,4,4,4]] In general I would need to retrieve every nth element of the interior arrays in to single arrays. I know I can loop over and do this, but I have really big arrays and I need the fastest way to do this. Thanks for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: numpy arrays
On Wednesday, March 23, 2016 at 11:07:27 AM UTC+1, Heli wrote: > Hi, > > I have a 2D numpy array like this: > > [[1,2,3,4], > [1,2,3,4], > [1,2,3,4] > [1,2,3,4]] > > Is there any fast way to convert this array to > > [[1,1,1,1], > [2,2,2,2] > [3,3,3,3] > [4,4,4,4]] > > In general I would need to retrieve every nth element of the interior arrays > in to single arrays. I know I can loop over and do this, but I have really > big arrays and I need the fastest way to do this. > > Thanks for your help, Thanks a lot everybody, Transposing is exactly what I want. I just was not sure if that would work on internal arrays of a 2D array. Problem solved thanks everyone. -- https://mail.python.org/mailman/listinfo/python-list
Optimizing if statement check over a numpy value
Dear all,
I have the following piece of code. I am reading a numpy dataset from an hdf5
file and I am changing values to a new value if they equal 1.
There is 90 percent chance that (if id not in myList:) is true and in 10
percent of time is false.
with h5py.File(inputFile, 'r') as f1:
with h5py.File(inputFile2, 'w') as f2:
ds=f1["MyDataset"].value
myList=[list of Indices that must not be given the new_value]
new_value=1e-20
for index,val in np.ndenumerate(ds):
if val==1.0 :
id=index[0]+1
if id not in myList:
ds[index]=new_value
dset1 = f2.create_dataset("Cell Ids", data=cellID_ds)
dset2 = f2.create_dataset("Porosity", data=poros_ds)
My numpy array has 16M data and it takes 9 hrs to run. If I comment my if
statement (if id not in myList:) it only takes 5 minutes to run.
Is there any way that I can optimize this if statement.
Thank you very much in Advance for your help.
Best Regards,
--
https://mail.python.org/mailman/listinfo/python-list
Re: Optimizing if statement check over a numpy value
On Thursday, July 23, 2015 at 1:43:00 PM UTC+2, Jeremy Sanders wrote: > Heli Nix wrote: > > > Is there any way that I can optimize this if statement. > > Array processing is much faster in numpy. Maybe this is close to what you > want > > import numpy as N > # input data > vals = N.array([42, 1, 5, 3.14, 53, 1, 12, 11, 1]) > # list of items to exclude > exclude = [1] > # convert to a boolean array > exclbool = N.zeros(vals.shape, dtype=bool) > exclbool[exclude] = True > # do replacement > ones = vals==1.0 > # Note: ~ is numpy.logical_not > vals[ones & (~exclbool)] = 1e-20 > > I think you'll have to convert your HDF array into a numpy array first, > using numpy.array(). > > Jeremy Dear all, I tried the sorted python list, but this did not really help the runtime. I haven´t had time to check the sorted collections. I solved my runtime problem by using the script from Jeremy up here. It was a life saviour and it is amazing how powerful numpy is. Thanks a lot Jeremy for this. By the way, I did not have to do any array conversion. The array read from hdf5 file using h5py is already a numpy array. The runtime over an array of around 16M reduced from around 12 hours (previous script) to 3 seconds using numpy on the same machine. Thanks alot for your help, -- https://mail.python.org/mailman/listinfo/python-list
Porting Python Application to a new linux machine
Dear all, I have my python scripts that use several python libraries such as h5py, pyside, numpy In Windows I have an installer that will install python locally on user machine and so my program gets access to this local python and runs successfully. How can I do this in Linux ? ( I want to install python plus my program on the user machine.) I do not want to use the user´s python or to install python on the user´s machine on root. Thanks in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
PyInstaller+ Python3.5 (h5py import error)
Dear all, Thanks a lot for your replies. Very helpful. I have already done some trials with Virtualenv, but PyInstaller is much closer to the idea of an installer you can pass to someone. I have been using development version of PyInstaller in order to be able to use it with my script written with Python versin 3.3.5. I started with a very simple script just to test. I use the following command to create the distribution folder. pyinstaller test.py my script contains the following few lines and it runs ok on my own machine. import numpy as np import h5py a=np.arange(10) print(a) inputFiles="test.h5" with h5py.File(inputFiles, 'w') as inputFileOpen: pass I am getting the following error related to importing h5py. test returned -1 Traceback (most recent call last): File "", line 2, in File "/usr/lib/python3.3/site-packages/PyInstaller-3.0.dev2-py3.3.egg/PyInstaller/loader/pyimod03_importers.py", line 311, in load_module File "/usr/lib64/python3.3/site-packages/h5py/__init__.py", line 23, in File "/usr/lib/python3.3/site-packages/PyInstaller-3.0.dev2-py3.3.egg/PyInstaller/loader/pyimod03_importers.py", line 493, in load_module File "h5r.pxd", line 21, in init h5py._conv (/tmp/pip_build_root/h5py/h5py/_conv.c:6563) File "/usr/lib/python3.3/site-packages/PyInstaller-3.0.dev2-py3.3.egg/PyInstaller/loader/pyimod03_importers.py", line 493, in load_module File "_objects.pxd", line 12, in init h5py.h5r (/tmp/pip_build_root/h5py/h5py/h5r.c:2708) File "/usr/lib/python3.3/site-packages/PyInstaller-3.0.dev2-py3.3.egg/PyInstaller/loader/pyimod03_importers.py", line 493, in load_module File "_objects.pyx", line 1, in init h5py._objects (/tmp/pip_build_root/h5py/h5py/_objects.c:6407) ImportError: No module named 'h5py.defs' If I modify my script to import numpy as np import h5py a=np.arange(10) print(a) then, the created exectuable will run successfully on other linux machines. Does anybody have any idea why I am getting the following h5py import error? My spec file also looks like this: # -*- mode: python -*- block_cipher = None a = Analysis(['test.py'], pathex=['/home/albert/test'], binaries=None, datas=None, hiddenimports=[], hookspath=None, runtime_hooks=None, excludes=None, win_no_prefer_redirects=None, win_private_assemblies=None, cipher=block_cipher) pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher) exe = EXE(pyz, a.scripts, exclude_binaries=True, name='test', debug=False, strip=None, upx=True, console=True ) coll = COLLECT(exe, a.binaries, a.zipfiles, a.datas, strip=None, upx=True, name='test') Thank you very much in Advance for your help, -- https://mail.python.org/mailman/listinfo/python-list
Re: PyInstaller+ Python3.5 (h5py import error)
Thanks Christian, It turned out that h5py.defs was not the only hidden import that I needed to add. I managed to get it working with the follwoing command adding 4 hidden imports. pyinstaller --hidden-import=h5py.defs --hidden-import=h5py.utils --hidden-import=h5py.h5ac --hidden-import=h5py._proxy test.py is there anyway that you can use to add all h5py submodules all together? Thanks, -- https://mail.python.org/mailman/listinfo/python-list
