1. (5000L, 1L) vs. (5000L,)
When performing any matrix operations that result in only a single column or row, Numpy would returns a directionless 1D array, instead of a 1xn
or nx1
matrix. You must manually reshape your result back to nx1
matrix, if it is what you expect.
A.reshape(n,1)
2. 1/m vs 1./m
Be careful Python would return any integer for you type 1/m
. Always play safe to add a dot to the dividend.
1./m
3. (y==k)*1
Boolean operations on MATLAB return 0
or 1
that is convenient for further calculation. You can cast them back to 0
or 1
in Python by multiplying them by one.
(y==k)*1
4. lambda = 3 vs lambda = 3.
Again, always assign real number 3.
to a variable, if it will be fit to a formula and the result would be real. Otherwise, Python would trim to an integer for you.
5. Fancy indexing in Pandas DataFrame returns a copy
If you try to extract rows a DataFrame with some conditions, and use the indexing style, it would return a copy of DataFrame to you. You should not assign anythings into it, if you really do, you should use the .loc
syntax.
df['a'][df['b']>0.5]=1 #failed
df.loc[df.b>0.5, 'a']=1 #correct
6. Strange behavior of Pandas mode function
See the API for details. If you simply want to vote a single most likely majority from some sample, you may consider to use the one in scipy.stats
library.
from scipy.stats import mode
mode(y)[0][0]
No comments:
Post a Comment