Thursday, June 30, 2016

Advice for Matlab Users on Python

Advice for Matlab Users on Python

1. (5000L, 1L) vs. (5000L,)

When performing any matrix operations that result in only a single column or row, Numpy would returns a directionless 1D array, instead of a 1xn or nx1 matrix. You must manually reshape your result back to nx1 matrix, if it is what you expect.

A.reshape(n,1)

2. 1/m vs 1./m

Be careful Python would return any integer for you type 1/m. Always play safe to add a dot to the dividend.

1./m

3. (y==k)*1

Boolean operations on MATLAB return 0 or 1 that is convenient for further calculation. You can cast them back to 0 or 1 in Python by multiplying them by one.

(y==k)*1

4. lambda = 3 vs lambda = 3.

Again, always assign real number 3. to a variable, if it will be fit to a formula and the result would be real. Otherwise, Python would trim to an integer for you.

5. Fancy indexing in Pandas DataFrame returns a copy

If you try to extract rows a DataFrame with some conditions, and use the indexing style, it would return a copy of DataFrame to you. You should not assign anythings into it, if you really do, you should use the .loc syntax.

df['a'][df['b']>0.5]=1 #failed
df.loc[df.b>0.5, 'a']=1 #correct

6. Strange behavior of Pandas mode function

See the API for details. If you simply want to vote a single most likely majority from some sample, you may consider to use the one in scipy.stats library.

from scipy.stats import mode
mode(y)[0][0]

Thursday, June 16, 2016

MATLAB vs Python Syntax (II)

MATLAB vs Python Syntax (II)
MATLAB Python (Numpy)
Matrix Summary
sum(A) #veritical 1xn A.sum(0) #1D array(n)
sum(A,2) #horizontal mx1 A.sum(1) #1D array(m)
sum(sum(A)) #total A.sum()
max(A) #1xn A.max(0) #(n)
max(A, [], 2) #mx1 A.max(1) #(m)
max(max(A)) A.max()
Shuffle Data
X(randperm(m), :) np.random.permutation(A)
Plot Graph
- import matplotlib.pyplot as plt
fugure(1) plt.figure(1)
r=randn(5000,1) r=np.randn(5000)
hist(r, 100) plt.hist(r, 100)
figure(2) plt.figure(2)
t=[0: 0.01: 0.98] t=arange(0, 0.99, 0.01)
y1=sin(2*pi*t) y1=np.sin(2*np.pi*t)
plot(t, y1) plt.plot(t, y1)
hold no #no need plt.clf() #clear
plot(t, y2, 'r') plt.plot(t, y2, 'r')
xlabel('time') plt.xlabel('time')
ylabel('value') plt.ylabel('value')
legend('sin', 'cos') plt.legend(('sin', 'cos'))
title('my plot') plt.tittle('my plot')
close all plt.close('all')
Flow Control
v=zero(10, 1) v=np.zeros(10)
for i=1:2:10, #skip 2 for i in range(0,m,2):
..v(i) = 2^i; ..v[i] = 2 ** i
end;
i=1; i=0
while i<=5, while i<5:
..v(i) = 100; ..v[i] = 100
..i = i+1; ..i = i+1
end;
v(1)=2; v[0]=2
if v(1)==1, if v[0]==1:
..disp('The value is one'); ..print 'The value is one'
elseif v(1)==2, elif v[0]==2:
..disp('The value is two'); ..print 'The value is two'
else, else:
..disp('The value is others'); ..print 'The value is others'
end;

Friday, June 10, 2016

MATLAB vs Python Syntax (I)

MATLAB vs Python Syntax (I)
MATLAB Python (Numpy)
I/O
load 'hello.txt' X X=pd.read_csv('hello.txt, header=None)
save 'hello.txt' X -ascii np.savetxt('hello.txt', X)
- import scipy.io as sio
- test=sio.loadmat('test.mat')
Data Creation
A=[1 2; 3 4; 5 6] A=np.array([[1,2], [3,4], [5,6]])
v=[1 2 3] v=np.array([1, 2, 3])
v=[1; 2; 3] v=np.array([[1], [2], [3]])
v=1: 0.1: 2 v=np.arange(1, 2.1, 0.1)
c=2*ones(2,3) v=2.*np.ones((2, 3))
w=zeros(1,3) w=np.zeros(3).T
r=rand(1,3) r=np.random.rand(3,4)
I=eye(4) I=np.eye(4)
size(A) #3 2 A.shape
length(v) #3 v.size
m=size(A, 1) m=A.shape[0]
Data Extraction
A(3, 2) #6 A[2, 1]
A(2,:) A[1]
A([1 3], :) A[[0, 2]]
A(2:end, 1) A[1:, 0]
R=rand(4,5) R=np.random.rand(4,5)
R(R(:,3)>0.5, [2,4]) R[R[:2]>0.5][:,[1,3]]
pos=find(p>0.5) pos=np.where(p>0.5)
X1=X(pos, :) X1=X[pos]
X1=X(p>0.5, :) X1=X[p>0.5, :]
Concatenate Data
A=[A, [101; 100; 102]] A=np.hstack([A, np.array([[101], [102], [103]]) ])
- A=np.c_[A, np.array([[101], [102], [103]])]
X=[ones(m,1), X] np.c_[np.ones(m), X]
Basic Operation
a == b a == b
a ~= b a != b
a && b a and b
a \|\| b a or b
xor(a,b) a ^ b
2^3 2**3
Matrix Operation
A * B #dot product A.dot(B)
A' #transpose A.T
A' + B A.T + B
A .* B #element-wise A * B
A .^ 2 A ** 2
1 ./ A 1. / A
log(A) np.log(A)
exp(A) np.exp(A)
A * v #result mx1 A.dot(v) #result 1D array
pinv(A) #inverse np.linalg.pinv(A)

Principle Component Analysis

Principle Component Analysis Eigenvector Decomposition Let A ∈ R n × n A \in \R^{n \times n} A ∈ R n × n be an n by n...