Skip to content →

Tag: NumPy

NumPy’s ndarray indexing

In NumPy a new kind of array is provided: n-dimensional array or ndarray. It’s usually fixed-sized and accepts items of the same type and size. For example, to define a 2×3 matrix:

import numpy as np
a = np.array([[1,2,3,], [4,5,6]], np.int32)

When indexing ndarray, it supports “array indexing” other than single element indexing.  (See http://docs.scipy.org/doc/numpy/user/basics.indexing.html)

It is possible to index arrays with other arrays for the purposes of selecting lists of values out of arrays into new arrays. There are two different ways of accomplishing this. One uses one or more arrays of index values. The other involves giving a boolean array of the proper shape to indicate the values to be selected. Index arrays are a very powerful tool that allow one to avoid looping over individual elements in arrays and thus greatly improve performance.

So you basically can do the following:

a = np.array([1, 2, 3], np.int32)
a[np.array([0, 2])) # Fetch the first the third elements, returns np.array([1, 3])
a[np.array([True, False, True])] # Same as the line above

Besides, when you do equals operation on ndarrays, another ndarray is returned by comparing each element:

a = np.array([1, 2, 3], np.int32)
a == 2 # Returns array([False,  True, False], dtype=bool)
a != 2 # Returns array([ True, False,  True], dtype=bool)
a[a != 2] # Returns a sub array that excludes elements with a value 2, in this case array([1, 3], dtype=int32)
Leave a Comment

Statistics of insurance sold on Taobao.com on Valentine’s Day

On Feb. 14th Taobao launched a campaign to sell insurance products, which promises 7% yearly interest rate. The sales data is public, so I wrote a script to crawl them down and did a brief study on this data. Here’re the results.

On that day (actually sold out in less than two hours in total) more than 40,000 people participated, resulting a total sales of almost one billion CNY (the exact number: 980,270,000 CNY). Two companies participated in this sales campaign: Zhujiang and Tian’an. The sales statistics are:

ZhujiangTian’anTotal
# of Customers138312909242923
Sales mean (k CNY)24.92205921.84700322.837872
Sales min (k CNY)111
Sales 25% (k CNY)122
Sales 50% (k CNY)101010
Sales 75% (k CNY)202522
Sales max (k CNY)10009001000
Sales total (k CNY)344697635573980270

The histograms of how many people pay for each amount.

le100kgt100kle100k

Zhujiang was extremely popular: in 2 minutes and 56 seconds it reached a sales of 200,212,000 CNY, that’s more than 1 million CNY sales PER SECOND! Indeed Chinese are crazy about online shopping. 😀

Leave a Comment