163

I know that if I use randn,

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))

gives me what I am looking for, but with elements from a normal distribution. But what if I just wanted random integers?

randint works by providing a range, but not an array like randn does. So how do I do this with random integers between some range?

Pavel
  • 4,389
  • 4
  • 25
  • 45
TheRealFakeNews
  • 6,497
  • 15
  • 63
  • 96
  • And related for when we're just adding a column: [Pandas: create new column in df with random integers](http://stackoverflow.com/questions/30327417/python-create-new-column-in-pandas-df-with-random-numbers-from-range) – smci Apr 07 '17 at 09:33

2 Answers2

245

numpy.random.randint accepts a third argument (size) , in which you can specify the size of the output array. You can use this to create your DataFrame -

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

Here - np.random.randint(0,100,size=(100, 4)) - creates an output array of size (100,4) with random integer elements between [0,100) .


Demo -

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

which produces:

     A   B   C   D
0   45  88  44  92
1   62  34   2  86
2   85  65  11  31
3   74  43  42  56
4   90  38  34  93
5    0  94  45  10
6   58  23  23  60
..  ..  ..  ..  ..
renan-eccel
  • 152
  • 10
Anand S Kumar
  • 82,977
  • 18
  • 174
  • 164
  • 1
    Could you please make a copy-pastable sample which includes the imports / does not have the line numbers? – Martin Thoma Nov 28 '17 at 15:29
  • 2
    Adding to the excellent solution. If you want to name the columns anything but a letter each in that order, you should do df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list(['AA','BB','C2','D2'])) – mzakaria Dec 02 '19 at 06:53
  • 2
    @mzakaria `[...]` is already a list so you don't need `list([...])` – jtlz2 Apr 09 '20 at 07:23
14

The recommended way to create random integers with NumPy these days is to use numpy.random.Generator.integers. (documentation)

import numpy as np
import pandas as pd

rng = np.random.default_rng()
df = pd.DataFrame(rng.integers(0, 100, size=(100, 4)), columns=list('ABCD'))
df
----------------------
      A    B    C    D
 0   58   96   82   24
 1   21    3   35   36
 2   67   79   22   78
 3   81   65   77   94
 4   73    6   70   96
... ...  ...  ...  ...
95   76   32   28   51
96   33   68   54   77
97   76   43   57   43
98   34   64   12   57
99   81   77   32   50
100 rows × 4 columns
Webucator
  • 1,960
  • 18
  • 28