如何在 python 中使用模拟数据创建数据框

san*_*jha 7 python random dataframe python-3.x pandas

我有示例架构,其中包含 12 列,每列都有特定的类别。现在我需要将这些数据模拟成大约 1000 行的数据帧。我该怎么办?

我使用下面的代码为每列生成数据

      Location = ['USA','India','Prague','Berlin','Dubai','Indonesia','Vienna']
      Location = random.choice(Location)

      Age = ['Under 18','Between 18 and 64','65 and older']
      Age = random.choice(Age)

      Gender = ['Female','Male','Other']
      Gender = random.choice(Gender)
Run Code Online (Sandbox Code Playgroud)

等等

我需要如下的输出

       Location        Age          Gender
       Dubai           below 18     Female
       India           65 and older Male
Run Code Online (Sandbox Code Playgroud)

。。。。

Arn*_*aud 7

您可以使用以下命令一一创建每​​一列np.random.choice

df = pd.DataFrame()                                                                                                                                                                     
N = 1000                                                                                                                                                                                
df["Location"] = np.random.choice(Location, size=N)                                                                                                                                     
df["Age"] = np.random.choice(Age, size=N)                                                                                                                                               
df["Gender"] = np.random.choice(Gender, size=N)  
Run Code Online (Sandbox Code Playgroud)

或者使用列表理解来做到这一点:

column_to_choice = {"Location": Location, "Age": Age, "Gender": Gender}

df = pd.DataFrame(
    [np.random.choice(column_to_choice[c], 100) for c in column_to_choice]
).T

df.columns = list(column_to_choice.keys())
Run Code Online (Sandbox Code Playgroud)

结果:

>>> print(df.head())                                                                                                                                                                              
    Location                Age  Gender
0      India       65 and older  Female
1     Berlin  Between 18 and 64  Female
2        USA  Between 18 and 64    Male
3  Indonesia           Under 18    Male
4      Dubai           Under 18   Other
Run Code Online (Sandbox Code Playgroud)