在Python中读取Excel文件

Question

在Python中读取Excel文件

我有一个Excel文件

Arm_id      DSPName        DSPCode          HubCode          PinCode    PPTL
1            JaVAS            01              AGR             282001    1,2
2            JaVAS            01              AGR             282002    3,4
3            JaVAS            01              AGR             282003    5,6

Run Code Online (Sandbox Code Playgroud)

我想在表单中保存一个字符串Arm_id,DSPCode,Pincode.此格式是可配置的,即它可能会更改为DSPCode,Arm_id,Pincode.我将格式保存在列表中

FORMAT = ['Arm_id', 'DSPName', 'Pincode']

Run Code Online (Sandbox Code Playgroud)

如果可配置,我如何阅读具有提供名称的特定列的内容FORMAT.

这是我试过的.目前我能够阅读文件中的所有内容

from xlrd import open_workbook
wb = open_workbook('sample.xls')
for s in wb.sheets():
    #print 'Sheet:',s.name
    values = []
    for row in range(s.nrows):
        col_value = []
        for col in range(s.ncols):
            value  = (s.cell(row,col).value)
            try : value = str(int(value))
            except : pass
            col_value.append(value)
        values.append(col_value)
print values

Run Code Online (Sandbox Code Playgroud)

我的输出是

[[u'Arm_id', u'DSPName', u'DSPCode', u'HubCode', u'PinCode', u'PPTL'], ['1', u'JaVAS', '1', u'AGR', '282001', u'1,2'], ['2', u'JaVAS', '1', u'AGR', '282002', u'3,4'], ['3', u'JaVAS', '1', u'AGR', '282003', u'5,6']]

Run Code Online (Sandbox Code Playgroud)

然后我周围循环values[0]试图找出FORMAT在内容上values[0],然后让指数Arm_id, DSPname and Pincode在values[0],然后从下一个循环,我知道所有的指标FORMAT因素,从而让知道哪些价值,我需要得到的.

但这是一个如此糟糕的解决方案.

如何在excel文件中获取具有名称的特定列的值？

Answer 1

小智 83

一个稍晚的答案,但有了熊猫,可以直接获得一个excel文件的列:

import pandas
import xlrd
df = pandas.read_excel('sample.xls')
#print the column names
print df.columns
#get the values for a given column
values = df['Arm_id'].values
#get a data frame with selected columns
FORMAT = ['Arm_id', 'DSPName', 'Pincode']
df_selected = df[FORMAT]

Run Code Online (Sandbox Code Playgroud)

导入xlrd不是必需的,只需确保安装xlrd,pandas将导入并使用它. (3认同)
在顶部添加`import xlrd`以使其工作.`read_excel`需要`xlrd`.如果得到`ImportError:没有名为'xlrd'的模块',那么执行`pip install xlrd` (2认同)

Answer 2

tam*_*gal 65

这是一种方法:

from xlrd import open_workbook

class Arm(object):
    def __init__(self, id, dsp_name, dsp_code, hub_code, pin_code, pptl):
        self.id = id
        self.dsp_name = dsp_name
        self.dsp_code = dsp_code
        self.hub_code = hub_code
        self.pin_code = pin_code
        self.pptl = pptl

    def __str__(self):
        return("Arm object:\n"
               "  Arm_id = {0}\n"
               "  DSPName = {1}\n"
               "  DSPCode = {2}\n"
               "  HubCode = {3}\n"
               "  PinCode = {4} \n"
               "  PPTL = {5}"
               .format(self.id, self.dsp_name, self.dsp_code,
                       self.hub_code, self.pin_code, self.pptl))

wb = open_workbook('sample.xls')
for sheet in wb.sheets():
    number_of_rows = sheet.nrows
    number_of_columns = sheet.ncols

    items = []

    rows = []
    for row in range(1, number_of_rows):
        values = []
        for col in range(number_of_columns):
            value  = (sheet.cell(row,col).value)
            try:
                value = str(int(value))
            except ValueError:
                pass
            finally:
                values.append(value)
        item = Arm(*values)
        items.append(item)

for item in items:
    print item
    print("Accessing one single value (eg. DSPName): {0}".format(item.dsp_name))
    print

Run Code Online (Sandbox Code Playgroud)

您不必使用自定义类,您可以简单地使用dict().但是,如果您使用类,则可以通过点符号访问所有值,如上所示.

以下是上述脚本的输出:

Arm object:
  Arm_id = 1
  DSPName = JaVAS
  DSPCode = 1
  HubCode = AGR
  PinCode = 282001 
  PPTL = 1
Accessing one single value (eg. DSPName): JaVAS

Arm object:
  Arm_id = 2
  DSPName = JaVAS
  DSPCode = 1
  HubCode = AGR
  PinCode = 282002 
  PPTL = 3
Accessing one single value (eg. DSPName): JaVAS

Arm object:
  Arm_id = 3
  DSPName = JaVAS
  DSPCode = 1
  HubCode = AGR
  PinCode = 282003 
  PPTL = 5
Accessing one single value (eg. DSPName): JaVAS

Run Code Online (Sandbox Code Playgroud)

Answer 3

Noe*_*ans 11

所以关键部分是抓住header(col_names = s.row(0))并在遍历行时跳过不需要的第一行for row in range(1, s.nrows)- 使用范围从1开始(不是隐式0).然后使用zip来逐步执行包含"name"作为列标题的行.

from xlrd import open_workbook

wb = open_workbook('Book2.xls')
values = []
for s in wb.sheets():
    #print 'Sheet:',s.name
    for row in range(1, s.nrows):
        col_names = s.row(0)
        col_value = []
        for name, col in zip(col_names, range(s.ncols)):
            value  = (s.cell(row,col).value)
            try : value = str(int(value))
            except : pass
            col_value.append((name.value, value))
        values.append(col_value)
print values

Run Code Online (Sandbox Code Playgroud)

Answer 4

小智 5

通过使用熊猫我们可以轻松阅读excel.

import pandas as pd 
import xlrd as xl 
from pandas import ExcelWriter
from pandas import ExcelFile 

DataF=pd.read_excel("Test.xlsx",sheet_name='Sheet1')

print("Column headings:")
print(DataF.columns)

Run Code Online (Sandbox Code Playgroud)

测试时间:https://repl.it 参考:https://pythonspot.com/read-excel-with-pandas/

你为什么要导入`xlrd`？ (2认同)

归档时间：	11 年，11 月前
查看次数：	331616 次
最近记录：	6 年，3 月前