如何使用Work或School帐户读取Python中的SharePoint Online(Office365)Excel文件?

Yan*_*Luo 5 python excel sharepoint office365 sharepoint-online

我是一名大学生,并且已经通过我的大学电子邮件地址注册为Office 365教育版用户。我通常使用我的电子邮件帐户登录https://www.office.comalice@abc.edu。我的个人资料的路径如下:https : //abcedu-my.sharepoint.com/personal/alice_abc_edu

我的Office 365中有一个Excel(.xlsx)文件。并且我想使用Python以编程方式访问(或下载)Excel文件。我已经搜索了一些解决方案。但是其中大多数都需要NTLM凭据。但是我只有我的电子邮件帐户和密码。我不知道我的NTLM凭证。是alice@abc.edu还是alice_abc_edu?或电子邮件用户名和NTLM是完全不同的身份验证方式。而且我不能使用NTLM?

看来我用来登录的电子邮件地址被正式称为Work or School AccountAzure Active Directory Credential。但是我不知道如何使用这样的帐户来实现我的要求?此外,我需要在Python中执行此操作。RESTful也可以。但是我只是停留在第一步验证中。谢谢!

在这里遵循了Microsoft Graph教程,它告诉我注册一个Python应用程序。然后我得到一个应用程序ID和应用程序秘密。但是当我使用官方python-sample-send-mail

"""send-email sample for Microsoft Graph"""
# Copyright (c) Microsoft. All rights reserved. Licensed under the MIT license.
# See LICENSE in the project root for license information.
import base64
import mimetypes
import os
import pprint
import uuid

import flask
from flask_oauthlib.client import OAuth

import config

APP = flask.Flask(__name__, template_folder='static/templates')
APP.debug = True
APP.secret_key = 'development'
OAUTH = OAuth(APP)
MSGRAPH = OAUTH.remote_app(
    'microsoft',
    consumer_key=config.CLIENT_ID,
    consumer_secret=config.CLIENT_SECRET,
    request_token_params={'scope': config.SCOPES},
    base_url=config.RESOURCE + config.API_VERSION + '/',
    request_token_url=None,
    access_token_method='POST',
    access_token_url=config.AUTHORITY_URL + config.TOKEN_ENDPOINT,
    authorize_url=config.AUTHORITY_URL + config.AUTH_ENDPOINT)

@APP.route('/')
def homepage():
    """Render the home page."""
    return flask.render_template('homepage.html')

@APP.route('/login')
def login():
    """Prompt user to authenticate."""
    flask.session['state'] = str(uuid.uuid4())
    return MSGRAPH.authorize(callback=config.REDIRECT_URI, state=flask.session['state'])

@APP.route('/login/authorized')
def authorized():
    """Handler for the application's Redirect Uri."""
    if str(flask.session['state']) != str(flask.request.args['state']):
        raise Exception('state returned to redirect URL does not match!')
    response = MSGRAPH.authorized_response()
    flask.session['access_token'] = response['access_token']
    return flask.redirect('/mailform')

@APP.route('/mailform')
def mailform():
    """Sample form for sending email via Microsoft Graph."""

    # read user profile data
    user_profile = MSGRAPH.get('me', headers=request_headers()).data
    user_name = user_profile['displayName']

    # get profile photo
    photo_data, _, profile_pic = profile_photo(client=MSGRAPH, save_as='me')
    # save photo data as config.photo for use in mailform.html/mailsent.html
    if profile_pic:
        config.photo = base64.b64encode(photo_data).decode()
    else:
        profile_pic = 'static/images/no-profile-photo.png'
        with open(profile_pic, 'rb') as fhandle:
            config.photo = base64.b64encode(fhandle.read()).decode()

    # upload profile photo to OneDrive
    upload_response = upload_file(client=MSGRAPH, filename=profile_pic)
    if str(upload_response.status).startswith('2'):
        # create a sharing link for the uploaded photo
        link_url = sharing_link(client=MSGRAPH, item_id=upload_response.data['id'])
    else:
        link_url = ''

    body = flask.render_template('email.html', name=user_name, link_url=link_url)
    return flask.render_template('mailform.html',
                                 name=user_name,
                                 email=user_profile['userPrincipalName'],
                                 profile_pic=profile_pic,
                                 photo_data=config.photo,
                                 link_url=link_url,
                                 body=body)

@APP.route('/send_mail')
def send_mail():
    """Handler for send_mail route."""
    profile_pic = flask.request.args['profile_pic']

    response = sendmail(client=MSGRAPH,
                        subject=flask.request.args['subject'],
                        recipients=flask.request.args['email'].split(';'),
                        body=flask.request.args['body'],
                        attachments=[flask.request.args['profile_pic']])

    # show results in the mailsent form
    response_json = pprint.pformat(response.data)
    response_json = None if response_json == "b''" else response_json
    return flask.render_template('mailsent.html',
                                 sender=flask.request.args['sender'],
                                 email=flask.request.args['email'],
                                 profile_pic=profile_pic,
                                 photo_data=config.photo,
                                 subject=flask.request.args['subject'],
                                 body_length=len(flask.request.args['body']),
                                 response_status=response.status,
                                 response_json=response_json)

@MSGRAPH.tokengetter
def get_token():
    """Called by flask_oauthlib.client to retrieve current access token."""
    return (flask.session.get('access_token'), '')

def request_headers(headers=None):
    """Return dictionary of default HTTP headers for Graph API calls.
    Optional argument is other headers to merge/override defaults."""
    default_headers = {'SdkVersion': 'sample-python-flask',
                       'x-client-SKU': 'sample-python-flask',
                       'client-request-id': str(uuid.uuid4()),
                       'return-client-request-id': 'true'}
    if headers:
        default_headers.update(headers)
    return default_headers

def profile_photo(*, client=None, user_id='me', save_as=None):
    """Get profile photo.

    client  = user-authenticated flask-oauthlib client instance
    user_id = Graph id value for the user, or 'me' (default) for current user
    save_as = optional filename to save the photo locally. Should not include an
              extension - the extension is determined by photo's content type.

    Returns a tuple of the photo (raw data), content type, saved filename.
    """
    endpoint = 'me/photo/$value' if user_id == 'me' else f'users/{user_id}/$value'
    photo_response = client.get(endpoint)
    if str(photo_response.status).startswith('2'):
        # HTTP status code is 2XX, so photo was returned successfully
        photo = photo_response.raw_data
        metadata_response = client.get(endpoint[:-7]) # remove /$value to get metadata
        content_type = metadata_response.data.get('@odata.mediaContentType', '')
    else:
        photo = ''
        content_type = ''

    if photo and save_as:
        extension = content_type.split('/')[1]
        if extension == 'pjpeg':
            extension = 'jpeg' # to correct known issue with content type
        filename = save_as + '.' + extension
        with open(filename, 'wb') as fhandle:
            fhandle.write(photo)
    else:
        filename = ''

    return (photo, content_type, filename)

def sendmail(*, client, subject=None, recipients=None, body='',
             content_type='HTML', attachments=None):
    """Helper to send email from current user.

    client       = user-authenticated flask-oauthlib client instance
    subject      = email subject (required)
    recipients   = list of recipient email addresses (required)
    body         = body of the message
    content_type = content type (default is 'HTML')
    attachments  = list of file attachments (local filenames)

    Returns the response from the POST to the sendmail API.
    """

    # Verify that required arguments have been passed.
    if not all([client, subject, recipients]):
        raise ValueError('sendmail(): required arguments missing')

    # Create recipient list in required format.
    recipient_list = [{'EmailAddress': {'Address': address}}
                      for address in recipients]

    # Create list of attachments in required format.
    attached_files = []
    if attachments:
        for filename in attachments:
            b64_content = base64.b64encode(open(filename, 'rb').read())
            mime_type = mimetypes.guess_type(filename)[0]
            mime_type = mime_type if mime_type else ''
            attached_files.append( \
                {'@odata.type': '#microsoft.graph.fileAttachment',
                 'ContentBytes': b64_content.decode('utf-8'),
                 'ContentType': mime_type,
                 'Name': filename})

    # Create email message in required format.
    email_msg = {'Message': {'Subject': subject,
                             'Body': {'ContentType': content_type, 'Content': body},
                             'ToRecipients': recipient_list,
                             'Attachments': attached_files},
                 'SaveToSentItems': 'true'}

    # Do a POST to Graph's sendMail API and return the response.
    return client.post('me/microsoft.graph.sendMail',
                       headers=request_headers(),
                       data=email_msg,
                       format='json')

def sharing_link(*, client, item_id, link_type='view'):
    """Get a sharing link for an item in OneDrive.

    client    = user-authenticated flask-oauthlib client instance
    item_id   = the id of the DriveItem (the target of the link)
    link_type = 'view' (default), 'edit', or 'embed' (OneDrive Personal only)

    Returns the sharing link.
    """
    endpoint = f'me/drive/items/{item_id}/createLink'
    response = client.post(endpoint,
                           headers=request_headers(),
                           data={'type': link_type},
                           format='json')

    if str(response.status).startswith('2'):
        # status 201 = link created, status 200 = existing link returned
        return response.data['link']['webUrl']

def upload_file(*, client, filename, folder=None):
    """Upload a file to OneDrive for Business.

    client  = user-authenticated flask-oauthlib client instance
    filename = local filename; may include a path
    folder = destination subfolder/path in OneDrive for Business
             None (default) = root folder

    File is uploaded and the response object is returned.
    If file already exists, it is overwritten.
    If folder does not exist, it is created.

    API documentation:
    https://developer.microsoft.com/en-us/graph/docs/api-reference/v1.0/api/driveitem_put_content
    """
    fname_only = os.path.basename(filename)

    # create the Graph endpoint to be used
    if folder:
        # create endpoint for upload to a subfolder
        endpoint = f'me/drive/root:/{folder}/{fname_only}:/content'
    else:
        # create endpoint for upload to drive root folder
        endpoint = f'me/drive/root/children/{fname_only}/content'

    content_type, _ = mimetypes.guess_type(fname_only)
    with open(filename, 'rb') as fhandle:
        file_content = fhandle.read()

    return client.put(endpoint,
                      headers=request_headers({'content-type': content_type}),
                      data=file_content,
                      content_type=content_type)

if __name__ == '__main__':
    APP.run()
Run Code Online (Sandbox Code Playgroud)

它给了我一个错误:

AADSTS65005: Using application 'My Python App' is currently not supported for your organization abc.edu because it is in an unmanaged state. An administrator needs to claim ownership of the company by DNS validation of abc.edu before the application My Python App can be provisioned. Request ID: 9a4874e0-7f8f-4eff-b6f9-9834765d8780, Timestamp: 01/25/2018 13:51:10 Trace ID: 8d1cc38e-3b5e-4bf1-a003-bda164e00b00 Correlation ID: 2033267e-98ec-4eb1-91e9-c0530ef97fb1 Timestamp: 2018-01-25 13:51:10Z&state=d94af98c-92d9-4016-b3da-afd8e8974f4b HTTP/1.1

因此,我大学的IT管理员似乎没有启用将应用程序与Microsoft Graph连接的功能。但是,这是唯一的方法吗?我已经有有效的电子邮件帐户和密码。我认为必须有一种方法可以直接使用我的凭据以编程方式登录Office 365?

Dan*_*Dan 13

Niels V的建议,请尝试使用Office365-REST-Python-Client

客户端实现Sharepoint REST API。这是您尝试做的一个例子:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File 

url = 'https://yoursharepointsite.com/sites/documentsite'
username = 'yourusername'
password = 'yourpassword'
relative_url = '/sites/documentsite/Documents/filename.xlsx'
Run Code Online (Sandbox Code Playgroud)

本节直接从github README.md使用ClientContext方法,并在SharePoint服务器上进行身份验证

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print "Web title: {0}".format(web.properties['Title'])

else:
  print ctx_auth.get_last_error()
Run Code Online (Sandbox Code Playgroud)

如果您只想下载文件,则只需使用File.open_binary()

filename = 'filename.xlsx'
with open(filename, 'wb') as output_file:
    response = File.open_binary(ctx, relative_url)
    output_file.write(response.content)
Run Code Online (Sandbox Code Playgroud)

但是,如果您要分析文件的内容,可以将文件下载到内存中,然后直接使用Pandas或您选择的python'.xlsx'工具:

import io
import pandas as pd

response = File.open_binary(ctx, relative_url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read file into pandas dataframe
df = pd.read_excel(bytes_file_obj)
Run Code Online (Sandbox Code Playgroud)

...你可以从这里拿走。我希望这有帮助!

  • @Dan,这看起来很棒!我有一个关于“url = 'https://yoursharepointsite.com/sites/documentsite'”的快速问题,当我输入公司共享点的 url 时,出现错误“AADSTS53003:访问已被条件访问策略阻止” . 访问策略不允许令牌发行”。这是否意味着我需要与公司的 IT 部门联系才能解锁?谢谢 !!! (3认同)

cas*_*t42 5

要从命令行读取文件,您可以执行以下操作:

curl -O -L --ntlm  --user username:password "https://yoursharepointsite.com/sites/documentsite/sites/documentsite/Documents/filename.xlsx"
Run Code Online (Sandbox Code Playgroud)

使用 python 自动执行此操作的最简单方法是基于 request_nmtl:

conda install requests_ntlm --channel conda-forge
Run Code Online (Sandbox Code Playgroud)

从 Sharepoint 下载 filename.xlsx 的代码(python 3):

# Paste here the path to your file on sharepoint
url = 'https://yoursharepointsite.com/sites/documentsite/sites/documentsite/Documents/filename.xlsx'

import getpass

domain = 'ADMIN' # adapt to your domain to in which the user exists
user = getpass.getuser() 
pwd = getpass.getpass(prompt='What is your windows AD password?')

import requests
from requests_ntlm import HttpNtlmAuth
from urllib.parse import unquote
from pathlib import Path

filename = unquote(Path(url).name)

resp = requests.get(url, auth=HttpNtlmAuth(f'{domain}\\{user}', pwd ))
open(filename, 'wb').write(resp.content)
Run Code Online (Sandbox Code Playgroud)