如何在GAE mapreduce上动态传递参数到映射函数?

Joh*_*ong 7 google-app-engine mapreduce

我需要运行一个动态的mapreduce作业,因为参数需要传递给地图,并且每次运行mapreduce作业时都会减少函数(例如,响应用户请求).

我该如何做到这一点?我无法在文档中看到如何在运行时为map和reduce进行动态处理.

class MatchProcessing(webapp2.RequestHandler):

  def get(self):
      requestKeyID=int(self.request.get('riderbeeRequestID'))
      userKey=self.request.get('userKey')
      pipeline = MatchingPipeline(requestKeyID, userKey)
      pipeline.start()
      self.redirect(pipeline.base_path + "/status?root=" + pipeline.pipeline_id)


class MatchingPipeline(base_handler.PipelineBase):
    def run(self, requestKeyID, userKey):
        yield mapreduce_pipeline.MapreducePipeline(
            "riderbee_matching",
            "tasks.matchingMR.riderbee_map",
            "tasks.matchingMR.riderbee_reduce",
            "mapreduce.input_readers.DatastoreInputReader",
            "mapreduce.output_writers.BlobstoreOutputWriter",
            mapper_params={
                "entity_kind": "models.rides.RiderbeeRequest",
                "requestKeyID": requestKeyID,
                "userKey": userKey,
            },
            reducer_params={
                "mime_type": "text/plain",
            },
            shards=16)


def riderbee_map(riderbeeRequest):
    # would like to access the requestKeyID and userKey parameters that were passed in mapper_params
    # so that we can do some processing based on that

    yield (riderbeeRequest.user.email, riderbeeRequest.key().id())


def riderbee_reduce(key, values):
    # would like to access the requestKeyID and userKey parameters that were passed earlier, perhaps through reducer_params
    # so that we can do some processing based on that

    yield "%s: %s\n" % (key, len(values))
Run Code Online (Sandbox Code Playgroud)

请帮忙?

Moi*_*vin 5

我很确定你可以在mapper_parameters中指定参数,并从上下文模块中读取它们.有关详细信息,请参阅http://code.google.com/p/appengine-mapreduce/wiki/UserGuidePython#Mapper_parameters.