对于比我更了解函数定义内部的人来说,这是一个普遍的问题.
一般来说,做这样的事情是否有性能权衡:
def my_function():
def other_function():
pass
# do some stuff
other_function()
Run Code Online (Sandbox Code Playgroud)
与:
def other_function():
pass
def my_function():
# do some stuff
other_function()
Run Code Online (Sandbox Code Playgroud)
我之前已经看过开发人员内联函数,以保持一个小的,单一使用函数接近实际使用它的代码,但我总是想知道是否存在执行此类操作的内存(或计算)性能损失.
思考?
一直在围绕Spark结构化流进行操作mapGroupsWithState(尤其是遵循Spark源代码中的StructuredSessionization示例)。我想确认mapGroupsWithState我的用例存在的一些限制。
就我而言,会话是用户的一组不间断活动,因此,两个按时间顺序排列(按事件时间而不是处理时间)的事件之间的间隔不会超过开发人员定义的持续时间(通常30分钟)。
在进入代码之前,一个示例将有所帮助:
{"event_time": "2018-01-01T00:00:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:01:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:05:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:45:00", "user_id": "mike"}
Run Code Online (Sandbox Code Playgroud)
对于上面的流,会话定义为30分钟的不活动时间。在流媒体上下文中,我们应该以一个会话结束(第二个会话尚未完成):
{"event_time": "2018-01-01T00:00:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:01:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:05:00", "user_id": "mike"}
{"event_time": "2018-01-01T00:45:00", "user_id": "mike"}
Run Code Online (Sandbox Code Playgroud)
现在考虑以下Spark驱动程序:
[
{
"user_id": "mike",
"startTimestamp": "2018-01-01T00:00:00",
"endTimestamp": "2018-01-01T00:05:00"
}
]
Run Code Online (Sandbox Code Playgroud)
该程序的输出为:
root
|-- event_time: timestamp (nullable = true)
|-- user_id: string (nullable = true)
state update for user mike (current watermark: 1969-12-31 19:00:00.0)
User mike has new …Run Code Online (Sandbox Code Playgroud) 尝试使用shoulda和rails 3编写简单的单元测试.
测试/单元/ user_test.rb
class UserTest < Test::Unit::TestCase
should validate_presence_of(:password, :on => :create)
should validate_presence_of(:handle, :email)
should validate_confirmation_of(:password)
should validate_length_of(:handle, :within => 6..15)
should validate_uniqueness_of(:handle)
should validate_format_of(:handle, :with => /\A\w+\z/i)
should validate_length_of(:email, :within => 6..100)
end
Run Code Online (Sandbox Code Playgroud)
Gemfile的相关部分
group :test do
gem 'shoulda'
gem 'rspec-rails', '2.0.0.beta.12'
end
Run Code Online (Sandbox Code Playgroud)
当我尝试使用此命令时,rake test --trace我收到以下错误:
** Execute test:units
/Users/removed/removed/removed/app_name/test/unit/user_test.rb:5: superclass mismatch for class UserTest (TypeError)
from /Library/Ruby/Gems/1.8/gems/activesupport-3.0.7/lib/active_support/dependencies.rb:239:in `require'
from /Library/Ruby/Gems/1.8/gems/activesupport-3.0.7/lib/active_support/dependencies.rb:239:in `require'
from /Library/Ruby/Gems/1.8/gems/activesupport-3.0.7/lib/active_support/dependencies.rb:227:in `load_dependency'
from /Library/Ruby/Gems/1.8/gems/activesupport-3.0.7/lib/active_support/dependencies.rb:239:in `require'
from /Library/Ruby/Gems/1.8/gems/rake-0.9.2/lib/rake/rake_test_loader.rb:9
from /Library/Ruby/Gems/1.8/gems/rake-0.9.2/lib/rake/rake_test_loader.rb:9:in `each'
from …Run Code Online (Sandbox Code Playgroud) 与此问题类似.
使用AFNetworking 2.0.3并尝试使用AFHTTPSessionManager的POST + constructBodyWithBlock上传图像.由于原因未知,似乎HTTP邮件正文在向服务器发出请求时始终为空.
我下面是AFHTTPSessionManager的子类(因此使用[self POST ...].
我试过两种方式构建请求.
方法1:我只是尝试传递params,然后只添加图像数据(如果存在).
- (void) createNewAccount:(NSString *)nickname accountType:(NSInteger)accountType primaryPhoto:(UIImage *)primaryPhoto
{
NSString *accessToken = self.accessToken;
// Ensure none of the params are nil, otherwise it'll mess up our dictionary
if (!nickname) nickname = @"";
if (!accessToken) accessToken = @"";
NSDictionary *params = @{@"nickname": nickname,
@"type": [[NSNumber alloc] initWithInteger:accountType],
@"access_token": accessToken};
NSLog(@"Creating new account %@", params);
[self POST:@"accounts" parameters:params constructingBodyWithBlock:^(id<AFMultipartFormData> formData) {
if (primaryPhoto) {
[formData appendPartWithFileData:UIImageJPEGRepresentation(primaryPhoto, 1.0)
name:@"primary_photo" …Run Code Online (Sandbox Code Playgroud) 这看起来似乎很明显,但是在回顾文档和示例时,我不确定是否可以找到一种方法来使用PySpark进行结构化流转换。
例如:
from pyspark.sql import SparkSession
spark = (
SparkSession
.builder
.appName('StreamingWordCount')
.getOrCreate()
)
raw_records = (
spark
.readStream
.format('socket')
.option('host', 'localhost')
.option('port', 9999)
.load()
)
# I realize there's a SQL function for upper-case, just illustrating a sample
# use of an arbitrary map function
records = raw_records.rdd.map(lambda w: w.upper()).toDF()
counts = (
records
.groupBy(records.value)
.count()
)
query = (
counts
.writeStream
.outputMode('complete')
.format('console')
.start()
)
query.awaitTermination()
Run Code Online (Sandbox Code Playgroud)
这将引发以下异常:
Queries with streaming sources must be executed with writeStream.start
Run Code Online (Sandbox Code Playgroud)
但是,如果我删除呼叫,rdd.map(...).toDF() …
我有一个链接到两个子模型的模型,如下所示:
var SubModel = Backbone.Model.extend({
defaults: {
headline: null,
image_url: null,
url: null
}
});
var MainModel = Backbone.Model.extend({
defaults: {
subModelA: null,
subModelB: null,
title: null
},
urlRoot: function() {
if (this.isNew()) {
return '/mainmodel/new';
}
return '/mainmodel';
},
initialize: function() {
this.fetch();
},
parse: function(data) {
var response = {};
response.subModelA = new SubModel(data.subModelA);
response.subModelB = new SubModel(data.subModelB);
response.title = data.title;
return response;
}
});
Run Code Online (Sandbox Code Playgroud)
我目前遇到的问题是调用var mainModelInstance = new MainModel()是否正确获取,/mainmodel/new但mainModelInstance.attributes始终是一个空白对象{}. …
apache-spark ×2
afnetworking ×1
backbone.js ×1
ios ×1
objective-c ×1
pyspark ×1
python ×1
ruby ×1
shoulda ×1
unit-testing ×1