我有一个脚本,包括从列表中打开文件,然后对该文件中的文本执行某些操作.我正在使用python多处理和Pool来尝试并行化这个操作.脚本的抽象如下:
import os
from multiprocessing import Pool
results = []
def testFunc(files):
for file in files:
print "Working in Process #%d" % (os.getpid())
#This is just an illustration of some logic. This is not what I'm actually doing.
for line in file:
if 'dog' in line:
results.append(line)
if __name__=="__main__":
p = Pool(processes=2)
files = ['/path/to/file1.txt', '/path/to/file2.txt']
results = p.apply_async(testFunc, args = (files,))
results2 = results.get()
Run Code Online (Sandbox Code Playgroud)
当我运行它时,每次迭代的进程id的打印输出都是相同的.基本上我正在尝试做的是获取输入列表的每个元素并将其分支到一个单独的进程,但似乎一个进程正在完成所有工作.
我有一个巨大的Json文件,其中一小部分如下:
{
"socialNews": [{
"adminTagIds": "",
"fileIds": "",
"departmentTagIds": "",
........
........
"comments": [{
"commentId": "",
"newsId": "",
"entityId": "",
....
....
}]
}]
.....
}
Run Code Online (Sandbox Code Playgroud)
我在社交新闻中应用了侧视图片段如下:
val rdd = sqlContext.jsonFile("file:///home/ashish/test")
rdd.registerTempTable("social")
val result = sqlContext.sql("select * from social LATERAL VIEW explode(socialNews) social AS comment")
Run Code Online (Sandbox Code Playgroud)
现在我想将此结果(DataFrame)转换回json并保存到文件中,但我无法找到任何scala api进行转换.是否有任何标准库可以做到这一点或某种方式来解决它?
我正绕着国家单子行道.琐碎的例子很容易理解.我现在转向一个现实世界的案例,其中域对象是复合的.例如,使用以下域对象(它们没有多大意义,只是纯粹的例子):
case class Master(workers: Map[String, Worker])
case class Worker(elapsed: Long, result: Vector[String])
case class Message(workerId: String, work: String, elapsed: Long)
Run Code Online (Sandbox Code Playgroud)
考虑到Worker
作为S
类型的State[S, +A]
单子它很容易写几个组合子这样的:
type WorkerState[+A] = State[Worker, A]
def update(message: Message): WorkerState[Unit] = State.modify { w =>
w.copy(elapsed = w.elapsed + message.elapsed,
result = w.result :+ message.work)
}
def getWork: WorkerState[Vector[String]] = State { w => (w.result, w) }
def getElapsed: WorkerState[Long] = State { w => (w.elapsed, w) }
def updateAndGetElapsed(message: Message): WorkerState[Long] = …
Run Code Online (Sandbox Code Playgroud) 我刚刚开始使用Rust教程,并使用递归结束了这样的代码
extern crate rand;
use std::io;
use rand::Rng;
use std::cmp::Ordering;
use std::str::FromStr;
use std::fmt::{Display, Debug};
fn try_guess<T: Ord>(guess: T, actual: T) -> bool {
match guess.cmp(&actual) {
Ordering::Less => {
println!("Too small");
false
}
Ordering::Greater => {
println!("Too big");
false
}
Ordering::Equal => {
println!("You win!");
true
}
}
}
fn guess_loop<T: Ord + FromStr + Display + Copy>(actual: T)
where <T as FromStr>::Err: Debug
{
println!("PLease input your guess.");
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to …
Run Code Online (Sandbox Code Playgroud) 我正在我的VirtualBoxed Ubuntu 11.4上测试这个代码
package main
import ("fmt";"time";"big")
var c chan *big.Int
func sum( start,stop,step int64) {
bigStop := big.NewInt(stop)
bigStep := big.NewInt(step)
bigSum := big.NewInt(0)
for i := big.NewInt(start);i.Cmp(bigStop)<0 ;i.Add(i,bigStep){
bigSum.Add(bigSum,i)
}
c<-bigSum
}
func main() {
s := big.NewInt( 0 )
n := time.Nanoseconds()
step := int64(4)
c = make( chan *big.Int , int(step))
stop := int64(100000000)
for j:=int64(0);j<step;j++{
go sum(j,stop,step)
}
for j:=int64(0);j<step;j++{
s.Add(s,<-c)
}
n = time.Nanoseconds() - n
fmt.Println(s,float64(n)/1000000000.)
}
Run Code Online (Sandbox Code Playgroud)
Ubuntu可以访问我的所有4个核心.我通过同时运行几个可执行文件和系统监视器来检查这个.但是,当我尝试运行此代码时,它只使用一个核心,并没有获得并行处理的任何利润.
我做错了什么?
假设我想穿越案例类通用表示描述这里
我已经定义了一些类型类来描述字段:
trait Described[X] extends (X => String)
object Described{
def apply[X](x: X)(implicit desc: Described[X]) = desc(x)
}
Run Code Online (Sandbox Code Playgroud)
定义了一些实例:
implicit object DoubleDescribed extends Described[Double]{
def apply(x: Double) = x.formatted("%01.3f")
}
Run Code Online (Sandbox Code Playgroud)
和一般用户:
import shapeless._
import shapeless.labelled.FieldType
import shapeless.ops.hlist.LeftFolder
object DescrFolder extends Poly2{
implicit def field[X, S <: Symbol](implicit desc: Described[X],
witness: Witness.Aux[S]):
Case.Aux[Seq[String], FieldType[S, X], Seq[String]] =
at[Seq[String], FieldType[S, X]](
(descrs, value) => descrs :+ f"${witness.value.name}: ${desc(value)}")
}
def describe[T <: Product, Repr <: HList](struct: T)
(implicit lgen: LabelledGeneric.Aux[T,Repr],
folder: …
Run Code Online (Sandbox Code Playgroud) 我是Haskell的新手,只是玩了一会儿.
我写了一个轻量级的OOP模拟:
--OOP.hs
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, UndecidableInstances, ScopedTypeVariables, FunctionalDependencies #-}
module OOP where
class Provides obj iface where
provide::obj->iface
(#>)::obj->(iface->a)->a
o #> meth = meth $ provide o
class Instance cls obj | obj -> cls where
classOf::obj->cls
class Implements cls iface where
implement::(Instance cls obj)=>cls->obj->iface
instance (Instance cls obj, Implements cls iface)=>Provides obj iface where
provide x = implement (classOf x::cls) x
Run Code Online (Sandbox Code Playgroud)
使用它像:
--main.hs
{-# LANGUAGE MultiParamTypeClasses #-}
import OOP
data I1 = I1
getI1::I1->String
getI1 i1 = …
Run Code Online (Sandbox Code Playgroud) 我试图在scala REPL 2.11.8中定义一个包含1000个字段的case类.案例类定义如下:
case class Step2_Class(
`Response` : String,
`D1` : String,
`D2` : String,
`D3` : String,
`D4` : String,
//......,
`D999` : String,
`D1000` : String)
Run Code Online (Sandbox Code Playgroud)
REPL正在等待回应.大约1小时后,抛出以下堆栈溢出异常.
java.lang.StackOverflowError
at scala.reflect.internal.Trees$class.traverseComponents$1(Trees.scala:1294)
at scala.reflect.internal.Trees$class.itraverse(Trees.scala:1330)
at scala.reflect.internal.SymbolTable.itraverse(SymbolTable.scala:16)
at scala.reflect.internal.SymbolTable.itraverse(SymbolTable.scala:16)
at scala.reflect.api.Trees$Traverser.traverse(Trees.scala:2475)
at scala.reflect.internal.Positions$DefaultPosAssigner.traverse(Positions.scala:288)
at scala.reflect.internal.Positions$DefaultPosAssigner.traverse(Positions.scala:282)
at scala.reflect.internal.Trees$class.traverseComponents$1(Trees.scala:1283)
at scala.reflect.internal.Trees$class.itraverse(Trees.scala:1330)
Run Code Online (Sandbox Code Playgroud)
你有什么想法?scala不支持这种情况吗?有没有解决方法?
我想尝试Monocle库.但我找不到基本语法的帮助资源.
总之,我需要Map[K,V] -> A
具有光学器件的光学器件V -> A
,我该如何定义它?
假设我有一些
import monocle.macros.GenLens
case class DirState(opened: Boolean)
object DirState {
val opened = GenLens[DirState](_.opened)
}
type Path = List[String]
type StateStore = Map[Path, DirState]
Run Code Online (Sandbox Code Playgroud)
接下来我遇到了我需要简单的地方StateStore => StateStore
,所以我正在导入
import monocle._
import monocle.std._
import monocle.syntax._
import monocle.function._
Run Code Online (Sandbox Code Playgroud)
并尝试先定义:
def setOpened(path: Path): StateStore => StateStore =
at(path) composeLens DirState.opened set true
Run Code Online (Sandbox Code Playgroud)
到这里来
暧昧隐式的值:这两个方法
atMap
在trait MapInstances
类型[K, V]=> monocle.function.At[Map[K,V],K,V]
和方法atSet
在trait SetInstances
类型的[A]=> monocle.function.At[Set[A],A,Unit] …
在以下示例中
import shapeless._
import shapeless.syntax.singleton._
val concat = "right".narrow
def extract[s <: String](x: s)(implicit witness: Witness.Aux[s]): String = witness.value
extract(concat)
Run Code Online (Sandbox Code Playgroud)
我收到了一个错误
错误:找不到参数的隐含值
witness:shapeless.Witness.Aux[String("right")]
我正在尝试做的事情是类型级DSL,它严重依赖单例类型.
由于在typelevel的fork之外支持单例类型的文字,我希望除了类型文字之外还要开发基于值的DSL,并且在值类型中保留可用的单例类型对于此任务至关重要.
所以我正在寻找允许我稍后从值的类型中提取单例字符串见证的任何解决方法.
操作使用.witness
而不是.narrow
完美地工作,但我仍然在寻找纯粹类型的解决方案,而不需要Witness
包装