请在下面找到代码,然后让我知道如何将列名更改为小写。我尝试了withColumnRename,但是我必须为每个列都做一次,然后键入所有列名。我只想在列上这样做,所以我不想提及所有列名,因为它们太多了。
Scala版本:2.11 Spark:2.2
import org.apache.spark.sql.SparkSession
import org.apache.log4j.{Level, Logger}
import com.datastax
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import com.datastax.spark.connector._
import org.apache.spark.sql._
object dataframeset {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("Sample1").setMaster("local[*]")
val sc = new SparkContext(conf)
sc.setLogLevel("ERROR")
val rdd1 = sc.cassandraTable("tdata", "map3")
Logger.getLogger("org").setLevel(Level.ERROR)
Logger.getLogger("akka").setLevel(Level.ERROR)
val spark1 = org.apache.spark.sql.SparkSession.builder().master("local").config("spark.cassandra.connection.host","127.0.0.1")
.appName("Spark SQL basic example").getOrCreate()
val df = spark1.read.format("csv").option("header","true").option("inferschema", "true").load("/Users/Desktop/del2.csv")
import spark1.implicits._
println("\nTop Records are:")
df.show(1)
val dfprev1 = df.select(col = "sno", "year", "StateAbbr")
dfprev1.show(1)
}
}
Run Code Online (Sandbox Code Playgroud)
要求的输出:
import org.apache.spark.sql.SparkSession
import org.apache.log4j.{Level, …Run Code Online (Sandbox Code Playgroud) 我们可以根据elif else调用方法吗...例如:
if line2[1] = '1':
a(line2)
elif line2[1] = '2':
b(line2)
elif line2[1] = '3':
c(line2)
Run Code Online (Sandbox Code Playgroud)
而这样的例子不胜枚举.我们可以使用地图并调用该函数.说
线路输入示例:
line = ['1','说','你好']
line = ['2','How','Are']
码:
def g(line)
my_map = { '1': a(line),
'2': b(line),
'3': c(line, b),
......
and the list goes on
}
Run Code Online (Sandbox Code Playgroud)
here if line[0] = '1' call a(line)
elif line[0] = '2' call b(line)
Run Code Online (Sandbox Code Playgroud)
如何根据输入调用函数.
如果可能,请发送示例代码
谢谢Rakesh
line = ['1','say','Hi','','','','','','',....继续5-15次]在上面的例子中,如果我还要分配其他变量.我该怎么做.
if line2[1] == '8':
p.plast = line2[3]
p.pfirst = line2[4]
p.pid = line2[9]
elif line2[1] == …Run Code Online (Sandbox Code Playgroud) 测试文件:
my_name = ''
my_home_address = ''
my_home_phone = ''
my_office_address = ''
my_office_phone = ''
Run Code Online (Sandbox Code Playgroud)
这是另一个文件,我们称之为 test2
import test
line1 = ['address', 'home', 'tom', 'downtown', '12345']
line1 = ['address', 'office', 'tom', 'uptown', '4567']
my_map = { 'home': ['my_name', 'my_home_address', 'my_home_phone'],
'office1': ['my_name', 'my_office_address', 'my_office_phone'],
'office2': ['my_name', 'my_office_address', 'my_office_phone'],
}
if line1[1] in my_map.keys():
for position, c_name in enumerate(line1):
setattr(test, my_map[line1[1]][pos], my_map[line1[1]][pos])
Run Code Online (Sandbox Code Playgroud)
问题:我有测试文件,其中包含保存数据的所有字段,例如:my_name = ''
my_home_address = '' my_home_phone = '' my_office_address = '' my_office_phone = ''等...
我的输入是 …
python ×2
apache-spark ×1
csv ×1
function ×1
if-statement ×1
methods ×1
scala ×1
setattr ×1
setattribute ×1
variables ×1