Lae*_*Oss 0 python mysql pyspark google-colaboratory
我一直在尝试使用来自 Google colab 的 pyspark 在 Windows 10 上的本地主机上写入/读取 MySQL Server 8.0.19 的表,但失败。还有很多类似的问题和一些建议的答案,但似乎没有一个解决方案在这里有效。这是我的代码:
<...installations ...>
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.appName("Word Count")\
.config("spark.driver.extraClassPath", "/content/spark-2.4.5-bin-hadoop2.7/jars/mysql-connector-java-8.0.19.jar")\
.getOrCreate()
Run Code Online (Sandbox Code Playgroud)
这是连接字符串:
MyjdbcDF = spark.read.format("jdbc")\
.option("url", "jdbc:mysql://127.0.0.1:3306/mydb?user=testuser&password=pwtest")\
.option("dbtable", "collisions")\
.option("driver","com.mysql.cj.jdbc.Driver")\
.load()
Run Code Online (Sandbox Code Playgroud)
我也使用了.option("driver","com.mysql.jdbc.Driver")但仍然不断收到此错误:
Py4JJavaError: An error occurred while calling o154.load.
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
...
...
...
Caused by: java.net.ConnectException: Connection refused (Connection refused)
Run Code Online (Sandbox Code Playgroud)
由此看来,我猜测 MySQL Sever 无法访问。我已经 Telnet 到端口 3306,它确认 MySQL 服务器正在接受来自客户端计算机的连接。我读过运行:netsh advfirewall firewall add rule name="MySQL Server" action=allow protocol=TCP dir=in localport=3306将允许 MySQL 服务器的防火墙规则,以防它被阻止,但没有任何变化。
有人可以帮忙outpy吗?
以下是我在 Colab 上安装和设置 MySQL 的方法
# install, set connection
!apt-get install mysql-server > /dev/null
!service mysql start
!mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'root'"
!pip -q install PyMySQL
%load_ext sql
%config SqlMagic.feedback=False
%config SqlMagic.autopandas=True
%sql mysql+pymysql://root:root@/
# query using %sql or %%sql
df = %sql SELECT Host, User, authentication_string FROM mysql.user
df
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
10760 次 |
| 最近记录: |