我有一个数据框,我想通过Name和Acquired对变量进行分组,但只选择包含Position 10的组.
这是我的数据框的一个例子.
Name Acquired Position Salary
1 Adam Dunn* Amateur Draft 7 250000
2 Adam Dunn* Amateur Draft 7 400000
3 Adam Dunn* Amateur Draft 7 445000
4 Adam Dunn* Amateur Draft 7 4600000
5 Adam Dunn* Amateur Draft 7 7500000
6 Adam Dunn* Amateur Draft 7 10500000
7 Adam Dunn* Amateur Draft 7 13000000
8 Adam Dunn* Free Agency 3 8000000
9 Adam Dunn* Free Agency 3 12000000
10 Adam Dunn* Free Agency 10 12000000
11 Adam Dunn* …Run Code Online (Sandbox Code Playgroud) 我试图将shapefile读入GeoDataFrame.
通常我只是这样做,它的工作原理:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
df = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp")
Run Code Online (Sandbox Code Playgroud)
但这一次它给了我错误: b'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument".'
完整错误:
---------------------------------------------------------------------------
CPLE_AppDefinedError Traceback (most recent call last)
<ipython-input-14-adcad0275d30> in <module>()
----> 1 df_wildfires_2016 = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp")
/usr/local/lib/python3.6/site-packages/geopandas/io/file.py in read_file(filename, **kwargs)
19 """
20 bbox = kwargs.pop('bbox', None)
---> 21 with fiona.open(filename, **kwargs) as f:
22 crs = f.crs
23 if bbox is not None:
/usr/local/lib/python3.6/site-packages/fiona/__init__.py in open(path, mode, driver, schema, …Run Code Online (Sandbox Code Playgroud) 我写了这段代码来创建一个地图.
ggplot(data = Canada2015_Import_3) +
borders(database = "world",
colour = "grey60",
fill="grey90") +
geom_polygon(aes(x=long, y=lat, group = group, fill = Trade_Value_mean),
color = "grey60") +
scale_fill_gradient(low = "blue", high = "red", name = "Trade Value") +
ggtitle("Canadien Imports in 2015") +
xlab("") + ylab("") +
theme(panel.background = element_blank(),
plot.title = element_text(face = "bold"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
Run Code Online (Sandbox Code Playgroud)
这张地图给了我一个带有科学记数法的图例,我想把它改成正常或用逗号.
有谁知道怎么做?
这是我数据框的基本结构.
Country Trade_Value_mean long lat group order subregion
Afghanistan 2359461 74.89131 37.23164 2 12 <NA>
Run Code Online (Sandbox Code Playgroud)
所有帮助表示赞赏.
使用下面的代码,我创建了一个哑铃图表.
f <- ggplot(Brewers_PA, aes(x=PA.2015, xend=PA.2016, y=Name))
f + geom_dumbbell(colour = "darkblue", point.colour.l = "darkred", point.colour.r = "darkBlue", point.size.l = 2.5, point.size.r = 2.5) +
theme(plot.background=element_rect(fill = "grey93", colour = "grey93")) +
theme(plot.title=element_text(size = 11, face = "bold", hjust = 0)) +
theme(axis.text.x=element_text(size = 8)) +
theme(axis.text.y=element_text(size = 8)) +
theme(axis.title.x=element_text(size = 9)) +
theme(axis.title.y=element_text(size=9)) + ylab("") + xlab("Plate Appearance") +
ggtitle("Brewers Change in Plate Appearance 2015-2016")
Run Code Online (Sandbox Code Playgroud)
由于本教程,我能够做到这一点.https://www.r-bloggers.com/beating-lollipops-into-dumbbells/
唯一的问题是我想为此添加一个图例,但我不知道如何.有人知道吗?所有帮助表示赞赏.
我基本上会喜欢传说中的颜色.所以,"darkblue"= 2016(PA.2016),"darkred"= 2015(PA.2015).我想添加一张图片,但由于某种原因它不会起作用.
这是我创建的数据框:
Name PA.2015 PA.2016
1 Jonathan Villar …Run Code Online (Sandbox Code Playgroud) 我在尝试发布我的Shiny应用时遇到了问题.
这是我发布的应用程序的代码:
用户界面:
library(shiny)
library(ggplot2)
library(dplyr)
ui <- fluidPage(
titlePanel("Visualizing Pitcher Statistics"),
sidebarLayout(
sidebarPanel(
helpText("Data from Baseball Prospectus"),
helpText("by Julien Assouline"),
sliderInput("yearinput", "YEAR",
min = 1970, max = 2016, value = c(2000, 2016),
animate = TRUE),
selectInput("xcol", "X Axis",
choices = c("YEAR","AGE","NAME","G","GS","PITCHES","IP","IP.Start","IP.Relief","W","L","SV","BS","PA","AB","R","ER","H","X1B","X2B","X3B","HR","TB","BB","UBB","IBB","SO","HBP","SF","SH","PPF","FIP","cFIP","ERA","DRA","PWARP","TEAMS","ROOKIE","League")),
selectInput("ycol", "y Axis",
choices = c("PWARP","YEAR","NAME","AGE","G","GS","PITCHES","IP","IP.Start","IP.Relief","W","L","SV","BS","PA","AB","R","ER","H","X1B","X2B","X3B","HR","TB","BB","UBB","IBB","SO","HBP","SF","SH","PPF","FIP","cFIP","ERA","DRA","TEAMS","ROOKIE","League")),
checkboxInput(inputId = "smoother",
label = "show smoother",
value = FALSE),
downloadButton("downloadPNG", "Download as a PNG file")
),
mainPanel(
tabsetPanel(
tabPanel("Scatterplot", plotOutput("plot1"),
verbatimTextOutput("descriptionTab1"), value = "Tab1"),
tabPanel("Line Chart", plotOutput("plot2"),
verbatimTextOutput("descriptionTab2"), value = "Tab2"), …Run Code Online (Sandbox Code Playgroud) 我是网络抓取的新手,我正试图在多个网页上刮取表格.这是网站:http://www.baseball-reference.com/teams/MIL/2016.shtml
我能够轻松地在一页上刮一张桌子rvest.有多个表,但我只想抓第一个,这是我的代码
library(rvest)
url4 <- "http://www.baseball-reference.com/teams/MIL/2016.shtml"
Brewers2016 <- url4 %>% read_html() %>%
html_nodes(xpath = '//*[@id="div_team_batting"]/table[1]') %>%
html_table()
Brewers2016 <- as.data.frame(Brewers2016)
Run Code Online (Sandbox Code Playgroud)
问题是我想要抓住可追溯到1970年的页面上的第一个表格.在表格上方的左上角有一个指定前一年的链接.有人知道我怎么做吗?
我也对不同的方法持开放态度,例如,除了rvest之外的其他方法可能会更好.我用rvest是因为它是我开始学习的那个.