我使用以下代码绘制条形图.需要以相反的顺序呈现图例.我该怎么做?
colorsArr = plt.cm.BuPu(np.linspace(0, 0.5, len(C2)))
p = numpy.empty(len(C2), dtype=object)
plt.figure(figsize=(11,11))
prevBar = 0
for index in range(len(C2)):
plt.bar(ind, C2[index], width, bottom=prevBar, color=colorsArr[index],
label=C0[index])
prevBar = prevBar + C2[index]
# positions of the x-axis ticks (center of the bars as bar labels)
tick_pos = [i+(width/2) for i in ind]
plt.ylabel('Home Category')
plt.title('Affinity - Retail Details(Home category)')
# set the x ticks with names
plt.xticks(tick_pos, C1)
plt.yticks(np.arange(0,70000,3000))
plt.legend(title="Line", loc='upper left' )
# Set a buffer around the edge
plt.xlim(-width*2, width*2)
plt.show()
Run Code Online (Sandbox Code Playgroud) 什么是Pythonic/pandas在pandas中的列中排序"级别"以在条形图中给出特定的条形排序.
例如,给定:
import pandas as pd
df = pd.DataFrame({
'group': ['a', 'a', 'a', 'a', 'a', 'a', 'a',
'b', 'b', 'b', 'b', 'b', 'b', 'b'],
'day': ['Mon', 'Tues', 'Fri', 'Thurs', 'Sat', 'Sun', 'Weds',
'Fri', 'Sun', 'Thurs', 'Sat', 'Weds', 'Mon', 'Tues'],
'amount': [1, 2, 4, 2, 1, 1, 2, 4, 5, 3, 4, 2, 1, 3]})
dfx = df.groupby(['group'])
dfx.plot(kind='bar', x='day')
Run Code Online (Sandbox Code Playgroud)
我可以生成以下一对图:

条形的顺序遵循行顺序.
重新排序数据的最佳方法是什么,以便条形图有Mon-Sun订购的条形码?
更新:这种垃圾解决方案有效 - 但它使用额外排序列的方式远非优雅:
df2 = pd.DataFrame({
'day': ['Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun'],
'num': [0, 1, 2, …Run Code Online (Sandbox Code Playgroud) 鉴于以下热图,我将如何删除轴标题('月'和'年')?
import seaborn as sns
# Load the example flights dataset and conver to long-form
flights_long = sns.load_dataset("flights")
flights = flights_long.pivot("month", "year", "passengers")
# Draw a heatmap with the numeric values in each cell
sns.heatmap(flights, annot=True, fmt="d", linewidths=.5)
Run Code Online (Sandbox Code Playgroud)
我正在寻找下载一个gzip压缩的csv并将其作为R对象加载而不先将其保存到磁盘.我可以使用压缩文件执行此操作,但似乎无法使用gzfile或使用它gzcon.
例:
grabRemote <- function(url) {
temp <- tempfile()
download.file(url, temp)
aap.file <- read.csv(gzfile(temp), as.is = TRUE)
unlink(temp)
return(aap.file)
}
grabRemote("http://dumps.wikimedia.org/other/articlefeedback/aa_combined-20110321.csv.gz")
Run Code Online (Sandbox Code Playgroud)
下载包含维基百科文章反馈数据的(小)gz压缩文件(不重要,但只是表明它不是巨大的或邪恶的).
我的代码工作正常,但我觉得我错过了一些非常明显的东西,通过创建和销毁临时文件.
我有一个带有正值和负值的Pandas DataFrame作为条形图.我想绘制正面颜色'绿色'和负面值'红色'(非常原始...大声笑).如果> 0'green'else <0'red',我不知道如何通过?
data = pd.DataFrame([[-15], [10], [8], [-4.5]],
index=['a', 'b', 'c', 'd'],
columns=['values'])
data.plot(kind='barh')
Run Code Online (Sandbox Code Playgroud)

我geom_segment用来绘制活动的时间表.这一切都在同一条线上,因为我想将它与其他图形一起呈现,我宁愿使y轴更小.似乎ggplot2中灰色图形区域的大小总是正方形,无论我是将其缩放还是更大.有没有办法说我想要x = 500 y = 50或类似的东西?
df2 <- structure(list(Activities =
structure(c(2L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 2L, 2L, 2L, 5L,
4L, 3L, 2L, 2L),
.Label = c("authoring", "hacking", "learning",
"surfing", "tasks"),
class = "factor"),
Start = c(14895L, 15005L, 16066L, 16226L, 16387L, 16394L,
27030L,27532L, 27600L, 27687L, 28660L, 28713L,
29154L, 30264L, 30345L, 32245L),
End = c(15005L, 16066L, 16226L, 16387L,16394L, 16509L,
27491L, 27591L, 27628L, 28450L, 28704L, 29109L,
30250L, 30345L, 31235L, 33794L)),
.Names = c("Activities", "Start", "End"),
class = "data.frame", …Run Code Online (Sandbox Code Playgroud) 我正在使用caret包来调整glmnet逻辑回归模型.虽然lambda最佳曲调的值是我在其中指定tuneGrid的lambda值之一,但最终模型的值完全不同:
require(caret)
set.seed(1)
x <- matrix(runif(1000), nrow = 100)
y <- factor(rbinom(100, 1, 0.5))
lambda.seq <- exp(seq(log(1e-5), log(1e0), length.out = 20))
model <- train(x, y,
method ="glmnet",
family = "binomial",
tuneGrid = expand.grid(alpha = 1,
lambda = lambda.seq))
model$bestTune
# alpha lambda
# 13 1 0.0143845
model$finalModel$lambdaOpt
# [1] 0.0143845
model$finalModel$lambda
# [1] 0.1236344527 0.1126511087 0.1026434947 0.0935249295 0.0852164325 0.0776460395
# [7] 0.0707481794 0.0644631061 0.0587363814 0.0535184032 0.0487639757 0.0444319185
# [13] 0.0404847094 0.0368881594 0.0336111170 …Run Code Online (Sandbox Code Playgroud) 我正在运行逻辑回归,并在文本列上运行tf-idf.这是我在逻辑回归中使用的唯一列.如何确保尽可能调整参数?
我希望能够通过一系列步骤来最终允许我说我的Logistic回归分类器尽可能地运行.
from sklearn import metrics,preprocessing,cross_validation
from sklearn.feature_extraction.text import TfidfVectorizer
import sklearn.linear_model as lm
import pandas as p
loadData = lambda f: np.genfromtxt(open(f, 'r'), delimiter=' ')
print "loading data.."
traindata = list(np.array(p.read_table('train.tsv'))[:, 2])
testdata = list(np.array(p.read_table('test.tsv'))[:, 2])
y = np.array(p.read_table('train.tsv'))[:, -1]
tfv = TfidfVectorizer(min_df=3, max_features=None, strip_accents='unicode',
analyzer='word', token_pattern=r'\w{1,}',
ngram_range=(1, 2), use_idf=1, smooth_idf=1,
sublinear_tf=1)
rd = lm.LogisticRegression(penalty='l2', dual=True, tol=0.0001,
C=1, fit_intercept=True, intercept_scaling=1.0,
class_weight=None, random_state=None)
X_all = traindata + testdata
lentrain = len(traindata)
print "fitting pipeline"
tfv.fit(X_all)
print "transforming data"
X_all = …Run Code Online (Sandbox Code Playgroud) python artificial-intelligence numpy machine-learning scikit-learn
有没有办法在R中分割驼峰案例字符串?
我试过了:
string.to.split = "thisIsSomeCamelCase"
unlist(strsplit(string.to.split, split="[A-Z]") )
# [1] "this" "s" "ome" "amel" "ase"
Run Code Online (Sandbox Code Playgroud) 是否有任何R套餐/功能可以实时获得汇率,例如Google财经?如果某些东西已经存在,我宁愿避免使用RCurl或其他解析器.
具体来说,给定"从"和"到"货币符号的向量,我想知道费率.就像是:
IdealFunction(c("CAD", "JPY", "USD"), c("USD", "USD", "EUR"))
Run Code Online (Sandbox Code Playgroud) r ×5
python ×4
matplotlib ×3
pandas ×2
camelcasing ×1
currency ×1
ggplot2 ×1
gzip ×1
legend ×1
numpy ×1
package ×1
python-3.x ×1
r-caret ×1
reverse ×1
scikit-learn ×1
seaborn ×1
split ×1
timeline ×1