我是Elasticsearch的新手,并且一直在手动输入数据,直到此时为止.例如,我做过这样的事情:
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'
Run Code Online (Sandbox Code Playgroud)
我现在有一个.json文件,我想把它编入Elasticsearch.我也尝试过类似的东西,但没有成功:
curl -XPOST 'http://jfblouvmlxecs01:9200/test/test/1' -d lane.json
Run Code Online (Sandbox Code Playgroud)
如何导入.json文件?是否需要先采取步骤以确保映射正确?
我是Python和Pandas的新手.我正在尝试将Pandas Dataframe转换为嵌套的JSON.函数.to_json()没有给我足够的灵活性来实现我的目标.
以下是数据帧的一些数据点(以csv,逗号分隔):
,ID,Location,Country,Latitude,Longitude,timestamp,tide
0,1,BREST,FRA,48.383,-4.495,1807-01-01,6905.0
1,1,BREST,FRA,48.383,-4.495,1807-02-01,6931.0
2,1,BREST,FRA,48.383,-4.495,1807-03-01,6896.0
3,1,BREST,FRA,48.383,-4.495,1807-04-01,6953.0
4,1,BREST,FRA,48.383,-4.495,1807-05-01,7043.0
2508,7,CUXHAVEN 2,DEU,53.867,8.717,1843-01-01,7093.0
2509,7,CUXHAVEN 2,DEU,53.867,8.717,1843-02-01,6688.0
2510,7,CUXHAVEN 2,DEU,53.867,8.717,1843-03-01,6493.0
2511,7,CUXHAVEN 2,DEU,53.867,8.717,1843-04-01,6723.0
2512,7,CUXHAVEN 2,DEU,53.867,8.717,1843-05-01,6533.0
4525,9,MAASSLUIS,NLD,51.918,4.25,1848-02-01,6880.0
4526,9,MAASSLUIS,NLD,51.918,4.25,1848-03-01,6700.0
4527,9,MAASSLUIS,NLD,51.918,4.25,1848-04-01,6775.0
4528,9,MAASSLUIS,NLD,51.918,4.25,1848-05-01,6580.0
4529,9,MAASSLUIS,NLD,51.918,4.25,1848-06-01,6685.0
6540,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-07-01,6957.0
6541,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-08-01,6944.0
6542,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-09-01,7084.0
6543,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-10-01,6898.0
6544,8,WISMAR 2,DEU,53.898999999999994,11.458,1848-11-01,6859.0
8538,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-07-01,6909.0
8539,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-08-01,6940.0
8540,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-09-01,6961.0
8541,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-10-01,6952.0
8542,10,SAN FRANCISCO,USA,37.806999999999995,-122.465,1854-11-01,6952.0
Run Code Online (Sandbox Code Playgroud)
有很多重复的信息,我想有这样的JSON:
[
{
"ID": 1,
"Location": "BREST",
"Latitude": 48.383,
"Longitude": -4.495,
"Country": "FRA",
"Tide-Data": {
"1807-02-01": 6931,
"1807-03-01": 6896,
"1807-04-01": 6953,
"1807-05-01": 7043
}
},
{
"ID": 5,
"Location": "HOLYHEAD",
"Latitude": 53.31399999999999,
"Longitude": -4.62,
"Country": "GBR", …
Run Code Online (Sandbox Code Playgroud) 我在编写Python时遇到了以下问题:我使用包含必须阻止的词的Pandas数据帧(使用SnowballStemmer).我想要用词来调查词干与非词干文本的结果,为此我将使用分类器.我使用以下代码作为词干分析器:
import pandas as pd
from nltk.stem.snowball import SnowballStemmer
# Use English stemmer.
stemmer = SnowballStemmer("english")
# Sentences to be stemmed.
data = ["programers program with programing languages", "my code is working so there must be a bug in the optimizer"]
# Create the Pandas dataFrame.
df = pd.DataFrame(data, columns = ['unstemmed'])
# Split the sentences to lists of words.
df['unstemmed'] = df['unstemmed'].str.split()
# Make sure we see the full column.
pd.set_option('display.max_colwidth', -1)
# Print dataframe.
df
+----+--------------------------------------------------------------+
| | …
Run Code Online (Sandbox Code Playgroud)