classA = Dataset(id = ["id1", "id2", "id3", "id4", "id5"],
mark = [50, 69.5, 45.5, 88.0, 98.5]);
grades = Dataset(mark = [0, 49.5, 59.5, 69.5, 79.5, 89.5, 95.5],
grade = ["F", "P", "C", "B", "A-", "A", "A+"]);
Run Code Online (Sandbox Code Playgroud)
我们可以使用 InMemorydatasets 包来进行 closejoin。
我们如何在 DataFrames 包中执行此方法。
closejoin(classA, grades, on = :mark)
Run Code Online (Sandbox Code Playgroud)
closejoin(classA, grades, on = :mark, direction=:forward, border=:nearest)
Run Code Online (Sandbox Code Playgroud)
以及如何在 R 中做到这一点?
ds = Dataset([[1, 1, 1, 2, 2, 2],
["foo", "bar", "monty", "foo", "bar", "monty"],
["a", "b", "c", "d", "e", "f"],
[1, 2, 3, 4, 5, 6]], [:g, :key, :foo, :bar])
Run Code Online (Sandbox Code Playgroud)
在InmemoryDatasets中,transpose函数可以传递列选择器的Tuple。
transpose(groupby(ds, :g), (:foo, :bar), id = :key)
Run Code Online (Sandbox Code Playgroud)
Result:
g foo bar monty foo_1 bar_1 monty_1
identity identity identity identity identity identity identity
Int64? String? String? String? Int64? Int64? Int64?
1 1 a b c 1 2 3
2 2 d e f 4 5 6
Run Code Online (Sandbox Code Playgroud)
问题:
我如何在 DataFrames.jl 中执行此操作? …