在 R 中生成 XML 文档

GCG*_*CGM 8 xml r

XML在我正在工作的项目中,我需要根据用户输入自动创建文档。使用用户输入修改文档的部分xml对我来说没问题,但我是xml在 R 中从头开始创建文档的新手

我想知道是否可以使用或包XML在 R 中生成如下所示的文档。到目前为止,我已经探索了,和函数,但我不熟悉创建此类文件所需的所有语法(完成后应将其保存在本地路径中)XMLxml2newXMLdocxml_new_documentxml_new_rootxml

<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <product>
      <refNo>1</refNo>
      <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
      <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
    </product>
    <product>
      <refNo>2</refNo>
      <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
      <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
    </product>
  </products>
  <views/>
</session>
Run Code Online (Sandbox Code Playgroud)

Par*_*ait 7

考虑使用上述库通过 DOM 方法构建 XML,例如XML不需要连接或插入字符串:

library(XML)

# DATA
df <- data.frame(refNo = c(1, 2),
                 uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip', 
                         'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
                 plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn', 
                            'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn')
                )

# CREATE XML FILE
doc = newXMLDoc()
root = newXMLNode("session", doc = doc)

# WRITE XML NODES AND DATA
mvNode = newXMLNode("modelVersion", "1.0.0", parent = root)

for (i in 1:nrow(df)){
  prodNode = newXMLNode("products", parent = root)

  # APPEND TO PRODUCT NODE
  newXMLNode("refNo", df$refNo[i], parent = prodNode)
  newXMLNode("uri", df$uri[i], parent = prodNode)
  newXMLNode("productReaderPlugin", df$plugin[i], parent = prodNode)
}

vwNode = newXMLNode("views", parent = root)

# OUTPUT XML CONTENT TO CONSOLE
print(doc)

# OUTPUT XML CONTENT TO FILE
saveXML(doc, file="Output.xml")
Run Code Online (Sandbox Code Playgroud)

输出

<?xml version="1.0"?>
<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <refNo>1</refNo>
    <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
    <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
  </products>
  <products>
    <refNo>2</refNo>
    <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
    <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
  </products>
  <views/>
</session>
Run Code Online (Sandbox Code Playgroud)


jos*_*rrà 7

xml2 (cran)包提供了 Hadleyuniverse 中的替代解决方案。

library(xml2)
library(tidyverse)

df <- data.frame(number = c(1, 2),
  uri = c('S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip', 
    'S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip'),
  plugin = c('class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn', 
    'class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn'),
  stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)

我们首先创建包含所有 xml 结构的 xml 文档

doc <- xml_new_root("session") 
xml_add_child(doc, "modelVersion", "1.0.0")  
xml_add_child(doc, "products") 
xml_add_child(doc, "products") 
xml_add_child(doc, "views")
doc
#> {xml_document}
#> <session>
#> [1] <modelVersion>1.0.0</modelVersion>
#> [2] <products/>
#> [3] <products/>
#> [4] <views/>
Run Code Online (Sandbox Code Playgroud)

我们现在在每个产品节点中添加组件。xml_add_child由于函数已向量化,因此不需要循环。

products_nodes <- xml_find_all(doc, "//products")
xml_add_child(products_nodes, "refNo", df$number)
xml_add_child(products_nodes, "uri", df$uri)
xml_add_child(products_nodes, "productReaderPlugin", df$plugin)
Run Code Online (Sandbox Code Playgroud)

最后将 xml 树保存到文件中并显示其内容

write_xml(doc, file = "output.xml", options =c("format", "no_declaration"))
cat(paste0(readLines("output.xml"), collapse = "\n"))
Run Code Online (Sandbox Code Playgroud)

这是“output.xml”文件的内容:

<session>
  <modelVersion>1.0.0</modelVersion>
  <products>
    <refNo>1</refNo>
    <uri>S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip</uri>
    <productReaderPlugin>class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn</productReaderPlugin>
  </products>
  <products>
    <refNo>2</refNo>
    <uri>S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip</uri>
    <productReaderPlugin>class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn</productReaderPlugin>
  </products>
  <views/>
</session>
Run Code Online (Sandbox Code Playgroud)

由reprex 包(v0.3.0)于 2021-05-06 创建


Ami*_*hli 1

没有任何这些包可能很容易解决...如果你的结构相当静态,我会使用https://github.com/tidyverse/glue然后只是cat()文件输出。像这样的东西:



## I guess your data looks like this?
df <- data.frame(number = c(1,2),
                 uri = c("S1A_IW_GRDH_1SDV_20190818T175529_20190818T175554_028627_033D25_22ED.zip<",
                         "S2A_MSIL1C_20190823T061631_N0208_R034_T42TXS_20190823T081730.zip"),
                 plugin = c("class org.esa.s1tbx.io.sentinel1.Sentinel1ProductReaderPlugIn",
                            "class org.esa.s2tbx.dataio.s2.ortho.plugins.Sentinel2L1CProduct_Multi_UTM42N_ReaderPlugIn"))
df

## build a function that outputs every block in xml format
thingieBuilder <- function(number, uri, plugin){
  glue::glue("<product>
           <refNo>{number}</refNo>
           <uri>{uri}</uri>
           <productReaderPlugin>{plugin}</productReaderPlugin>
           </product>")
}

## now run that for each entry in your df and unlist it, and make it a sausage, seperated by newlines
xmlProducts <- df %>% purrr::pmap(thingieBuilder) %>% unlist %>% paste(collapse = "\n")

## Now stick on top and bottom, and cat it to a file!
glue::glue("<session>
  <modelVersion>1.0.0</modelVersion>
  <products>\n",
           xmlProducts,
           "/n</products>
             <views/>
           </session>") %>% 
  cat(file = "boom.xml")
Run Code Online (Sandbox Code Playgroud)