问题描述
我有一个要解析的 XML 文件.XML 是使用
I have an XML File which I am trying to parse. The XML was created through Excel using
另存为 XML
因为 XML 文件是从 Microsoft Excel 创建的,所以它有这个标题:
Because the XML file was created from Microsoft Excel, it has this header:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
我试图提取的数据是这样设置的:
The data I am trying to extract is set up in blocks like this:
<Row ss:AutoFitHeight="0" ss:Height="30">
<Cell ss:StyleID="s22"/>
<Cell ss:StyleID="s24"><Data ss:Type="String">Jane Doe</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">JaneDoe</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">XYZ</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">(555) 555-5555</Data></Cell>
<Cell ss:StyleID="s22"/>
</Row>
现在,我的查询如下所示:
Right now, my query looks like this:
;WITH XMLNAMESPACES ('urn:schemas-microsoft-com:office:spreadsheet' as ss)
select * from (
select X.value('local-name(.)[1]','varchar(max)') as Name,
X.value('.[1]','varchar(max)') as Value
from @allUsers.nodes('//*') as T(X)
) a
where Name = 'Data'
并给我这些结果:
Name Value
---- -----------
Data Jane Doe
Data JaneDoe
Data XYZ
Data (555)555-5555
我想做的是将它分成 4 行,所以我有类似的东西:
What I would like to do is separate this into 4 rows, so I have something like:
Name UserName Address Phone
----- ---------- --------- ----------
Jane Doe JaneDoe XYZ (555)-555-5555
我尝试选择一列作为
X.value('.[2]','varchar(max)') as UserName
但我只是得到了所有的 NULL
值.
but I just get all NULL
values for that.
有没有办法做到这一点?
Is there any way to do this?
XML 文件的一般结构如下:
The general structure of the XML file looks like:
<Workbook>
<DocumentProperties>
</DocumentProperties>
<ExcelWorkbook>
</ExcelWorkbook>
<Styles>
<Style>
</Style>
</Styles>
<Worksheet>
<Table>
<Column.../>
<Column.../>
<Column.../>
<Row>
<Cell.../>
<Cell><Data>...</Data></Cell>
<Cell><Data>...</Data></Cell>
<Cell><Data>...</Data></Cell>
<Cell><Data>...</Data></Cell>
<Cell.../>
</Row>
...
</Table>
</Worksheet>
我想要获取的信息在 ...</Data>
字段
and the information I am trying to get is in the <Data>...</Data>
field
编辑
从我表述这个问题的方式来看,标题名称似乎已经被编入,但它们实际上被读取为
.我也不确定这部分的用途是什么
From the way I phrased the question, it would seem like the header names are already programmed in, but they are actually read as rows in <Cell><Data><Data/></Cell>
. I am also not sure what purpose the part serves
这是
部分的开始:
<Table ss:ExpandedColumnCount="6" ss:ExpandedRowCount="2685" x:FullColumns="1"
x:FullRows="1">
<Column ss:AutoFitWidth="0" ss:Width="26.25"/>
<Column ss:AutoFitWidth="0" ss:Width="117" ss:Span="3"/>
<Column ss:Index="6" ss:AutoFitWidth="0" ss:Width="29.25"/>
<Row ss:AutoFitHeight="0" ss:Height="60"> --Contains the header names
<Cell ss:StyleID="s22"/>
<Cell ss:StyleID="s23"><Data ss:Type="String">Name</Data></Cell>
<Cell ss:StyleID="s23"><Data ss:Type="String">UserName</Data></Cell>
<Cell ss:StyleID="s23"><Data ss:Type="String">Address</Data></Cell>
<Cell ss:StyleID="s23"><Data ss:Type="String">Telephone Number</Data></Cell>
<Cell ss:StyleID="s22"/>
</Row>
<Row ss:AutoFitHeight="0" ss:Height="30"> --First record I would like to extract
<Cell ss:StyleID="s22"/>
<Cell ss:StyleID="s24"><Data ss:Type="String">John Smith</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">JSmith</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">ABC</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">(999) 999-9999</Data></Cell>
<Cell ss:StyleID="s22"/>
</Row>
推荐答案
同一用户提出了两个非常相似的问题.OP 决定删除一个并在此处合并,并要求我将我的答案从那里复制到此线程.
There were two very similar question by the same user. The OP decided to delete one and combine this here and asked me to copy my answer from there to this thread.
注意必须声明为DEFAULT"的 xmlns-namespace:
Be aware of the xmlns-namespace which must be declared as "DEFAULT":
简化了您的 XML,但这个想法应该没问题...
Simplified your XML, but the idea should be OK...
DECLARE @allUsers XML=
'<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<Worksheet>
<Table>
<Row ss:AutoFitHeight="0" ss:Height="30">
<Cell ss:StyleID="s22"/>
<Cell ss:StyleID="s24"><Data ss:Type="String">Jane Doe</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">JaneDoe</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">XYZ</Data></Cell>
<Cell ss:StyleID="s24"><Data ss:Type="String">(555) 555-5555</Data></Cell>
<Cell ss:StyleID="s22"/>
</Row>
</Table>
</Worksheet>
</Workbook>';
;WITH XMLNAMESPACES ('urn:schemas-microsoft-com:office:spreadsheet' as ss
,DEFAULT 'urn:schemas-microsoft-com:office:spreadsheet')
SELECT T.X.value('Cell[1]/Data[1]','varchar(max)') AS DontKnow1
,T.X.value('Cell[2]/Data[1]','varchar(max)') AS Name
,T.X.value('Cell[3]/Data[1]','varchar(max)') AS UserName
,T.X.value('Cell[4]/Data[1]','varchar(max)') AS DontKnow2
,T.X.value('Cell[5]/Data[1]','varchar(max)') AS Telephone
,T.X.value('Cell[6]/Data[1]','varchar(max)') AS DontKnow3
FROM @allUsers.nodes('/Workbook/Worksheet/Table/Row') as T(X)
这篇关于将具有相同标记的 XML 值分成不同的行 SQL Server的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!