SQL Server:从 VARCHAR(MAX) 字段替换无效的 XML 字符

SQL Server: Replace invalid XML characters from a VARCHAR(MAX) field(SQL Server:从 VARCHAR(MAX) 字段替换无效的 XML 字符)
本文介绍了SQL Server:从 VARCHAR(MAX) 字段替换无效的 XML 字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 VARCHAR(MAX) 字段,它以 XML 格式连接到外部系统.接口抛出以下错误:

I have a VARCHAR(MAX) field which is being interfaced to an external system in XML format. The following errors were thrown by the interface:

mywebsite.com-2015-0202.xml:413005: parser error : xmlParseCharRef: invalid xmlChar value 29
ne and Luke's family in Santa Fe. You know you have a standing invitation,
                                                                               ^
mywebsite.com-2015-0202.xml:455971: parser error : xmlParseCharRef: invalid xmlChar value 25
The apprentice nodded, because frankly, who hadnt? That diseases like chol
                                                      ^
mywebsite.com.com-2015-0202.xml:456077: parser error : xmlParseCharRef: invalid xmlChar value 28
bon mot; a sentimental love of nature and animals; the proverbial British 
                                                                               ^
mywebsite.com-2015-0202.xml:472073: parser error : xmlParseCharRef: invalid xmlChar value 20
"Andyou want that?"
          ^
mywebsite.com-2015-0202.xml:492912: parser error : xmlParseCharRef: invalid xmlChar value 25
She couldnt live like this anymore.

我们发现以下字符列表无效:

We found that the following list of characters are invalid:

�








	

























我正在尝试清理这些数据,我找到了一个 SQL 函数来清理这些字符 此处.但是,该函数将 NVARCHAR(4000) 作为输入参数,因此我已将该函数更改为使用 VARCHAR(MAX).

I am trying to clean this data, and I found a SQL function to clean these characters here. However, the function was taking NVARCHAR(4000) as input parameter, so I have changed the function to use VARCHAR(MAX) instead.

如果将 NVARCHAR(4000) 更改为 VARCHAR(MAX) 会产生错误的结果,请告知任何人吗?抱歉,我无法在本地测试此界面,因此想寻求意见/建议.

Could anyone please advise if changing the NVARCHAR(4000) to VARCHAR(MAX) would produce wrong results? Sorry, I wouldn't be able to test this interface locally so thought to seek opinion/advise.

原始功能:

CREATE FUNCTION fnStripLowAscii (@InputString nvarchar(4000))
RETURNS nvarchar(4000)
AS
BEGIN
IF @InputString IS NOT NULL
BEGIN
  DECLARE @Counter int, @TestString nvarchar(40)

  SET @TestString = '%[' + NCHAR(0) + NCHAR(1) + NCHAR(2) + NCHAR(3) + NCHAR(4) + NCHAR(5) + NCHAR(6) + NCHAR(7) + NCHAR(8) + NCHAR(11) + NCHAR(12) + NCHAR(14) + NCHAR(15) + NCHAR(16) + NCHAR(17) + NCHAR(18) + NCHAR(19) + NCHAR(20) + NCHAR(21) + NCHAR(22) + NCHAR(23) + NCHAR(24) + NCHAR(25) + NCHAR(26) + NCHAR(27) + NCHAR(28) + NCHAR(29) + NCHAR(30) + NCHAR(31) + ']%'

  SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)

  WHILE @Counter <> 0
  BEGIN
    SELECT @InputString = STUFF(@InputString, @Counter, 1, NCHAR(164))
    SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
  END
END
RETURN(@InputString)
END

修改版本:

CREATE FUNCTION [dbo].RemoveInvalidXMLCharacters (@InputString VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
    IF @InputString IS NOT NULL
    BEGIN
      DECLARE @Counter INT, @TestString NVARCHAR(40)

      SET @TestString = '%[' + NCHAR(0) + NCHAR(1) + NCHAR(2) + NCHAR(3) + NCHAR(4) + NCHAR(5) + NCHAR(6) + NCHAR(7) + NCHAR(8) + NCHAR(11) + NCHAR(12) + NCHAR(14) + NCHAR(15) + NCHAR(16) + NCHAR(17) + NCHAR(18) + NCHAR(19) + NCHAR(20) + NCHAR(21) + NCHAR(22) + NCHAR(23) + NCHAR(24) + NCHAR(25) + NCHAR(26) + NCHAR(27) + NCHAR(28) + NCHAR(29) + NCHAR(30) + NCHAR(31) + ']%'

      SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)

      WHILE @Counter <> 0
      BEGIN
        SELECT @InputString = STUFF(@InputString, @Counter, 1, ' ')
        SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
      END
    END
    RETURN(@InputString)
END

推荐答案

使用 VARCHAR(MAX) 是安全的,因为我的数据列是 VARCHAR(MAX)场地.此外,如果我将 VARCHAR(MAX) 字段传递给接受 NVARCHAR(MAX) 参数的 SQL 函数.

It is safe to use VARCHAR(MAX) as my data column is a VARCHAR(MAX) field. Also, there will be an overhead of converting VARCHAR(MAX) to NVARCHAR(MAX) if I pass a VARCHAR(MAX) field to the SQL function which accepts the NVARCHAR(MAX) param.

非常感谢@RhysJones、@Damien_The_Unbeliever 的评论.

Thank you very much @RhysJones, @Damien_The_Unbeliever for your comments.

这篇关于SQL Server:从 VARCHAR(MAX) 字段替换无效的 XML 字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)