列中的差异数_数据库问题-得得之家

本文介绍了列中的差异数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想检索一列，每行中的字母有多少差异.例如

I would like to retrieve a column of how many differences in letters in each row. For instance

如果你有一个值test"而另一行有一个值testing"，那么test"和testing"之间的差异是4个字母.该列的数据将为值 4

If you have a a value "test" and another row has a value "testing ", then the differences is 4 letter between "test" and "testing ". The data of the column would be value 4

I have reflected about it and I don't know where to begin

id    ||  value     || category   || differences 
--------------------------------------------------
 1    ||  test      || 1          || 4
 2    ||  testing  || 1          || null   
11    ||  candy     || 2          || -3       
12    ||  ca        || 2          || null

在这个场景和上下文中，测试"和休息"没有区别.

In this scenario and context it is no difference between "Test" and "rest".

推荐答案

我认为您正在寻找的是编辑差异，而不仅仅是计算前缀相似度，为此有一些常用算法.Levenshtein 的方法是我以前使用过的方法，我已经看到它作为 TSQL 函数实现.this SO question 的答案建议了一些 TSQL 中的实现，您可能只是能够按原样获取和使用.

I think what you are looking for is a measure of edit difference, rather than just counting prefix similarity, for which there are a few common algorithms. Levenshtein's method is one that I've used before and I've seen it implemented as TSQL functions. The answers to this SO question suggest a couple of implementations in TSQL that you might just be able to take and use as-is.

^{(尽管花时间测试代码并理解方法，而不是仅仅复制代码并使用它，以便在出现问题时您可以理解输出 - 否则您可能会产生一些技术债务你以后要还钱)}

确切地说，您想要哪种距离计算方法取决于您想如何计算某些事物，例如，您是将替换算作一次更改还是将删除和插入算作一次，以及您的字符串是否足够长，可以这样做你想考虑子串移动等等.

Exactly which distance calculation method you want will depend on how you want to count certain things, for instance do you count a substitution as one change or a delete and an insert, and if your strings are long enough for it to matter do you want to consider substring moves, and so forth.

这篇关于列中的差异数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持编程学习网！

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除！

列中的差异数

问题描述

推荐答案

相关文档推荐