根据sql server中的相关值对数据进行分组

data grouping according to relevant values in sql server(根据sql server中的相关值对数据进行分组)
本文介绍了根据sql server中的相关值对数据进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据以 15 分钟为间隔:

<前>时间价值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 122010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00

通常会有 96 分.

根据数值,我们可能会注意到,00:15 到 01:45 之间的值彼此接近,02:00 到 03:00 之间的值彼此接近,而从 03:15 到04:00 他们靠得很近.

基于彼此接近"规则,我希望将数据分组"为 3 部分:

  • 00:15 到 01:45
  • 02:00 至 03:00
  • 03:15 到 04:00

请注意数据可能是随机的,根据上面定义的规则可以分为 3 个以上的部分,但最多不应超过 10 个部分.并且分组必须遵循时间顺序,例如,您不能将 00:15/02:30/04:45 归为 1 组,因为这 3 个点不连续.

请给出一些如何在 t-sql 中实现它的想法.

更新:该值可能是:

<前>时间价值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 4 --突然减少2010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00

对于这种情况,我们不应该把 02:30 分开分组,因为我们希望分组大小必须至少为 3 点,我们会将那个点 (02:30) 放到前一组(从 02:00 到 03:00).

解决方案

由于你的问题变化太大,这里是新问题的新答案,我只包括代码部分.

声明@t table(time datetime, value int)声明@variation浮动设置@variation = 2不计较插入@t 值('2010-01-01 00:15',3)插入@t 值('2010-01-01 00:30',2)插入@t 值('2010-01-01 00:45',4)插入@t 值('2010-01-01 01:00',5)插入@t 值('2010-01-01 01:15',1)插入@t 值('2010-01-01 01:30',3)插入@t 值('2010-01-01 01:45',4)插入@t 值('2010-01-01 02:00',52)插入@t 值('2010-01-01 02:15',5)插入@t 值('2010-01-01 02:30',52)插入@t 值('2010-01-01 02:45',54)插入@t 值('2010-01-01 03:00',55)插入@t 值('2010-01-01 03:15',3)插入@t 值('2010-01-01 03:30',2)插入@t 值('2010-01-01 03:45',3)插入@t 值('2010-01-01 04:00',5)声明@result 表(最小时间日期时间,最大时间日期时间)A:删除@result; t 为(select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0),作为(从 t 中选择时间、lv、rn、0 grp,其中 rn = 1联合所有选择 t1.time, a.lv, t1.rn,存在的情况(从 t t2 中选择 1 其中 t1.rn 介于 rn + 1 和 rn + 3 之间,并且lv 之间 t1.lv - @variation 和 t1.lv +@variation) 然后 grp else grp + 1 end从 t t1 加入 a ont1.rn = a.rn +1)插入@result通过 grp 从组中选择 min(time), max(time)如果@@rowcount >10开始设置@variation=@variation + .5转到一个结尾从@result中选择*

结果:

mintime maxtime2010-01-01 00:15:00.000 2010-01-01 01:45:00.0002010-01-01 02:00:00.000 2010-01-01 03:00:00.0002010-01-01 03:15:00.000 2010-01-01 04:00:00.000

the data is in 15 minute interval:

Time               Value
2010-01-01 00:15   3
2010-01-01 00:30   2
2010-01-01 00:45   4
2010-01-01 01:00   5
2010-01-01 01:15   1
2010-01-01 01:30   3
2010-01-01 01:45   4
2010-01-01 02:00   12
2010-01-01 02:15   13
2010-01-01 02:30   12
2010-01-01 02:45   14
2010-01-01 03:00   15
2010-01-01 03:15   3
2010-01-01 03:30   2
2010-01-01 03:45   3
2010-01-01 04:00   5
..........
..........
..........
2010-01-02 00:00

Typically there will be 96 points.

According to the values, we may notice that the values from 00:15 to 01:45 are close to each other, and from 02:00 to 03:00 they are close to each other, and from 03:15 to 04:00 they are close to each other.

Based on the "close to each other" rule, I want the data to be "grouped" into 3 parts:

  • 00:15 to 01:45
  • 02:00 to 03:00
  • 03:15 to 04:00

Please consider that the data could be random, and could be grouped into more than 3 parts according to the rule defined above, but maximum should not be more than 10 parts. And the grouping must honor the time sequence, for example, you cannot just put 00:15/02:30/04:45 into 1 group because these 3 points are NOT consecutive.

Please give some thoughts how to implement it in t-sql.

updated: The value could be:

Time               Value
2010-01-01 00:15   3
2010-01-01 00:30   2
2010-01-01 00:45   4
2010-01-01 01:00   5
2010-01-01 01:15   1
2010-01-01 01:30   3
2010-01-01 01:45   4
2010-01-01 02:00   12
2010-01-01 02:15   13
2010-01-01 02:30   4  --suddenly decreased
2010-01-01 02:45   14
2010-01-01 03:00   15
2010-01-01 03:15   3
2010-01-01 03:30   2
2010-01-01 03:45   3
2010-01-01 04:00   5
..........
..........
..........
2010-01-02 00:00

for these kinds of situation, we should not group 02:30 separately, because we want the group size has to be at least 3 points, and we will put that point (02:30) to the previous group (from 02:00 to 03:00).

解决方案

Since your question changed so much, here is a new answer to the new question, I only included the code part.

declare @t table(time datetime, value int)
declare @variation float
set @variation = 2
set nocount on

insert @t values('2010-01-01 00:15',3)
insert @t values('2010-01-01 00:30',2)
insert @t values('2010-01-01 00:45',4)
insert @t values('2010-01-01 01:00',5)
insert @t values('2010-01-01 01:15',1)
insert @t values('2010-01-01 01:30',3)
insert @t values('2010-01-01 01:45',4)
insert @t values('2010-01-01 02:00',52)
insert @t values('2010-01-01 02:15',5)
insert @t values('2010-01-01 02:30',52)
insert @t values('2010-01-01 02:45',54)
insert @t values('2010-01-01 03:00',55)
insert @t values('2010-01-01 03:15',3)
insert @t values('2010-01-01 03:30',2)
insert @t values('2010-01-01 03:45',3)
insert @t values('2010-01-01 04:00',5)


declare @result table(mintime datetime, maxtime datetime)
a:
delete @result

;with t as
(
select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0
), a as
(
select time, lv, rn, 0 grp from t where rn = 1
union all
select t1.time, a.lv, t1.rn, 
case when exists (select 1 from t t2 where t1.rn between rn + 1 and rn + 3 and 
lv between t1.lv - @variation and t1.lv +@variation) then grp else grp + 1 end
from t t1 join a on 
t1.rn = a.rn +1
)
insert @result
select min(time), max(time) from a group by grp

if @@rowcount > 10 
begin 
    set @variation=@variation + .5 
    goto a 
end

select * from @result

Result:

mintime                     maxtime
2010-01-01 00:15:00.000     2010-01-01 01:45:00.000
2010-01-01 02:00:00.000     2010-01-01 03:00:00.000
2010-01-01 03:15:00.000     2010-01-01 04:00:00.000

这篇关于根据sql server中的相关值对数据进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Execute complex raw SQL query in EF6(在EF6中执行复杂的原始SQL查询)
Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)