Part One
在字母矩阵中,寻找出”XMAS”特定字符串出现的次数,可以是横、纵或者斜线。
..X... .SAMX. .A..A. XMAS.S .X....
横竖撇捺
将输入拆分到具体的(row, col)坐标后,再重新按照横竖撇捺四个方向进行组合
with origin as (
SELECT row_number() over () as rn, line
FROM lance_input
), splits AS (
SELECT rn :: INTEGER as _row, x.idx :: INTEGER as _col, x.pos
FROM origin, regexp_split_to_table(line, '') with ordinality as x(pos, idx)
), top_down as (
SELECT string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _col
), left_right as (
SELECT string_agg(pos, '' order by _col) as line
FROM splits
GROUP by _row
), slash as (
SELECT string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row + _col
), backslash as (
SELECT string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row - _col
), allstr as (
SELECT line from top_down
union all
SELECT line from left_right
union all
SELECT line from slash
union all
SELECT line from backslash
)
正则匹配
从字符串中提取出XMAS,如果是反方向的话,则是SAMX,那么用正则表达式来表达的话就是
regexp_matches(line, 'XMAS|SAMX', 'g')
但是,使用这样的方式,查询的结果却比正确的结果要少。原因就是XMAS和SAMX连续出现的时候,只会匹配到一处。只有非连续出现,才能匹配到两处。
postgres=# select regexp_matches('MAMXMASAMX', 'XMAS|SAMX', 'g');
regexp_matches
----------------
{XMAS}
(1 row)
postgres=# select regexp_matches('MAMXMASMMSAMX', 'XMAS|SAMX', 'g');
regexp_matches
----------------
{XMAS}
{SAMX}
(2 rows)
因此,还是两次分别匹配拿到结果,如下所示:
SELECT regexp_matches(line, 'XMAS', 'g')
FROM allstr
UNION ALL
SELECT regexp_matches(reverse(line), 'XMAS', 'g')
FROM allstr
Part Two
需要从字母矩阵中,找出X形状的两个”MAS”组合,其中的A作为交叉点。
M.S .A. M.S
交叉点的坐标
这次不需要横竖,仅需要斜线的记录,这次将其分组记录下来
with origin as (
SELECT row_number() over () as rn, line
FROM lance_input
), splits AS (
SELECT rn :: INTEGER as _row, x.idx :: INTEGER as _col, x.pos
FROM origin, regexp_split_to_table(line, '') with ordinality as x(pos, idx)
), slash as (
SELECT _row + _col as _group, string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row + _col
), backslash as (
SELECT _row - _col as _group, string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row - _col
)
这次难点是需要定位这个A的位置,且”MAS”可能出现多次,可以通过如下方式,找到对应的坐标。除了最后一条记录,其他的都是有效的位置。
postgres=# select sum(length(pos) + 3) over (order by idx) - 1 from regexp_split_to_table('aaaabbbaaaabbbaaa', 'bbb') with ordinality as x(pos, idx) ;
?column?
----------
6
13
19
(3 rows)
通过group和position信息,从而推断出这个A在全局的横纵坐标。
slash_mas_pos as (
SELECT _group, case when _group > 140 + 1 then _group - 140 - 1 + a_pos else a_pos end as a_row
FROM (
SELECT _group, length(line) as total, sum(length(pos) + 3) over (partition by _group order by idx) - 1 as a_pos
FROM slash, regexp_split_to_table(line, 'MAS') with ordinality as x(pos, idx)
) t
WHERE a_pos < total
)
完整SQL
slash的坐标和backslash的坐标取交集,即可得出总共可能的位置
with origin as (
SELECT row_number() over () as rn, line
FROM lance_input
), splits AS (
SELECT rn :: INTEGER as _row, x.idx :: INTEGER as _col, x.pos
FROM origin, regexp_split_to_table(line, '') with ordinality as x(pos, idx)
), slash as (
SELECT _row + _col as _group, string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row + _col
), backslash as (
SELECT _row - _col as _group, string_agg(pos, '' order by _row) as line
FROM splits
GROUP by _row - _col
), slash_mas_pos as (
SELECT _group, case when _group > 140 + 1 then _group - 140 - 1 + a_pos else a_pos end as a_row
FROM (
SELECT _group, length(line) as total, sum(length(pos) + 3) over (partition by _group order by idx) - 1 as a_pos
FROM slash, regexp_split_to_table(line, 'MAS') with ordinality as x(pos, idx)
) t
WHERE a_pos < total
), slash_sam_pos as (
SELECT _group, case when _group > 140 + 1 then _group - 140 - 1 + a_pos else a_pos end as a_row
FROM (
SELECT _group, length(line) as total, sum(length(pos) + 3) over (partition by _group order by idx) - 1 as a_pos
FROM slash, regexp_split_to_table(line, 'SAM') with ordinality as x(pos, idx)
) t
WHERE a_pos < total
), backslash_mas_pos as (
SELECT _group, case when _group > 0 then a_pos + _group else a_pos end as a_row
FROM (
SELECT _group, length(line) as total, sum(length(pos) + 3) over (partition by _group order by idx) - 1 as a_pos
FROM backslash, regexp_split_to_table(line, 'MAS') with ordinality as x(pos, idx)
) t
WHERE a_pos < total
), backslash_sam_pos as (
SELECT _group, case when _group > 0 then a_pos + _group else a_pos end as a_row
FROM (
SELECT _group, length(line) as total, sum(length(pos) + 3) over (partition by _group order by idx) - 1 as a_pos
FROM backslash, regexp_split_to_table(line, 'SAM') with ordinality as x(pos, idx)
) t
WHERE a_pos < total
)
SELECT a.*
FROM (
SELECT a_row, _group - a_row as a_col
FROM slash_mas_pos
UNION ALL
SELECT a_row, _group - a_row as a_col
FROM slash_sam_pos
) a
JOIN (
SELECT a_row, a_row - _group as a_col
FROM backslash_mas_pos
UNION ALL
SELECT a_row, a_row - _group as a_col
FROM backslash_sam_pos
) b
ON a.a_row = b.a_row
AND a.a_col = b.a_col;