0. Overview

  • Title : Spatio-Temporal Knowledge Transfer for Urban Crowd Flow Prediction via Deep Attentive Adaptation Networks
  • Authors : Senzhang Wang, Hao Miao, Jiyue Li, Jiannong Cao
  • Year : 2021
  • Publish : TITS (IEEE Transactions on Intelligent Transportation Systems)

1. Introduction

1) Why do we need it?

  • Deep learning์ด ๋‹ค์–‘ํ•œ spatio-temporal(์‹œ๊ณต๊ฐ„) prediction task์— ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์Œ
    • ST-ResNet(2017, Cit. 1606) : forecast crowds inflow & outflow in each region of a city
    • STDN(2018, Cit. 521) : road network based traffic prediction
    • predict passenger pickup/demand demands (Attention+ConvLSTM)
    • DeepTransport : predict the traffic data within a transport network (CNN+RNN)
  • ์ตœ๊ทผ์—๋Š” transfer learning์„ ์‚ฌ์šฉํ•ด ์ƒ๊ธฐ ๋ฌธ์ œ๋ฅผ ํ’€์–ด๋ณด๊ณ ์ž ํ–ˆ์Œ
    • RegionTrans(2019, Cit. 88) : source, target city์˜ ๋น„์Šทํ•œ ์ง€์—ญ์„ ๋งค์นญ โ†’ ์ด ์ž‘์—… ํ•˜๋ ค๋ฉด other service data๊ฐ€ ๋˜ ํ•„์š” (data ๊ด€์  = region level)
    • MetaST(2019, Cit. 166) : ์—ฌ๋Ÿฌ ๋„์‹œ์˜ ์žฅ๊ธฐ์  ์ถ”์„ธ๋ฅผ ๋ฝ‘์•„๋‚ด์„œ target city์— ์จ๋ณด์ž โ†’ ์ด๊ฑธ automatically ํ•ด์ฃผ๋Š” ํ†ตํ•ฉ ๋ชจ๋ธ์€ ์—†์Œ
  • ์šฐ๋ฆฌ๋Š” data ๊ด€์  = distribution ์ˆ˜์ •ํ•˜๊ณ , unified framework๋ฅผ ๋งŒ๋“ค์–ด๋ณด๊ฒ ๋‹ค.
  • Urban Crowd Flow Prediction : ๋„์‹œ/๊ตํ†ต ๋ถ„์•ผ์˜ ํฐ ์ฃผ์ œ. ์ „ํ†ต์ ์œผ๋กœ๋Š” ARIMA ๊ฐ™์€ ํ†ต๊ณ„ based methods๋ฅผ ์ฃผ๋กœ ์‚ฌ์šฉํ–ˆ์œผ๋‚˜, ์ตœ๊ทผ์—๋Š” DL methods๊ฐ€ ๋งŽ์ด ์“ฐ์ด๋Š” ํŽธ
    • DNN, ST-ResNet, SeqST-GAN, ConvLSTM, MT-ASTN, DCRNN, RegionTrans, MetaST ๋“ฑ
  • Transfer Learning : ML์˜ scarce labeled data problem์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ œ์‹œ๋œ ๋ฐฉ๋ฒ•๋ก 
    • TCA, TLDA, JAN, JMMD ๋“ฑ
  • DAN(2015, Cit. 4413) : CNN์„ domain adaptation task์— ๋งž๊ฒŒ ์ผ๋ฐ˜ํ™”, ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์—์„œ ํฐ ์„ฑ๊ณต
    • Neural Net์ด general feature ์ž˜ ์žก์•„๋‚ด๊ณ  ์„ฑ๋Šฅ ์ข‹๋‹ค๋งŒ, labeled data ๋ณ„๋กœ ์—†๋Š” target domain์— ๋ฐ”๋กœ CNN ์“ฐ๋‹ˆ ๋ฌธ์ œ๊ฐ€ ๋งŽ์Œ
    • ์‹ค์ œ๋กœ Yosinski et al.(2014, Cit. 8740) ๋ณด๋‹ˆ Conv 1-3๊นŒ์ง„ OK, Conv 4-5๋ถ€ํ„ฐ ์ด์ƒํ•ด์ง€๋”๋‹ˆ, FC 6-8์—์„  ์™„์ „ํžˆ ๋ฉ”๋กฑ
    • DAN ์ €์ž๋“ค์€ Conv 1-3์€ ๊ทธ๋Œ€๋กœ ๋‘๊ณ (freeze), Conv 4-5 ๋‹จ๊ณ„์— fine-tuning ์ ์šฉ, FC 6-8์€ CNN parameter optimizing์— multi-kernel MMD๋ฅผ regularizer๋กœ ๋„ฃ๋Š” ์‹์œผ๋กœ ๊ฐœ์„ 
      • Sejdinovic et al.(2013, Cit. 610) : two samples์˜ distribution์ด ๊ฐ™์€์ง€ ํ‰๊ฐ€ํ•  ๋งŒํ•œ ํ†ต๊ณ„๋Ÿ‰์œผ๋กœ MMD(Maximum Mean Discrepancies)๋ฅผ ์ œ์‹œํ•œ ๋ฐ” ์žˆ์Œ
    • ์š”์•ฝํ•˜๋ฉด CNN parameter๋ฅผ ์ฐพ๋˜, FC-layers ๋‹จ์—์„œ ๋งŒ๋“ค์–ด์ง€๋Š” source์™€ target์˜ hidden representation์ด ๋น„์Šทํ•ด์ง€๋„๋ก ์ถ”๊ฐ€ ์ œํ•œ์„ ์„ค์ •ํ•œ ๊ฒƒ
  • ConvLSTM(2015, Cit. 6876) : ๊ธฐ์กด Fully Connected LSTM์€ 1์ฐจ์› time-series โ†’ ๊ณต๊ฐ„์ •๋ณด(row, column)์„ ๋„ฃ์–ด์„œ 3์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๋„๋ก ํ™•์žฅ
    • ํ™์ฝฉ ๊ธฐ์ƒ์ฒญ์—์„œ radar echo images๋กœ ๊ฐ•์ˆ˜ ์˜ˆ๋ณด๋ฅผ ํ•˜๋ ค๋‹ˆ, ๊ธฐ์กด LSTM์œผ๋ก  ๊ณต๊ฐ„์„ฑ์„ ๋‹ด์•„๋‚ผ ์ˆ˜ ์—†์–ด์„ ์ง€ ์„ฑ๋Šฅ์ด ์•ˆ ์ข‹๋”๋ผ โ†’ image๋ฅผ LSTM์— ๋„ฃ๊ธฐ ์ „ CNN์œผ๋กœ ์ดˆ๋ฒŒ๊ตฌ์ดํ•˜๋Š” ๋ฐฉ์‹์„ ์ œ์•ˆ

3) Formulationss

  • Spatio-Temporal Data : 2์ฐจ์› ๊ณต๊ฐ„ ์ƒ์—์„œ ๊ธฐ๋ก๋˜๋Š”, ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋Š” feature๋ฅผ ๋งํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ๋‹จ์ผ feature๋ผ๋ฉด ๊ธฐ๋ณธ์ ์œผ๋กœ 3์ฐจ์› ๋ฐ์ดํ„ฐ.
  • ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์ง€์—ญ์—์„œ ๋งŒ๋“ค์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๋ฉฐ, ์ด๋“ค์„ ๊ฐ™์€ ์ˆ˜์˜ grid cell๋กœ ๋‚˜๋ˆ  ์ž‘์—…ํ•œ๋‹ค.
    • ์„œ์šธ, ๋Œ€์ „, ๋‰ด์š•, โ€ฆ ๋„์‹œ์˜ ํฌ๊ธฐ/ํ˜•ํƒœ๋Š” ์ œ๊ฐ๊ฐ์ด์ง€๋งŒ cell ์ˆ˜๊ฐ€ ๊ฐ™๋„๋ก ๊ฒฉ์ž๋ฅผ ๋งŒ๋“ค์–ด์ค€๋‹ค.

    ๋ฐ์ดํ„ฐ๊ฐ€ coverํ•˜๋Š” ๊ณต๊ฐ„์„ m*n๊ฐœ์˜ grid cell๋กœ ๋‚˜๋ˆˆ๋‹ค. each cell region์ด t์‹œ์ ์— ๊ฐ–๋Š” ์ •๋ณด(๊ตํ†ต๋Ÿ‰, ๊ฐ•์ˆ˜ ๋“ฑ)๊ฐ€ ์žˆ์„ ํ…๋ฐ, ์ด๋“ค์ด ์–ด๋–ค ๊ฐ’์„ ๊ฐ–๋Š”์ง€ ํ‘œํ˜„ํ•œ ๊ฒŒ spatio-temporal image (matrix)๋ผ ํ•œ๋‹ค.

    ๋ฐ์ดํ„ฐ๊ฐ€ coverํ•˜๋Š” ๊ณต๊ฐ„์„ m*n๊ฐœ์˜ grid cell๋กœ ๋‚˜๋ˆˆ๋‹ค. each cell region์ด t์‹œ์ ์— ๊ฐ–๋Š” ์ •๋ณด(๊ตํ†ต๋Ÿ‰, ๊ฐ•์ˆ˜ ๋“ฑ)๊ฐ€ ์žˆ์„ ํ…๋ฐ, ์ด๋“ค์ด ์–ด๋–ค ๊ฐ’์„ ๊ฐ–๋Š”์ง€ ํ‘œํ˜„ํ•œ ๊ฒŒ spatio-temporal image (matrix)๋ผ ํ•œ๋‹ค.

  • ๊ฒฉ์ž ํ˜•ํƒœ matrix๋ฅผ image๋ผ ํ•  ๋•Œ, ๋งค ์‹œ์ ๋งˆ๋‹ค ๊ธฐ๋ก๋œ image๋“ค์˜ time-series๋ฅผ ๋ชจ์œผ๋ฉด 3์ฐจ์› tensor๊ฐ€ ๋œ๋‹ค.
    • ์„œ์šธ์˜ ๋”ฐ๋ฆ‰์ด ํ†ตํ–‰๋Ÿ‰(a feature)์„ ์—ด๋‘ ์‹œ๊ฐ„์ฏค ๊ด€์ฐฐํ–ˆ๋‹ค๋ฉด, ํ•ด๋‹น ๋ฐ์ดํ„ฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ spatio-temporal tensor๋กœ ๋ฌ˜์‚ฌํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค.

    image๋Š” ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋ฉฐ, t์‹œ์  ๊ธฐ์ค€์œผ๋กœ ๊ณผ๊ฑฐ k๊ฐœ image๋ฅผ ์ถ•์ ํ•˜๋ฉด, ์œ„์™€ ๊ฐ™์€ 3์ฐจ์› tensor๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์ด tensor๊ฐ€ ์•ž์œผ๋กœ ์ „๊ฐœํ•  ๋…ผ๋ฆฌ์˜ ๊ธฐ๋ณธ ๋‹จ์œ„๋กœ ์ž์ฃผ ์“ฐ์ธ๋‹ค.

    image๋Š” ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ•˜๋ฉฐ, t์‹œ์  ๊ธฐ์ค€์œผ๋กœ ๊ณผ๊ฑฐ k๊ฐœ image๋ฅผ ์ถ•์ ํ•˜๋ฉด, ์œ„์™€ ๊ฐ™์€ 3์ฐจ์› tensor๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์ด tensor๊ฐ€ ์•ž์œผ๋กœ ์ „๊ฐœํ•  ๋…ผ๋ฆฌ์˜ ๊ธฐ๋ณธ ๋‹จ์œ„๋กœ ์ž์ฃผ ์“ฐ์ธ๋‹ค.

  • tensor๋“ค์€ ์ตœ์ƒ๋‹จ(latest) image๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ถ”๋ ค๋‚ธ ์ตœ๊ทผ k๊ฐœ images์ธ ์…ˆ์ธ๋ฐ, ์ด ๊ฐ™์€ ๋ญ‰์น˜๋ฅผ 1-step after ๋งˆ๋‹ค ๊ณ„์† ๋ฝ‘์•„๋‚ธ๋‹ค๋ฉด, ํ•ด๋‹น tensors๋กœ ์–ด๋–ค 4์ฐจ์› ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๊ฒ ๋‹ค.
    • List with parameters : Row(m) * Column(n) * Accumulation(k) * Time-stamp(t)
    • ์ด ๋ฆฌ์ŠคํŠธ๋ฅผ tensor set, ๊ธธ์ด๋ฅผ โ€˜Lโ€™์ด๋ผ ํ•˜์ž.
    • ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์€(์žฅ๊ธฐ๊ฐ„) domain์—์„œ๋Š” ์ง‘ํ•ฉ์ด ๊ธธ์ญ‰ํ•˜๊ฒŒ, ๋ฐ˜๋Œ€๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ domain์—์„œ๋Š” ์งค๋ง‰ํ•œ ์ง‘ํ•ฉ์ด ๋‚˜์˜จ๋‹ค.

    tensor๋Š” ์ •๋ณด๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, domain์— ๋”ฐ๋ผ ์ •๋ณด๋Ÿ‰์€ ๋‹ค๋ฅผ ํ…Œ๋‹ค. ์˜ˆ์ปจ๋Œ€ ์—ฌ๊ธฐ์„  ์„œ์šธ์˜ ํƒ์‹œ ์Šน๊ฐ ๋ฐ์ดํ„ฐ๋Š” ๋‚˜ํ˜(์ตœ์ข… ์—…๋ฐ์ดํŠธ ๊ธฐ์ค€) ์ •๋„๋กœ ๊ธธ์ง€๋งŒ, ๋”ฐ๋ฆ‰์ด ํ†ตํ–‰๋Ÿ‰ ๋ฐ์ดํ„ฐ๋Š” ๊ธฐ๊ปํ•ด์•ผ ๋ฐ˜๋‚˜์ ˆ์ฏค ๋ผ์„œ, ๋‹ค๋ฅธ domain์ธ ํƒ์‹œ ์ •๋ณด๋ฅผ ์–ด๋–ป๊ฒŒ ์ž˜ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์„๊นŒ ๊ณ ๋ฏผํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๊ฒŒ ์ด ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ ์ฃผ์ œ.

    tensor๋Š” ์ •๋ณด๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, domain์— ๋”ฐ๋ผ ์ •๋ณด๋Ÿ‰์€ ๋‹ค๋ฅผ ํ…Œ๋‹ค. ์˜ˆ์ปจ๋Œ€ ์—ฌ๊ธฐ์„  ์„œ์šธ์˜ ํƒ์‹œ ์Šน๊ฐ ๋ฐ์ดํ„ฐ๋Š” ๋‚˜ํ˜(์ตœ์ข… ์—…๋ฐ์ดํŠธ ๊ธฐ์ค€) ์ •๋„๋กœ ๊ธธ์ง€๋งŒ, ๋”ฐ๋ฆ‰์ด ํ†ตํ–‰๋Ÿ‰ ๋ฐ์ดํ„ฐ๋Š” ๊ธฐ๊ปํ•ด์•ผ ๋ฐ˜๋‚˜์ ˆ์ฏค ๋ผ์„œ, ๋‹ค๋ฅธ domain์ธ ํƒ์‹œ ์ •๋ณด๋ฅผ ์–ด๋–ป๊ฒŒ ์ž˜ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์„๊นŒ ๊ณ ๋ฏผํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๊ฒŒ ์ด ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ ์ฃผ์ œ.

2. Main Architecture

  • ๊ธฐ๋ณธ์ ์ธ ํŠน์ง•์€ stacked ConvLSTM ์œผ๋กœ ์žก์•„๋‚ด๋ฉฐ, ๋งŒ๋“ค์–ด์ง„ hidden state์— DAN(generalized CNN), ๋งˆ์ง€๋ง‰์—” Global Attention ์ ์šฉ & ๊ธฐํƒ€ features ์ถ”๊ฐ€ํ•˜๋Š” ๊ตฌ์„ฑ์ด๋‹ค

๋…ผ๋ฌธ์˜ main figure. ํฌ๊ฒŒ 1) ConvLSTM, 2) CNN with MMD (DAN), 3) Global spatial attention ๊ตฌ๊ฐ„์œผ๋กœ ๋‚˜๋‰œ๋‹ค.

๋…ผ๋ฌธ์˜ main figure. ํฌ๊ฒŒ 1) ConvLSTM, 2) CNN with MMD (DAN), 3) Global spatial attention ๊ตฌ๊ฐ„์œผ๋กœ ๋‚˜๋‰œ๋‹ค.

1) Representaion Learning (ConvLSTM)

  • Input = Tensor set(4D) ์ด์ง€๋งŒ, ์ž‘์—…์€ ๋งค image(2D) ๋งˆ๋‹ค ์ง„ํ–‰ โ†’ ํ•œ ์žฅ์”ฉ CNN์„ ๊ฑฐ์ณ ์ƒˆ๋กœ์šด tensor set์„ ๋งŒ๋“ค์–ด ๋‚ผ ์ˆ˜ ์žˆ์Œ โ†’ ๋‹ค์‹œ LSTM์˜ Input gate์— ํˆฌ์ž… + ์ด์ „ hidden state tensor set๊ณผ ๊ฒฐํ•ฉ + โ€ฆ (๋งˆ์ฐฌ๊ฐ€์ง€๋กœ 2D ๋‹จ์œ„๋กœ ์ง„ํ–‰) โ†’ ๋ฐ˜๋ณต
  • ๋ชจ๋“  stacked LSTM์„ ํ†ต๊ณผํ•ด ๋งŒ๋“ค์–ด์ง„ ์ตœ์ข… ๊ฒฐ๊ณผ๋ฌผ์„ โ€˜Hโ€™๋ผ ํ•˜์ž

2) Knowledge Transfer (DAN)

  • two different domainsโ€™ distributions์ด ์–ผ๋งˆ๋‚˜ ๋‹ค๋ฅธ์ง€, distance๋กœ ํ‰๊ฐ€ํ•œ ๊ฒƒ์„ MMD๋ผ ํ•œ๋‹ค.
  • ๋„๋ฉ”์ธ ๋ณ„๋กœ hidden state์— CNN์„ ์ ์šฉํ•˜๋˜, CNN layer ๋งˆ๋‹ค mmd loss๋ฅผ ์‚ฐ์ถœํ•ด ํ‰๊ท ์„ ๋‚ธ๋‹ค.
  • Parameter set ฮ˜ = argmin Loss Function of (GT vs ConvLSTM & CNN & mmd_loss & โ€ฆ )

3) Global Spatial Attention

  • local spatial correlations๋Š” CNN ๋‹จ๊ณ„์—์„œ ์žกํžˆ์ง€๋งŒ, ๋ณด๋‹ค ๋„“์€ ๋ฒ”์œ„์—์„œ geographical dependencies๋Š” ์ž˜ ํฌ์ฐฉ๋˜์ง€ ์•Š๋Š”๋‹ค.
    • ์ง€๋ฆฌ์ƒ์œผ๋กœ๋Š” ๋ฉ€๋ฆฌ ๋–จ์–ด์ง„ ๋‘ ์ง€์—ญ์ด ์œ ์‚ฌํ•œ Point of Interest distribution์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค
    • ์ด๋Š” taxi-trip, crowd flow ๊ฐ™์€ ์‹œ๊ณต๊ฐ„ ์ •๋ณด๋„ ๋งˆ์ฐฌ๊ฐ€์ง€
  • source domain ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•  ๋•Œ, attention score๋ฅผ ๊ณฑํ•ด์„œ ๊ฐ€์ ธ์˜ค๋ฉด global relation์„ ์ฒดํฌํ•˜๋Š” ํšจ๊ณผ๋ฅผ ๋‚ผ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ

์•„์นจ ํ™๋Œ€์˜ ํƒ์‹œ ์Šน๊ฐ(source)์€, ๊ฐ™์€ ์‹œ๊ฐ ํ™๋Œ€์™€ ๋…ธ์›์˜ ์ž์ „๊ฑฐ ํ†ตํ–‰๋Ÿ‰(target)๊ณผ ๋‹ฎ์•„์žˆ๋‹ค. domain์€ ๋‹ค๋ฅด์ง€๋งŒ, โ€˜์ถœํ‡ด๊ทผ/ํ†ตํ•™โ€™ ์ด๋ผ๋Š” ์š”์†Œ๊ฐ€ ์ €๋ณ€์— ๊น”๋ ค์žˆ์Œ์„ attention mechanism์„ ํ†ตํ•ด ํŒŒ์•…ํ•˜๋Š” ์…ˆ. ์„ฑ์ˆ˜๋Š” ๋…ธ์›๋ณด๋‹ค ํ™๋Œ€์— ๊ฐ€๊นŒ์ด ์žˆ์ง€๋งŒ, ์ฃผ๊ฑฐ/์—…๋ฌด/ํ•™๊ตฐ ๋ณด๋‹จ โ€˜๋ฌธํ™”์˜ˆ์ˆ โ€™ ์ง€์—ญ์ด๋ผ ์•„์นจ์— ์ž์ „๊ฑฐ ํƒ€๋Š” ์‚ฌ๋žŒ์ด ์ ๋‹ค๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค.

์•„์นจ ํ™๋Œ€์˜ ํƒ์‹œ ์Šน๊ฐ(source)์€, ๊ฐ™์€ ์‹œ๊ฐ ํ™๋Œ€์™€ ๋…ธ์›์˜ ์ž์ „๊ฑฐ ํ†ตํ–‰๋Ÿ‰(target)๊ณผ ๋‹ฎ์•„์žˆ๋‹ค. domain์€ ๋‹ค๋ฅด์ง€๋งŒ, โ€˜์ถœํ‡ด๊ทผ/ํ†ตํ•™โ€™ ์ด๋ผ๋Š” ์š”์†Œ๊ฐ€ ์ €๋ณ€์— ๊น”๋ ค์žˆ์Œ์„ attention mechanism์„ ํ†ตํ•ด ํŒŒ์•…ํ•˜๋Š” ์…ˆ. ์„ฑ์ˆ˜๋Š” ๋…ธ์›๋ณด๋‹ค ํ™๋Œ€์— ๊ฐ€๊นŒ์ด ์žˆ์ง€๋งŒ, ์ฃผ๊ฑฐ/์—…๋ฌด/ํ•™๊ตฐ ๋ณด๋‹จ โ€˜๋ฌธํ™”์˜ˆ์ˆ โ€™ ์ง€์—ญ์ด๋ผ ์•„์นจ์— ์ž์ „๊ฑฐ ํƒ€๋Š” ์‚ฌ๋žŒ์ด ์ ๋‹ค๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ๊ฒ ๋‹ค.

  • ๊ตฌ์ฒด์ ์œผ๋กœ๋Š” source domain์˜ 2D image์˜ ํŠน์ • ๋ถ€๋ถ„ Region (i, j)๊ฐ€, target domain์˜ ๋ชจ๋“  m*n๊ฐœ region๊ณผ ์–ผ๋งˆ๋‚˜ ๋‹ฎ์•„์žˆ๋Š”์ง€ ์ฒดํฌํ•œ๋‹ค
    • ๋ณธ ๋…ผ๋ฌธ์—์„œ ๋‹ค๋ฃจ๋Š” image๋Š” ๋ชจ๋‘ ๊ฐ™์€ m*n ์‚ฌ์ด์ฆˆ grid cell๋กœ ๋‚˜๋ˆ ์ ธ ์žˆ์œผ๋‹ˆ ํ–‰๋ ฌ ๊ณ„์‚ฐ์ด ์šฉ์ดํ•˜๋‹ค.
    • dot-product, softmax ์ทจํ•ด์„œ attention matrix ๋งŒ๋“œ๋Š” ๋“ฑ ๋„๋ฆฌ ์•Œ๋ ค์ง„ attention mechanism๊ณผ ํฌ๊ฒŒ ๋‹ค๋ฅธ ์ ์€ ๋ณด์ด์ง€ ์•Š์•˜๋‹ค

3. Algorithm & Code

1) Algorithm

algo 1.jpg

2) Real Code

https://github.com/MiaoHaoSunny/ST-DAAN

4. Evaluation

  • ๊ณผ๊ฑฐ Taxi, Bike ๋ฐ์ดํ„ฐ๋กœ Crowd flow prediction ํ•˜๋Š” task๋กœ ST-DAAN ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ด๋ณด์ž

์—ฌ๋Ÿฌ ๋„์‹œ์—์„œ ์ˆ˜์ง‘๋œ taxi, bike ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ, ๊ฐ๊ฐ GPS ๊ฒฝ๋กœ, ์ถœ๋ฐœ/๋„์ฐฉ์ง€, ์‹œ๊ฐ, ID ๋“ฑ ๋‹ค์–‘ํ•œ variables๋กœ ๊ตฌ์„ฑ๋ผ์žˆ๋‹ค. number of trips, time span์„ ๋น„๊ตํ•˜๋ฉด DIDI๋Š” ๊ฐ™์€ ํƒ์‹œ ๋ฐ์ดํ„ฐ์…‹์ธ TaxiNYC๋ณด๋‹ค data scarce ํ•˜๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

์—ฌ๋Ÿฌ ๋„์‹œ์—์„œ ์ˆ˜์ง‘๋œ taxi, bike ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ, ๊ฐ๊ฐ GPS ๊ฒฝ๋กœ, ์ถœ๋ฐœ/๋„์ฐฉ์ง€, ์‹œ๊ฐ, ID ๋“ฑ ๋‹ค์–‘ํ•œ variables๋กœ ๊ตฌ์„ฑ๋ผ์žˆ๋‹ค. number of trips, time span์„ ๋น„๊ตํ•˜๋ฉด DIDI๋Š” ๊ฐ™์€ ํƒ์‹œ ๋ฐ์ดํ„ฐ์…‹์ธ TaxiNYC๋ณด๋‹ค data scarce ํ•˜๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  • Intra-city(TaxiNYC โ†’ BikeNYC), Cross-city(BikeChicago โ†’ BikeNYC, DIDI โ†’ TaxiBJ) transfer case๋ฅผ ๋ชจ๋‘ ๋‹ค๋ค„๋ณด์•˜๋‹ค
  • Baseline model์€ non-transfer learning, ์ตœ๊ทผ์˜ transfer leaning based์—์„œ ๊ณ ๋ฃจ ๊ณจ๋ž๋‹ค
    • non-transfer learning based : ARIMA, ConvLSTM, DCRNN, DeepST, ST-ResNet
    • transfer learning based : (์œ„ ๋ชจ๋ธ๋“ค์— fine-tuning), RegionTrans, MetaST

1) Comparison With Baselines

  • ARIMA < non-transfer < non-transfer with fine-tuning < transfer < ST-DAAN ์ˆœ์œผ๋กœ ์„ฑ๋Šฅ Good
    • ST-DAAN full version๊ณผ Attention & External features์„ ๊ฐ๊ฐ ๋นผ๋ณธ variation์„ ๋น„๊ตํ•ด๋ณด๋‹ˆ, ์ด๋“ค ์—ญ์‹œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์ด ๋์Œ

      Intra-city, Cross-city ๋ฌด๊ด€ํ•˜๊ฒŒ ST-DAAN์ด ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„. nonAtt, nonExt๋Š” ๊ฐ๊ฐ global spatial attention, inserting external feature์„ ์—†์•ค ๋ฒ„์ „์˜ ST-DAAN

      Intra-city, Cross-city ๋ฌด๊ด€ํ•˜๊ฒŒ ST-DAAN์ด ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„. nonAtt, nonExt๋Š” ๊ฐ๊ฐ global spatial attention, inserting external feature์„ ์—†์•ค ๋ฒ„์ „์˜ ST-DAAN

2) Effect of Data Amount

  • ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์„ ์ˆ˜๋ก ์ข‹๊ธด ํ•˜๋”๋ผ. Source/Target ๋‘˜ ๋‹ค ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์œผ๋ฉด ์„ฑ๋Šฅ ์ข‹์Œ

๋Œ€์ฒด๋กœ ๋ฐ์ดํ„ฐ length ๊ธธ์ˆ˜๋ก ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ์ข‹์•„์ง. ์—ญ์‹œ ๋‹ค๋‹ค์ต์„ 

๋Œ€์ฒด๋กœ ๋ฐ์ดํ„ฐ length ๊ธธ์ˆ˜๋ก ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ์ข‹์•„์ง. ์—ญ์‹œ ๋‹ค๋‹ค์ต์„ 

3) Parameter Sensitivity Analysis

  • Scarce data ๋‹ค๋ฃจ๋Š” transfer learning, ์‹ ๊ฒฝ๋ง ๊นŠ๊ฒŒ ์Œ“์œผ๋ฉด ์˜คํžˆ๋ ค overfitting ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒ
  • Domain discrepancy์— ์ ๋‹นํ•œ penalty ์ค˜์•ผ ํ•จ. ์ž‘๊ฒŒ ์ฃผ๋ฉด common knowledge๊ฐ€ ์ „๋‹ฌ๋˜์ง€ ์•Š๊ณ , ๋„ˆ๋ฌด ํฌ๊ฒŒ ์ฃผ๋ฉด only domain-specific feature๋งŒ ์ „๋‹ฌ๋จ

ConvLSTM, CNN ๋‹จ๊ณ„์—์„œ number of layers ๋„ˆ๋ฌด ๋งŽ์œผ๋ฉด ๋ฌธ์ œ, penalty hyper-parameter gamma๋„ ์ ๋‹นํžˆ ์„ค์ •ํ•  ํ•„์š”

ConvLSTM, CNN ๋‹จ๊ณ„์—์„œ number of layers ๋„ˆ๋ฌด ๋งŽ์œผ๋ฉด ๋ฌธ์ œ, penalty hyper-parameter gamma๋„ ์ ๋‹นํžˆ ์„ค์ •ํ•  ํ•„์š”

5. Others

  • TaxiBJ์˜ crowd flows๋ฅผ RegionTrans, ST-DAAN์œผ๋กœ ์˜ˆ์ธกํ•ด๋ณด์•˜๋Š”๋ฐ, ํƒ์‹œ ๋งŽ์ด ์žก๋Š” Rush hour์—์„œ ST-DAAN์ด RegionTrans ๋Œ€๋น„ ์šฐ์ˆ˜ โ†’ ๋ณธ ๋ชจ๋ธ์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋„์›€๋  ๋งŒํ•œ ์ง๊ด€์  ์˜ˆ์‹œ?
    • ๊ธฐ์กด ๋ชจ๋ธ์€ time invariant, ํŠน์งˆ์„ ์ œ๋Œ€๋กœ ๊ตฌ๋ถ„ํ•˜์ง€ ๋ชปํ•˜์ง€๋งŒ, ST-DAAN์€ ์ผ์ • ๋ถ€๋ถ„ GT์— ๋‹ค๊ฐ€์„œ๋Š” ๋ชจ์Šต์„ ๋ณด์˜€๋‹ค๋Š” ์‹์œผ๋กœ ์ดํ•ดํ•จ

ํƒ์‹œ ๋งŽ์ด ์•ˆ ์žก๋Š” ์‹ฌ์•ผ ์‹œ๊ฐ์—๋Š” RegionTrans, ST-DAAN ๋‘˜ ๋‹ค ๋น„์Šทํ•˜์ง€๋งŒ, Rush hour์—์„  ๊ฝค ๋น„์Šทํ•˜๊ฒŒ capture

ํƒ์‹œ ๋งŽ์ด ์•ˆ ์žก๋Š” ์‹ฌ์•ผ ์‹œ๊ฐ์—๋Š” RegionTrans, ST-DAAN ๋‘˜ ๋‹ค ๋น„์Šทํ•˜์ง€๋งŒ, Rush hour์—์„  ๊ฝค ๋น„์Šทํ•˜๊ฒŒ capture