日本データベース学会

dbjapanメーリングリストアーカイブ(2020年)

[dbjapan] CFP: MediaEval 2020 「Insight for Wellbeing:Multimodal personal health lifelog data analysis」データ分析チャレンジ 参加者募集

  • To: <dbjapan [at] dbsj.org>
  • Subject: [dbjapan] CFP: MediaEval 2020 「Insight for Wellbeing:Multimodal personal health lifelog data analysis」データ分析チャレンジ 参加者募集
  • From: <dlzpj [at] nict.go.jp>
  • Date: Wed, 19 Aug 2020 16:06:38 +0900

日本データベース学会の皆様

情報通信研究機構の趙と申します。

MediaEval 2020のデータ分析チャレンジ「Insight for Wellbeing: Multimodal 
personal health lifelog data analysis」の参加者を募集しております。

今回のチャレンジは東京都内で収集された大気情報とlifelog の画像情報をセッ
トにした異種類データの分析チャレンジです。

まだ、本Workshopの最終報告会は今年度はオンラインでの参加ができるようにい
たします。

皆様の多数のご参加をお待ちしております。

------------------------------------
Insight for Wellbeing: Multimodal personal health lifelog data analysis
------------------------------------



Task Schedule
31 July: Data release
30 October: Runs due
15 November: Results returned
30 November: Working notes paper
Early December: MediaEval 2020 Workshop


Task Description
Task participants create systems that derive insights from multimodal 
lifelog data that are important for health and wellbeing. The first 
dataset, namely “personal air quality data” (PAQD), includes air 
pollution data (PM2.5, O3, and NO2) and lifelog data (e.g., 
physiological data, tags, and images) collected by using sensors boxes, 
lifelog cameras, and smartphones along the predefined routes in a city. 
The second dataset, namely “global air quality data” (GAQD), includes 
weather and air pollution data collected over the city and provided by 
the government and crawled from related websites.

Participants in this task tackle two challenging subtasks:

Personal Air Quality Prediction with public/open data: Task participants 
predict the value of personal air pollution data (PM2.5, O3, and NO2) 
using only weather data (wind speed, wind direction, temperature, 
humidity) and air pollution data (PM2.5, O3, and NO2) from public/open 
data sources (e.g., stations, website). This subtask’s target is to 
investigate whether we can use public/open data to predict personal air 
pollution data. The personal air pollution data can be concerned as the 
regional air pollution data since these data a locally collected by 
people who carry personal equipment. In other words, the ground truth is 
data collected by sensor boxes carried by people.
Personal Air Quality Prediction with lifelog data: participants predict 
the personal Air Quality Index using images captured by people (plus 
GAQD). The purpose of this subtask is whether we can use only lifelog 
data (i.e., pictures of the surrounding environment, annotations, and 
comments), plus some data from open sources (e.g., weather, air 
pollution data) to predict the personal air pollution data.
Motivation and Background
The association between people’s wellbeing and the properties of the 
surrounding environment is an essential area of investigation. Although 
these investigations have a long and rich history, they have focused on 
the general population. There is a surprising lack of research 
investigating the impact of the environment on the scale of individual 
people. On a personal scale, local information about air pollution (e.g.,
 PM2.5, NO2, O3), weather (e.g., temperature, humidity), urban nature (e.
g., greenness, liveliness, quietness), and personal behavior (e.g., 
psychophysiological data) play an essential role. It is not always 
possible to gather plentiful amounts of such data. As a result, a key 
research question remains open: Can sparse or incomplete data be used to 
gain insight into wellbeing? Is there a hypothesis about the 
associations within the data so that wellbeing can be understood using a 
limited amount of data? Developing hypotheses about the associations 
within the heterogeneous data contributes towards building good 
multimodal models that make it possible to understand the impact of the 
environment on wellbeing at the local and individual scale. Such models 
are necessary since not all cities are fully covered by standard air 
pollution and weather stations, and not all people experience the same 
reaction to the same environment situation. Moreover, images captured by 
the first-person view could give essential cues to understand that 
environmental situations in cases in which precise data from air 
pollution stations are lacking.

Let us imagine the following scenario. Yamamoto-san is using the Image-2
-AQI app to know how harmful air pollution is by merely feeding captured 
images to the app. Simultaneously, at the urban air pollution center, 
the air pollution map is updated with Yamamoto-san’s contribution (e.g., 
images, annotation). Satoh-san, with some clicks on his smartphone, the 
environmental-based risk map application can show him the excellent 
route from A to B with less congestion and harmful air pollution. 
Simultaneously, less congestion from A to B is due to fewer people 
coincidentally traveling on the same route. Such simple apps are parts 
of the human-environment sustainable and co-existing system that have 
changed people’s pro-environmental behaviors.

The critical research question here is, “does the personal air quality 
be predicted by using other data that is easy to obtain?”

Target Group
This task targets (but is not limited to) researchers in the areas of 
multimedia information retrieval, machine learning, AI, data science, 
event-based processing and analysis, multimodal multimedia content 
analysis, lifelog data analysis, urban computing, environmental science, 
and atmospheric science.

Data
The personal air quality data (PAQD) were collected from March to April 
2019 along the marathon course of the Tokyo 2020 Olympics and the 
running course around the Imperial Palace using wearable sensors. There 
were five data collection participants assigned to five routes to 
collect the data. Routes 1–4 were along the marathon course for the 
Tokyo 2020 Olympics. Route 5 was the running course around the Imperial 
Palace. The length of each route was approximately 5 km. Each 
participant started data collection at 9 am every weekday, and it took 
approximately one hour to walk each route. Collected data contain 
weather data (e.g., temperature, humidity), atmospheric data (e.g., O3, 
PM2.5, and NO2), GPS data, and lifelog data (e.g., images, annotation).

The glocal air pollution data (GAPD) contains the atmospheric monitoring 
station data collected by the Atmospheric Environmental Regional 
Observation System (AEROS) in Japan (http://soramame.taiki.go.jp). AEROS 
contains real-time atmospheric data at every hour for 2032 
meteorological monitoring stations across Japan. The atmospheric data 
includes eleven types of air pollutant data (SO2, NOx, NO, NO2, CO, Ox, 
NMHC, CH4, THC, SPM, and PM2.5), and four types of meteorological data (
wind direction, wind speed, temperature, and humidity).

All data are stored in CSV format, except images in JPG format. Personal 
data are privacy protected. All task participants should sign the 
agreement of using these data, released by MediaEval and NICT-Japan, for 
research purposes only.


Task Organizers,
Minh-Son Dao (NICT, Japan) dao (at) nict.go.jp

Peijiang Zhao (NICT, Japan) dlzpj (at) nict.go.jp

Ngoc-Thanh Nguyen (UIT, Vietnam) thanhnn.13 (at) grad.uit.edu.vn

Thanh-Binh Nguyen (HCMUS, Vietnam) ngtbinh (at) hcmus.edu.vn

Duc-Tien Dang-Nguyen (UiB, Norway) ductien.dangnguyen (at) uib.no

Cathal Gurrin (DCU, Ireland) cgurrin (at) computing.dcu.ie

Task Auxiliaries
Tan-Loc Nguyen-Tai (UIT, Vietnam)

Dang-Hieu Nguyen (UIT, Vietnam)

Minh-Tam Nguyen (UIT, Vietnam)

Quoc-Dat Duong (HCMUS, Vietnam)

Minh-Quan Le (HCMUS, Vietnam)

Trong-Dat Phan (HCMUS, Vietnam)










-----------------------------------------
情報通信研究機構
統合ビッグデータ研究センター
ビッグデータ利活用研究室 

Peijiang Zhao
趙 培江(チョウ バイコウ)

e-mail:dlzpj [at] nict.go.jp
tel:042-327-5375
-----------------------------------------