xhyu


Xiaohui Yu

Photo of Xiaohui Yu

School of Information Technology

Associate Professor

Office: 3050 Victor Phillip Dahdaleh Building (DB)
(Formerly known as Technology Enhanced
Learning Building)
Phone: 416-736-2100 Ext:  33887
Email: xhyu@yorku.ca
Primary website: http://www.yorku.ca/xhyu


I am an Associate Professor in the School of Information Technology, York University. I received a BSc degree from Nanjing University, China, a MPhil from the Chinese University of Hong Kong (under the supervision of Prof. Ada Fu), and a PhD from the University of Toronto (under the supervision of Prof. Nick Koudas). I a member of the Graduate Program in Electrical Engineering and Computer Science at York University.

More...

My current research centres around managing and analyzing big data arising from a variety of contexts, such as intelligent transportation systems, location-based services, and social networks. In particular, I am interested in developing algorithms and systems to support the real-time processing of streaming data at scale, as well as making sense of such data by learning mobility patterns through modeling the interactions of spatial, temporal, and personal aspects. My research has been supported by NSERC, BRAIN Alliance, and industrial partners such as IBM and Huawei.

Degrees

PhD, University of Toronto
MPhil, Chinese University of Hong Kong
BSc, Nanjing University, China

Research Interests

Information Technologies , Big data management and analytics

Current Research Projects


    See more
    Funders:
    Minor Research Grant (York University)

    See more
    Funders:
    ATK Fellowship (York University)

    See more
    Funders:
    Minor Research Grant (York University)
Managing and Mining Urban Spatio-Temporal Data

    Summary:

    The wide-spread use of smart phones, sensors and other IoT devices in cities world-wide has given rise to a huge volume of urban spatio-temporal data, which often present themselves as high-velocity continuous streams with considerable noise and uncertainties. These data record a vast amount of movement information of people, vehicles, etc., and serve as the backbone of a variety of applications, such as urban traffic management, road network planning, location-based services, and environmental monitoring. While governments, businesses and other organizations have realized the tremendous value of urban spatio-temporal data, how to effectively tap into this potential is still an elusive goal.

    The unifying theme of the project is to address the challenges arising from managing and mining urban spatio-temporal data. Some of the questions we strive to answer are: How to improve the quality of such data to provide a reliable basis for data analytics? How to efficiently process continuous queries (such as k nearest-neighbor queries) and discover patterns over spatio-temporal streams? How to construct a probabilistic model to capture the underlying intention of movement? How to use this model to support advanced applications, such as traffic flow forecasting, dynamic navigation, and next location prediction?

    Novel models and methods developed from this project will help lay the data management and analytics foundation for a wide spectrum of applications, and provide a better understanding of human mobility patterns.

    See more
    Role: Principal Investigator

Conference Papers

Publication
Year

J. Dong, X. Yu. CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search, in Proceedings of the 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007), Banff, Canada, July 2007. (pdf)

2007

Conference Proceedings

Publication
Year

K. Pu, X. Yu. Keyword Query Cleaning, to appear in Proceedings of the 34th International Conference on Very Large Data Bases (VLDB 2008), Auckland, New Zealand, August 2008.

2008

X. Yu, Y. Liu. Reasoning about Similarity Queries in Text Retrieval Tasks, in Proceedings of the 17th International World Wide Web Conference (WWW 2008), Beijing, April 2008.

2008

M. Hadjieleftheriou, X. Yu, N. Koudas, D. Srivastava. Selectivity Estimation of Set Similarity Selection Queries, to appear in Proceedings of the 34th International Conference on Very Large Data Bases (VLDB 2008), Auckland, New Zealand, August 2008.

2008

Y. Liu, J. Huang, A. An, X. Yu. ARSA: A Sentiment-Aware Model for Predicting Sales Performance Using Blogs, in Proceedings of the 30th Annual International ACM SIGIR Conference (SIGIR 2007), Amsterdam, July 2007. Acceptance rate:18%. (pdf)

2007

C. Zuzarte, X. Yu. Fast Approximate Computation of Statistics on Views. In Proceedings of the ACM SIGMOD Conference (SIGMOD 2006), Chicago, IL, June 2006. (industry talk)

2006

X. Yu, N. Koudas, C. Zuzarte. HASE: A Hybrid Approach to Selectivity Estimation for Conjunctive Predicates. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT 2006), Munich, Germany, March 2006. Acceptance rate: 16%. (ps, pdf, ppt)

2006

S. Guha, N. Koudas, D. Srivastava, X. Yu. Reasoning About Approximate Match Query Results. In Proceedings of the 22nd International Conference on Data Engineering (ICDE 2006), Atlanta, USA, April 2006. Acceptance rate (full paper): 12.9%. (ps, pdf, ppt)

2006

X. Yu, K. Q. Pu, N. Koudas. Monitoring k-Nearest Neighbor Queries over Moving Objects. In Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, April 2005. Acceptance rate: 12.9%. (ps, pdf, ppt)

2005

X. Yu, C. Zuzarte, K. Sevcik. Towards Estimating the Number of Distinct Value Combinations for a Set of Attributes. In Proceedings of the ACM 14th Conference on Information and Knowledge Management (CIKM 2005), Bremen, Germany, November 2005. Acceptance rate: 18%. (ps, pdf, ppt)

2005


Current Courses

Term Course Number Section Title Type
Winter 2019 AP/ITEC4230 3.0 M Data Warehousing & Business Intelligence LECT
Winter 2019 GS/ITEC6220 3.0 M Advanced Information Management LECT



I am an Associate Professor in the School of Information Technology, York University. I received a BSc degree from Nanjing University, China, a MPhil from the Chinese University of Hong Kong (under the supervision of Prof. Ada Fu), and a PhD from the University of Toronto (under the supervision of Prof. Nick Koudas). I a member of the Graduate Program in Electrical Engineering and Computer Science at York University.

My current research centres around managing and analyzing big data arising from a variety of contexts, such as intelligent transportation systems, location-based services, and social networks. In particular, I am interested in developing algorithms and systems to support the real-time processing of streaming data at scale, as well as making sense of such data by learning mobility patterns through modeling the interactions of spatial, temporal, and personal aspects. My research has been supported by NSERC, BRAIN Alliance, and industrial partners such as IBM and Huawei.

Degrees

PhD, University of Toronto
MPhil, Chinese University of Hong Kong
BSc, Nanjing University, China

Research Interests

Information Technologies , Big data management and analytics

Current Research Projects


    Project Type: Funded
    Funders:
    Minor Research Grant (York University)

    Project Type: Funded
    Funders:
    ATK Fellowship (York University)

    Project Type: Funded
    Funders:
    Minor Research Grant (York University)
Managing and Mining Urban Spatio-Temporal Data

    Summary:

    The wide-spread use of smart phones, sensors and other IoT devices in cities world-wide has given rise to a huge volume of urban spatio-temporal data, which often present themselves as high-velocity continuous streams with considerable noise and uncertainties. These data record a vast amount of movement information of people, vehicles, etc., and serve as the backbone of a variety of applications, such as urban traffic management, road network planning, location-based services, and environmental monitoring. While governments, businesses and other organizations have realized the tremendous value of urban spatio-temporal data, how to effectively tap into this potential is still an elusive goal.

    The unifying theme of the project is to address the challenges arising from managing and mining urban spatio-temporal data. Some of the questions we strive to answer are: How to improve the quality of such data to provide a reliable basis for data analytics? How to efficiently process continuous queries (such as k nearest-neighbor queries) and discover patterns over spatio-temporal streams? How to construct a probabilistic model to capture the underlying intention of movement? How to use this model to support advanced applications, such as traffic flow forecasting, dynamic navigation, and next location prediction?

    Novel models and methods developed from this project will help lay the data management and analytics foundation for a wide spectrum of applications, and provide a better understanding of human mobility patterns.

    Project Type: Funded
    Role: Principal Investigator

All Publications


Conference Papers

Publication
Year

J. Dong, X. Yu. CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search, in Proceedings of the 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007), Banff, Canada, July 2007. (pdf)

2007

Conference Proceedings

Publication
Year

K. Pu, X. Yu. Keyword Query Cleaning, to appear in Proceedings of the 34th International Conference on Very Large Data Bases (VLDB 2008), Auckland, New Zealand, August 2008.

2008

X. Yu, Y. Liu. Reasoning about Similarity Queries in Text Retrieval Tasks, in Proceedings of the 17th International World Wide Web Conference (WWW 2008), Beijing, April 2008.

2008

M. Hadjieleftheriou, X. Yu, N. Koudas, D. Srivastava. Selectivity Estimation of Set Similarity Selection Queries, to appear in Proceedings of the 34th International Conference on Very Large Data Bases (VLDB 2008), Auckland, New Zealand, August 2008.

2008

Y. Liu, J. Huang, A. An, X. Yu. ARSA: A Sentiment-Aware Model for Predicting Sales Performance Using Blogs, in Proceedings of the 30th Annual International ACM SIGIR Conference (SIGIR 2007), Amsterdam, July 2007. Acceptance rate:18%. (pdf)

2007

C. Zuzarte, X. Yu. Fast Approximate Computation of Statistics on Views. In Proceedings of the ACM SIGMOD Conference (SIGMOD 2006), Chicago, IL, June 2006. (industry talk)

2006

X. Yu, N. Koudas, C. Zuzarte. HASE: A Hybrid Approach to Selectivity Estimation for Conjunctive Predicates. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT 2006), Munich, Germany, March 2006. Acceptance rate: 16%. (ps, pdf, ppt)

2006

S. Guha, N. Koudas, D. Srivastava, X. Yu. Reasoning About Approximate Match Query Results. In Proceedings of the 22nd International Conference on Data Engineering (ICDE 2006), Atlanta, USA, April 2006. Acceptance rate (full paper): 12.9%. (ps, pdf, ppt)

2006

X. Yu, K. Q. Pu, N. Koudas. Monitoring k-Nearest Neighbor Queries over Moving Objects. In Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, April 2005. Acceptance rate: 12.9%. (ps, pdf, ppt)

2005

X. Yu, C. Zuzarte, K. Sevcik. Towards Estimating the Number of Distinct Value Combinations for a Set of Attributes. In Proceedings of the ACM 14th Conference on Information and Knowledge Management (CIKM 2005), Bremen, Germany, November 2005. Acceptance rate: 18%. (ps, pdf, ppt)

2005


Current Courses

Term Course Number Section Title Type
Winter 2019 AP/ITEC4230 3.0 M Data Warehousing & Business Intelligence LECT
Winter 2019 GS/ITEC6220 3.0 M Advanced Information Management LECT